﻿

This article is part of a set of four; the common factor is Calculus of Variations. In classical physics Calculus of Variations is applied in three areas: Optics, Statics, and Dynamics. Each article in the set is written as a standalone article, resulting in some degree of overlap.
The other three articles:

 Optics: Fermat's stationary time Statics: The catenary Dynamics: The Energy-Position Equation

# Overview of the basics of Calculus of Variations, as applied in physics

## 1   Motivating example Image credit: Susan Schwartzenberg - Exploratorium

As motivating example for this demonstration of how Calculus of Variations works we will take the case of a soap film that stretches between two parallel rings, as shown in the image.

The shape of the soap film is a minimal surface. The equation for that shape will be derived using a form of differential calculus.

The strategy:
The surface from ring to ring is a surface of revolution, so we can treat the problem as a two-dimensional problem. Approximate the continuous surface in the form of a stack of truncated cones. In the limit of making those truncated cones infinitesimally thin the resultant surface converges onto the solution to the problem.

Section 2 works towards graphlet 2.5, an interactive diagram for the purpose demonstrating what it takes to solve the minimal surface problem.
Section 3 opens with presenting an economic way of constructing a general differential equation for solving variational problems (section 3.1). Section 3.2 offers a discussion of the method of deriving the Euler-Lagrange equation that the majority of sources offer.
Section 4 presents a problem that is closely related to the soap film problem: the catenary problem. The catenary problem and the soap film problem are siblings; they have the same solution.
Section 5 gives the key points from the full discussion of Hamilton's stationary action that is available in an article of its own.

## 2   Simplest case: two truncated cones

In preperation for later use: Picture 2.1 Image

Image credit: Elaine Dawe - Quora

The lateral area of a truncated cone is given by the following expression: (2.1)

We start with exploration of the simplest case that is an instance of the type of problem that we want to solve:

Picture 2.2 Graphlet

The surface of a cone is a surface of revolution. So we can use an xy-coordinate system, with the y-coordinate representing the radial distance. Graphlet 2.2 shows 2 cones, projected onto the xy-plane. The three numbers underneath display the following:
- area of the cone marked '1'
- total area of the two conical surfaces
- area of the cone marked '2'

Moving the slider changes the circumference of the circle where the two cones are adjoined. The value displayed in the slider knob is the radius of that circle

For each cone: the area is a function of the radius in two ways:
- The area is proportional to the circumference
- The area is proportional to the width.

About the width: the steeper the slope, the larger the width. That is: the width is a function of the derivative of the curve.

Picture 2.3 Graphlet

Picture 2.4 Graphlet

In graphlet 2.3 the right hand panel gives a overview of how the areas of the two cones respond to sweeping out variation. I will refer to the point where the total surface area is minimal as 'the sweet spot'. As you are sweeping out variation: A1 and A2 are changing in opposite direction. At the sweet spot the two are changing at the same rate.

The curve '-A2' is the mirrored counterpart of curve 'A2' At the sweet spot the curves A1 and -A2 have the same slope; the tangents are parallel to each other.

In graphlet 2.4 the view of the right hand panel is zoomed in on curve A1 The other two curves have been shifted vertically to bring them in close proximity to curve A1

Picture 2.5 Graphlet

In Graphlet 2.5 there are 4 sliders, giving 4 cones. (I capitalize on symmetry; to the left and to the right of the y-axis the cones are mirrored.)

I encourage you to go through the process of using the sliders to converge onto the minimal surface. First move the sliders way out of position, and then just eyeball a first try. Then you go back and forth in adjusting the sliders, until you have reached the point in variation space where there is no more room for improvement.

(As a time saver: only shift the first three sliders, and leave the fourth slider at its default position. Then it is quicker to converge back to the equilibrium positions.

The radio buttons labeled 'x 1', 'x 0.1' and 'x 0.01' toggle between three sets of 4 sliders. The second and third set of sliders are for fine adjustment. Clicking the button 'Consolidate' does the following two things:
- The primary sliders are incremented with the value of the secondary and tertiary sliders
- The secondary and tertiary sliders are reset to their zero position

#### Discussion

An essential feature of the process that is implemented in graphlet 2.5 is this: every time you adjust a point the change affects the state of the neighbouring points. So you proceed to adjust those points, but that affects the next neighbours, and so on. Every local change propagates out, eventually to the entire curve. The process is global in the sense that in order to find the minimal surface all the sliders must be at equilibrium point concurrently.

In the graphlet the total surface area is shown. However, the graphlet could also have been implemented as follows: when a particular slider is being moved: display the combined surface area of just the two cones that are changed by the adjustment of that particular slider. Then the graphlet never shows the total surface area, but the user can still home in on the minimal surface area.

In the graphlet you home in on a state of equal rate of change. In the case depicted in graphlet 2.5 there is a single extremum, in this case that extremum is a minimum. The actual process is one of homing in on the equal rate of change state. This process has a side-effect: since that minimum is the only point in the variation space where the rate of change is equal everywhere: identifying the point of equal-rate-of-change everywhere coincides with identifying the minimum.

To express that equal-rate-of-change property at infinitesimal level: construct a differential equation. That is the subject of Section 3.1

## 3   The differential equation

### 3.1 Constructing the differential equation

Picture 3.1.1 Graphlet

I will refer to the triplet of points in graphlet 3.1.1 as the unit of operation. The unit of operation is applied concurrently along the entire curve.

We make the x1,2 and x2,3 intervals the same length, so that they can be generically referred to as 'Δx' The finalizing step of the contruction will be to take the limit of Δx → 0

The purpose of showing the construction process is to show how the resulting equation expresses the process that is implemented in graphlet 2.5.

In section 3.2 of this article the derivation that uses integration by parts is discussed.

In graphlet 3.1.1: the labels 'A' and 'B' refer both to the midpoints of the respective line elements, and to the two line elements themselves.

For the slope of each line element we use Lagrange notation, indicating the derivative of y with respect to x as y' (3.1.1)

To set up for later use we take the derivative with respect to y (3.1.2)

With the above preparation in place:

Let CA be the surface area of cone A, CB the surface area of cone B.

For the pair of adjacent cones of the unit of operation we have the condition: the derivatives with respect to change of the y-coordinate must match, as depicted in graphlets 2.3 and 2.4. The following construction expresses that condition: (3.1.3)

Notice what is not there: (3.1.3) does not produce the total area. At this stage is it not necessary to consider the total area. (3.1.3) expresses a criterion of equality: a condition of matching rate of change.

From (3.1.3) a differential equation is constructed. Solving that differential equation gives the shape of the soap film. With the equation for the shape obtained the actual surface area can be calculated.

(3.1.3) is stated in terms of elements with a finite size, the cones CA and CB. The final step towards the differential equation will be to take the limit of Δx → 0

Incidentally: while our goal is to find a function that gives the y-coordinate as a function of the x-coordinate, equation (3.1.3) states differentiation with respect to y instead of differentiation with respect to x. That makes (3.1.3) a distinct type of differential equation.

We have to accommodate that the area of the truncated cone involves multiplying the y-coordinate with (a function of) the derivative of the y-coordinate. That means that in order to evaluate (3.1.3) we must expand into partial differentiation. (3.1.4)

The leading part of each of the four terms of the equation is there because the chain rule has been applied.

First step of developing (3.1.4): substitute the terms of (3.1.2) into it. (3.1.5)

Distribute the minus sign. (3.1.6)

Move the terms on the right hand side to the left and restructure. In (3.1.6) the terms with CA and CB were grouped together, in (3.1.7) the terms with y and y' are together. (3.1.7) (3.1.8)

The expression is now at a point where it is ready to take the limit of Δx → 0

In the limit of Δx → 0:
- the term on the left hand side approaches the partial derivative of C with respect to y.
- The term on the right hand side approaches the derivative with respect to x of the partial derivative of C with respect to y' (3.1.9)

### Dimensional analysis

The operation performed by the first term of (3.1.9) is straightforward: differentiation with respect to the y-coordinate. It looks as if the second term of (3.1.9) is doing something different, let's take a closer look at that.

The starting point was (3.1.3) which specifies differentiation with respect to the y-coordinate. We use dimensional analysis as a form of checksum. The dimensional analysis corroborates that the second term performs the same operation as the first term: differentiation with respect to the y-coordinate.

### 3.2 Integration of a test function

The majority of Calculus of Variations sources give the derivation in which variation is applied to a test function, evaluating how the integral responds to that. The following is a demonstration of how that proceeds.

Notation:

 y(x) the function that we want to solve for. ε multiplication factor yε(x) test function to execute the variation

Picture 3.2.1 Graphlet
Depiction of the test function yε(x)

Graphlet 3.2.1 demonstrates the way to implement the test function yε(x). A single parameter is used to sweep out variation. Variation of the multiplication factor ε sweeps out the variation of the test function yε(x).

It is sufficient for the test function yε(x) to have the following property:
yε(a) = yε(b) = 0

Picture 3.2.2 Graphlet

Noteworthy: the start point and end point used in this derivation do not have to correspond to some physical point of the actual problem setting. Working out how to fit the curve to the problem setting comes only later.

In the course of the derivation we will see that the logic has no dependence on:
- which section along the curve the points 'a' and 'b' are located.
- the distance between the points 'a' and 'b'.

We have the option of thinking of the variation implemented as a single variation, spanning a significant length, but we can also think of the variation as multiple instances, with arbitrarily short span, distributed over the length of the curve. The logic of the derivation is independent of those implementation details.

The setup: (3.2.1)

Here the integrand of the integration is stated with a capital letter F, to express that it is distinct from the function we are trying to obtain.

#### To dismiss the integration

The goal now is to arrive at an expression that without ever evaluating the integral will allow us to solve for the curve we are looking for; the goal is to get to a point where we can dismiss the integration.

We set up a derivative with respect to the multiplication factor ε. Note that while we do need the ability to take the derivative with respect to ε, we don't need it for a range of values of ε. The derivative with respect to ε at the point where ε is zero is all we need. (3.2.2)

Applying the chain rule: (3.2.3)

In order to make progress (3.2.3) must be brought to a form where the test function yε(x) can be dismissed.

In (3.2.3) the obstacle to progress is the fact that it contains the derivative of yε(x) with respect to x. We will use the product rule of differentiation to transfer that differentiation from the test function yε(x) to the expression F(y(x),y'(x)).

The following term from (3.2.3) is the one that will be transformed: (3.2.4)

To save space: from here (3.2.4) will be notated as follows: (3.2.5)

The expressions (3.2.6), (3.2.7), (3.2.8), and (3.2.9) cover that transformation process, the transformation process converts (3.2.3) to (3.2.10)

The tool that will be used is the product rule of differentiation: (3.2.6)

In (3.2.7) the term we want to transform is the first term on the right hand side of the expression. Everything around that is arranged in such a way that the form of (3.2.7) matches the pattern of (3.2.6). (3.2.7)

Set up integration of both sides of (3.2.7), from point a to point b: (3.2.8)

(3.2.8) is the reason why at the start the test function yε(x) was specified to be zero at x=a and x=b.

Given that yε(a) = yε(b) = 0 it follows that the left hand side of (3.2.8) evaluates to zero, hence: (3.2.9)

Noteworthy: this transfer of the differentiation is the only point in the derivation where yε(a) = yε(b) = 0 is used. It is not used anywhere else. And again: the logic of this transfer step has no dependence on the positions of a and b. They can be positioned anywhere along the curve, including arbitrarily close together.

The relation (3.2.9) allows us to transform (3.2.3) into the following: (3.2.10)

Now that the differentiation has been transferred we can factor out the term yε(x): (3.2.11)

At the start it was announced: we want to get to a point where we can dismiss the test function yε(x), and the integration. With (3.2.11) we have reached that point.

In order to satisfy (3.2.11) we must have: (3.2.12)

#### Discussion

The most important feature of the derivation of the Euler-Lagrange equation is this: the integration is dismissed. Dismissing the integration is the whole point of the derivation.

That raises the question: how can it be that while the integration is dismissed we are still able to solve the problem? The explanation: the expression has been shifted to something of equal mathematical power: a differential equation.

A differential equation asserts a global condition in the sense that the solution to the differential equation is a function that satisfies the differential condition concurrently everywhere along the curve.

This was first discussed/demonstrated in graphlet 2.5, the process to find the point of global concurrent equilibrium

### 3.3 The Euler-Lagrange equation

The standard Euler-Lagrange equation, for an x,y coordinate system: (3.3.1)

For the case of the minimal surface problem: once the curve is known the total surface area is calculated by evaluating the following integral: (3.3.2)

The form of the integrand in (3.3.2) comes from the expression for the lateral area of a truncated cone: (1)

The Euler-Lagrange equation, with the above integrand, is as follows: (3.3.3)

This would be very difficult to solve, but fortunately in this situation the Euler-Lagrange equation can be reduced to a simpler equation.

To show why such a reduction is possible the following section is about the catenary problem, which turnes out to be closely related to the minimal surface problem. So much so that the two problems have the same solution!

In section 4. the catenary problem is solved twice, first using differential calculus, and then using Calculus of Variations. After that we will be in a position to return to the minimal surface problem.

## 4   The catenary

### 4.1 Introduction

Picture 4.1.1 Graphlet
Catenary

In Graphlet 4.1 the length of chain between the cusps is in the catenary shape. I will refer to that specific section - between the cusps - as 'the catenary'. The vertical lines left and right represent chain length hanging down from the respective cusps, providing tensioning force. I will refer to those sections as 'tensioning chain'.

I will refer to the total tension force exerted by the catenary (at the cusp) as the cusp tension. In the diagram the force that the catenary exerts at the cusp is decomposed in two components. The magnitude of the vertical component is equal to the weight of the length of chain that is being suspended in between the cusps. The magnitude of the horizontal component follows from the angle of the chain (at the cusp).

The two piles of chain left and right represent that surplus chain piles up at a set height below the cusp. Only the free hanging length of chain contributes to tensioning, therefore the tension force is for all lengths of the catenary the same. I will refer to this tension as the provided tension.

The cusp tension and the provided tension are acting in opposition to each other; I will refer to the resultant effect as non-equilibrium.

With the checkbox 'Non-equilibrium' checked: the number displayed indicates for the given length of the catenary the state of the opposing forces. As you move the slider left and right: a negative value of the non-equilibrium state means there is not enough provided tension, and when released from that position the catenary will sag. Interestingly, it turns out there are two cross-over points. As you move the slider: the second cross-over point is at slider value 1.44

With the checkbox 'Length' checked: the value that is displayed is the length of chain from cusp to midpoint. The vertical component of the cusp tension is equal to the length of the catenary.

When the catenary is at its equilibrium position: the provided tension is counteracting both the vertical component and the horizontal component of the cusp tension. That is why the provided tension has to be larger than the vertical component of the cusp tension.

#### Coordinate system of the diagram

The coordinate system is chosen such that the two cusps are located at x=-1 and x=1 respectively. The mass of the chain is set to one unit of mass per unit of length. The provided tension is set up such that at the equilibrium position the horizontal component of the tension in the catenary comes out as 1 unit of force. That is why when the slider is in the 1.00 position the line that represents the horizontal component of the tension force is 1 unit long; it represents 1 unit of force.

### 4.2 The Catenary in terms of force equilibrium

Graphlet 4.2.1 illustrates why the solution of the catenary problem can be found with a differential equation.

Picture 4.2.1 Graphlet
The tension along the catenary

Since the shape is symmetric it is sufficient to evaluate from the midpoint to the cusp.

With:

 TH The horizontal component of the tension λ The weight per unit of length L the length of the chain from the midpoint to the x-coordinate.

The weight that has to be supported at coordinate x is given by multiplying the length L with the weight per unit of length: λL

In graphlet 4.2.1: move the slider and pay attention to the force component in horizontal direction. Everywhere along the curve that horizontal component has the same magnitude. (The reason that component is a constant: it is perpendicular to the direction of gravity.) For later reference: we can think of this constant as a conserved quantity. In the calculation: as the evaluation traverses the x-coordinate there is a conserved quantity.

#### Constructing the differential equation

To prepare for later use: from midpoint to cusp the slope of the curve increases; the length of chain per unit of x-coordinate increases accordingly. (4.1) gives an expression for dL/dx. (4.2.1)

The equilibrium shape has the following property: at every point along the length of the chain the tension force is tangent to the local slope.

At every point, from the mid point to the cusp, the chain above that point is providing the required force to support the weight of the length of chain below that point.

It follows: at every point along the curve: the slope of the curve (the tangent) coincides with the ratio of horizontal tension component and vertical tension component: (4.2.2)

On how to proceed from here: we need to work towards an expression that is purely in terms of the cartesian coordinates x and y. That is why (4.1.2) was prepared; by combining (4.2.2) with (4.2.1) L will be replaced with an expresssion that is in terms of the cartesian coordinates x and y

(4.2.1) gives the derivative of L with respect to x, so in order to combine we need to adapt (4.2.2).

At this point we take advantage of the following: the horizontal tension component is a constant.

We differentiate both sides of (4.2.2) with respect to x, with TH a constant. (4.2.3)

Combining (4.2.3) and (4.2.1) achieves the goal of converting the quantity dL: (4.2.4) is in terms of the cartesian coordinates x and y only: (4.2.4)

(Thanks to Daniel Rubin for pointing out the following strategy to solve (4.2.4).

(4.2.5) is (4.2.4) with the factor TH/λ omitted. (4.2.5)

We make the substitution , and we square both sides. Squaring both sides introduces an extraneous solution, so at a later stage we must discard that. (4.2.6)

Next we take the derivative with respect to x: (4.2.7)

Dividing both sides by 2du/dx: (4.2.8)

So the solution to the equation is a function with the property that if you differentiate it twice you are back to the original function. That narrows the options down to the following two expressions, which are named 'hyperbolic cosine' and 'hyperbolic sine' respectively: (4.2.9)

Of these two the first one satisfies (4.2.5)

### 4.3 The Catenary in terms of minimized potential energy

We will need an expression for the potential energy.

When height difference is small compared to the Earth's radius we can treat the Earth's gravity as a uniform force.

We have that potential energy is defined as the negative of work done; to obtain the work done: integration of force over distance. (4.3.1)

In the case of a uniform force that integration simplifies to multiplication. For gravity the change in potential energy from height h0 to height h: (4.3.2)

In order to keep the expression simple we set the value of all the constants to 1 unit, and we set h0 to zero. Then the value of the potential energy is equal to the value of height h

In a diagram we will use the y-coordinate for the height h

At this point we are in a position to see what is going to happen.

We have that the Euler-Lagrange equation implements the process that is depicted in the graphlets 2.3, 2.4, and 2.5. The Euler-Lagrange equation acts as an operator that performs differentiation with respect to the y-coordinate.

(4.3.1) expresses the definition of potential energy: the integral of force with respect to the y-coordinate.

As we know: the operations of differentiation and integration are each other's inverse.

In the case of the catenary: the Euler-Lagrange operator will convert the potential energy to the corresponding force.

Resuming the minimized potential energy approach:
The integral of the potential energy, from midpoint to a cusp located at coordinate x comes out as follows: (4.3.3)

The factor y of the integrand is for the height above zero, and the factor √ (1+(y')²) is for the amount of chain length per unit of distance along the x-axis

As announced earlier, the catenary problem is a sibling of the minimal surface problem; (4.3.3) has the same form as (3.3.2), the integral for the surface area of the soap film.

To solve for the curve of minimal potential energy we use the same strategy as in the case of the differential equation approach: we take advantage of the catenary's property that the horizontal tension component is a constant.

In (4.3.3) the integrand has the terms 'y' and 'dy/dx', but no term with the x-coordinate by itself. That circumstance allows a way to reduce the Euler-Lagrange equation down to a simpler expression. That simpler expression is named 'Beltrami identity'.

Derivation: Appendix I: the Beltrami identity

The Beltrami identity: if then: (4.3.4)

where C is a constant.

Inserting the integrand of (4.3.3) into (4.3.4) gives (4.3.5). The expression looks difficult, but many of the terms drop away against each other. (4.3.5)

Set equal to a constant C (4.3.6)

For the time being we set the value of the constant C to '1'.

From there we use the same pattern as was used from (4.5) to (4.9). To do that we switch from Lagrange notation for derivatives to Leibnitz notation.

Square both sides. Squaring both sides introduces an extraneous solution, so at a later stage we must discard that. (4.3.7)

Take the derivative with respect to x: (4.3.8)

divide both sides by 2(dy/dx): (4.3.9)

### 4.4 Discussion: relation between force equilibrium approach and energy minimization approach

In order to state the catenary problem as a problem of minimal potential energy: the potential energy (as a function of the height 'h') was obtained from the gravitational force (as a function of the height 'h').

Then you insert that expression in the Euler-Lagrange equation: the Euler-Lagrange equation immediately recovers the force by differentiating with respect to the y-coordinate.

That is: while it appears as if there are two distinct approaches to solving the catenary problem:
- evaluating force equilibrium,
- minimizing potential energy,
in actual fact the two approaches are one and the same.

The Euler-Lagrange equation performs the type of operation that is visualized in graphlet 2.5; in the limit of the increments along the x-axis approaching zero you get the Euler-Lagrange equation.

The Euler-Lagrange operator converts the potential energy to the corresponding force. The resulting equation is a force equilibrium equation.

## 5   Classical Mechanics

In preparation for later use we will first work out the case of integrating a non-uniform acceleration a from a starting point s0 to an end point s

. The second row marks the change of differential. For each change of differential the limits change accordingly.

with the intermediate steps omitted: (5.1)

Incidentally: a remarkable property of (5.1) is this: the form of the right hand side is identical to the case of uniform acceleration (5.2)

In order to use the Euler-Lagrange in classical mechanics we must construct quantities such that when they are inserted in the Euler-Lagrange equation the Euler-Lagrange equation will recover F=ma

We have that the Euler-Lagrange operator performs differentiation with respect to the position coordinate. Therefore we start with F=ma and we integrate both sides with respect to the position coordinate. (5.3)

We use (5.1) to develop the right hand side: (5.4)

(5.4) is the work-energy theorem. The left hand side of (5.4) is work done, and the left hand side is kinetic energy.

Potential energy is defined as the negative of work done. (5.5)

About the concept of kinetic energy: there was a precursor concept, which was named vis viva, 'the living force', defined as mv². Around the mid 1800's the physics community shifted to a kinetic energy defined as ½mv².

Clearly the shift to ½mv² was bound to happen: defining kinetic energy in accordance with the work-energy theorem makes everything fit together.

A prominent example of this everything-fits-together: with potential energy and kinetic energy defined in accordance with the work-energy theorem we have: in interconversion of potential energy and kinetic energy the amount of change of energy matches: (5.6)

Since potential energy and kinetic energy are obtained by integration with respect to the position coordinate: differentiating with respect to the position coordinate will recover F=ma: (5.7)

(5.7) can be restated in a form of that coincides with the form of the Euler-Lagrange equation. (5.8)

(5.9) and (5.10) demonstrate the equivalence of (5.7) and (5.8): (5.9) and (5.10) both evaluate to ma. (5.9) (5.10)

#### Discussion

The Euler-Lagrange equation is an operator that performs differentiation with respect to the y-coordinate.

In the context of the catenary problem it is customary to treat the horizontal position coordinate as the x-coordinate, making the y-coordinate the height coordinate h. Thus in the case of the catenary problem the Euler-Lagrange equation performs differentiation with respect to the height coordinate, converting the potential energy to force.

In the context of classical mechanics the goal is to obtain the position coordinate of some object as a function of the time-coordinate. Thus in the case of classical mechanics the Euler-Lagrange equation performs differentiation with respect to the position coordinate, converting potential energy to force, and the ½ part of the kinetic energy to acceleration.

In the case of Classical Mechanics: to apply the Euler-Lagrange equation is to find the point in variation space such that everywhere along the trajectory the rate of change of kinetic energy matches the rate of change of potential energy.

In the case of problems such as the soap film problem or the catenary problem the goal is to take advantage of the side-effect: to find a minimum. In the case of Classical Mechanics that side-effect does not come into play: it's only about identifying the trajectory such that everywhere along the trajectory the rate of change of kinetic energy matches the rate of change of potential energy.

In the case where the potential energy is expressed in terms of some form of generalized coordinates, see Appendix II: generalized force

### Appendix I: the Beltrami identity

In the case of the soap film minimal surface problem and the catenary problem: there is no direct dependence on the x-coordinate. The integrand has the terms 'y' and 'dy/dx', but no term with the x-coordinate by itself

We use the expression for the catenary problem as example (I.1)

the general expression for the derivative of F with respect to x: (I.2)

The partial derivative with respect to x is zero, of course, and the expression reduces to: (I.3)

We use the following form of the Euler-Lagrange equation to do a substitution: (I.4)

After the substitution, and some rearranging: (I.5)

It looks as if we made things worse for ourselves. However, (I.5) is a form that can be collapsed. Take the right hand side of (I.6) and perform the differentiation according to the product rule. The result is the right hand side of (I.5) In other words: the operation that takes (I.5) to (I.6) is the product rule applied in reverse. (I.6)

Since differentiation is a linear operation we can factor it out: (I.7)

In order for that derivative to be zero the term inside the parenthesis must be a constant. This is the Beltrami identity. The Beltrami identity is a specific instance of the Euler-Lagrange equation. (I.8)

### Appendix II: generalized force

When the potential energy is expressed in terms of some form of generalized coordinate(s) the result of differentiation with respect to the position coordinate will be an expression in terms of a generalized force.

The concept of generalized force is to be understood as follows:

Example:
Let's say that we are modeling the oscillation of the balance wheel of a watch, using:
- polar coordinates
- rotational kinetic energy
- potential energy of a spiral shaped spring.

Then the Euler-Lagrange equation does the following conversions:
- potential energy to torque
- the ½ω² part of rotational kinetic energy to angular acceleration.

Torque is an example of a generalized force. It is of course the most widely known example of it.

At this point we consider the form of F=ma.
F=ma expresses a relation between the following three types of entity:
- tendency to cause change of state (force)
- coefficient of opposition to change of state (inertia)
- second time derivative of the position coordinate

The ratio of the force and the opposition to change gives the resulting acceleration, the second time derivative of position. (II.1)

In the case of a balance wheel it is convenient to use polar coordinates, and then the dynamic entities come out as torque, moment of inertia, and angular acceleration.

The corresponding equation that gives the angular acceleration as a function of the ratio of torque and moment of inertia: (II.2)

 τ torque I moment of inertia φ angle

This pattern generalizes to all forms of generalized coordinates.

As we know: torque does not have the dimensions of force. For self-consistency: for a given choice of coordinate the corresponding generalized force must be such that the product of generalized force and its corresponding generalized coordinate has the dimensions of work.

Reference:
Richard Fitzpatrick, Classical Mechanics course:
Generalized forces

It is to be expected that the form of the fundamental equation is independent of the choice between cartesian coordinates and some form of generalized coordinates. Indeed: in classical mechanics the fundamental equation has the same form independent of the choice of coordinate system. The three dynamics entities are:
- agent of change (force, torque, etc.)
- opposition to change (inertia, moment of inertia, etc.)
- second time derivative of position coordinate

The expressions (II.1) and (II.2) have been set up to illustrate that the form of the fundamental equation is independent of the choice between cartesian and (some form of) generalized coordinate(s). 