This article is part of a set of four; the common factor is Calculus of Variations. In classical physics Calculus of Variations is applied in three areas: Optics, Statics, and Dynamics. Each article in the set is written as a standalone article, resulting in some degree of overlap.
The other three articles:
Foundation:  Calculus of Variations  
Optics:  Fermat's stationary time  
Statics:  The catenary 
The EnergyPosition equation
1. From F = ma to the WorkEnergy theorem
1.1 Torricelli's formula
1.2 The WorkEnergy theorem
1.3 Significance of Energy derivative
1.4 Derivative with respect to position
2. From WorkEnergy to stationary action
2.1 Calculus, unit of operation
2.2 Unit of operation: evaluating area
2.3 From unit of operation to integral
3. Trial trajectories: response to variation
3.1 Potential increases linear with height
3.2 Potential increases quadratic with displacement
3.3 Potential increases proportional to cube of displacement
Appendix I Verification
Appendix II The meaning of stationary
Appendix III Jacob's Lemma
This exposition is about Hamilton's stationary action. The usual presentation is to posit Hamilton's stationary action, and to proceed with showing that F = ma can be recovered from it.
However, it is also possible to proceed in the other direction. You can start with F = ma, and proceed to Hamilton's stationary action in all forward steps. The development is in two stages:
 Derivation of the WorkEnergy theorem from F = ma
 Demonstration that in cases where the WorkEnergy theorem holds good Hamilton's stationary action will hold good also.
The WorkEnergy theorem and Hamilton's stationary action have the following in common: the physics taking place is described in terms of kinetic energy and potential energy. As an overarching name for both the WorkEnergy theorem and Hamilton's stationary action I will use the expression 'energy mechanics'.
Interactive diagrams
The intention of this article is to let the interactive diagrams tell the story. You can choose to jump ahead to the interactive diagrams However, I strongly recommend you return to these paragraphs at some point; for overall understanding they are necessary.
Requirement for well defined potential energy
Given a particular force: in order to have a well defined expression for potential energy the force must have the property that the work done in moving from some point A to some point B is independent of the path taken between the two points. The validity of energy mechanics is limited to forces with that property.
(It may be that in terms of a deeper theory such as quantum physics all interactions actually have that potentialisindependentofthepath property, but we should not blindly assume that.)
As we know: the principle of conservation of energy is asserted without any restriction. That's why the concept is referred as a 'principle', it is a blanket statement.
The discussion in this article avoids the blanket statement. The discussion in this article is limited to the classes of cases where by way of experimental corroboration it is evident that the work done in moving from some point A to some point B is independent of the path taken between the two points.
1. From F = ma to the WorkEnergy theorem
1.1 Torricelli's formula
I will refer to the interactive diagrams as 'graphlets' (contraction of 'graphical applet').
In the following discussion I first obtain an expression that is known as Torricelli's formula (not to be confused with Torricelli's law), which is one of the two elements of the workenergy theorem. Each step in the discussion is a step up in generalization.
Using the standard letters for time, position, velocity and acceleration:
t time
s position
v velocity
a acceleration
In graphlet 1.1; use the radio button to switch to 'uniform velocity'. With uniform velocity the distance covered after a time interval of 't' units of time is equal to area of the shaded rectangle. Significantly, this area property generalizes to nonuniform velocity. Use the radio button to switch to 'uniform acceleration'.
The area of the shaded region increases through successive addition of rectangular strips. For each increment Δt there is a corresponding rectangular strip. (In the graphlet the time increments are small, resulting in an appearance of smoothly increasing area.) For each added strip: the width is Δt and the height is given by (1.4): a·t. This demonstrates: after a time interval of t units of time the distance covered is ½at²
We have the relation between force and acceleration:
The overall objective of this discussion is to show why it is that the WorkEnergy theorem follows from F=ma. A key element is: acceleration is a second derivative.
Taking a second derivative means you perform the same operation twice, in a way that creates a recurring relation: acceleration is to velocity what velocity is to position.
The pattern that that is laid out by (1.2) and (1.3) has surprisingly far reaching ramifications. That is the subject of the rest of this section.
Uniform acceleration
We will first explore uniform acceleration, and after that we will proceed to arbitrary acceleration.
For the case of uniform acceleration a we have:
As demonstrated in graphlet 1.1: the distance s covered after t units of time, according to (1.4):
To obtain Torricelli's formula for the case of uniform acceleration: multiply both sides of (1.5) with acceleration a, and substitute according to (1.4):
(1.6) states a transformation of the product a·s to the product ½v². That transformation is through transfering a time factor t.
 The position coordinate s is rescaled to velocity by dividing it with the time factor t
 The acceleration value is rescaled to velocity by multiplying it with the time factor t.
For continuity with the form for nonuniform acceleration: accommodate nonzero value of initial position and nonzero value of initial velocity.
s_{0}  initial position 
s  final position 
v_{0}  initial velocity 
v  final velocity 
In graphlet 1.2 the left hand side of (1.7) is represented. The value in the slider knob is the time coordinate. The position is a function of time according to (1.5). That is, in graphlet 1.2 the ycoordinate is not a function of the xcoordinate of the displayed point. Instead time is used as a parameter and both the horizontal and vertical coordinate are a function of this common parameter.
In graphlet 1.3 both sides of (1.7) are represented:
In graphlet 1.3:
The area of the shaded region increases through successive addition of rectangular strips. For each time increment Δt there is a corresponding rectangular strip.
Next step: generalization to nonuniform acceleration by evaluating an integral.
In the development of the integral the strategy represented in graphlet 1.3 is used:
· transformation of position coordinate to velocity coordinate
· transformation of acceleration coordinate to velocity coordinate.
The second line shows the changes of differential, first from ds to dt, and then from dt to dv, each time with corresponding change of limits.
This is quite a windfall: (1.8) has the same form as (1.7). That is: the generalization to nonuniform acceleration did not result in a more complicated expression. From here on I will use the name Torricelli's formula inclusively, using it to refer to the case of nonuniform acceleration as well.
Comparison of the left hand panel and right hand panel of graphlet 1.3:
· the integral of a ds specifies adding rectangular strips with height a and width ds
· the integral of v dv specifies adding rectangular strips with height v and width dv
In the process of performing the integration: the dimensions of the rectangular strips are transformed. From the left hand panel to the right hand panel: the height of each strip is multiplied with the current value of the factor t for time, and the width is divided by the current factor t. It follows that in the left hand panel and the right hand panel the areas are the same.
1.2 The WorkEnergy theorem
To obtain the workenergy theorem: set up the integral of force with respect to the position coordinate, substitute the force F with ma, and then restate according to Torricelli's formula:
(About the quantity ½mv^{2} on the right hand side of (1.9): we have that earlier in the history of theory of motion a quantity of the form mv^{2} was recognized in another context: collisions. In elastic collision a quantity proportional to mv^{2} is conserved. This quantityproportionaltomv^{2} was referred to as 'the living force'. As we know: today a single concept is used, kinetic energy, with the value supplied by the WorkEnergy theorem: ½mv^{2}.)
To emphasize the structure of the WorkEnergy theorem I give the elements arranged in a table. In each row the statement follows mathematically from the statements in the row(s) above it.







Potential energy
Historically the concept of potential energy arose before the workenergy theorem reached today's standard form. With the benefit of hindsight we recognize that the concept of potential energy capitalizes on the workenergy theorem.
Potential energy is defined as the negative of the left hand side of (1.9); the integral of force over distance.
In order for the potential energy to be well defined a specific condition must be met: the value of the integral from point s_{0} to point s must be independent of how the test object moves from point s_{0} to point s.
When the potential energy is well defined:
That is: with potential energy defined according to (1.10): when an object is moving down a potential gradient the kinetic energy will increase by the amount that the potential energy decreases. (With everything reversed when moving in the opposite direction, of course.)
(1.12) combines three statements. I present these three statements as a unit to emphasize the interconnection between them. While the statements are different mathematically, the physics content of these three is the same.
From this point on all equations/statements in this article will be in terms of potential energy.
1.3 Significance of energy derivative
Potential energy and kinetic energy are both entities that do not have an intrinsic zero point. Whenever a measurement of energy is made it is a measurement of energy difference.
The only way to define potential energy at all is to define the potential difference between some start point and some end point.
Kinetic energy satisfies Galilean relativity, so it is definable only as a difference between some starting velocity and some end velocity.
In working with equations the value of the energy is not the relevant property. Instead the derivative of the energy is the relevant property. Note especially: the fact that energy doesn't have an intrinsic zero point makes no difference for the value of the derivative of the energy.
1.4 Derivative with respect to position
With (1.12) established we would like to use that as an energy mechanics tool.
The validity of (1.12) extends down to infinitisimal change:
The the way to use (1.13) as the basis of an equation of motion is to take the derivative with respect to position.
Taking the derivative with respect to position is the inverse of the process from (1.1) to (1.9). That is: taking the derivative with respect to position recovers F = ma (See the section verification.)
I will refer to (1.14) as the 'EnergyPosition equation', as it takes the derivative of the energy with respect to position.
2. From WorkEnergy to stationary action
2.1 Calculus, unit of operation
In the following section I discuss both Differential Calculus and Calculus of Variations, discussing the parallels and the differences
In this graphlet, 2.1, and likewise in 2.2, the shape of the dashed line is a parabola. That is: the curve represents the trajectory of an object that is launched upward, and is from that point on subject to uniform (downward) acceleration.
Differential equations are a higher level of equation than basic level equations. Here I mean with 'basis level equation': an equation that has a value as its solution. Example: solving for the point where a graph crosses the xaxis.
As we know: the concept of a differential equation is that its solution is a function. Whereas a basic level equation is about a single point a differential equation is about being satisfied concurrently everywhere in the domain.
Graphlet 2.1 demontrates the unit of operation of differential calculus: a pair of points. The unit of operation is concatenated over the entire domain. In the limit of making the concatenated units of operation infinitisimally small: the solution to the differential equation is a function.
In the case of calculus of variations the unit of operation is a triplet of points. The outer points of the triplet are treated as fixed points, and the effect of a small vertical shift of the middle point is evaluated.
The concatenation of the unit of operation is such that units are overlapping: each point is participating in the variation. The solution to a variational problem is a function such that the unit of operation is satisfied everywhere concurrently.
This connection between differential calculus and calculus of variations was first recognized by by Jacob Bernoulli, published in 1697. See Appendix III Jacob's Lemma.
Use the mouse to move the vertical slider. As you are sweeping out variation you are looking for the sweet spot: the point where as the object is moving the rate of change of kinetic energy matches the rate of change of potential energy.
(About the case in this specific diagram: a uniform force gives rise to a linear potential, and because of that: as you move the vertical slider the value of ΔE_{p} remains the same. With any other function for the potential energy the ΔE_{p} will change upon variation of the middle point.)
The diagram starts with the unit of operation depicted large, for the purpose of exploration. Of course: for the purpose of calculus the validity of the logic must extend all the way to infinitisimally small scale. This is visualized when the horizontal slider is moved.
The graphlets 3.1, 3.2 and 3.3 provide an implementation where in a concatenation of subsections each subsection can be changed individually.
2.2 Unit of operation: evaluating area
Hamilton's action is defined as the integral with respect to time of the Lagrangian (E_{k}E_{p}), and the geometric interpretation of integration is to evaluate an area.
The following graphlet takes the unit of operation as starting point, setting up area evaluation. Specifically: change of area in response to variation sweep. The reasoning is set up to be valid at any scale. Hence the reasoning is valid down to infinitissimally small intervals.
The two time intervals t_{1,2} and t_{2,3} are displayed in three of the four subpanels: upperleft, upperright, and lowerleft. The time intervals are set up to be of equal duration.
In the upperright subpanel: the height of each column represents the value of the energy corresponding to that time interval. The columns are of equal width, therefore the area of each column is in exact proportion to the corresponding energy value. Each column occupies half the width of Δt (the time interval), so upon concatenation of all time intervals the summed area of all the columns is equal to the integral over time.
The number displayed in red is the summed area of the two red columns; the number displayed in green is the summed area of the two green columns.
Hypothetical versus actual
When sweeping out variation the change of energy is hypothetical change of energy, not actual change of energy. In the diagram: at each position of the movable point the diagram shows what the energies would be if the object would move along that particular trial trajectory.
Stating the two different kinds of change explicity:
 Rate of change of actual energy as an object is moving along the true trajectory.
 Rate of change of hypothetical energy (as a function of variation sweep)
In the case of the actual motion, the motion along the true trajectory, one form of energy is transformed into the other; the energies are counterchanging: ΔE_{k} = ΔE_{p}
In the case of sweeping out variation: as illustrated in diagram 3: the (hypothetical) energies are cochanging.
Subtraction
In the hypothetical variation sweep the energies are cochanging, so in order to compare them we need to subtract one from the other. (By convention it is the potential energy that gets the minus sign.)
In the lowerright subpanel: to express that the potential energy is subtracted from the kinetic energy the green area is displayed as area below the coordinate's zero line. This is the concept of signed area; counting area below the zero point of the coordinate system in the negative.
The lowerright subpanel represents in blue the result of the subtraction.
Stationary
The motion of the blue dot over the diagram represents the response of the value 'area(E_{k}E_{p})' to variation sweep.
At the point where the variation hits the sweet spot the value of 'area(E_{k}E_{p})' is stationary. (See the section the meaning of 'stationary') That means: at the sweet spot the rate of change of summed red area matches the rate of change of summed green area.
Matching derivatives
In diagram 3, when the variation hits the true trajectory the sum of kinetic energy and potential energy is a constant. In ascending motion along the true trajectory: the amount of decrease of potential energy is matched by the amount of increase of potential energy.
Expressed in the form of differentiation with respect to time, the true trajectory has the following property everywhere:
When the derivatives with respect to time are matching the derivatives with respect to the position coordinate are matching also.
For the true trajectory we have:
Variation of position
In classical mechanics when variation is applied it is variation of position. That means that evaluating the derivative of (E_{k}E_{p}) with respect to variation and evaluating the derivative of (E_{k}E_{p}) with respect to position is the same thing.
2.3 From unit of operation to integral
In graphlet 3 the width of each of the two bars is half the width of the time interval, which means that when concatenating units of operation the bars end up exactly adjacent. Hence we can proceed as follows: we divide the total duration in equal time intervals, and then replicate this bar configuration end to end, covering the entire duration.
With the entire duration divided in equal subintervals:
 The stationary property propagates from the subintervals to the whole duration; when the summed area is stationary in each and every subinterval then the summed area along the entire duration will be stationary.
 The time intervals can be made arbitrarily short.
Summing bars of signed area, in the limit of subdividing into infinitely many bars: that is evaluating the integral. That integral is Hamilton's action: the integral with respect to time of the Lagrangian (E_{k}E_{p})
Hence: when the circumstances are such that the WorkEnergy theorem holds good it follows that Hamilton's stationary action will hold good.
Differential
The standard presentation of Hamilton's action is to state it in integral form:
The graphlet set 2.1, 2.2, 2.3 shows that the response of the value of this integral to variation arises from response down at the infinitisimals. The response propagates from the infinitisimal scale to the value of the integral.
Therefore: stating the application of variational calculus in integral form is not a necessity.
3. Trial trajectories: response to variation
The following three graphlets are three instances of the same graphlet, for three successive classes of cases.
As you manipulate the sliders: the process to home in on the true trajectory is to manipulate the trial worldline such that over the entire trajectory the slope of the kinetic energy curve (red) matches the slope of the potential energy curve (green). When the trial trajectory is at thet the point in variation space such that those two slopes match the derivative of Hamilton's action is zero.
3.1 Potential increases linear with height
Graphlet 3.1 is for the case of a uniform force, causing an acceleration of 2 m/s^{2}.
As we know: with a uniform force the curve that represents the height as a function of time is a parabola.
In the 'energy' subpanel the green curve represents the minus potential energy. In effect the potential energy curve has been flipped upside down. That way you can see directly that when the trial trajectory coincides with the true trajectory the red and green curve are parallel to each other everywhere.
(Any form of calculating the true trajectory uses the derivative of the potential energy. That is, in calculation the value of the potential energy itself is not used. Because of that the choice of zero point of potential energy is arbitrary.)
In the lower right subpanel:
The blue dot represents Hamilton's action.
The label of the horizontal axis is p_{v}, which stands for 'variational parameter'. The positioning of the dots corresponds to the value of the main slider.
In the case of linear potential: when the trial trajectory coincides with the true trajectory the value of Hamilton's action is minimal/minimized. In the case implemented in this graphlet: any change of the trial trajectory results in raising the value of Hamilton's action.
3.2 Potential increases quadratic with displacement
Graphlet 3.2 is for the case where the force increases in linear proportion to the displacement. This case is commonly referred to as Hooke's law.
As we know: With Hooke's law the resulting motion is harmonic oscillation and the curve that represents the displacement as a function of time is the sine function.
With Hooke's law the potential energy increases with the square of the displacement. That is: with Hooke's law we have that the rate of change of potential energy is on par with that of the kinetic energy: with Hooke's law the expressions for the energies are both quadratic expressions.
As we know: with Hooke's law the amplitude and period of the resulting oscillation are independent from each other. Hamilton's stationary action corroborates that: when the trial trajectory is the harmonic oscillation function then if you change the amplitude the value of the action remains the same.
(In the diagram the actual curve is the cosine function; the point is that it is the harmonic oscillation function.)
3.3 Potential increases proportional to cube of displacement
Graphlet 3.3 is for the case where the force increases proportional to the square of the displacement, hence the potential energy increases with the cube.
Click the button 'Show numerical' to show how the curve displayed in graphlet 6 was independently verified
Comparing the trajectories
In graphlet 3.1 the potential energy as a function of position is linear. Whenever the potential energy as a function of position is of lower order than the expression for the kinetic energy the action reaches a minimum when the derivative of the action is zero. In graphlet 3.2 the potential energy function and the kinetic energy function are both quadratic expressions, so the action is neither a minimum nor a maximum. Here in graphlet 3.3 the potential energy as a function of position is a higher order expression than the one for the kinetic energy (cubic versus quadratic), and consequently when the derivative of the action is zero the action reaches a maximum.
I recommend enlarging the graphlet to the full width of the browser window. Visibility of the navigation column of this page can be toggled. When the navigation column has been hidden: use the button 'Larger' and zoom in on the page as a whole to make the graphlet fill the entire width of the browser window.
How to operate the demonstration:
The graphlet set 3.1, 3.2, and 3.3 is designed to be discoverable, but for the sake of completeness I provide a description.
The main slider, located at the bottom of the graphlet, executes a global variation sweep. The value displayed in the "knob" of the main slider is a value to implement the variation, from here on I will refer to it as the 'variational parameter p_{v}' In the 'integral' panel the variational parameter p_{v} is along the horizontal axis.
Names for the 4 subpanels:
 upper left: Control panel
 lower left: Height panel
 upper right: Energies panel
 lower right: Integrals panel
Note: The 3 subpanels with a coordinate system are named after their vertical axis name: Height, Energy, Integral.
In the starting configuration the trajectory points (height panel) have been placed such that they coincide (to a very good approximation) with the true trajectory of the object.
When the object is moving along the true trajectory the rate of change of kinetic energy matches the rate of change of potential energy (workenergy theorem). This match of rate of change has been emphasized by turning the graph of the potential energy (green) upside down. When the object is moving along the true trajectory the red and green curve have the same slope along the entire trajectory.
The energies panel is where it happens. The energies panel represents how the equation that you are using solves for the true trajectory.
Hamilton's Action
Hamilton's Action is represented by the blue dot in the Integrals panel. Use the main slider to sweep out variation. The blue dot follows a curve; it is a curve in variation space. The slope of that curve represents the derivative of Hamilton's Action with respect to variation.
At the point where over the entire trajectory the derivative of the kinetic energy is equal to the derivative of the minus potential energy the derivative of the blue dot is zero.
In the control panel the two sliders on the far left are adjusters for the trajectory. The upper adjuster morphs the trajectory towards a curve that is more blunt than the true trajectory; the lower adjuster morphs the trajectory towards a triangle shape.
These adjusters allow morphing of the trial trajectory while maintaining that during the entire ascent the velocity never reverses from decreasing to increasing again, which would be unphysical.
The row of ten sliders is for adjustment of individual nodes. The three radio buttons toggle between three sets of node adjusting sliders.
The '× 1' button: for a ratio of 1to1 of moving the slider and movement of the corresponding node of the trajectory.
The '× 0.1' button: ratio of 10 to 1
The '× 0.01' button: ratio of 100 to 1, for fine adjustment.
The button 'Consolidate':
Clicking the button 'Consolidate' does the following: the current value of the '× 0.1' slider is transferred to the '× 1' slider. That is: after clicking the button 'Consolidate' the position of the '× 0.1' slider is reset to its zero position, and the '× 1' slider has been incremented accordingly.
In the energies panel: the grey point is draggable, and it drags the entire curve of the kinetic energy (red) with it. By shifting the kinetic energy curve over to the potential energy curve the user can verify that the red and green curve are parallel to each other along the entire trajectory. (The evaluation of the integral uses the unshifted position of the kinetic energy curve.)
In the integrals panel:
· Red/Green point: value of the integral of the red/green curve of the energies panel
· Blue point: the sum of the values of the red point and the green point
Finding the true trajectory by adjusting the sliders
As a seed move the '× 1' sliders so that the nodes are positioned along a straight line, say at 45 degrees.
In the 'Energies' subpanel, move the graph of the kinetic energy (red dots) down, on average a bit below the potential energy graph (green dots)
Use the '× 0.1' sliders to move the nodes upward bringing the red dots and green dots into alignment.
The true trajectory has the property that the rate of change of kinetic energy matches the rate of change of potential energy, so you want to make those two curves parallel to each other.
Appendix I: Verification
The WorkEnergy theorem is constructed by stating the integral with respect to position of F=ma
Hence taking the derivative with respect to position will recover F=ma.
Simplify to initial position coordinate zero and initial velocity zero:
the derivative with respect to position:
In the EulerLagrange equation, (I.10) and (I.11), it looks as if the way the kinetic energy term is processed is different from how the kinetic energy term is processed in the EnergyPosition equation, but that is not actually the case.
In classical mechanics the EulerLagrange equation takes the derivative of the energy with respect to the position coordinate. Proof: (I.13) proceeds with the differentiation of kinetic energy specified by the EulerLagrange equation:
(I.14) evaluates to ma, showing that in classical mechanics the EnergyPosition equation (1.14), and the EulerLagrange equation (I.10)/(I.11) are the same equation.
Appendix II: The meaning of 'stationary'
Red and green are both ascending functions. We want to identify the coordinate where the rate of change of the red value matches the rate of change of the green value. The shortest way to get there is by setting up an equation with the derivative of the red curve on one side and the derivative of the green curve on the other side.
In the case of Hamilton's stationary action the same comparison is performed, but in a form that takes more steps. We see the following: a third function is defined, which is called the Lagrangian L: L=(E_{k}E_{p}), and the point to be identified is called the point of 'stationary action'. 'Stationary action' is another way of saying: identify the point where the derivative of the Lagrangian is zero.
Summerizing:
Hamilton's stationary action works by imposing the constraint that was established with (1.11): the rate of change of kinetic energy must match the rate of change of potential energy.
Back to the paragraph 'stationary' in the article.
Appendix III: Jacob's Lemma
When Johann Bernoulli had presented the Brachistochrone problem to the mathematicians of the time Jacob Bernoulli was among the few who was able to find the solution independently. The treatment by Jacob Bernoulli is in the Acta Eruditorum, May 1697, pp. 211217
Jacob opens his treatment with an observation concerning the fact that the curve that is sought is a minimum curve.
Lemma. Let ACEDB be the desired curve along which a heavy point falls from A to B in the shortest time, and let C and D be two points on it as close together as we like. Then the segment of arc CED is among all segments of arc with C and D as end points the segment that a heavy point falling from A traverses in the shortest time. Indeed, if another segment of arc CFD were traversed in a shorter time, then the point would move along ACFDB in a shorter time than along ACEDB, which is contrary to our supposition.
Jacob's lemma generalizes to all cases where the curve that you want to find is an extremum; either a maximum or a minimum. If the evaluation is an extremum for the entire curve, then it is also an extremum for any subsection of the curve, down to infinitisimally short subsections.
The interactive diagrams on this page are created with the Javascript Library JSXGraph. JSXGraph is developed at the Lehrstuhl für Mathematik und ihre Didaktik, University of Bayreuth, Germany.
This work is licensed under a Creative Commons AttributionShareAlike 3.0 Unported License.