Poincaré’s 1901 paper 1 introduces (in just a humble 3 pages) the Euler-Poincaré equations, which are the specialization of the Euler-Lagrange equations to the case where a Lie group $G$ acts on the manifold $\mathcal Q$. In this case, vector fields on $\mathcal Q$ can be expressed in terms of the infinitesimal actions of the group, and the Euler-Lagrange equations reduce to the Euler-Poincaré equations. In what follows, I work through Poincaré’s paper without making too many identifications .
Paths in $\mathcal Q$
Think of the space $\mathcal Q$ as being the $n$-dimensional configuration space of some dynamical system. It is assumed that an $r$-dimensional Lie group $G$ acts transitively on $\mathcal Q$, i.e., for any two points $q, q^\prime \in \mathcal Q$, there exists a (not necessarily unique) $g\in G$ such that $g\cdot q = q^\prime$. Given a basis $\lbrace E_i \rbrace_{i=1}^r$ for $\mathfrak g$, the group action induces a set of vector fields $\lbrace \overline E_i\rbrace_{i=1}^r$ on $\mathcal Q$ which are sometimes called ‘infinitesimal generators’. These vector fields may be defined in terms of their actions on an arbitrary function $f\in C^\infty(\mathcal Q)$:
$$[\overline E_i f](q) = \lim_{\epsilon\rightarrow 0} \frac{f(\exp(\epsilon E_i)\cdot q) - f(q)}{\epsilon}.$$We will also use $\Phi(g, q) \coloneq g\cdot q$ and $\Phi^{(q)}(g) \coloneq \Phi(g, q)$ to denote the group action. Then, $\overline E_{i,q}$ is nothing but the pushforward vector $d\Phi^{(q)}_e( E_i)$.
Let the map $\gamma:[0,1] \rightarrow \mathcal Q$ describe a path in $\mathcal Q$ (e.g., the trajectory of a dynamical system whose state at time $t$ is $\gamma(t)$), starting at time $t=0$ and ending at time $t=1$. Due to transitivity of the group action, we can express the “velocity” vectors of $\gamma$ as follows:
$$ \begin{align*} \dot\gamma(s) \coloneq d \gamma_s \left(\frac{\partial}{\partial t}\Big\rvert_{t=s}\right) &= \eta^i(s) \overline E_i\rvert_{\gamma(s)} \\ \end{align*} $$$\in T_{\gamma(s)}\mathcal Q$, where $\eta^i(s)$ are some coefficient/component functions that describe the velocity of $\gamma$. Note that the coefficients $\left\lbrace \eta^i\right\rbrace_{i=1}^n$ are not unique when $n < r$, since $\left\lbrace \overline E_{i,\gamma(s)}\right\rbrace_{i=1}^r$ may then be an over-complete basis for $T_{\gamma(s)}\mathcal Q$.
⌁ Variation of $\gamma$
Denote the space of paths on $\mathcal Q$ as $\mathcal P\mathcal Q$, so that $\gamma \in \mathcal P\mathcal Q$. Consider a family of paths $\Gamma(\ \cdot\ , \lambda)$ such that $\Gamma(t, 0) = \gamma(t)$. As $\lambda$ is varied, we think of $\Gamma(\ \cdot\ , \lambda)$ as a path in $\mathcal P \mathcal Q$, i.e., a path through path space! We know that the derivatives of such paths-of-paths at $\lambda = 0$ should correspond bijectively to the tangent vectors in $T_{\gamma}\mathcal P \mathcal Q$:
$$ \frac{\partial}{\partial\lambda} \Gamma (\ \cdot\ , \lambda)\Big\rvert_{\lambda=0} \in T_{\gamma}\mathcal P \mathcal Q. $$
For each $t\in[0, 1]$, this is nothing but a tangent vector of $\mathcal Q$:
\[ \frac{\partial}{\partial\lambda} \Gamma (t, \lambda)\Big\rvert_{\lambda=0} \in T_{\gamma(t)}\mathcal Q. \]
Thus, as we vary $t$, we obtain a vector field along $\gamma$.
An element $\delta\hspace{-1.5pt}\gamma$$\hspace{1pt}\in T_{\gamma} \mathcal P \mathcal Q$ of this tangent space is commonly referred to as a variation of $\gamma$.2 It is both a vector in $T_{\gamma} \mathcal P \mathcal Q$ and a vector field (along $\gamma$) on $\mathcal Q$. Since it’s a vector field, we know that $\delta\hspace{-1.5pt}\gamma$ can be expressed as
\[ \delta\hspace{-1.5pt}\gamma(t) \coloneq \frac{\partial}{\partial \lambda} \Gamma(t, \lambda)\Big\rvert_{\lambda=0} =\xi^i(t) \overline E_{i,\gamma(t)}. \]
Meanwhile, the $t$-derivative of $\Gamma$ also defines a vector field:
$$ \dot\Gamma(s,\lambda) \coloneq \frac{\partial}{\partial t} \Gamma(t, \lambda)\Big\rvert_{t=s} = \eta^i(s, \lambda)\overline E_{i,\Gamma(s,\lambda)}.$$Since $\Gamma(\hspace{1pt}\cdot\hspace{1pt},\lambda)$ coincides with $\gamma$ at $\lambda=0$, we have the compatibility condition $\eta^i(t,0)=\eta^i(t)$.
| Notation | Meaning |
|---|---|
| $\Gamma(\cdot, \lambda)$ | perturbed version of $\gamma$ |
| $\delta\hspace{-1.5pt}\gamma$ | direction in which $\gamma$ is perturbed (i.e., variation of $\gamma$) |
| $\dot\Gamma(\cdot, \lambda)$ | velocity vector field of $\Gamma(\cdot, \lambda)$ |
| $\left\lbrace \xi^i,\eta^i \right\rbrace_{i=1}^r$ | coefficients of $\delta\hspace{-1.5pt}\gamma$ and $\dot\Gamma$, respectively |
| $\dot{\square}$, $\delta\hspace{0pt}\square$ | derivatives of $\square$ with respect to $t$ and $\lambda$, resp. |
Now for the tricky part – any variation of $\gamma$ induces a variation of $\dot\gamma$, which can be described by the functions $\lbrace\delta\hspace{-1.5pt}\eta^i\rbrace_{i=1}^r$ defined as follows:
$$ \begin{align*} \delta\hspace{-1.5pt}\eta^i &\coloneq \frac{\partial}{\partial\lambda} \eta^i(s, \lambda)\Big\rvert_{\lambda=0}. \end{align*} $$As the variational principle requires us to pass from the $n$-dimensional space $\mathcal Q$ to the $(n+r)$-dimensional space $\mathcal Q \times \mathfrak g$, we need additional constraints that relate $\delta\hspace{-1.5pt}\gamma$ to $\lbrace\delta\hspace{-1.5pt}\eta^i\rbrace_{i=1}^r$, ensuring that the variations of $\gamma$ and $\dot \gamma$ are compatible.3 To put it differently, notice that since $\delta\hspace{-1.5pt}\gamma$ determines the perturbed curve $\Gamma(\hspace{1pt}\cdot\hspace{1pt}, \lambda)$ for vanishingly small values of $\lambda$, the value of the velocity vectors are already specified by $\delta\hspace{-1.5pt}\gamma$, meaning that we don’t have the freedom to also specify $\lbrace\delta\hspace{-1.5pt}\eta^i\rbrace_{i=1}^r$ arbitrarily.
Annoyingly, Poincaré considers this matter a triviality. His (translated) paper reads “… and [one] can easily find…” before the result is presented. It is indeed a triviality when $G$ is abelian4, in which case we can use what Marsden & Ratiu like to call the equality of mixed partials: $\frac{\partial^2}{\partial t \partial \lambda} = \frac{\partial^2}{\partial \lambda \partial t}$. Much of what follows is a discussion of how this equality manifests in the non-abelian case, without resorting to the convenience of matrix Lie groups.
⌁ Variation of $\dot\gamma$
The reader may want to specialize to matrix Lie groups and finish the argument algebraically, as done in Theorem 13.5.3 of Introduction to Mechanics and Symmetry by Marsden & Ratiu. Here, I will do a slightly more general version of the proof. What follows can be supplemented with Lee’s Introduction to Riemannian Manifolds (Lee IRM).
Consider an affine torsion-free connection $\nabla$ on $\mathcal Q$. The curve $\gamma(t)$ and the connection $\nabla$ together define the covariant derivative operator, $D_t(\hspace{1pt}\cdot\hspace{1pt})$. Using the Symmetry Lemma [Lemma 6.2, Lee IRM], we have
$$ \begin{align*} D_t\left[\frac{\partial}{\partial \lambda}\Gamma(t, \lambda)\Big|_{\lambda=0}\right](s) = D_{\lambda} \left[\frac{\partial}{\partial t}\Gamma(t, \lambda)\Big|_{t=s}\right](0). \end{align*} $$Either side of this equation is a vector at $\Gamma(s,0)$. The idea is that we need to relate the left-hand side to $\dot\xi^i$ and the right-hand side to $\delta\hspace{-1.5pt}\eta^i$. The left-hand side is5
$$ \begin{align*} D_t\delta\hspace{-1.5pt}\gamma(s) &= \dot\xi^i(s)\overline E_{i,\gamma(s)} + \xi^i(s)\big[\nabla_{\dot\gamma(s)}\overline E_{i}\big]_{\gamma(s)}\\ &= \dot\xi^i(s)\overline E_{i,\gamma(s)} + \xi^i(s)\eta^j(s)\big[\nabla_{\overline E_{j}}\overline E_{i}\big]_{\gamma(s)}, \end{align*} $$whereas the right-hand side is
$$ \begin{align*} D_{\lambda} &\left[\eta^i(s, \lambda)\overline E_{i,\Gamma(s,\lambda)}\right](0) \\&= \frac{\partial }{\partial\lambda} \eta^i(s, \lambda)\Big|_{\lambda=0}\overline E_{i,\Gamma(s,0)} + \eta^i(s, 0) \big[\nabla_{\delta\hspace{-1.5pt}\gamma(s)}\overline E_{i}\big]_{\gamma(s)} \\&= \delta\hspace{-1.5pt}\eta^i(s)\overline E_{i,\gamma(s)} + \eta^i(s)\big[\nabla_{\delta\hspace{-1.5pt}\gamma(s)}\overline E_{i}\big]_{\gamma(s)}\\&= \delta\hspace{-1.5pt}\eta^i(s)\overline E_{i,\gamma(s)} + \xi^j(s)\eta^i(s)\big[\nabla_{\overline E_{j}}\overline E_{i}\big]_{\gamma(s)} \\&= \delta\hspace{-1.5pt}\eta^i(s)\overline E_{i,\gamma(s)} \\&\qquad+ \xi^j(s)\eta^i(s)\hspace{2pt}\big(\hspace{1pt}\nabla_{\overline E_{i}}\overline E_{j} - [{\overline E_{i}}, \overline E_{j} ]\big)_{\gamma(s)}. \end{align*} $$The last equality (as well as the symmetry lemma itself) follows from the connection being torsion-free. Note that $[{\overline E_{i}}, \overline E_{j} ] = L_{\overline E_{i}} \overline E_{j}$ is the Lie derivative; we will return to this point shortly. For now, we observe that the symmetry lemma yields
$$ \begin{align*} \dot\xi^i(s)\overline E_{i,\gamma(s)} &= \delta\hspace{-1.5pt}\eta^i(s)\overline E_{i,\gamma(s)} - \eta^i(s)\xi^j(s)[{\overline E_{i,\gamma(s)}}, \overline E_{j,\gamma(s)} ]. \end{align*} $$⌁ Returning to $\mathfrak g$
We are not quite done; the reader will notice that our expression appears to agree with Marsden and Ratiu’s, but has a sign-difference when compared to Poincaré’s. Actually, our equations are closer to Poincaré’s. The apparent discrepancy is due to the fact that the vector fields $\lbrace \overline E_i\rbrace_{i=1}^r$ are more closely related to the right-invariant vector fields (RIVFs) on $G$ than they are to the left-invariant vector fields (LIVFs). In particular, there is a Lie algebra anti-homomorphism: $[{\overline E_i, \overline E_j}] = -\overline{[ E_i, E_j]}$ (proven in the appendix), since the usual Lie bracket on $\mathfrak g$ is also defined via LIVFs. Our urge to make everything “act from the left” in mathematical notation has led us to consider left group actions on $\mathcal Q$, and I suppose the same urge has made LIVFs the predominant choice on $G$.
Let $\xi(s) \coloneq \xi^i(s) E_i$ and $\delta\hspace{-1.5pt}\eta(s) \coloneq \delta\hspace{-1.5pt}\eta^i(s) E_i$ be curves in $\mathfrak g$. Note that $\overline{\small\dot\xi(s)} = \dot\xi^i(s)\overline E_{i,\gamma(s)}$. We have,
$$ \begin{align*} \overline{\small\dot\xi} = \overline{\delta\hspace{-1.5pt}\eta}- \big[\hspace{2pt}\overline{\small {\delta\hspace{-1.5pt}\eta}}\hspace{0.5pt},\hspace{1pt}&\overline{\small \xi}\hspace{2pt}\big]= \overline{\delta\hspace{-1.5pt}\eta} + \overline{[\hspace{1.5pt}{\small\delta\hspace{-1.5pt}\eta}\hspace{0.5pt},\hspace{1pt} {\small \xi\hspace{1.5pt}}]}\\ &\Downarrow \\ \dot\xi={\delta\hspace{-1.5pt}\eta} &+ {[\hspace{1.5pt}{\small\delta\hspace{-1.5pt}\eta}\hspace{0.5pt},\hspace{1pt} {\small \xi\hspace{1.5pt}}]}. \end{align*} $$Finally, we have something that agrees with Poincaré’s note.[^alt] It agrees with Marsden & Ratiu’s too, but corresponds to the right-invariant version of their result, which in their book is left as an exercise to the reader. In the proof presented by M & R, they consider the right group action of $G$ on $\mathcal Q$, whereas we have considered a left action. Consequently, M & R describe $\dot\Gamma$ and $\delta\hspace{-1.5pt}\gamma$ using their left-invariant velocities, while we had to work with their right-invariant counterparts.
In the basis we chose for $\mathfrak g$, we can compute the structure constants $\lbrace c_{ij}^k\rbrace_{i,j,k=1}^r$, after which we can write
$$ \dot\xi^i = {\delta\hspace{-1.5pt}\eta}^i \ +\ {\delta\hspace{-1.5pt}\eta}^{\hspace{0.5pt}j}\, \xi^k \,c_{jk}^i. $$Euler-Poincaré Equations
Let $\mathscr L:T\mathcal Q \rightarrow \mathbb R$ be a smooth function, called the Lagrangian (recall that a Hamiltonian $\mathscr H$ is instead a function on $T^\ast \mathcal Q$). Let $\eta(t)\coloneq \eta^i(t)E_i$. Given a tuple $(\gamma(t),\eta(t))$, we can map it to a unique point in $T\mathcal Q$:
$$ (\gamma(t),\eta(t))\mapsto\big(\gamma(t),\overline{\eta(t)}_{\gamma(t)}\big) $$(this is not precisely the pushforward map of $\Phi$, but it’s very close!) Consequently, we can pull back $\mathscr L$ to define a function on $\mathcal Q \times \mathfrak g$, denoted as $\mathscr L'$. The function $\mathscr L'$ will be invariant under the transformations of $\eta$ that leave $\overline{\eta}$ unchanged.
Now, $\mathscr L'$ can be further pulled back under $\tilde \gamma : t\mapsto(\gamma(t),\eta(t))$ to define $\mathscr L' \circ \tilde\gamma$, a function on $[0,1]$ that can be integrated! Consider the mapping $\mathscr A:\mathcal P(\mathcal Q\times \mathfrak g) \rightarrow \mathbb R$ defined by
$$ \begin{align} \mathscr A(\tilde\gamma) &= \int_0^1 \mathscr L' \circ \tilde\gamma\,(t)\hspace{1pt} dt = \int_0^1 \mathscr L'\big(\gamma(t), \eta(t)\big)\hspace{1pt} dt, \end{align} $$called the action functional.6 Even before we consider the variation in $\gamma$, we already see that $\gamma(t)$ and $\eta(t)$ should satisfy a constraint, namely that $\dot\gamma(t)=\eta^i(t)\overline{E_i}_{\gamma(t)}$. In the classical derivation, this was replaced by the “identity” $\dot q(t) = \frac{d}{dt} q(t)$.
$$ \require{amscd} \begin{CD} \mathcal Q \times \mathfrak g @>{(p,\, X) \mapsto (p,\,\overline{X}_p)}>> T\mathcal Q \\ @A{\tilde\gamma}AA @VV{\mathscr L}V \\ [0,1] @>{{\mathscr L}'\circ\tilde\gamma}>> \mathbb{R} \end{CD} $$The variations in $\gamma$ and $\eta$, will introduce another constraint. That is, $\delta\hspace{-1pt}\eta$ should be compatible with $\delta\hspace{-1pt}\gamma$ (as was made precise in the preceding section). These variations will together induce a variation in $\mathscr L'$, and consequently in $\mathscr A$, the last of which we will set to $0$. I will write this using coordinates (see Appendix A) because (i) we can, and (ii) I don’t know how to do this in a coordinate-free way (yet):
$$ \begin{align} {d\mathscr A}_\gamma \big((\delta\hspace{-1pt}\gamma, \delta\hspace{-1.5pt}\eta)\big) &= \int_0^1 \left[\frac{\partial \mathscr L'}{\partial\,\gamma^i\,}\delta\hspace{-1pt}\gamma^i + \frac{\partial \mathscr L'}{\partial\,\eta^i\,}\delta\hspace{-1pt}\eta^i\right]\big(\tilde\gamma(t)\big)\hspace{1pt} dt =0. \end{align} $$The object ${d\mathscr A}_\gamma $ is written by Marsden & Ratiu as $\frac{\delta \mathscr A}{\delta \gamma\ }$ – it is the exterior derivative of $\mathscr A$. Using the compatibility condition from the previous section (as well as App. A), we have
$$ \begin{align} &\int_0^1 \left[\frac{\partial \mathscr L'}{\partial\,\gamma^i\,}\xi^j \overline E_j^i + \frac{\partial \mathscr L'}{\partial\,\eta^j\,}\dot\xi^j \ +\ \frac{\partial \mathscr L'}{\partial\,\eta^i\,}{\delta\hspace{-1.5pt}\eta}^{k} \xi^j c_{jk}^i\right]\hspace{1pt} dt,\\ &=\int_0^1 \xi^j \left[\frac{\partial \mathscr L'}{\partial\,\gamma^i\,} \overline E_j^i - \frac{d}{dt}\left(\frac{\partial \mathscr L'}{\partial\,\eta^j\,}\right)+ \frac{\partial \mathscr L'}{\partial\,\eta^i\,} {\delta\hspace{-1.5pt}\eta}^{k}c_{jk}^i\right]\hspace{1pt} dt\\ &\qquad\qquad\qquad +\int_0^1 \frac{d}{dt}\left(\frac{\partial \mathscr L'}{\partial\,\eta^j\,} \xi^j\right) dt =0. \end{align} $$But the fundamental theorem of calculus tells us that
$$ \int_0^1 \frac{d}{dt}\left(\frac{\partial \mathscr L'}{\partial\,\eta^j\,} \xi^j\right) dt = \left[\frac{\partial \mathscr L'}{\partial\,\eta^j\,} \xi^j\right]_{t=0}^{t=1}. $$For variations that fix the endpoints (I explain in the footnotes why we need this constraint), the above term is $0$, and we can localize the integral to obtain the Euler-Lagrange equations:
$$ \begin{align} \frac{d}{dt}\left(\frac{\partial \mathscr L'}{\partial\,\eta^j\,}\right) = \frac{\partial \mathscr L'}{\partial\,\eta^i\,} {\delta\hspace{-1.5pt}\eta}^{k}c_{jk}^i + \frac{\partial \mathscr L'}{\partial\,\gamma^i\,} \overline E_j^i . \end{align} $$And there it is, une forme nouvelle des équations de la mécanique. As Poincaré points out, this is especially of interest when $\mathscr L'$ only depends on $\eta$ (e.g., when computing geodesic motion).
Appendices
A. Computation in Coordinates
Let $\lbrace q^i\rbrace_{i=1}^n$ be coordinates on a subset of $\mathcal Q$. We can express $\overline E_i$ in terms of the coordinate frame $\lbrace {\partial}/{\partial q^i}\rbrace_{i=1}^n$, as
$$\overline E_i = \overline E_i^j \frac{\partial}{\partial q^j},$$where each $\overline E_i^j \in C^\infty(\mathcal Q)$ is a coordinate function. Letting $\overline E_i$ act on the coordinate function $q^j$, we get
$$ \overline E_i q^j = \overline E_i^k \frac{\partial q^j}{\partial q^k} = \overline E_i^j, $$which tells us how to compute the coefficients of $\overline E_i$ – just feed it the coordinate functions. Returning to the definition of $\dot\gamma(s)$, we get:
$$\dot\gamma(s) = \eta^i(s) \overline E_i^j(\gamma(s)) \frac{\partial}{\partial q^j}\Big\vert_{\gamma(s)}.$$Letting the left-hand side act on the coordinate function $q^k$, we get
$$ \begin{align*} \big(\dot\gamma(s)\big) (q^k) &= d \gamma_s \left(\frac{\partial}{\partial t}\Big\rvert_{t=s}\right)(q^k)=\frac{d}{dt}(q^k\circ \gamma)(t)\Big|_{t=s}, \end{align*} $$whereas letting the right-hand side eat $q^k$, we get
$$ \begin{align*} \eta^i(s) \overline E_i^j(\gamma(s)) \frac{\partial q^k}{\partial q^j}\Big\vert_{\gamma(s)} = \eta^i(s) \overline E_i^k(\gamma(s)). \end{align*} $$Here, $i$ sums from $1$ to $r$ whereas $j$ sums (and $k$ ranges) from $1$ to $n$. Also, observe that $r\geq n$ due to transitivity of the group action. Denoting $q^j \circ \gamma$ as $\gamma^j$, we can finally put everything together:
$$ \frac{d \gamma^k}{dt}(s) = \eta^i(s) \overline E_{i,\gamma(s)} \gamma^k(s). $$This expression can be found in Poincaré’s paper (linked at the beginning of the post). Similarly, we have
$$ \begin{align*} \delta\hspace{-1.5pt}\gamma(s) =\delta\hspace{-1.5pt}\gamma^i(s)\frac{\partial}{\partial q^i}\Big\vert_{\gamma(s)} = \xi^j(s) \overline E_j^i \frac{\partial}{\partial q^i}\Big\vert_{\gamma(s)}. \end{align*} $$B. Computation using Matrix Algebra
We can choose $\Gamma$ to be
$$\Gamma(t, \lambda) = \exp(\lambda \xi^i(t) E_i)\cdot\gamma(t)$$since it satisfies the conditions for being a representative of $\delta\hspace{-1.5pt}\gamma$. Actually, we can do something more; we can express $\gamma(t)$ as $g(t) \cdot p$, where $p\in \mathcal Q$ is a some distinguished point (that may be called the “origin” of $\mathcal Q$) and $g(t)$ is a curve of actions in $G$. This means that
$$\Gamma(t, \lambda) = \exp(\lambda \xi^i(t) E_i)g(t)\cdot p.$$The curve $g$ that generates $\gamma$ in this manner is not unique even after we have fixed some point $p$. To see why, one can consider (the rather silly example of) $G=\mathbb R^3 \times \mathbb R^3$ and $\mathcal Q = \mathbb R^3$. Nevertheless, since we are going to probe all possible variations of $\gamma$, we only need to worry about the “surjectivity” of this formulation, rather than its “injectivity”. Surjectivity follows from our assumption of the group action being transitive.
If $G$ is a matrix Lie group, the expression
\[ \begin{align*} \frac{d}{d t}&\left(\left( \frac{\partial}{\partial \lambda} \Gamma(t, \lambda)\Big\rvert_{\lambda=0}\right)g(t)^{-1}\right) \end{align*} \]
can be viewed from a purely matrix-algebraic lens. This is what Marsden and Ratiu do in their book, so I leave the remaining details to them.
C. Proof of $[{\overline X, \overline Y}] = -\overline{[ X, Y]}$
Let $\bar L^{(g)} (q) \coloneq \Phi(g,q) = \Phi^{(q)}(g)= g\cdot q$. We will reuse this notation for left and right multiplication on $G$, so that $L^{(g)}(h)=R^{(h)}(g) = gh$. The following equalities hold:
$$ \begin{align*} g\cdot(h\cdot q)&=gh\cdot q\\ \quad\bar L^{(g)}\circ \bar L^{(h)} &= \bar L^{(gh)}\\ \bar L^{(g)} \circ \Phi^{(q)} &= \Phi^{(q)}\circ L^{(g)}\ \ \\ \Phi^{(h\cdot q)} &= \Phi^{(q)}\circ R^{(h)} \end{align*} $$
As a mnemonic, we can think of $\Phi^{(q)}$ as “right-multiplication by $q$”, so that it “commutes” with $\bar L^{(g)}$. Next, we need to demonstrate the fact that $\overline{\mathrm{Ad}_g Y}={\bar L^{(g^{-1})}}^\ast\hspace{1pt} \overline Y$. The vector field on the left has, at $q\in\mathcal Q$, the value
$$ \begin{align*} d\Phi^{(q)}_e (\mathrm{Ad}_g Y) &= d\Phi^{(q)}_e dR^{(g^{-1})}_{g} dL^{(g)}_e Y\\ &= d\Phi^{(g^{-1}\cdot \hspace{1pt}q)}_{g} dL^{(g)}_e Y. \end{align*} $$The corresponding vector on the right is
$$ ({L^{(g^{-1})}}^\ast\hspace{1pt} \overline Y)_q = (L^{(g)}_\ast\hspace{1pt} \overline Y)_q = {(d \bar L_{g})}_{g^{-1}\cdot \hspace{1pt}q} d\Phi_{e}^{(g^{-1} \cdot\hspace{1pt} q)} Y, $$and the “commutativity” of $\bar L^{(g)}$ and $\Phi^{(q)}$ completes the argument.
Next, choose $g=\exp(tX)$ and differentiate w.r.t $t$ (i.e., evaluate the pushforward of $\frac{\partial}{\partial t}$):
$$ \begin{align*} \overline{\frac{d}{dt}\mathrm{Ad}_{\exp(t X)} Y\Big\rvert_{t=0}} &= \frac{d}{dt} {\bar L^{(\exp(-tX))}}^\ast \overline Y\Big\rvert_{t=0}\\ \overline{\mathrm{ad}_X Y} &= -\frac{d}{dt} {\bar L^{(\exp(tX))}}^\ast \overline Y\Big\rvert_{t=0}\\ \overline{[X,Y]} &= -[\overline{X}, \overline{Y}]. \end{align*} $$The last line follows from the fact that $\bar L^{(\exp(tX))}$ is the flow of $\overline{X}$, and that the Lie bracket of vector fields is (by definition) the Lie derivative.
-
I’m grateful to my colleague Jöel Bensoam for introducing me to this paper (and to variational calculus)! ↩︎
-
$\Gamma$ is in fact a representative of the equivalence class defined by $\delta\hspace{-1.5pt}\gamma$. Moreover, note that since the variation lives in a tangent space of $\mathcal P \mathcal Q$, it should act on an object of the type $f:\mathcal P\mathcal Q \rightarrow \mathbb R$ (called a functional – a function that eats a path and spits out a number). This action is defined as follows:
$$\delta\hspace{-1.5pt}\gamma\hspace{1pt}(f) = \frac{\partial}{\partial\lambda} f\left(\Gamma(\ \cdot\ , \lambda)\right)\Big\rvert_{\lambda=0} = \mathbf D f(\gamma) \cdot \delta\hspace{-1.5pt}\gamma,$$wherein the last piece of notation is explained in Marsden and Ratiu’s books, as well as in my other post . ↩︎
-
An example showing the importance of constraints is as follows. We can use variational calculus to show that the shortest path between two points $\mathbf p,\mathbf p^\prime \in \mathbb R^n$ is a straight line. Since perturbations of the straight line $\gamma(t)= \mathbf p + t(\mathbf p^\prime-\mathbf p)$ that keep $\gamma(0)$ and $\gamma(1)$ fixed can only increase the length, we conclude that the straight line is indeed the shortest path. However, if we drop the constraint on $\gamma(0)$ and $\gamma(1)$, then there exists a perturbation that moves the points closer. Basically, we need to impose the constraints $\delta\hspace{-1.5pt}\gamma(0) = 0$ and $=\delta\hspace{-1.5pt}\gamma(1)=0$ to properly formulate the geometric problem we have in mind. ↩︎
-
I recommend that the reader go to Chapter 1 of Mechanics & Symmetry and identify the point in the proof of the Euler-Poincaré equations where the relationship between $\delta\hspace{-1.5pt}\gamma$ and $\delta\hspace{-1.5pt}\eta^i$ is used. There, $\gamma(t)$ is described by $q^i(t)$ and $\dot\gamma(t)$ by $\dot q^i(t)$. ↩︎
-
The formula for the covariant derivative of a vector field along $\gamma$ is given in Lee IRM. Instead of writing $\nabla_{\dot\gamma(t)}\overline E_{i,\gamma(t)}$, we should properly extend either vector field to an open set of $\mathcal Q$ before evaluating the derivative. We will ignore this technicality to economize on our notation. ↩︎
-
Note that the (translation-invariant) integration measure on $[0,1]$ is unique up to scaling; choosing a scaling is like choosing how fast (and in which direction!) time flows, and does not influence the minimizer. Secondly, the definition of $\mathscr A$ relies on the fact that we can canonically lift $\gamma$ to define $\tilde\gamma$ (such a lifting doesn’t come from a choice of connection on $T\mathcal Q$). ↩︎