Algorithms for Walking, Running, Swimming, Flying, and Manipulation
© Russ Tedrake, 2024
Last modified .
How to cite these notes, use annotations, and give feedback.
Note: These are working notes used for a course being taught at MIT. They will be updated throughout the Spring 2024 semester. Lecture videos are available on YouTube.
Previous Chapter | Table of contents | Next Chapter |
Our goals for this chapter are modest: we'd like to understand the dynamics of a pendulum.
Why a pendulum? In part, because the dynamics of a majority of our multi-link robotics manipulators are simply the dynamics of a large number of coupled pendula. Also, the dynamics of a single pendulum are rich enough to introduce most of the concepts from nonlinear dynamics that we will use in this text, but tractable enough for us to (mostly) understand in the next few pages.
The Lagrangian derivation of the equations of motion (as described in the appendix) of the simple pendulum yields: \begin{equation*} m l^2 \ddot\theta(t) + mgl\sin{\theta(t)} = Q. \end{equation*} We'll consider the case where the generalized force, $Q$, models a damping torque (from friction) plus a control torque input, $u(t)$: $$Q = -b\dot\theta(t) + u(t).$$
Let us first consider the dynamics of the pendulum if it is driven in a particular simple way: a torque which does not vary with time: \begin{equation} ml^2 \ddot\theta + b\dot\theta + mgl \sin\theta = u_0. \end{equation}
.You can experiment with this system in
These are relatively simple differential equations, so if I give you $\theta(0)$ and $\dot\theta(0)$, then you should be able to integrate them to obtain $\theta(t)$... right? Although it is possible, integrating even the simplest case ($b = u = 0$) involves elliptic integrals of the first kind; there is relatively little intuition to be gained here.
This is in stark contrast to the case of linear systems, where much of our understanding comes from being able to explicitly integrate the equations. For instance, for a simple linear system we have $$\dot{q} = a q \quad \rightarrow \quad q(t) = q(0) e^{at},$$ and we can immediately understand that the long-term behavior of the system is a (stable) decaying exponential if $a<0$, an (unstable) growing exponential if $a>0$, and that the system does nothing if $a=0$. Here we are with certainly one of the simplest nonlinear systems we can imagine, and we can't even solve this system?
All is not lost. If what we care about is the long-term behavior of the
system, then there are a number of techniques we can apply. In this
chapter, we will start by investigating graphical solution methods. These
methods are described beautifully in a book by Steve
Strogatz
Let's start by studying a special case -- intuitively when $b\dot\theta \gg ml^2\ddot\theta$ -- which via dimensional analysis (using the natural frequency $\sqrt{\frac{g}{l}}$ to match units) occurs when $b \sqrt\frac{l}{g} \gg ml^2$. This is the case of heavy damping, for instance if the pendulum was moving in molasses. In this case, the damping term dominates the acceleration term, and we have: $$ml^2 \ddot\theta + b\dot\theta \approx b\dot\theta = u_0 - mgl\sin\theta.$$ In other words, in the case of heavy damping, the system looks approximately first order. This is a general property of heavily-damped systems, such as fluids at very low Reynolds number.
I'd like to ignore one detail for a moment: the fact that $\theta$ wraps around on itself every $2\pi$. To be clear, let's write the system without the wrap-around as: \begin{equation}b\dot{x} = u_0 - mgl\sin{x}.\label{eq:overdamped_pend_ct}\end{equation} Our goal is to understand the long-term behavior of this system: to find $x(\infty)$ given $x(0)$. Let's start by plotting $\dot{x}$ vs $x$ for the case when $u_0=0$:
The first thing to notice is that the system has a number of fixed points or steady states, which occur whenever $\dot{x} = 0$. In this simple example, the zero-crossings are $x^* = \{..., -\pi, 0, \pi, 2\pi, ...\}$. When the system is in one of these states, it will never leave that state. If the initial conditions are at a fixed point, we know that $x(\infty)$ will be at the same fixed point.
Next let's investigate the behavior of the system in the local vicinity of the fixed points. Examining the fixed point at $x^* = \pi$, if the system starts just to the right of the fixed point, then $\dot{x}$ is positive, so the system will move away from the fixed point. If it starts to the left, then $\dot{x}$ is negative, and the system will move away in the opposite direction. We'll call fixed-points which have this property unstable. If we look at the fixed point at $x^* = 0$, then the story is different: trajectories starting to the right or to the left will move back towards the fixed point. We will call this fixed point locally stable. More specifically, we'll distinguish between multiple types of stability (where $\epsilon$ is used to denote an arbitrary small scalar quantity):
An initial condition near a fixed point that is stable in the sense of
Lyapunov may never reach the fixed point (but it won't diverge), near an
asymptotically stable fixed point will reach the fixed point as $t
\rightarrow \infty$, and near an exponentially stable fixed point will
reach the fixed point with a bounded rate. An exponentially stable fixed
point is also an asymptotically stable fixed point, but the converse is
not true. Attractivity does not actually imply Lyapunov
stability†
Our graph of $\dot{x}$ vs. $x$ can be used to convince ourselves of i.s.L. and asymptotic stability by visually inspecting $\dot{x}$ in the vicinity of a fixed point. Even exponential stability can be inferred if we can find a negatively-sloped line passing through the equilibrium point which separates the graph of $f(x)$ from the horizontal axis. The existence of this separating line implies that the nonlinear system will converge at least as fast as the linear system represented by the straight line. I will graphically illustrate unstable fixed points with open circles and stable fixed points (i.s.L.) with filled circles.
Next, we need to consider what happens to initial conditions which begin farther from the fixed points. If we think of the dynamics of the system as a flow on the $x$-axis, then we know that anytime $\dot{x} > 0$, the flow is moving to the right, and $\dot{x} < 0$, the flow is moving to the left. If we further annotate our graph with arrows indicating the direction of the flow, then the entire (long-term) system behavior becomes clear:
For instance, we can see that any initial condition $x(0) \in (-\pi,\pi)$ will result in $\lim_{t\rightarrow \infty} x(t) = 0$. This region is called the region of attraction (or basin of attraction) of the fixed point at $x^* = 0$. Basins of attraction of two fixed points cannot overlap, and the manifold separating two basins of attraction is called the separatrix. Here the unstable fixed points, at $x^* = \{.., -\pi, \pi, 3\pi, ...\}$ form the separatrix between the basins of attraction of the stable fixed points. In general, regions of attraction are always open, connected, invariant sets, with boundaries formed by trajectories.
As these plots demonstrate, the behavior of a first-order one dimensional system on a line is relatively constrained. The system will either monotonically approach a fixed-point or monotonically move toward $\pm \infty$. There are no other possibilities. Oscillations, for example, are impossible. Graphical analysis is a fantastic analysis tool for many first-order nonlinear systems (not just pendula); as illustrated by the following example:
These equations are not arbitrary - they are actually a model for one
of the simplest neural networks, and one of the simplest models of
persistent memory
Experiment with it for yourself:
As I've also provided the equations of an LSTM unit that you can also experiment with. See if you can figure it out!
Architectures like the Transformer for sequence modeling can also be understood through the lens of dynamical systems theory (as autoregressive models), but we will defer that discussion until later in the notes.
One last piece of terminology. In the neuron example, and in many dynamical systems, the dynamics were parameterized; in this case by a single parameter, $w$. As we varied $w$, the fixed points of the system moved around. In fact, if we increase $w$ through $w=1$, something dramatic happens - the system goes from having one fixed point to having three fixed points. This is called a bifurcation. This particular bifurcation is called a pitchfork bifurcation. We often draw bifurcation diagrams which plot the fixed points of the system as a function of the parameters, with solid lines indicating stable fixed points and dashed lines indicating unstable fixed points, as seen in the figure:
Our pendulum equations also have a (saddle-node) bifurcation when we change the constant torque input, $u_0$. Finally, let's return to the original equations in $\theta$, instead of in $x$. Only one point to make: because of the wrap-around, this system will appear to have oscillations. In fact, the graphical analysis reveals that the pendulum will turn forever whenever $|u_0| > mgl$, but now you understand that this is not an oscillation, but an instability with $\theta \rightarrow \pm \infty$.
You can find another beautiful example of these concepts (fixed points, basins of attraction, bifurcations) from recurrent neural networks in the exercise below on Hopfield networks.
Consider again the system $$ml^2 \ddot\theta = u_0 - mgl \sin\theta - b\dot\theta,$$ this time with $b = 0$. This time the system dynamics are truly second-order. We can always think of any second-order system, $$\ddot{\theta} = f(\theta,\dot\theta),$$ as (coupled) first-order system with twice as many variables. This system is equivalent to the two-dimensional first-order system \begin{align*} \dot x_1 =& x_2 \\ \dot x_2 =& f(x_1,x_2), \end{align*} where $x_1 = \theta$ and $x_2 = \dot \theta$. Therefore, the graphical depiction of this system is not a line, but a vector field where the vectors $[\dot x_1, \dot x_2]^T$ are plotted over the domain $(x_1,x_2)$. This vector field is known as the phase portrait of the system.
In this section we restrict ourselves to the simplest case when $u_0 = 0$. Let's sketch the phase portrait. First sketch along the $\theta$-axis. The $x$-component of the vector field here is zero, the $y$-component is $-\frac{g}{l}\sin\theta.$ As expected, we have fixed points at $\pm \pi, ...$ Now sketch the rest of the vector field. Can you tell me which fixed points are stable? Some of them are stable i.s.L., none are asymptotically stable.
You might wonder how we drew the black contour lines in the figure above. We could have obtained them by simulating the system numerically, but those lines can be easily obtained in closed-form. Directly integrating the equations of motion is difficult, but at least for the case when $u_0 = 0$, we have some additional physical insight for this problem that we can take advantage of. The kinetic energy, $T$, and potential energy, $U$, of the pendulum are given by $$T = \frac{1}{2}I\dot\theta^2, \quad U = -mgl\cos(\theta),$$ where $I=ml^2$ and the total energy is $E(\theta,\dot\theta) = T(\dot\theta)+U(\theta)$. The undamped pendulum is a conservative system: total energy is a constant over system trajectories. Using conservation of energy, we have: \begin{gather*} E(\theta(t),\dot\theta(t)) = E(\theta(0),\dot\theta(0)) = E_0 \\ \frac{1}{2} I \dot\theta^2(t) - mgl\cos(\theta(t)) = E_0 \\ \dot\theta(t) = \pm \sqrt{\frac{2}{I}\left[E_0 + mgl\cos\left(\theta(t)\right)\right]} \end{gather*} Using this, if you tell me $\theta$ I can determine one of two possible values for $\dot\theta$, and the solution has all of the richness of the black contour lines from the plot. This equation has a real solution when $\cos(\theta) > \cos(\theta_{max})$, where $$\theta_{max} = \begin{cases} \cos^{-1}\left( -\frac{E_0}{mgl} \right), & E_0 < mgl \\ \pi, & \text{otherwise}. \end{cases}$$ Of course this is just the intuitive notion that the pendulum will not swing above the height where the total energy equals the potential energy. As an exercise, you can verify that differentiating this equation with respect to time indeed results in the equations of motion.
The particular orbit defined by $E = mgl$ is special -- this is the orbit that visits the (unstable) equilibrium at the upright. This is the homoclinic orbit of the pendulum.
Now what happens if we add a constant torque? If you visualize the bifurcation diagram, you'll see that the fixed points come together, towards $q = \frac{\pi}{2}, \frac{5\pi}{2}, ...$, until they disappear. One fixed-point is unstable, and one is stable.
Before we continue, let me now give you the promised example of a system that is not stable i.s.L., but which attracts all trajectories as time goes to infinity. We can accomplish this with a very pendulum-like example (written here in polar coordinates):
Consider the system \begin{align*} \dot{r} &= r(1-r), \\ \dot\theta &= \sin^2 \left( \frac{\theta}{2} \right).\end{align*} This system has two equilibrium points, one at $r^*=0,\theta^*=0$, and the other at $r^* = 1,\theta^*=0$. The fixed point at zero is clearly unstable. The fixed point with $r^*=1$ attracts all other trajectories, but it is not stable by any of our definitions.
Although the equations are simpler in polar coordinates, it's useful to draw the vector field in Cartesian coordinates:
Take a minute to examine the vector field of this to make sure you understand. This example may seem contrived, but we will soon see this same phenomenon happen in practice when we introduce the energy-shaping swing-up controller for the pendulum in the next chapter.
Now let's add damping back. You can still add torque to move the fixed points (in the same way).
With damping, the downright fixed point of the pendulum now becomes asymptotically stable (in addition to stable i.s.L.). Is it also exponentially stable? How can we tell? One technique is to linearize the system at the fixed point. A smooth, time-invariant, nonlinear system that is locally exponentially stable must have a stable linearization; we'll discuss linearization more in the next chapter.
Here's a thought exercise. If $u$ is no longer a constant, but a function $\pi(\theta,\dot{\theta})$, then how would you choose $\pi$ to stabilize the vertical position. Feedback cancellation is the trivial solution, for example: $$u = \pi(\theta,\dot{\theta}) = 2mgl\sin\theta.$$ But these plots we've been making tell a different story. How would you shape the natural dynamics - at each point pick a $u$ from the stack of phase plots - to stabilize the vertical fixed point with minimal torque effort? This is exactly the way that I would like you to think about control system design. And in the dynamic programming chapter, we'll start to develop our computational solution techniques.
The simple pendulum is fully actuated. Given enough torque, we can produce any number of control solutions to stabilize the originally unstable fixed point at the top (such as designing a feedback controller to effectively invert gravity).
The problem begins to get interesting (a.k.a. becomes underactuated) if we impose a torque-limit constraint, $|u|\le u_{max}$. Looking at the phase portraits again, you can now visualize the control problem. Via feedback, you are allowed to change the direction of the vector field at each point, but only by a fixed amount. Clearly, if the maximum torque is small (smaller than $mgl$), then there are some states which cannot be driven directly to the goal, but must pump up energy to reach the goal. Furthermore, if the torque-limit is too severe and the system has damping, then it may be impossible to swing up to the top. The existence of a solution, and number of pumps required to reach the top, is a non-trivial function of the initial conditions and the torque-limits.
Although this system is very simple, obtaining a solution with general-purpose algorithms requires much of the same reasoning necessary for controlling much more complex underactuated systems; this problem will be a work-horse for us as we introduce new algorithms throughout this book.
In the specific case of the pendulum, we can give a satisfying hand-designed nonlinear controller based on our intuition of pumping energy into the system. We have already observed that the total energy of the pendulum is given by $$E = \frac{1}{2} m l^2 \dot\theta^2 - mgl\cos\theta.$$ To understand how to control the energy, let us examine how that energy changes with time: \begin{align*} \dot{E} =& ml^2\dot\theta \ddot\theta + \dot\theta mgl\sin\theta \\ =& \dot\theta \left[ u - mgl\sin\theta \right] + \dot\theta mgl\sin\theta \\ =& u\dot\theta. \end{align*} In words, adding energy to the system is simple - apply torque in the same direction as $\dot\theta$. To remove energy, apply torque in the opposite direction (e.g., damping).
To swing up the pendulum, even with torque limits, let us use this observation to drive the system to its homoclinic orbit, and then let the dynamics of the pendulum carry us to the upright equilibrium. Recall that the homoclinic orbit has energy $mgl$ -- let's call this our desired energy:$$E^d = mgl.$$ Furthermore, let's define the difference between our current energy and this desired energy as $\tilde{E} = E - E^d$, and note that we still have $$\dot{\tilde{E}} = \dot{E} = u\dot\theta.$$ Now consider the feedback controller of the form $$u = -k \dot\theta \tilde{E},\quad k>0.$$ I admit that this looks a bit strange; it was chosen for the simple reason that it turns the resulting "error dynamics" into something simple: $$\dot{\tilde{E}} = - k \dot\theta^2 \tilde{E}.$$ Think about the graphical analysis of this system if you were to draw $\dot{\tilde{E}}$ vs. $\tilde{E}$ for any fixed $\dot\theta$ -- it's a straight line separated from the horizontal axis which would imply an exponential convergence: $\tilde{E} \rightarrow 0.$ This is true for any $\dot\theta$, except for $\dot\theta=0$ (so it will not actually swing us up from the downright fixed point... but if you nudge the system just a bit, then it will start pumping energy and will swing all of the way up). The essential property is that when $E > E^d$, we should remove energy from the system (damping) and when $E < E^d$, we should add energy (negative damping). Even if the control actions are bounded, the convergence is easily preserved.
This is a nonlinear controller that will push all system trajectories to the unstable equilibrium. But does it make the unstable equilibrium locally stable? With only this controller, the fixed point is attractive, but is not stable -- just like our example above. Small perturbations may cause the system to drive all of the way around the circle in order to once again return to the unstable equilibrium. For this reason, to actually balance the system, we'll have to switch to a different controller once we get near the fixed point (an idea that we'll discuss more in the next chapter).
Take a minute to play around with the energy-shaping controller for swinging up the pendulum
Make sure that you take a minute to look at the code which runs during these examples. Note the somewhat arbitrary threshold for switching to the balancing controller. We'll give a much more satisfying answer for that in the chapter on Lyapunov methods.
There are a few things that are extremely nice about this controller. Not only is it simple, but it is actually incredibly robust to errors we might have in our estimate of the model parameters, $m,$ $l,$ and $g.$ In fact, the only place that the model enters our control equation is in the calculation of $\tilde{E} = \frac{1}{2} m l^2 \dot\theta^2 - mgl(1 + \cos\theta)$, and the only property of this estimate that impacts stability is the location of the orbit when $\tilde{E} = 0$, which is $\frac{1}{2}l\dot\theta^2 = g(\cos\theta + 1).$ Therefore, we could have used $\frac{E - E_d}{m} = \frac{1}{2} l^2 \dot\theta^2 - gl(1 + \cos\theta)$ in our controller and obtained the same result. This doesn't depend at all on our estimate of the mass, and only linearly on our estimates of the length and gravity (and if one cannot measure the length of the pendulum accurately, then a proper measuring device would make an excellent investment). This is somewhat amazing; we will develop many optimization-based algorithms capable of swinging up the pendulum, but it will take a lot of work to find one that is as insensitive to the model parameters.
If there is damping in the original system, of course we can cancel this out, too, using $u = -k \dot\theta \tilde{E} + b\dot\theta.$ And once again, the controller is quite robust if we have errors in the estimated damping ratio.
Previous Chapter | Table of contents | Next Chapter |