Saturday 21 September 2013

Perturbation Theory part III

In my previous post I introduced singular equations and how perturbation theory can be applied to solve them. You may have noticed how the applications to differential equations was conspicuously absent in my previous post, this was because singular differential equations are more complicated than their algebraic counterparts. They share many similarities however, notably that if we set $\epsilon = 0$ we do not capture the true behaviour of the solution. Our naive approach of simply setting $\epsilon = 0$ causes us to lose some of the behaviour at the edges of the domain and we introduce what's called a boundary layer to try and capture the behaviour of the solution around the boundaries.

As is often the case an example will illustrate this best, consider the simple linear ODE. \[ L: \quad \left\{ \begin{array}{l l} \epsilon y''(x) + y'(x) + y(x) = 0 \quad x \in [0,1] \\ y(0)=0 \\ y(1)=1 \end{array} \right.\] Now let's apply regular perturbation theory and see what happens, setting $\epsilon = 0$ we have \[ L_0 : \quad \left\{ \begin{array}{l l} y'(x) + y(x) = 0 \quad x \in [0,1] \\ y(0)=0 \\ y(1)=1 \end{array} \right.\] The solution to this is $y = A e^{-x}$, due to the reduction in order of the ODE the solution we have found from perturbation theory is unable to satisfy both of the boundary conditions in general as we only have one undetermined constant.

If y is the solution to the ODE, then one possible behaviour is

  • Over most of the domain we expect $\epsilon y''$ to be small, and thus $L_0$ captures the approximate behaviour of $y$
  • In certain areas of the domain we find that $\epsilon y''$ is no longer small, i.e. $y'' \sim 1/\epsilon$. In these areas $L_0$ is no longer a good approximation. We expect these areas to be close to the boundaries of the differential equation to make $y$ match the boundary conditions.
There is a procedure we can follow to solve these
  1. Determine the scaling of the boundary layer. We do this just as we did in algebraic singular perturbation theory, determine the scaling factor $\delta$ that removes the singular terms.
  2. Rescale the independent variable in the boundary layer using the change of variables $\xi =\delta (x-x_0) $, where $x_0$ is the location of the boundary layer.
  3. Find the perturbation series solutions to $y$ inside the boundary layers and outside the boundary layers. We term these the 'inner' and 'outer' solutions respectively. (We're not the most creative sorts when it comes to naming things)
  4. Determine all of the arbitrary constants in our solutions. We do this by matching the boundary conditions in the inner solutions and 'matching' the inner and outer solutions so they join up in the overlapping regions.

Actual problem

We will attempt to solve the above ODE using boundary layers, we have once again chosen a problem that does have an analytic solution, however we shall proceed as though this was not the case.

Scaling

Let's now attempt to solve our example problem using this method. Let's start on the left side of the domain. First we want to determine the scaling factor that will remove the singular term. To do this introduce the change of variables $x_L = \delta x$ near the point $x=0$. Using the chain rule and applying the change of variables to $L$ we find \[ \epsilon \delta^2 y''(x_L) + \delta y'(x_L) + y(x_L) = 0 \] We now compare the terms pairwise to try and find a balance as we did before.

  1. When the first and second term are in balance, $\delta = \epsilon^{-1}$. The first and second terms are then of order $\mathcal{O}(\epsilon^-1)$ which dominates the third term, which is of order $\mathcal{O}(1)$. This result is sensible and leads to the distinguished limit.
  2. When the first and third order terms are in balance, $\delta = \epsilon^{-1/2}$. The first and third terms are then of order $\mathcal{O}(1)$ and are dominated by the second order term which is of order $\mathcal{O}(\epsilon^{-1/2})$. A single term, the second term, dominates; it is of higher order than the balancing terms so this does not result in a dominant balance. So we obtain no new solutions.
  3. When the second and third order terms are in balance $\delta = 1$. The second and third terms are then of order $\mathcal{O}(1)$ and dominate the first term which is of order $\mathcal{O}(\epsilon)$. Of course here we have just recovered the original singular solution that we obtained by setting $\epsilon=0$.
Now we have determined that $\delta = \epsilon^{-1}$ is the appropriate scaling factor. The scaling of the boundary layer also tells us the width of the layer, $\epsilon$, and hence the range over which $\epsilon y''$ is large.

With this in mind we have three expansions to determine, one for each of the regions we have divided the interval into, using the appropriate scaling factor.

  1. Near $x = 0$, the left hand boundary, we have rescaled the equation so that $x_L = x/\epsilon$ \[ y_L(x_L) = y_{L_0}(x_L) + \epsilon y_{L_1}(x_L) + \cdots \]
  2. In the middle region outside the boundary regions we can use regular perturbation theory and thus need no rescaling \[ y_M(x) = y_{M_0}(x) + \epsilon y_{M_1}(x) + \cdots \]
  3. Near $x = 1$, the right hand boundary, we have rescaled the equation so that $x_R = (x-1)/\epsilon$ \[ y_R(x_R) = y_{R_0}(x_R) + \epsilon y_{R_1}(x_R) + \cdots \]

Solving

First we attempt to solve the equation around the left boundary \[ y_{L}''(x_L) + y_{L}'(x_L) + \epsilon y_{L}(x_L) = 0 \] Applying regular perturbation theory to this equation we find \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: y_{L_0}'' + y_{L_0}' = 0\\ \mathcal{O}(\epsilon) &: y_{L_1}'' + y_{L_1}' + y_{L_0} = 0 \\ \end{array} \right.\] and hence the zeroth order solution, $\mathcal{O}(1)$, is \[ y_{L_0} = A_{L_0} + B_{L_0} e^{-x_L} \] This is solution at the left most boundary of the interval so must satisfy $y(0) = 0$ which leads to the condition $A_{L_0} + B_{L_0} = 0$

Next we attempt to solve the middle equation \[ \epsilon y''(x) + y'(x) + y(x) = 0 \] Applying regular perturbation theory to this equation we find \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: y_{M_0}' + y_{M_0} = 0\\ \mathcal{O}(\epsilon) &: y_{M_0}'' + y_{M_1}' + y_{M_1} = 0 \\ \end{array} \right.\] and hence the zeroth order solution, $\mathcal{O}(1)$, is \[ y_{M_0} = A_{M_0} e^{-x} \] This solution does not need to match any of the ODEs boundary conditions, instead $A_{M_0}$ will be determined by matching with the two inner solutions.

Finally we attempt to solve the equation around the right boundary \[ y_{R}''(x_R) + y_{R}'(x_R) + \epsilon y_{R}(x_R) = 0 \] Applying regular perturbation theory to this equation we find \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: y_{R_0}'' + y_{R_0}' = 0\\ \mathcal{O}(\epsilon) &: y_{R_1}'' + y_{R_1}' + y_{R_0} = 0 \\ \end{array} \right.\] and hence the zeroth order solution, $\mathcal{O}(1)$, is \[ y_{R_0} = A_{R_0} + B_{R_0} e^{-x_R} \] This is solution at the right most boundary of the interval so must satisfy $y(1) = 1$ which leads to the condition $A_{R_0} + B_{R_0} = 1$. Check the change of variables for $x_R$ to see that $y(1)=1$ in $x$ corresponds to $y(0)=1$ in $x_R$.

matching

Currently we have 5 arbitrary constants to determine, the boundary conditions provide two solutions and the matching provides the remaining three. The idea behind matching is that there is some overlap region between each of the expansions where $y(x)$ is accurately approximated by both $y_{M_0}(x)$ and $y_{L_0}(x_L)$. In this overlap region we expect \[ y_{M_0}(x) \sim y_{L_0}(x_L) \ \text{as} \ x \to 0, \ \text{and} \ x_L = x/\epsilon \to \infty \] One method of doing this is to introduce an intermediate scaling factor $\tilde{x} = x/\epsilon^{\alpha}$ with $ 0 < \alpha < 1$. Then with $\epsilon \to 0$ with fixed $\tilde{x}$ we have that $x = \epsilon^{\alpha} \tilde{x} \to 0$ and $x_L = \epsilon^{\alpha-1} \tilde{x} \to \infty$. Now matching the left side \[ \left\{ \begin{array}{l l} y_{L_0}(x_L) = A_{L_0} + B_{L_0} e^{-\epsilon^{\alpha-1} \tilde{x} } + \mathcal{O}(\epsilon) \\ y_{L_0}(x_L) \sim A_{L_0} + \mathcal{O}(\epsilon) \\ \end{array} \right.\] We were able to neglect the exponential terms in $y_{L_0}$ as these go to $0$ as $\epsilon \to 0$. And for the middle terms we find \[ \left\{ \begin{array}{l l} y_{M_0} = A_{M_0} e^{- \epsilon^{\alpha} \tilde{x}} + \mathcal{O}(\epsilon) \\ y_{M_0} \sim A_{M_0} - \tilde{x} \epsilon^{\alpha} A_{M_0} + \mathcal{O}(\epsilon) \\ \end{array} \right.\] Here we have written the exponential term as a power series in $\epsilon \tilde{x}$, up to first order. For both of these approximations to $y$ to match as $\epsilon \to 0$ we require $A_{L_0} = A_{M_0}$

Now match the right side, we use the scaling $\tilde{x} = (x-1)/\epsilon^{\alpha}$, be careful as I have used the same notation $\tilde{x}$, however the definition here is different. \[ y_{R_0} = A_{R_0} + B_{R_0} e^{-\epsilon^{\alpha-1} \tilde{x} } + \mathcal{O}(\epsilon) \] Here we can no longer neglect the exponential terms as we did on the left, this is because our scaling results in $\tilde{x} = (x-1)/\epsilon^{\alpha} \le 0$. So the exponential here grows without limit. Now for the middle terms \[ \left\{ \begin{array}{l l} y_{M_0} = A_{M_0} e^{-1 - \epsilon^{\alpha} \tilde{x}} + \mathcal{O}(\epsilon) \\ y_{M_0} \sim A_{M_0}e^{-1} - \tilde{x} \epsilon^{\alpha} A_{M_0} e^{-1} + \mathcal{O}(\epsilon) \\ \end{array} \right.\] As we noticed above the second term in $y_R$ grows exponentially, which cannot be matched in $y_M$ so we require $B_{R_0} = 0$ and find \[ B_{R_0} = 0, \quad A_{R_0} = A_{M_0}e^{-1} \] The boundary conditions combined with the matching leaves us 5 equations with 5 unknowns to determine \[ \left\{ \begin{array}{l l} A_{L_0} + B_{L_0} &= 0 \\ A_{R_0} + B_{R_0} &= 1 \\ A_{L_0} &= A_{M_0} \\ B_{R_0} &= 0 \\ A_{R_0} &= A_{M_0} e^{-1} \end{array} \right.\] Solving this system of equations we find \[ A_{L_0} = e, \quad B_{L_0} = -e, \quad A_{M_0} = e, \quad A_{R_0} = 1, \quad B_{R_0} = 0 \] This gives us the perturbation series solutions to zeroth order, $\mathcal{O}(1)$ \[ \left\{ \begin{array}{l l} y_L(x_L) = e - e^{1-x_L} \\ y_M(x) = e^{1-x} \\ y_R(x_R) = 1 \end{array} \right.\] notice that we didn't actually need to consider a boundary layer at $x = 0$, as our perturbation series is just the boundary condition.

Composite expansion

Now all that remains is to join together the inner and outer solutions to get an approximation to $y$. We could construct a piecewise function, however as the exact edges of boundary layers are fuzzy we adopt a different approach. We construct a uniformly valid approximation to $y$ by adding together the inner and outer solutions and subtracting the solution in the overlap region. We do this as when the solutions overlap we have double counted the value of the solution. To determine the value of the overlap we write the outer solution in terms of the inner variables and the inner solution in terms of the outer variables. \[ \left\{ \begin{array}{l l} y_L(x_L) = e - e^{1 - x_L} \\ y_L(x) = e - e^{1 - x/\epsilon} \sim e + \mathcal{O}(\epsilon) \end{array} \right.\] \[ \left\{ \begin{array}{l l} y_M(x) = e^{1-x} \\ y_M(x_L) = e^{1 - \epsilon x_L} \sim e + \mathcal{O}(\epsilon) \end{array} \right.\] the overlap term to zeroth order, $\mathcal{O}(1)$ is $y_{overlap} = e$, so the composite solution is \[ y(x) = e^{1-x} + e - e^{1-x/\epsilon} - e = e^{1-x} - e^{1-x/\epsilon} \] This is the uniform approximation of $y$ to zeroth order.

Higher order approximations

Naturally we can continue this method up to higher orders of $\epsilon$ to obtain better approximations to $y$. The algebra however is much more tedious.

Solving

To construct a higher order approximation we proceed in the same way as before, the method is unchanged however it's usually much messier. For anything other than simple equations a CAS is a must. Collecting the first order terms, $\mathcal{O}(\epsilon)$ \[ \left\{ \begin{array}{l l} y_{L_1}(x_L) = -e x_L - x_L e^{1-x_L} + A_{L_1} + B_{L_1} e^{-x_L}\\ y_{M_1}(x) = -e x e^{-x} + A_{M_1} e^{-x} \\ y_{R_1}(x_R) = -x_R + A_{R_1} + B_{R_1} e^{-x_R} \\ \end{array} \right.\] The associated boundary conditions are \[ \left\{ \begin{array}{l l} y_{L_1}(0) = 0 \\ y_{R_1}(0) = 0 \end{array} \right.\] which gives us the conditions \[ A_{L_1} + B_{L_1} = 0, \quad A_{R_1} + B_{R_1} = 0 \]

matching
As with before the remaining 3 conditions are determined from the matching. We use the same change of variables $\tilde{x} = x/\epsilon^{\alpha}$ \[ \left\{ \begin{array}{l l} y_L(x_L) = e - e^{-\epsilon^{\alpha - 1} \tilde{x}} + \epsilon (-e \epsilon^{\alpha - 1} \tilde{x} -e \epsilon^{\alpha - 1} \tilde{x} e^{-\epsilon^{\alpha-1}\tilde{x}} + A_{L_1} + B_{L_1} e^{-\epsilon^{\alpha - 1} \tilde{x}}) + \mathcal{O}(\epsilon^2) \\ y_L \sim e - e \epsilon^{\alpha} \tilde{x} + \epsilon A_{L_1} + \mathcal{O}(\epsilon^2) \end{array} \right.\] Here again we were able to neglect the exponentials for the same reason as before. Proceeding to the middle solution \[ \left\{ \begin{array}{l l} y_M(x) = e^{1-\epsilon^{\alpha} \tilde{x}} + \epsilon (-e \epsilon^{\alpha} \tilde{x} e^{-\epsilon^{\alpha} \tilde{x}} + A_{M_1} e^{-\epsilon^{\alpha} \tilde{x}}) + \mathcal{O}(\epsilon^2) \\ y_M \sim e - e \epsilon^{\alpha} \tilde{x} + e \epsilon^{2 \alpha} \tilde{x}^{2}/2 + \cdots - e \epsilon^{\alpha + 1} \tilde{x} + e \epsilon^{2\alpha + 1} \tilde{x}^{2} + \cdots + \epsilon A_{M_1} - \epsilon^{\alpha + 1} A_{M_1} \tilde{x} + \mathcal{O}(\epsilon^2) \end{array} \right.\] matching these together \[ A_{L_1} = A_{M_1} \] To determine the right hand side we use the same change of variables as before $\tilde{x} = (x-1)/\epsilon$ and find \[ \left\{ \begin{array}{l l} y_R = 1 + \epsilon (-\epsilon^{\alpha - 1} \tilde{x} + A_{R_1} + B_{R_1} e^{-\epsilon^{\alpha - 1} \tilde{x}} ) + \mathcal{O}(\epsilon^2) \\ y_R \sim 1 - \epsilon^{\alpha} \tilde{x} + \epsilon A_{R_1} + \epsilon B_{R_1} e^{-\epsilon^{\alpha - 1} \tilde{x}} + \mathcal{O}(\epsilon^2) \end{array} \right.\] the middle is \[ \left\{ \begin{array}{l l} y_M = e^{-1-\epsilon^{\alpha} \tilde{x}} + \epsilon (-e(1+ \epsilon^{\alpha} \tilde{x}) e^{-1-\epsilon^{\alpha} \tilde{x}} + A_{M_1} e^{-1-\epsilon^{\alpha} \tilde{x}}) + \mathcal{O}(\epsilon^2) \\ y_M \sim 1 - \epsilon^{\alpha} \tilde{x} + \epsilon^{2 \alpha} \tilde{x}^{2}/2 + \cdots - (\epsilon + \epsilon^{\alpha + 1} \tilde{x} ) (1 - \epsilon \tilde{x} + \cdots) + \epsilon A_{M_1} e^{-1} (1 - \epsilon^{\alpha} + \cdots ) + \mathcal{O}(\epsilon^2) \end{array} \right.\] matching yields \[ B_{R_1} = 0, \quad A_{M_1} e^{-1} - 1 = A_{R_1} \] So again we have 5 equations with 5 unknowns to determine \[ \left\{ \begin{array}{l l} A_{L_1} + B_{L_1} &= 0 \\ A_{R_1} + B_{R_1} &= 0 \\ A_{L_1} &= A_{M_1} \\ B_{R_1} &= 0 \\ A_{R_1} &= A_{M_1} e^{-1} - 1 \end{array} \right.\] with solutions given by \[ A_{L_1} = e, \quad B_{L_1} = -e, \quad A_{M_1} = e, \quad A_{R_1} = 0, \quad B_{R_1} = 0 \] This gives us the first order equations from the perturbation series of $y$, \[ \left\{ \begin{array}{l l} y_{L_1}(x_L) = -e x_L - x_L e^{1-x_L} + e - e^{1-x_L}\\ y_{M_1}(x) = -x e^{1-x} + e^{1-x} \\ y_{R_1}(x_R) = -x_R \\ \end{array} \right.\]

composite solution

We want to find a solution of the form $y = y_L + y_M - y_{overlap}$, as we did above. First we need to determine the value of the overlap, re-expressing the inner and outer solutions in the appropriate variables we have on the left side \[ \left\{ \begin{array}{l l} y_L = e - e^{1- x/\epsilon} + \epsilon \Big( -e \dfrac{x}{\epsilon} - \dfrac{x}{\epsilon} e^{1-x/\epsilon} + e - e^{1-x/\epsilon} \Big) + \cdots \\ y_L \sim e - e x + \epsilon e + \mathcal{O}(\epsilon^2) \end{array} \right.\] and the middle solution \[ \left\{ \begin{array}{l l} y_M = e^{1-\epsilon x_L} + \epsilon \Big( -\epsilon x_L e^{1-\epsilon x_L} + e^{1-\epsilon x_L} \Big) + \cdots \\ y_L \sim e - e \epsilon x_L + \epsilon e + \mathcal{O}(\epsilon^2) \end{array} \right.\] Hence we see that the overlap is given by \[ y_{overlap} = e - e x +\epsilon e \] So after all that algebra we've determined to first order an approximate solution to $y$. \[ y(x) = -\epsilon e^{-x/\epsilon} - x e^{1-x/\epsilon} - \epsilon e^{1-x/\epsilon} + e^{1-x} - \epsilon x e^{1-x} + \epsilon e^{1-x} + \mathcal{O}(\epsilon^2) \] Whilst we could have found the analytic solution to this problem, by choosing something simple we were able to reasonably deconstruct the problem and carry out the algebra relatively easily.

Sunday 15 September 2013

Perturbation Theory part II

Now that we understand the basics of perturbation theory we can address some of the shortcomings in our original formulation. For example as it currently stands our theory of perturbation is unable to be used for so called singular problems, ones where we cannot simply set $\epsilon = 0$. Problems like this usually have the small parameter $\epsilon$ on the leading term. We are also unable to deal with secular terms in our perturbation series, terms that grow without bound. The original example I presented in my previous post was an example of such a problem and we shall attempt to address how to deal with it here.

Singular Perturbations

We start by considering a singular perturbation problem, first we shall attempt to use regular perturbation theory as we did previous and see how and why it fails. Consider the quadratic equation $\epsilon x^2 - x + 1 = 0, \ \epsilon \ll 1$ using the usual quadratic formula the solution to this problem can be trivially and is \[ x = \dfrac{1 \pm \sqrt{1 - 4 \epsilon}}{2 \epsilon} \] Again as we have done previously, let's assume we couldn't have just used the quadratic formula and attempt to find a solution using perturbation theory. Following the algorithm.

  1. First let's identify the unperturbed problem, just set $\epsilon = 0$ and see what we're left with \[ -x + 1 = 0 \] The solution to this linear equation is \[ x = 1 \] Things already seem to be going wrong, by setting $\epsilon = 0$ we have lost a solution to the problem. Let's continue and see what happens.
  2. Let's assume we can write $x$ in a perturbation series as \[x = x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots \] Next we substituting this into our original equation to give us \[ \epsilon (x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots)^2 - (x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots) + 1 = 0 \] So we now have the full problem written in terms of a perturbation series, so let's proceed to the final step.
  3. Now we solve the perturbation series by collecting terms of equal order in $\epsilon$ and solving the resulting system of equations. So collecting terms of equal orders \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: - x_0 + 1 &=& 0 \\ \mathcal{O}(\epsilon) &: {x_0}^2 - x_1 &=& 0 \\ \mathcal{O}(\epsilon^2) &: 2 x_0 x_1 - x_2 &=& 0 \\ \vdots \end{array} \right.\] We solve the system of equations up to second order in $\epsilon$, $\mathcal{O}(\epsilon^2)$, and find that the solution to the system is given by \[ x_0 = 1 , \quad x_1 = 1, \quad x_2 = 2 \] So putting this all together we find \[ x = 1 + \epsilon + \epsilon^2 + \cdots \] This is a catastrophic failure, by applying regular perturbation to this problem we've only managed to find one solution, when there must exist two.

So what went wrong? By setting $\epsilon=0$ we have reduced the order of our equation from a second order polynomial to a first order and thus lost a solution. To get around this issue we need to find a way to apply perturbation without lowering the order of the equation we want to solve. Fortunately there is a simple way to do this, we introduce a scaling factor.

Let's introduce the scaling factor $y = \delta(\epsilon) x$. If we apply this to our previous problem we find \[ \epsilon \delta^2 y^2 - \delta y + 1 = 0 \] Now we need to work out the scaling factor $\delta$. There are two methods to do this. We can either vary the size of $\delta$ from small to large until we find the appropriate size of $\delta$ that balances the equation and gives us both solutions. This method is something of an art as spotting the right intervals to choose can be difficult. This works best with larger problems. However here we shall use the simpler method which is better suited to smaller problems. We compare pairwise all combinations of terms until we find a meaningful balance.

  1. When the first and second term are in balance, $\delta = \epsilon^{-1}$. The first and second terms are then of order $\mathcal{O}(\epsilon^{-1})$ which dominates the third term, which is of order $\mathcal{O}(1)$. This result is sensible and leads to two solutions.
  2. When the first and third order terms are in balance, $\delta = \epsilon^{-1/2}$. The first and third terms are then of order $\mathcal{O}(1)$ and are dominated by the second order term which is of order $\mathcal{O}(\epsilon^{-1/2})$. A single term, the second term, dominates; it is of higher order than the balancing terms so this does not result in a dominant balance. So we obtain no new solutions.
  3. When the second and third order terms are in balance $\delta = 1$. The second and third terms are then of order $\mathcal{O}(1)$ and dominate the first term which is of order $\mathcal{O}(\epsilon)$. Of course here we have just recovered the original singular solution that we obtained by setting $\epsilon=0$.

Our analysis here leads us to conclude that we should rescale our equation by using the change of variables $x = y/\epsilon$. This is the result of the dominant balance where we find two solutions. Rescaling our original equation and simplifying we find \[ y^2 - y + \epsilon = 0 \] Through our rescaling we've changed our singular equation into one that can be readily tackled using regular perturbation theory. For brevity I shall no go through solving this regular perturbation problem, following the method from my previous post we find that the solutions to this problem, to second order, are given by \[ \left\{ \begin{array}{l l} y_+ = 1 - \epsilon - \epsilon^2 \\ y_- = \epsilon + \epsilon^2 \end{array} \right.\] and if we transform this back into our original variable, $x$ we find \[ \left\{ \begin{array}{l l} x_+ = 1/\epsilon - 1 - \epsilon \\ x_- = 1 + \epsilon \end{array} \right.\] These are the perturbation series solutions of our original equation to second order in $\epsilon$, $mathcal{O}(\epsilon^2$. Of course now that we have two solutions, we need to check them against the actual answer to make sure our method gives us the correct solution. The analytic solution is \[ x = \dfrac{1 \pm \sqrt{1 - 4 \epsilon}}{2 \epsilon} \] Writing out the square root term as a power series gives us \[ \dfrac{1 \pm (1 - 2\epsilon - 2\epsilon^2 + \cdots}{2\epsilon} \] which is the same as the perturbation series solution. This is an excellent result as through a simple rescaling and balancing act we have found a way to solve singular problems and vastly expand the repertoire of problems we can solve.

Let's analyse more thoroughly why regular perturbation failed. Examining the solutions we can see that the standard ansatz of regular perturbation theory failed to capture the $1/ \epsilon$ term that appears in the solution. If instead we had chosen an ansatz of the form $x = \epsilon^{-1}x_{-1} + x_0 + \epsilon x_1 + \cdots$ we would have found both solutions, however without considerable insight or an exact solution it is very difficult to predict an expansion that will cover all of the solutions; so the more general method is to rescale and then apply regular perturbation theory. An interpretation of the root that we initially failed to capture is that as we let $\epsilon \to 0$ the root diverges and $x \to \infty$. Be careful as this interpretation is only true for algebraic equations.

Multiple scale analysis

As we are addressing singular perturbation theory now is an appropriate time to talk about dealing with secular terms. A secular term is one that grow without bound. These often appear when applying perturbation theory to differential equations with oscillatory solutions. This technique is very important for extending the range over which solutions are good, in fact in some cases it lets us lift the problem that perturbation series are only valid up until $x \sim 1/\epsilon$. A prime example of an equation we need this sort of method to fix is the non linear duffing equation. This is the differential equation I originally presented at the beginning of my previous post, and here we will develop the tools to solve it.

This non linear duffing equation with appropriate boundary values is \[ \left\{ \begin{array}{l l} y''(t) + y(t) + \epsilon y^3(t) = 0\\ y(0) = a \\ y'(0) = 0 \\ \end{array} \right.\] We initially approach this problem using regular perturbation theory, following the usual method

  1. Let's set $\epsilon = 0$ and Identify the unperturbed problem \[ \left\{ \begin{array}{l l} y'' + y = 0\\ y(0) = a \\ y'(0) = 0 \\ \end{array} \right.\] This is standard second order ODE with constant coefficients which can be solved easily via the usual methods, it has the solution \[ y(t) = a \cos t \] Now we move on to identify the perturbed problem.
  2. We assume that we can write $y(t)$ as a perturbation series \[ y(t) = y_0 + \epsilon y_1 + \epsilon^2 y_2 + \cdots \] We now substitute this into our original equation, including the boundary conditions and find \[ \left\{ \begin{array}{l l} (y_0 + \epsilon y_1 + \epsilon^2 y_2 + \cdots)'' + (y_0 + \epsilon y_1 + \epsilon^2 y_2 + \cdots) + \epsilon (y_0 + \epsilon y_1 + \epsilon^2 y_2 + \cdots)^3 = 0 \\ y_0(0) + \epsilon y_1(0) + \epsilon^2 y_2(0) + \cdots = a \\ {y_0}'(0) + \epsilon {y_1}'(0) + \epsilon^2 {y_2}'(0) + \cdots = 0 \end{array} \right.\] Remember that only the zeroth order boundary conditions are inhomogeneous in this problem. Now we've formulated the problem in terms of a perturbation series let's proceed to the final step.
  3. Now we collect all the terms of equal order in $\epsilon$ and form a system of equations that we need to solve. The system is \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: {y_0}'' + {y_0} &=& 0 \\ \mathcal{O}(\epsilon) &: {y_1}'' + {y_1} + {y_0}^3 &=& 0 \\ \mathcal{O}(\epsilon^2) &: {y_2}'' + {y_2} + 3{y_0}{y_1}^2 &=& 0 \\ \vdots \end{array} \right.\] and has the corresponding system of boundary conditions \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: y_0(0) = a, \quad {y_0}'(0) = 0 \\ \mathcal{O}(\epsilon) &: y_1(0) = 0, \quad {y_1}'(0) = 0 \\ \mathcal{O}(\epsilon^2) &: y_2(0) = 0, \quad {y_2}'(0) = 0 \\ \vdots \end{array} \right.\] This system is very tedious to solve by hand, it can be done very easily using the method of undetermined coefficients and the usual techniques for solving homogeneous linear constant coefficient ODEs. I would recommend using a computer algebra system, such as Mathematica, to solve this system; in particular if you want to consider solving for higher order terms then a CAS really is a must. Regardless the solution to first order is \[ y(t) = a \cos t + \dfrac{\epsilon a^3}{32}(\cos 3t - \cos t - 12 t \cos t) + \mathcal{O}(\epsilon^2) \] We could do all of this before, but here we can identify the secular term, which is $t \cos t$; this term will grow without limit and causes the perturbation series to diverge from the actual solution as $x \sim 1/\epsilon$.

Now we introduce the idea of multiple scale analysis; here we assume the existence of many time scales in our problem, $( t, \tau, \tau_1, \cdots )$. We treat each of these time scales as its own independent variable. This is the general procedure, however for simplicity here let's consider only two timescales with the two variables $t$ and $\tau = \epsilon t$ and seek a perturbative solution of the form \[ y(t) = Y_0(t,\tau) + \epsilon Y_1(t,\tau) + \epsilon^2 Y_2(t,\tau) + \cdots \] Substituting this new two scale expansion into our original differential equation we find \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: \dfrac{\partial^2 Y_0}{\partial t^2} + Y_0 &=& 0 \\ \mathcal{O}(\epsilon) &: \dfrac{\partial^2 Y_1}{\partial t^2} + Y_1 + {Y_0}^3 + 2 \dfrac{\partial^2 Y_0}{\partial t \partial \tau} &=& 0 \\ \vdots \end{array} \right.\] Solving this system of equations we find, starting at zeroth order $\mathcal{O}(1)$ \[ Y_0 = A(\tau) e^{i t} + B(\tau) e^{-i t} \] Where $A(\tau)$ and $B(\tau)$ are functions we have yet to determine. We determine the two unknown functions by imposing the condition that no secular terms should appear in $Y_1$. When we solved this problem using regular perturbation there were of course secular terms in $Y_1$. Because of the functional freedom we have in $A$ and $B$ thanks to our new timescale we are free to do this. Before we move we should more thoroughly examine our solution, by invoking euler's formula we can actually reduce the freedom of the functions to \[ Y_0 = A(\tau) e^{i t} + A^*(\tau) e^{-i t} \] This lowers the number of unknowns we need to solve for which will simplify our calculations. Next we substitute this resulting into the first order equation, $\mathcal{O}(\epsilon)$ to find \[ \dfrac{\partial^2 Y_1}{\partial t^2} + Y_1 = - A^3 e^{3 i t} - (A^*)^3 e^{-3 i t} - \Big( 2i \dfrac{\partial A}{\partial \tau} + 3 A^2 A^* \Big) e^{i t} + \Big( 2i \dfrac{\partial A^*}{\partial \tau} - 3 A (A^*)^2 \Big) e^{-i t} = 0 \] We examine the above equation, the homogeneous equation has solutions that go like $Y_1 \sim e^{\pm i t}$, so we expect secular terms to arise in the solution due to the presence of inhomogeneous terms that go like $e^{\pm i t}$. However as we have the function freedom to choose $A$, we want to solve the follow system to remove the secular terms \[ \left\{ \begin{array}{l l} 2i \dfrac{\partial A}{\partial \tau} + 3 A^2 A^* &=& 0 \\ 2i \dfrac{\partial A^*}{\partial \tau} - 3 A (A^*)^2 &=& 0 \end{array} \right.\] However these equations are simply the complex conjugate of one another, so we need only find a solution to one of them. So solve this equations we express the complex values coefficient function $A$ as $A(\tau) = r(\tau) e^{i \theta(\tau)} $. Where $r$ and $\theta$ are real valued functions of $\tau$. Using the chain rule and the product rule we find \[ \Big( 2i \dfrac{dr}{d\tau} - 2 r(\tau) \dfrac{d \theta}{d \tau} + 3r^3(\tau) \Big) e^{i \theta(\tau)} = 0 \] Equating the real and imaginary parts we find \[ \left\{ \begin{array}{l l} \dfrac{d r}{d \tau} &= 0 \\ \dfrac{d\theta}{d\tau} &= \dfrac{3}{2} r^2(\tau) \end{array} \right.\] The solution to these can be found simply, $r(\tau) = r_0$ and $\theta(\tau) = 3/2 \ {r_0}^2 \tau + \theta_0 $. This gives us the general solution \[ A(\tau) = A(0) e^{i\theta_0 + i3/2 {r_0}^2 \tau} \] substituting this solution into our equation for $Y_0$ we get \[ Y_0 = 2 r_0 \cos \Big( \theta_0 + \dfrac{3}{2} {r_0}^2 \tau + t \Big) \] and now we just need to match the BC to complete our solution for the timescale $\tau = \epsilon t$. The BCs to match are $y(0) = a, \quad y'(0) = 0$, from which we find that \[ r_0 = a/2, \quad \theta_0 = 0 \] and we arrive at our two scale approximation to the non linear duffing equation \[ y(t) = a \cos \Big( \Big( 1 + \epsilon \dfrac{3a}{8} \Big) t \Big) + \mathcal{O}(\epsilon) \] We can repeat this process to determine the higher order terms by eliminating all the secular terms that appear via this method. The result of this procedure is that our perturbation series is now valid over a much larger range, as $\tau = \epsilon t$ our expansion remains good as long as $t \ll 1/\epsilon^2 $.

Final comment

When we've talked about perturbation theory it has always been for equations were there is some term that is small and we can set to zero. However perturbation theory also work when we have a very large term which dominates the equation. I mention this here as now we've talked about rescaling we can see why this is. Consider the quadratic equation we worked with earlier $ x^2 - x + \alpha = 0$, except this time $\alpha$ is a very large parameter $\alpha \gg 0$. We can transform this equation by making the variable change $x = \alpha y$ to get \[ y^2 - \dfrac{y}{\alpha} + \dfrac{1}{\alpha} = 0 \] If we now define $\epsilon = 1/\alpha$ we now have a regular perturbation theory problem \[ y^2 - \epsilon y + \epsilon = 0 \] Through a simple rescaling we can extend perturbation theory to deal with the very small and the very large.

Friday 13 September 2013

Perturbation Theory Part I

So what is perturbation theory, and why should you care? Perturbation theory comprises a whole host of mathematical techniques that can be applied to many classes of problems to find approximate solutions. Typically problems that can be solved through perturbation theory are close to a problem with a closed form, analytic solution but with some additional small term, like say for example: \[ \left\{ \begin{array}{l l} u'' + u + \epsilon u^3 = 0\\ u(0) = a \\ u'(0) = 0 \\ \end{array} \right.\] This is a non linear ODE with no analytic solution. However what if $\epsilon$ takes a small value, say $\epsilon \ll 1$, then maybe we can make some progress. It seems like an intuitive guess to assume that when $\epsilon$ is sufficiently small the solution to this ODE isn't going to be influenced very much by this term. This allows us to think about the problem in a different way, as a simple analytic problem that we can solve, but with some additional 'perturbation' due to the small non linear term. So in the case of this example we expect that the solution to this equation should be similar to the solution to \[ \left\{ \begin{array}{l l} u'' + u = 0\\ u(0) = a \\ u'(0) = 0 \\ \end{array} \right.\] which is of course, $u(x) = a \cos x$.

So perturbation theory gives us solutions to difficult problems in terms of a solution to an easy problem with some additional correction terms. This approach is very fruitful in subjects like physics where any real, non idealised, system is likely to have irregularities that lead to solving non-linear differential equations. Pertubative solutions also carry some advantages over numerical methods as we do not need to fix numerical values to any parameters, and perturbative solutions offer more physical insight into a given problem.

Perturbation theory Ideas and Theorems

So now that we can see the uses and motivations behind perturbation theory we need to introduce the framework in which it sits. But first to talk about perturbation theory we first need to introduce some new notation.

Let $f(x)$ and $g(x)$ be two functions defined around the point $x=x_0$, then
$f = \mathcal{o}(g)$ as $x \to x_0$ if $\lim_{x \to x_0} f / g = 0$
$f = \mathcal{O}(g)$ as $x \to x_0$ if $\lim_{x \to x_0} f / g = \text{const} \ne 0$
$f \sim g$ as $x \to x_0$ if $\lim_{x \to x_0} f / g = 1$

Regular perturbation

There are two important ideas we need to do perturbation theory, first the perturbation series, which we shall use as an approximation tool and the fundamental theorem of perturbation theory which allows us to determine the functions in a perturbation series.

First the perturbation series. Perturbation theory borrows heavily from power series, we want to expand our difficult problem in terms of a small parameter $\epsilon$ around $\epsilon =0$. Naturally there are caveats with this, however for sufficiently 'nice' functions this should be fine. \[ f(x;\epsilon) = f_0(x) + \epsilon f_1(x) + \epsilon^2 f_2(x) + \cdots \] Where $\{ f_i(x) \}$ is a function to be determined.

Second the fundamental theorem of perturbation theory, which states, If an asymptotic expansion satisfies \[ A_0 + A_1 \epsilon + \cdots + A_N \epsilon^N \equiv \mathcal{O}(\epsilon^{N+1}) \] for all sufficiently small $\epsilon$ and all the coefficients,$A_i$, are independent of $\epsilon$ then \[ A_0 = A_1 = \cdots = A_N = 0 \] This is a very important result as it allows us to solve a perturbation series separately for each order of $\epsilon$ and thus determine all of the functions $\{ f_i(x) \}$ in out perturbations series.

An algorithmic approach

With the important ideas of perturbation theory in hand we turn out attention towards how to solve a general perturbation problem. The following approach will work for most perturbation problems.

  1. Set $\epsilon = 0$ and solve the resulting linear, unperturbed, equation
  2. Allow $\epsilon$ to be non zero and formulate the equation in terms of a perturbation series, namely let $f(x)$ be specified by the power series \[ f = f_0 + \epsilon f_1 + \epsilon^2 f_2 + \cdots \] and substituting this into all the $f(x)$ terms in the original equation.
  3. Now collect terms of equal order in $\epsilon$. Solve the resulting systems of equations upto a suitable order in $\epsilon$, here we have made use of the fundamental theorem of perturbation theory. The approximate solution to the original equation is found by placing the result of the linear system into the perturbation series.

Some examples
Polynomials

Lots of talk and no real examples doesn't help understanding, so let's choose a really simple example, one that we could solve exactly. So consider the quadratic equation $x^2 - 4x + 3 + \epsilon = 0, \ \epsilon \ll 1 $ The solution to this can be found trivially via the quadratic formula and is \[ x = 2 \pm \sqrt{1-\epsilon} \] But what if we didn't know that, let's instead apply perturbation theory to this problem and see what we get. Let's work through the problem using the algorithm I suggested originally.

  1. First let's identify the unperturbed problem. Of course we just set $\epsilon = 0$ and see what we're left with \[ x^2 - 4x + 3 = 0 \] And let's identify the solution to this problem, using the quadratic formula we find \[ x = 2 \pm 1 \] We've identified and solved the unperturbed problem so let's proceed to the next stage.
  2. Let's assume we can write $x$ in a perturbation series as \[x = x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots \] Next we substituting this into our original equation to give us \[ (x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots)^2 - 4(x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots) + 3 + \epsilon = 0 \] So we now have the full problem written in terms of a perturbation series, so let's proceed to the final step.
  3. Now we solve the perturbation series by collecting terms of equal order in $\epsilon$ and solving the resulting system of equations. So collecting terms of equal orders \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: {x_0}^2 - 4 x_0 + 3 &=& 0 \\ \mathcal{O}(\epsilon) &: 2x_0 x_1 - 4x_1 + 1 &=& 0 \\ \mathcal{O}(\epsilon^2) &: {x_1}^2 + 2 x_2 x_0 - 4 x_2 &=& 0 \\ \vdots \end{array} \right.\] Solving perturbation series problems by hand can be rather tedious so we shall only proceed up to second order in $\epsilon$. The solution to this system of equations can be computed iteratively very simply and we find that \[ x_0 = 2 \pm 1 , \quad x_1 = \pm 1/2, \quad x_2 = \pm 1/8 \] So letting $x_{\pm}$ denote the large and smaller root respectively we find our approximate solution, \[ \left\{ \begin{array}{l l} x_+ = 3 - \epsilon/2 - \epsilon^2/8 + \cdots \\ x_- = 1 + \epsilon/2 + \epsilon^2/8 + \cdots \\ \end{array} \right.\] Perturbation success!

The next question we should be asking is how good is this approximate result compared to the actual solution. That of course is why I picked such a simple example as we can directly compare our perturbation solution to the analytic one. The analytic solution from before is \[ x = 2 \pm \sqrt{1-\epsilon} \] To compare this to our approximate we write $\sqrt{1-\epsilon}$ as a taylor series in $\epsilon$. This is \[ \sqrt{1-\epsilon} = 1 - \epsilon/2 - \epsilon^2/8 - \epsilon^3/16 + \mathcal{O}(\epsilon^4) \] Putting this expansion into the analytic solution we find \[ x = 2 \pm 1 \mp \epsilon/2 \mp \epsilon^2/8 \mp \epsilon^3/16 + \mathcal{O}(\epsilon^4) \] Comparing the solutions, they match up to order $\mathcal{O}(\epsilon^3)$ which was as far as we chose to continue our perturbation series. In this case the perturbation series actually converges to the correct solution with an infinite number of terms, however our truncated series provides a very good approximation to the actual result.

Differential equations

As I implied earlier the ideas in perturbation theory have very broad areas of application. We can use them further to find approximate solutions to differential equations. Consider the boundary value problem \[ \left\{ \begin{array}{l l} u'(x) = \epsilon u^2, \quad x>0 \\ u(0) = 1 \end{array} \right.\] This is a very simple example but demonstrates the ideas involved in applying regular perturbation theory to differential equations. Again let us use the algorithm that I originally proposed to solve this ODE using perturbation theory.

  1. Let's set $\epsilon = 0$ and Identify the unperturbed problem \[ \left\{ \begin{array}{l l} u'(x) = 0 \\ u(0) = 1 \end{array} \right.\] The solution to the unperturbed problem is trivial and is. \[ u(x) = 1 \] Now we move on to identify the perturbed problem.
  2. We assume that we can write $u(x)$ as a perturbation series \[ u(x) = u_0 + \epsilon u_1 + \epsilon^2 u_2 + \cdots \] We now substitute this into our original equation, including the boundary conditions and find \[ \left\{ \begin{array}{l l} (u_0 + \epsilon u_1 + \epsilon^2 u_2 + \cdots)' = \epsilon (u_0 + \epsilon u_1 + \epsilon^2 u_2 + \cdots)^2 \\ u_0(0) + \epsilon u_1(0) + \epsilon^2 u_2(0) + \cdots = 1 \end{array} \right.\] It's important to note there that the boundary conditions depend on the order of $\epsilon$ that we are interested in. In this case only the zeroth order boundary condition is inhomogeneous, $u_0(0)=1$, however all the higher order terms are homogeneous, having $u_n(0)=0$ for $n \ne 0$. Now we've formulated the problem in terms of a perturbation series let's proceed to the final step.
  3. Now we collect all the terms of equal order in $\epsilon$ and form a system of equations that we need to solve. The system is \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: {u_0}' &=& 0 \\ \mathcal{O}(\epsilon) &: {u_1}' - {u_0}^2 &=& 0 \\ \mathcal{O}(\epsilon^2) &: {u_2}' - {u_1}^2 &=& 0 \\ \vdots \end{array} \right.\] and has the corresponding system of boundary conditions \[ \left\{ \begin{array}{l l} \mathcal{O}(1) &: u_0(0) = 1 \\ \mathcal{O}(\epsilon) &: u_1(0) = 0 \\ \mathcal{O}(\epsilon^2) &: u_2(0) = 0 \\ \vdots \end{array} \right.\] As with our previous example it is not difficult to solve this system of equations, as before we proceed iteratively and find that the solutions to the system are \[ u_0 = 1, \quad u_1 = x, \quad u_2 = x^2 \] Substituting these values into the perturbation series we find the approximate solution to the ODE is given by \[ u(x) = 1 + \epsilon x + \epsilon^2 x^2 + \cdots \] Perturbation success!

Again we ask ourselves, how good is this approximation compared to the actual solution and as this is a differential equation and $u$ is a function in $x$ is there a limit on the values of $x$ so that our approximate solution remains good. First we need to determine the exact solution to the problem, this can be done in a number of ways as the result is \[ u(x) = \dfrac{1}{1-\epsilon x} \] Examining the behaviour of these solutions over a range of values provides a number of insights. First the analytic solution has a singularity when $\epsilon x = 1$, we cannot find this behaviour in our perturbative solution. However is this a reasonable expectation, over what range do we expect our expansion to be valid?

It turns out in this case that our perturbation series is identical to the power series in $\epsilon x$ of $1/1-\epsilon x$, and its convergence depends upon $\epsilon x < 1$. So with the small number of terms we have in our approximate solution the only range we expect our solution to be good is one where $\epsilon x \ll 1$. Solving this for $x$ leads to the range over which our expansion is good, namely, $ 0< x \ll 1/\epsilon $.

But what about the more general cases when we can't find a power series to check when out solution is valid? Well it's often problem specific however another way to see why our approximation fails is to look at the perturbation series. The second term in the series is $\epsilon x$ and this is of order $\mathcal{O}(\epsilon)$ so all is good, but as $x \sim 1/\epsilon$ our first order perturbation is now of zeroth order, $\mathcal{O}(1)$. Thus the way in which we separated out the perturbation terms in orders of $\epsilon$ is no longer correct and thus our solution fails.