Numerical Methods 3

Takeaways

Finite difference formulas approximate derivatives by relating function values in a discritized domain
Direct method is for solving boundary value problems. We discritize the domain, insert finite difference formulas into the ODE, and use these to assemble a system of equations we can solve for the solution.
Multistep methods use multiple previous solution points to solve for the next point, and are good for solutions with lots of oscillations
Shooting method is a way of solving BVPs where we treat it as an IVP, then find the initial conditions that give a solution which satisfies the boundary conditions

Overview

In this last set of notes on numerical methods, we address a very important class of problems: boundary value problems. These problems are extremely important both for ordinary differential equations as well as partial differential equations (which they cover in CME 104). We first go through the basics of these problems, and then a method for solving them known as the direct method.

After this, we cover a couple remaining schemes that are important in the broader field of numerical mathematics, and are good for you to be exposed to at this early point in your engineering career. The discussion of these methods though will be relatively high-level since these are much more advanced than the other numerical methods in this course.

Boundary Value Problems

A boundary value problem is a different type of ODE, but before we dive into them, let’s talk about initial value problems a bit, the type of ODEs we’ve been studying up until now in this course. By drawing parallels between initial value problems (IVPs) and boundary value problems (BVPs), hopefully BVPs will be easier to understand. Do not think of these as interchangeable though; at the bottom of these notes we briefly discuss some of the fundamental differences between BVPs and IVPs.

IVPs are ODEs where all of the information for the problem is prescribed at the beginning of the domain. An example would be:

\[y'' + 4y = \cos(x), \qquad y(0) = 5, \; y'(0) = 0\]

Recall that initial conditions “pin down” the solution to a single solution, since we use initial conditions to solve for constants of integration. In a differential equation, the order of the equation is equal to the number of constants of integration we will have. The constants of integration function as extra degrees of freedom, and the initial conditions add extra constraints on the problem so we do not have any free parameters.

In BVPs, instead of prescribing the information at the beginning of the domain, we prescribe it at the boundaries of the domain, hence why it is called a boundary value problem. An example of a BVP would be:

\[y'' + 4y = \cos(x), \qquad y(0) = 5, \; y'(\pi) = 0\]

See how we have defined two constraints, one at each boundary of the domain? These are known as boundary conditions, and understanding the different types is fundamental to working with BVPs.

As a point to make before we move on, when it comes to solving BVPs analytically, keep in mind that we can use the same exact solution methods as we used for IVPs. For instance, the above example would be solved using characteristic equation and undetermined coefficients. The solution methods get you to the solution with the constants of integration; BVPs just define the information used to solve for these constants differently from how it’s defined for IVPs.

Boundary Conditions

There are three types of boundary conditions for BVPs, appropriately named Type I, II, and III. It is important to be able to quickly identify these not so much for analytical solutions, but for the numerical solutions. Specifically, you will need to treat Type II and III conditions quite differently from a Type I condition when using some of the numerical methods discussed here. For this reason, it is worth going over each condition.

Type I boundary conditions (BCs), also known as Dirichlet (pronounced “der-i-shlay”) BCs, are where the function value is fixed at one end of the domain. In the example in the previous section, the left BC at $x = 0$ is a Dirichlet condition since we are fixing the function value there. For a physical example, this would be like holding the temperature at one end of a heated rod at some constant, or grounding the voltage so that $V = 0$ at the boundary.

Type II BCs, known as Neumann BCs, is where we define the derivative at one end of the domain. In the above example, this is the type of BC we impose at $x = \pi$. A physical version of this would be if we were modeling the temperature of a heated rod, and one end of the rod was a perfect thermal conductor such that $T' = 0$ at that boundary.

Type III BCs, known as Robin BCs, is where the boundary condition is defined as some relationship between the function value and its derivative at the boundary. (In CME 102, this relationship will always be linear, but in practice it may not be.) An example of a problem with a Robin BC is:

\[y'' + 4y = \cos(x),\qquad y(0) = 5,\; 2y(\pi) + y'(\pi) = 5\]

It’s difficult to define an exact physical analogue for a Robin BC, but they often arises in thermal and optical systems.

Finally, keep in mind that we need to have at least one type I BC in our problem. If we don’t, then both ends are defined just in terms of derivatives, and we’re back to where we started: we still have a degree of freedom in the system.

Finite Difference Formulas

The next major topic in numerical methods we need to discuss are finite difference schemes. Finite difference schemes, in one sentence, are using the differences between the function evaluated at discrete points to approximate the function derivatives. These are incredibly important in solving differential equations, as will become apparent very shortly.

Finite differences are in many ways a dual to the other numerical methods we’ve discussed so far. Previously (e.g. in forward Euler), we assumed that we were able to calculate the derivative at a given point, we had the function value at a specific point, and that by combining these we could calclate the next function value. In finite difference schemes, we’re assuming we have all the function values evaluated at discrete points (well call these points nodes), and are using these to estimate the derivative at a point.

The simplest finite difference schemes are known as forward and backward differences. The forward difference formula is:

\[\begin{gather} y'_i = \frac{y_{i+1} - y_i}{h} \end{gather}\]

It is a forward difference because we’re estimating the derivative with the next function value. The backward difference formula is similarly:

\[\begin{gather} y'_i = \frac{y_{i} - y_{i-1}}{h} \end{gather}\]

We won’t dive into the accuracy analysis here (it is given in the course reader or easily found in any introductory textbook on numerical methods), but these schemes are both first order. However, similarly to when we derived the trapezoidal method, if we average these together we can obtain a second order accurate method:

\[\begin{gather} y'_i = \frac{y_{i+1} - y_{i-1}}{2h} \end{gather}\]

This is known as a central difference scheme, since we are using the function values at the nodes directly adjacent to the current node.

We can also estimate second derivatives using a central difference scheme:

\[\begin{gather} y''_i = \frac{y_{i+1} - 2y_i + y_{i-1}}{h^2} \end{gather}\]

This scheme is second order accurate. The error analysis is a bit involved and unimportant to what we’ll be doing with it here. The easiest way to think of deriving this formula is applying the backward difference formula to $y’$ computed with forward differences:

\[y''_i = \frac{y'_i - y'_{i-1}}{h} = \frac{\frac{y_{i+1} - y_i}{h} - \frac{y_i - y_{i-1}}{h}}{h} = \frac{y_{i+1} - 2y_i + y_{i-1}}{h^2}\]

A couple notes before we move on to the next (much harder) part:

We can actually derive finite difference schemes for any higher order derivatives—we are not just limited to first and second derivatives (these schemes would be pretty useless if they had this limitation).
Not all higher order schemes are central difference schemes; indeed, there are many schemes known as one-sided schemes that only use values to the left or right.

With that said, let the fun part begin.

Putting It Together: Direct Method

Now it’s time to put together all this stuff about boundary conditions and finite differences into a method for solving BVPs. Finite differences form a linear relationship between nodes in the domain, which suggests that we might be able to form a linear system system of equations for the function values at the nodes.

This is exactly what we do with direct method: we plug the finite difference approximations to the derivatives into the ODE, rearrange these into a linear system, then solve the linear system to solve for the function values at all of the nodes. This method is called direct method because we are finding the node function values all at once, as opposed to finding them sequentially like with did with forward Euler or RK4.

To see how this is done, we’ll work through the example from above. Let’s start with the simple case, where we have two type I boundary conditions:

\[\begin{gather} y'' + 4y = \cos(x),\qquad y(0) = 5,\; y(\pi) = 0 \end{gather}\]

We first need to insert the finite difference approximations into the equations. (Unless otherwise stated, it is safe to assume you should be using central difference schemes, since they are more accurate than the one-sided schemes we’ve studied here.) To do this, just simply substitute the finite differences for the derivatives:

\[\begin{gather} \frac{y_{i+1} - 2y_i + y_{i-1}}{h^2} + 4y_i = \cos(x_i) \end{gather}\]

Note that we are only writing the equation out for a single arbitrary node $i$. This is becaue the above equation applies to all nodes in the domain. (We just need to treat the boundary nodes slightly differently, as you’ll see below.) Next, we need to rearrange the equation so that the coefficients for the $y_i$’s are all grouped together:

\[\begin{gather} y_{i-1} + (4h^2-2)y_i + y_{i+1} = h^2 \cos(x_i) \end{gather}\]

This equation is known as the recursive equation for direct method, because it recursively relates all non-boundary nodes in our domain. Be sure to know this term, it will come up a lot when you do direct method.

Now, we need to treat the boundary conditions. When we have type I boundary conditions, the function value is fixed and we do not solve for it. We can also plug this value into the recursive equation for the node adjacent to the boundary node with a known value.

Suppose we’re solving for $N = 5$ nodes (two boundary nodes and three interior) in thsi example. Then, we would plug the boundary value for $y_i = 1$ into the equation for $i = 2$, and the value at $y_i = 5$ into the equation for $i = 4$. Doing this:

\[\begin{gather} y_1 + (4h^2-2)y_2 + y_3 = h^2 \cos(x_2) \\ 5 + (4h^2-2)y_2 + y_3 = h^2 \cos(x_2) \\ (4h^2-2)y_2 + y_3 = h^2 \cos(x_2) - 5\\ y_3 + (4h^2-2)y_4 + y_5 = h^2 \cos(x_4) \\ y_3 + (4h^2-2)y_4 = h^2 \cos(x_4) \end{gather}\]

If we also write down the equation at $i = 3$:

\[\begin{gather} y_2 + (4h^2-2)y_3 + y_4 = h^2 \cos(x_3) \end{gather}\]

Two above equations together with the recursive equation applied at node $i=3$, we have a full linear system for $y_2$, $y_3$, $y_4$:

\[\begin{bmatrix} (4h^2-2) & 1 & 0 \\ 1 & (4h^2-2) & 1 \\ 0 & 1 & (4h^2-2) \end{bmatrix} \begin{bmatrix} y_2 \\ y_3 \\ y_4 \end{bmatrix} = \begin{bmatrix} h^2 \cos(x_2) - 5 \\ h^2 \cos(x_3) \\ h^2 \cos(x_4) \end{bmatrix}\]

We can easily solve this linear system to find the numerical solution.

A Harder Case: Neumann and Robin Boundaries

Discouragingly, this last example was the easy case since this problem had two type I BCs. What if we had a type II or type III boundary condition? Problems with Neumann and Robin boundary conditions prove to be trickier, but if we are careful about things I think you’ll find it’s actually reasonably straightforward.

Let’s consider a modified version of the previous example, where we change the ODE to have a robin boundary condition:

\[y'' + 4y = \cos(x),\qquad y(0) = 5,\; 2y(\pi) + y'(\pi) = 5\]

First of all, different boundary conditions for the same ODE does not change the recursive equation. This is the same ODE as in the last part, so it has the same recursive equation:

\[\begin{gather} y_{i-1} + (4h^2-2)y_i + y_{i+1} = h^2 \cos(x_i) \end{gather}\]

Let’s suppose we are still using $N = 5$ nodes to solve for the numerical solution. We next need to derive the equation at the boundaries. Note that the left boundary is unchanged, so this equation is also the same:

\[\begin{gather} (4h^2-2)y_2 + y_3 = h^2 \cos(x_2) - 5 \end{gather}\]

Now comes the hard part: we need to find the equation at the right boundary. The right boundary is a type III BC, so the function value is not explicitly prescribed and we therefore need to solve for it. Let’s start by writing out the equation at the right boundary node:

\[\begin{gather} y_4 + (4h^2-2)y_5 + y_6 = h^2 \cos(x_5) \end{gather}\]

Uh-oh, what’s wrong with this equation? We have $y_6$, but our domain is only defined up to $y_5$. We need to come up with a way to eliminate this point outside the domain.

To do this, we use the boundary condition $2y(\pi) + y'(\pi) = 5$. To treat a type III (or II) BC, we start by inserting our finite difference schemes into the formula just like we did to derive the recursive equation:

\[\begin{gather} 2y_5 + \frac{y_6 - y_4}{2h} = 5 \end{gather}\]

Aha! The same point outside our domain appears in this equation too. This means we can solve for this point as a function of points within the domain using the boundary condition equation, then plug this expression back into the recursive equation at $i = 5$:

\[\begin{gather} y_6 = y_4 - 4hy_5 + 10h \end{gather}\]

Plugging into the recursive equation above and rearranging:

\[\begin{gather} y_4 + (4h^2-2)y_5 + (y_4 - 4hy_5 + 10h) = h^2 \cos(x_5) \\ 2y_4 + (4h^2 - 4h - 2)y_5 = h^2 \cos(x_5) - 10h \end{gather}\]

Now we have a nice equation that is defined only in terms of nodes that lie within the domain. Basically, we introduced a point outside the domain, which we call a ghost point, and used this point to substitute the boundary condition into the recursive equation, thereby enforcing the BC.

Now that we have the boundaries taken care of, we can assemble the final linear system:

\[\begin{gather} \begin{bmatrix} (4h^2-2) & 1 & 0 & 0 \\ 1 & (4h^2-2) & 1 & 0 \\ 0 & 1 & (4h^2-2) & 1 \\ 0 & 0 & 2 & (4h^2 - 4h - 2)\end{bmatrix} \begin{bmatrix} y_2 \\ y_3 \\ y_4 \\ y_5 \end{bmatrix} = \begin{bmatrix} h^2 \cos(x_2) - 5 \\ h^2 \cos(x_3) \\ h^2 \cos(x_4) \\ h^2 \cos(x_5) - 10h \end{bmatrix} \end{gather}\]

which we can solve pretty easily in MATLAB (or by hand if you’re really enterprising).

This has been a lot, so to close let’s summarize the steps to solving a direct method problem:

Insert finite difference formulas directly into the ODE.
Find the recursive equation by rearranging things so that all the coefficients so each $y_i$ value are together. This is the equation you will use for the interior nodes.
Find the equation at the boundaries. If it a type I BC, we just plug the value directly into the equation. If it is a Type II or III BC, we need to insert finite difference formulas into the expression for the BC, solve out for the ghost points, and insert this into the equation for the boundary node.
Using the equations derived in steps 2 and 3 to assemble a linear system that can be solved for the function values at the nodes.

Direct method is one of the trickiest topics in introductory numerical analysis because it is so different from things you’ve seen before. However, this is an incredibly powerful and efficient method for solving spatial problems with constraints defined at the boundaries, and is something you will see everywhere as you progress with your engineering careers. For this reason, be sure to take direct method to heart and really understand it on a conceptual and mathematical level.

Multistep Methods

The remainder of these notes are devoted to two ancillary topics in numerical solutions: multistep methods, and the shooting method. Don’t worry though, we do not cover these in deep detail: these topics are relatively advanced, and are really only mentioned here for completeness.

Multistep methods are similar to the Euler methods or RK4 in that they are moving forward in time. The difference is that in the previous forward-propagating algorithms, we were only using $y_i$ to solve for $y_{i=1}$, but in multistep methods we’re using two or more previous $y_i$’s to find the numerical solution.

The first multistep method is the Adams-Bashforth method. There are two main versions of the Adams-Bashforth algorithm we study here: the second order formula given by:

\[y_{i+1} = y_i + \frac{h}{2} \left[3f(x_n, y_n) - f(x_{n-1}, y_{n-1})\right]\]

and the fourth order version:

\[y_{i+1} = y_i + \frac{h}{24} \left[55f(x_n, y_n) - 59f(x_{n-1}, y_{n-1}) + 37f(x_{n-2}, y_{n-2}) - 9f(x_{n-3}, y_{n-3})\right]\]

In both variants, notice that we use only previous function values (or evaluations of the derivative) to calculate the next function value. We are using two or more previous values, so it is a multistep method.

Why use this method over, say, RK4? The main advantage to these methods is that we are able to achieve second order (or higher) accuracy but only need to compute a single function evaluation at each step. Remember that RK4 is extremely costly because it requires four separate evaluations of $f(x,y)$ at every iteration. Here, we can simply store previous evaluations of $f(x,y)$ then call them up from memory instead of recomputing. (For most problems, calling from memory is far cheaper than recomputing, so this is a viable strategy.)

The significant downside to Adams-Bashforth and other multistep methods are that they are not self-starting. This means that when we start calculating the numerical solution at the first time step, we need previous time steps that simply do not exist. In practice, we would calculate the first few steps using a self-starting methods like one of the single step methods we studied previously, but these can be inaccurate or very costly depending on the problem at hand.

The other multistep method is the Adams-Moulton method, which is a combination of the trapezoidal method and Adams-Bashforth. Adams-Moulton is a predictor-corrector method, in that we make a prediction of $y_{n+1}$ using Adams-Bashforth, then correct it but plugging this into an update like trapezoidal method. This follows the same train of thought as improved Euler, where we predicted $y_{n+1}$ using forward Euler, then used this prediction to make a more accurate final estimate.

The formulas for second order Adams-Moulton are:

\[\begin{gather} y_{n+1}^* = y_n + \frac{h}{2} [3f(x_n,y_n) - f(x_{n-1}, y_{n-1})] \\ y_{n+1} = y_n + \frac{h}{2} [f(x_{n+1}, y_{n+1}^*) + f(x_n, y_n)] \end{gather}\]

See? It looks exactly like trapezoidal method, but we’re using Adams-Bashforth instead of forward Euler. There’s also a fourth order version of Adams-Moulton:

\[\begin{gather} y_{n+1}^* = y_n + \frac{h}{24} [55f(x_n, y_n) - 59f(x_{n-1}, y_{n-1}) + 37f(x_{n-2}, y_{n-2}) - 9f(x_{n-3}, y_{n-3})] \\ y_{n+1} = y_n + \frac{h}{24} [9f(x_{n+1}, y_{n+1}^*) + 19f(x_{n}, y_{n}) - 5f(x_{n-1}, y_{n-1}) + f(x_{n-2}, y_{n-2})] \end{gather}\]

where we use fourth order Adams-Bashforth as the predictor, and a different formula for the corrector. Notice that for both the second and forth order variants, we need to compute $f(x,y)$ twice (compared to just once for Adams-Bashforth), but the advantage of Adams-Moulton is we get slightly better accuracy and much improved stability.

In practice, we mainly use multistep methods because they are better at handling equations that have oscillations in the solution. (These methods are very commonly used in weather prediction, for example.) Intuitively this makes sense: we’re taking more of the previous points in time into account, so we should be able to more accurately determine oscillations. The mathematical details of this are actually pretty involved though, so we leave them out here.

Shooting Method

The last numerical method we will talk about in these notes is the shooting method. The shooting method is actually a surprisingly simple method when approached from the right perspective. We won’t go too deeply into the mathematical details or implementation, but hopefully this will shed some light on it.

The idea of the shooting method is to use solvers for IVPs to solve BVPs. To do this, we guess initial conditions, then iteratively solve the IVP starting at the left boundary using the guessed initial conditions and adjust the initial conditions based on the error at the right boundary. The motivation behind doing this is that we have lots of very fast and very accurate solvers for IVPs, so it seems natural to try to apply these to solving BVPs, since doing so would avoid us having to solve potentially very large systems of linear equations.

The shooting method has its share of limitations (particularly how it only effectively applies to linear ODEs). Furthermore, IVPs and BVPs are different in some very fundamental ways. Particularly, IVPs evolve as we move forward in the domain, and all of the information that determines the solution is available at the beginning of the solution, whereas in BVPs, the system is constrained from both ends of the domain and information is shaping the solution from both ends simultaneously. Nevertheless, shooting method is a very interesting idea, and does find some applications in practice.