PHYS 251 - Introduction to Computer Techniques in Physics

Numerical Analysis - Algebraic Equations

Numerical Solution Using a Generalized Newton's Method


Notation: Equation numbers are placed in parentheses, (Eq. 1), at the end of the sentence preceding the mathematical presentation of the equation. They should not be read as if they are part of the sentence. When reference is made to an equation, the word "equation" will be spelled out and not abbreviated. Mathematical and Greek symbols are not in all cases precise, since they may depend on the browser used.


Mathematical Functions

It is assumed throughout this discussion that the reader is familiar with the concept of mathematical functions, such as, f(x) = x2 - 2x + 36, and f(x) = sin(x). If this is not the case, then the reader should consult an introductory calculus textbook.


System of Linear Algebraic Equations

It is natural to begin a consideration of numerical methods with algebraic and transcendental systems, since the arithmetical operations performed by modern computers are those of classical algebra.

The general system of n linear algebraic equations of nth order can be written as (Eq. 1)

a11 x1 + a12 x2 + a13 x3 + ... + a1n xn = b1,
a21 x1 + a22 x2 + a23 x3 + ... + a2n xn = b2,
a31 x1 + a32 x2 + a33 x3 + ... + a3n xn = b3,
...
an1 x1 + an2 x2 + an3 x3 + ... + ann xn = bn.

If we let A be an ( n x n ) matrix of the coefficients, ann, x be the column vector of the variables, xn, and b be the column vector of the right-hand side, bn, the system can be written in compact vector notation as (Eq. 2)

A x = b.

For theoretical discussions Equation 2 is more convenient even though Equation 1 and 2 are equivalent.

For computational purposes, one wants to know that our system of equations, Equation 1, has one and only one solution before investing time and effort searching for something that does not exist. The basic theorem says that in order for Equation 1 to have one and only one solution, the determinant of A, written det ( A ), must be different from zero (Eq. 3)

det ( A ) = | A | not= 0.


Cramer's Rule for Solving a System of Linear Algebraic Equations

If one is dealing with systems of 2, 3, or 4 linear algebraic equations, a method, known as Cramer's rule, is a reasonable technique for finding a numerical solution. Let us demonstrate the technique through an example (Eq. 4)

x1 + x2 - x3 = 2,
x1 - x2 + x3 = 0,
- x1 + x2 + x3 = 0,

whose solution is x1 = 1, x2 = 1, and x3 = 0.

For this example the determinant of the coefficients ann is (Eq. 5a)

| A | = | ( a11, a12, a13 ), ( a21, a22, a23 ), ( a31, a32, a33 ) |
= a11 ( a22 a33 - a23 a32 ) + a12 ( a23 a31 - a21 a33 ) + a13 ( a21 a32 - a22 a31 ) = - 4.

Thus the solution for x1 using Cramer's rule is (Eq. 5b)

x1 = | A1 | / | A | = ( -4 ) / ( -4 ) = 1,

where | A1 | is the determinant of A in which the column of coefficients, a11 = 1, a21 = 1, and a31 = -1, is replaced by the column of b's, b1 = 2, b2 = 0, and b3 = 0. In a similar fashion x2 is given by (Eq. 5c)

x2 = | A2 | / | A | = ( -4 ) / ( -4 ) = 1,

where | A2 | is the determinant of A in which the column of coefficients, a12 = 1, a22 = -1, and a32 = 1, is replaced by the column of b's, b1 = 2, b2 = 0, and b3 = 0. Finally, the solution for x3 is given by (Eq. 5d)

x3 = | A3 | / | A | = ( 0 ) / ( -4 ) = 0,

where | A32 | is the determinant of A in which the column of coefficients, a13 = -1, a23 = 1, and a33 = 1, is replaced by the column of b's, b1 = 2, b2 = 0, and b3 = 0.


Example Using Cramer's Rule for Solving a System of Linear Algebraic Equations

For an example of using the Excel function MDETERM to do this problem follow the link Cramer's Rule. As should be fairly evident, Cramer's rule becomes cumbersome when the system consists of more than about four equations.


Newton's Method for Solving One Equation in One Unknown

In addition to Cramer's rule there are other methods for the solution of systems of linear algebraic equations, such as Gaussian Elimination and LU Decomposition. They are called direct methods in as much as, barring roundoff error accumulation, they lead to a numerical solution in a prescriptive fashion with a fixed number of steps. However, in general, they do not apply to the solution of nonlinear algebraic and transcendental systems of equations. The generalized Newton's method is one that can be applied to both linear and nonlinear systems. However, before considering the generalized Newton's method for systems of equations, let us consider the classical Newton's method, often referred to as a Newton-Raphson technique, applied to one equation in one unknown.

Let us begin with the equation

y = f(x) = 0,

where f(x) is some function, such as the polynomial (Eq. 6)

y = f(x) = x3 + (31/2) x2 - 2 x - 2 (31/2) = 0.

The problem is one of finding the value or values of x such that y = 0, that is we are searching for the zeros of y = f(x). Figure 1 shows the function f(x) graphed for 0 < x < 2.5.

Note that the function changes sign between the values of x = 1 and x = 1.5. Thus the value of x that makes the function zero is approximately x = 1.4. The Newton's method is an iterative scheme that utilizes the tangent to the curve at some point, say x = a. Regardless of whether x = a > xsolution or x = a < xsolution, the tangent line crosses, in general, the x-axis at some point, say x0. Since, in general, f( x0 ) not= 0, x0 is only a zero approximation to the root and we wish to develop a means of generating a new approximation, call it x1.

To do so, let us note that the derivative of the function f ( x = a ), denoted by f '(a), is tangent to the curve at the point x = a. Hence, we can use the definition of the derivative as the limit of the divided difference (Eq. 7)

( dy /dx )x=a = f '(x) = [ f(a) - f(x0) ] / ( a - x0 ),

in order to obtain an expression for the value of x0 by setting f ( x0 ) = 0 and re-arranging Equation 7, which yields (Eq. 8)

x0 = a - f(a) / f '(x).

Similarly for any iteration, Equation 8 can be generalized to produce an iterative relation (Eq. 9)

xn+1 = xn - f ( xn ) / f '( x n ) ,

which provides an expression for calculating an improved estimate of the root in terms of the preceding estimate, the function evaluated for the preceding estimate, and the derivative of the function evaluated for the preceding estimate.

As for issues of convergence, we find that

  1. Newton's method converges rapidly, but
  2. it does not always converge.

An alternative route to Equation 9 is through a Taylor expansion

f(x) = f ( xn ) + ( x - xn ) f '( xn ) + [ ( x - xn )2 / 2 ] f ''( xn ) + [ ( x - xn )3 / 6 ] f '''( xn ) + ...

in which we keep the first two terms only, set f(x) = 0, and re-label x as xn+1. The result is again Equation 9.

In order to obtain an estimate of the error made in each iteration, let us subtract the i+1 iteration from the actual value of the root, x, and change the iteration index to i (Eq. 10)

( x - xi+1 ) = ( x - xi ) + f ( xi ) / f '( x i ).

The difference ( x - xi+1 ) is the error in the i+1 iteration, call it ei+1, while ( x - xi ) is the error in the i iteration, call it ei, or

ei+1 = ei + f ( xi ) / f '( x i ) ,

which characterizes the error at one interation to the next in terms of the function and its derivative. However, we wold like to specify the error in terms of the tangent and its rate of change, f ''(x). We can do this by using the first three terms in the Taylor expansion, which yields the result that (Eq. 11)

ei+1 = - ei2 [ f ''( xi ) / f '( x i ) ] .

Thus for the tangent and its rate of change being reasonably constant near a root, the convergence according to Equation 11 is quadratic. But, note that if one makes an initial bad guess for the starting value x = a, so that e0 is greater than one, then the method diverges. Hence graphing the function in order to ascertain the approximate value of the root is almost a necessity in order converge the method.


Example of Newton's Method for One Equation in One Unknown

As an example let us consider the function given in Equation 6 above, whose solution is x = 21/2 = 1.41421 to five decimals. The solution using the Newton's method is given in the following document Newton's Method.


Generalized Newton's Method for One Equation in One Unknown

In general, the criterion for terminating the iterations in Newton's method is that the following expression has been satisfied (Eq. 12)

| xi+1 - xi | < e ,

where e is a small quantity that establishes the accuracy of the calculation and is dependent on the requirements of the application. It should be obvious that depending on the nature of the function f ( x ), such as being in the neighborhood of a local maximum or minimum where the tangent becomes nearly parallel to the x axis, the convergence can require a large number of iterations to achieve any reasonable accuracy.

For those functions requiring a large number of iterations, one can attempt to construct a line rather than the tangent at x0 = a for which xsolution < x+ < x1. Thus x+ is better approximation to the root than is x1. The equation of such a line can be written as (Eq. 13)

y = f ( x0 ) + t f '( x0 ) ( x - x0 ) ,

where t is a number close to unity. Setting y = 0 and x = x1 = x+, Equation 13 yields

x1 = x0 - [ f ( x0 ) / t f '( x0 ) ] ,

which can be generalized to produce a recursion relation for x. However, for notational purpose, let us set w = 1 / t with 0 < w < 2 so that the recursion relation becomes (Eq. 14)

xi+1 = xi - w [ f ( xi ) / f '( xi ) ] .

Equation 14 is known as a generalized Newton's method and w is called an over-relaxation parameter. Of course we recover the usual Newton's method when we set w = 1.


Generalized Newton's Method for a System of Two Algebraic Equations in Two Unknowns

Suppose one wishes to solve a system of two equations in two unknowns, such as (Eq. 15)

y1 = f1( x1, x2 ) = 0 ,
y2 = f2( x1, x2 ) = 0 .

A generalization of Equation 14 for iterative relations for x1 and x2 in Equation 15 will require us to change notation, since we are identifying the various independent variables x with a subscript. Thus, in order to identify the number of iterations, we will use the notation, xi(k), for the kth iteration of the ith variable. Also, since we are dealing with functions that are dependent on more than one independent variable, we must take the partial derivative with respect to each of the variables rather than the derivative df/dx = f '(x). The symbol we will use for the partial derivative is df1( x1, x2 )/dx1. Using this notation, the generalization of the generalized Newton's method for two functions in two unknowns is (Eq. 16)

x1(k+1) = x1(k) - w [ f1 ( x1(k), x2(k) ) / df1( x1(k), x2(k) )/dx1 ] ,
x2(k+1) = x2(k) - w [ f2 ( x1(k+1), x2(k) ) / df2( x1(k+1), x2(k) )/dx2 ].

It is reasonably obvious as to what would be the generalization of Equation 16 to a system of three algebraic equations in three unknowns. The point to note is that the new estimate for a variable is used in subsequent calculations involving that variable as is evident in Equation 16. That is to say, the k+1st estimate of x3 would use the k+1st estimates for x1 and x2, but the kth estimate for x3.


Example of Generalized Newton's Method for Two Equations in Two Unknowns

As an example let us consider the functions given below


Physics & Astronomy Department, George Mason University
Maintained by Amin Jazaeri, amin@physics.gmu.edu