A Solution of Project Three : Iterative Methods in Numerical Linear Algebra

Below is a possible solution to the project.

1. The methods of Jacobi and Gauss-Seidel

The methods of Jacobi and Gauss-Seidel are iterative methods to solve linear systems of equations.

1.1 An implementation of the method of Gauss-Seidel in MATLAB

The following should be saved as gauss_seidel.m :
function [x,dx] = gauss_seidel(A,b,x,eps,N)
%
%  The function gauss_seidel applies the method of Gauss-Seidel
%  to solve A*x = b.
%  On entry:
%    A       coefficient matrix of the linear system;
%    b       right-hand side vector of the linear system;
%    x       initial approximation;
%    eps     accuracy requirement: stop when norm(dx) < eps;
%    N       maximal number of steps allowed.
%  On return:
%    x       approximation for the solution of A*x = b;
%    dx      vector last used to update x, 
%            if success, then norm(dx) < eps.
%
%  Example:
%    Q = orth(rand(5)); D = diag(rand(1,5));
%    A = Q*D*Q'; z = rand(5,1); b = A*z;
%    x = z + rand(5,1)*1e-4;
%    [x,dx] = gauss_seidel(A,b,x,1e-8,50)
%
n = size(A,1);
fprintf('Running the method of Gauss-Seidel...\n');
for i = 1:N
   for j = 1:n
      dx(j) = b(j);
      for k = 1:n
         dx(j) = dx(j) - A(j,k)*x(k);
      end;
      dx(j) = dx(j)/A(j,j);
      x(j) = x(j) + dx(j);
   end;
   fprintf('  norm(dx) = %.4e\n', norm(dx));
   if (norm(dx) < eps)
      fprintf('Succeeded in %d steps\n', i);
      return;
   end;
end;
fprintf('Failed to reach accuracy requirement in %d steps.\n', N);

1.2 Comparison of the Convergence

We generated ten random test cases, with the following script (stored in the file random_test.m ):

function random_test
%
%  This function generates ten random test cases to run the
%  method of Jacobi and Gauss-Seidel, with the aim of showing
%  the superior rate of convergence of the method of Gauss-Seidel
%  above the method of Jacobi.
%
%  The output is placed in the file "random_test_output".
%
%  In case the method of Jacobi diverges and Gauss-Seidel converges,
%  the method of Jacobi is re-executed with more iterations, to
%  make the divergence more dramatic.
%  
% 
diary random_test_output;
for i = 1:10
  Q = orth(rand(5)) 
  D = diag(rand(1,5))
  A = Q*D*Q' 
  z = rand(5,1)
  b = A*z
  x0 = z + rand(5,1)*1e-4
  [x,jdx] = jacobi(A,b,x0,1e-8,50)
  [x,gdx] = gauss_seidel(A,b,x0,1e-8,50)
  if (norm(jdx) > 1.e-8) & (norm(gdx) < 1.e-8)
     [x,jdx] = jacobi(A,b,x0,1.e-8,500)
     if (norm(jdx) > 1.e+10)
        fprintf('found case diverging for Jacobi, Gauss-Seidel converges\n');
     end;
  end;
end;
diary off;
The output of this run is summarized in the following table:
      +--------+-------------------------+
      | random |     number of steps     |
      |  case  +----------+--------------+
      | number |  Jacobi  | Gauss-Seidel | 
      +========+=========================+
      |   1    |    50    |      50      |
      |   2    |    50    |      50      |
      |   3    |    31    |      25      |
      |   4    |    18    |       9      |
      |   5    |    50    |      31      |
      |   6    |    50    |      24      |
      |   7    |    22    |      13      |
      |   8    |    50    |      50      |
      |   9    |    19    |       7      |
      |  10    |    25    |      11      |
      +--------+----------+--------------+
We see that in all cases, the number of iterations required by the method of Gauss-Seidel is less than (or equal to) the number of steps required by the method of Jacobi. The cases of an equal number (here 50) are due to the slow convergence of the methods.

During this run, two cases were discovered where the method of Jacobi diverged and Gauss-Seidel converged, thus answering the next question:

1.3 A Diverging Case for Jacobi, but not for Gauss-Seidel

A system for which the method of Jacobi diverges, but Gauss-Seidel converges, was discovered by the script random_test.m , here are the data:
Q =

   -0.4888    0.6382   -0.1493    0.5743   -0.0396
   -0.5844   -0.7385   -0.1753    0.2815    0.0553
   -0.3614    0.1276    0.4569   -0.2788    0.7527
   -0.3123    0.1696   -0.6880   -0.6327    0.0045
   -0.4374    0.0469    0.5148   -0.3358   -0.6548


D =

    0.0574         0         0         0         0
         0    0.2910         0         0         0
         0         0    0.6761         0         0
         0         0         0    0.9522         0
         0         0         0         0    0.0403


A =

    0.4614    0.0508   -0.1659   -0.2364   -0.2136
    0.0508    0.2747   -0.1425   -0.1140   -0.1479
   -0.1659   -0.1425    0.2502   -0.0316    0.2391
   -0.2364   -0.1140   -0.0316    0.7152   -0.0270
   -0.2136   -0.1479    0.2391   -0.0270    0.3154


z =

    0.8216
    0.2236
    0.9997
    0.2605
    0.2303


b =

    0.1139
   -0.1031
    0.1287
   -0.0712
    0.0961


x0 =

    0.8217
    0.2237
    0.9998
    0.2606
    0.2303
Follow the link for for the full output of this run. We see that the method of Gauss-Seidel reaches the accuracy in 31 steps, while the method of Jacobi clearly diverges.

2. The Power Method

The Power Method is a simple algorithm to find the dominant eigenvalue of a matrix.

2.1 The Order of Convergence

We ran the power method on a random 5-by-5 matrix with a random start vector:
>> A = rand(5);
>> x0 = rand(5,1);
>> [lambda,x,error] = powers(A,x0,1.0e-13,50)
Running the power method...
  error : 3.7249e-01
  error : 9.3438e-02
  error : 2.1424e-02
  error : 4.9355e-03
  error : 8.2906e-04
  error : 1.9342e-04
  error : 3.3982e-05
  error : 7.7752e-06
  error : 1.3945e-06
  error : 3.0911e-07
  error : 5.7668e-08
  error : 1.2228e-08
  error : 2.3913e-09
  error : 4.8203e-10
  error : 9.9130e-11
  error : 1.8989e-11
  error : 4.0962e-12
  error : 7.4987e-13
  error : 1.6830e-13
  error : 2.9831e-14
Succeeded in 20 steps.
From the numerical output above and observing the plot of the logarithms of the errors, we see the method is converging linearly. In particular, getting the 10-th decimal place correct takes about as many steps as getting the 3-rd decimal place correct.

2.2 Starting from an Eigenvector

We conducted the following experiment:
>> A = rand(5);
>> [v,d] = eig(A);
>> x0 = v(:,2);
>> [lambda,x,error] = powers(A,x0,eps,50)
Running the power method...
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 2.0000e+00
  error : 1.9999e+00
  error : 1.9969e+00
  error : 1.9138e+00
  error : 1.1828e+00
  error : 3.3731e-01
  error : 5.6304e-02
  error : 9.4574e-03
  error : 1.5784e-03
  error : 2.6368e-04
  error : 4.4042e-05
  error : 7.3564e-06
  error : 1.2287e-06
  error : 2.0524e-07
  error : 3.4281e-08
  error : 5.7260e-09
  error : 9.5641e-10
  error : 1.5975e-10
  error : 2.6683e-11
  error : 4.4569e-12
  error : 7.4440e-13
  error : 1.2427e-13
  error : 2.0784e-14
  error : 3.5353e-15
  error : 6.8213e-16
  error : 1.6653e-16
Succeeded in 43 steps.
So we start with the eigenvector that corresponds to the second eigenvalue. From the numerical output above and observing the plot of the logarithms of the errors, we see that the method does not converge at first. Only after more than twenty iterations we see the errors go down, with a similar linear convergence as in the random case.

The reason for this strange convergence pattern is because the start vector has no component in the eigenvector corresponding to the dominant eigenvalue. The justification of the power method (see the project description) relies on decomposing the start vector as a linear combination of the eigenvectors and observing that the first component of the decomposition (i.e.: the component parallel to the eigenvector corresponding to the dominant eigenvalue) grows after each iteration. If the start vector has no component in the direction of the eigenvector corresponding to the dominant eigenvalue, then there can be no growth. This explains the stagnation of the errors in the first twenty iterations. But why does the method then start to converge? Well, we see the benificial effect of round off errors at work. For starters, we do not have the exact value of the eigenvector (we computed it via MATLAB's eig command). Secondly, during the iterations, tiny errors creep in, causing the start vector to deviate from the direction of the second eigenvector. When this accumulation of round off error reaches a critical point, the vector gains a component in the direction of the eigenvector corresponding to the dominant eigenvalue and starts to converge linearly to that eigenvector.