Domain Decomposition Methods ============================ THe method of Jacobi is an interative method which is not in place: we do not overwrite the current solution with new components as soon as these become available. In contrast, the method of Gauss-Seidel does update the current solution with newly computed components of the solution as soon as these are computed. Domain decomposition methods to solve partial differential equations are another important class of synchronized parallel computations, explaining the origin for the need to solve large linear systems. This chapter ends with an introduction to the software PETSc, the Portable, Extensible Toolkit for Scientific Computation. Gauss-Seidel Relaxation ----------------------- The method of Gauss-Seidel is an iterative method for solving linear systems. We want to solve :math:`A {\bf x} = {\bf b}` for a very large dimension *n*. Writing the method of Jacobi componentwise: .. math:: x_i^{(k+1)} := x_i^{(k)} + \frac{1}{a_{i,i}} \left( b_i - \sum_{j=1}^n a_{i,j} x_j^{(k)} \right), \quad i=1,2,\ldots,n We observe that we can already use :math:`x_j^{(k+1)}` for :math:`j < i`. This leads to the following formulas .. math:: x_i^{(k+1)} := x_i^{(k)} + \frac{1}{a_{i,i}} \left( b_i - \sum_{j=1}^{i-1} a_{i,j} x_j^{(k+1)} - \sum_{j=i}^n a_{i,j} x_j^{(k)} \right), \quad i=1,2,\ldots,n. C code for the Gauss-Seidel method is below. :: void run_gauss_seidel_method ( int n, double **A, double *b, double epsilon, int maxit, int *numit, double *x ) /* * Runs the method of Gauss-Seidel for A*x = b. * * ON ENTRY : * n the dimension of the system; * A an n-by-n matrix A[i][i] /= 0; * b an n-dimensional vector; * epsilon accuracy requirement; * maxit maximal number of iterations; * x start vector for the iteration. * * ON RETURN : * numit number of iterations used; * x approximate solution to A*x = b. */ { double *dx = (double*) calloc(n,sizeof(double)); int i,j,k; for(k=0; k= 0.0) ? dx[i] : -dx[i]); } printf("%4d : %.3e\n",k,sum); if(sum <= epsilon) break; } *numit = k+1; free(dx); } Running on the same example as in the previous chapter goes much faster: :: $ time /tmp/gauss_seidel 1000 0 : 1.264e+03 1 : 3.831e+02 2 : 6.379e+01 3 : 1.394e+01 4 : 3.109e+00 5 : 5.800e-01 6 : 1.524e-01 7 : 2.521e-02 8 : 7.344e-03 9 : 1.146e-03 10 : 3.465e-04 11 : 5.419e-05 computed 12 iterations <----- 8407 with Jacobi error : 1.477e-05 real 0m0.069s <----- 0m42.411s user 0m0.063s <----- 0m42.377s sys 0m0.005s <----- 0m0.028s Parallel Gauss-Seidel with OpenMP --------------------------------- Using ``p`` threads: :: void run_gauss_seidel_method ( int p, int n, double **A, double *b, double epsilon, int maxit, int *numit, double *x ) { double *dx; dx = (double*) calloc(n,sizeof(double)); int i,j,k,id,jstart,jstop; int dnp = n/p; double dxi; for(k=0; k. PETSc is installed on ``kepler`` in the directory ``/home/jan/Downloads/petsc-3.7.4``. The source code contains the directory ``ts`` on time stepping methods. We will run ``ex3`` from ``/home/jan/Downloads/petsc-3.7.4/src/ts/examples/tutorials`` After making the code, a sequential run goes like :: $ /tmp/ex3 -draw_pause 1 The entry in a ``makefile`` to compile ``ex3``: :: PETSC_DIR=/usr/local/petsc-3.7.4 PETSC_ARCH=arch-linux2-c-debug ex3: $(PETSC_DIR)/$(PETSC_ARCH)/bin/mpicc ex3.c \ -I$(PETSC_DIR)/include \ $(PETSC_DIR)/$(PETSC_ARCH)/lib/libpetsc.so \ $(PETSC_DIR)/$(PETSC_ARCH)/lib/libopa.so \ $(PETSC_DIR)/$(PETSC_ARCH)/lib/libfblas.a \ $(PETSC_DIR)/$(PETSC_ARCH)/lib/libflapack.a \ /usr/lib64/libblas.so.3 \ -L/usr/X11R6/lib -lX11 -lgfortran -o /tmp/ex3 On a Mac OS X laptop, the ``PETSC_ARCH`` will be ``PETSC_ARCH=arch-darwin-c-debug``. Bibliography ------------ 1. S. Balay, J. Brown, K. Buschelman, V. Eijkhout, W. Gropp, D.~Kaushik, M. Knepley, L. Curfman McInnes, B. Smith, and H.~Zhang. *PETSc Users Manual. Revision 3.2*. Mathematics and Computer Science Division, Argonne National Laboratory, September 2011. 2. Ronald F. Boisvert, L. A. Drummond, Osni A. Marques: Introduction to the special issue on the Advanced CompuTational Software (ACTS) collection. *ACM TOMS* 31(3):281--281, 2005. Special issue on the Advanced CompuTational Software (ACTS) Collection. Exercises --------- 1. Take the running times of the OpenMP version of the method of Gauss-Seidel and compute the efficiency for each of the 9 cases. What can you conclude about the scalability? 2. Use MPI to write a parallel version of the method of Gauss-Seidel. Compare the speedups with the OpenMP version. 3. Run an example of the PETSc tutorials collection with an increasing number of processes to investigate the speedup.