This description consists of the following parts:

1. Introduction

3. Solver and Test Program Overview

5. Parallel Performance via OpenMP

6. Information on Earlier Versions

7. MUDPACK and Multigrid References

## 1. Introduction

MUDPACK is a collection of portable, Fortran 77 subprograms, with a few Fortran90 extensions, for efficiently solving linear elliptic Partial Differential Equations (PDEs) using multigrid iteration. The most frequent Fortran90 extension used is the DO-END DO loop. Users may compile with any Fortran90 compiler, but Fortran77 compilers usually accept the code without complaint. In Fortran90 terminology, the source code is fixed-form, not free-form. OpenMP directives are included to enable shared memory parallelism. The package was created to make multigrid iteration available in user friendly form. The software is written in much the same format as the separable elliptic PDE package FISHPACK [5]. It extends the domain of solvable problems to include both separable and nonseparable PDEs. Detailed descriptions of earlier versions of MUDPACK are given [2,9].

Multigrid iteration (see [13,14,16,18,19,21]) combines classical iterative techniques, such as Gauss-Seidel line or point relaxation, with subgrid refinement procedures to yield a method superior to the iterative techniques alone. By iterating and transferring approximations and corrections at subgrid levels, a good initial guess and rapid convergence at the fine grid level can be achieved. Multigrid iteration requires less storage and computation than direct methods for nonseparable elliptic PDEs (e.g., see [7]) and is competitive with direct methods such as cyclic reduction [5,15,25,26] for separable equations. In particular, three-dimensional problems can often be handled at reasonable computational cost. Achieving optimal multigrid performance requires hand-tailored coding for certain problems. The generality of the equations solved by MUDPACK software may sometimes result in loss of efficiency. It is hoped that this is compensated for by the package's ease of use, applicability to a wide range of real problems (including those typically encountered in the atmospheric sciences at NCAR [8,12]), and avoidance of repeated "re-inventions of the wheel." Savings in human code development time can be at least as important as economic use of machine cycles. With careful selection of relaxation and multigrid parameters, optimal performance can often be realized using MUDPACK software. See [1,3,9,10] for a variety of problems where discretization level error (i.e., the same error that a direct method will produce) is reached in only one full multigrid cycle using MUDPACK solvers. Supercomputer performance from a decade ago [23] was measured with the examples in [1,7,9,10].

Reference [1] below is http://nldr.library.ucar.edu/repository/assets/technotes/asset-000-000-000-167.pdf. Readers may wish to also refer to it in learning about MUDPACK, since it has some information not provided otherwise on this website.

## 2. Special Features

** * Solving Linear Elliptic PDEs in a Variety of Forms **

These forms include real and complex, two- and three-dimensional, self-adjoint, separable and nonseparable, and PDEs with cross derivative terms (see part 3 of this file).

** * Solving PDEs in Curvilinear Coordinate Systems **

The solution regions are rectangular regions in the sense that the domain of each independent variable must be a bounded interval on the real line. This means that curvilinear coordinate systems such as spherical or cylindrical coordinates are acceptable. The codes are not restricted to Cartesian coordinates.

** * Generating Second- and Fourth-order Approximations **

Standard second-order finite difference approximations are generated on uniform grids superimposed on the solution region. These can be improved to fourth-order estimates using "deferred corrections" ([22,24]).

** * Handling of General Boundary Conditions **

Any combination of periodic, specified (Dirichlet), and mixed derivative boundary conditions is allowed. Some of the solvers allow oblique (non-normal) derivative boundary conditions.

** * Ease of Input of the Continuous Problem **

User defined input subroutines are the mechanisms for passing PDE coefficients and boundary conditions.

** * Automatic Discretization of the Continuous Problem **

The discretization is transparent to a user who only needs to supply the PDE, boundary conditions, and grid size information. Standard second-order finite difference formula are used to approximate first and second partial derivatives. The result is a linear block tridiagonal system of equations. More complex difference formula (asymmetric near boundaries [4]) are used with the fourth-order solvers. The coefficients multiplying the second partial derivatives in the PDE are adjusted during discretization at coarser grid levels if there are nonzero first-order coefficients which would destroy diagonal dominance. This is necessary to preserve convergence of iterative schemes.

** * Use of Multigrid Iteration to Approximate the Discretization Equations **

This is the essential feature of the MUDPACK software. It makes this complex collection of integrated numerical procedures available in friendly form.

** * Flexibility in Choosing Grid Size **

Second and fourth order approximations are generated on uniform l by m by n grids superimposed on boxes in three dimensions or l by m grids superimposed on rectangles in two dimensions. The grid sizes have the form:

l = p * 2^{(i-1)} + 1

m = q * 2^{(j-1)} + 1

n = r * 2^{(k-1)} + 1

where p,q,r,i,j,k are positive integers. i,j,k > 0 determine the number and size of the subgrid levels employed by multigrid cycling. Values for p,q,r should be chosen as small as possible (typically 2,3 or 5) and values for i,j,k as large as possible within grid size requirements for efficient cycling. In particular, larger values for p,q or r can cause cause algorithm deterioration. For 2-d and 3-d nonseparable PDEs this can be bypassed by using one of the "hybrid" solvers described below.

Let G denote the finest l by m by n grid. In MUDPACK, multigrid cycling is implemented on the ascending chain of grids

G(1) < ... < G(s) < ... < G(t) = G

where t = max0(i,j,k) and each G(s) (s=1,...,t) has l(s) by m(s) by n(s) grid points given by:

l(s) = p * 2^{max0(i+s-t,1)} + 1

m(s) = q * 2^{max0(j+s-t,1)} + 1

n(s) = r * 2^{max0(k+s-t,1)} + 1

When grid size requirements cannot be met with MUDPACK software (even with one of the hybrid solvers described below) then one option is to choose a grid which does satisfy the constraints which is as close as possible to the required grid and solve the problem there. The approximation can then be transferred to the required grid using multidimensional cubic interpolation.

** * Selection of Multigrid Options **

MUDPACK has options for implementing variants of multigrid iteration and default options for those preferring black box solvers. The default options (chosen for robustness) set cubic prolongation, fully weighted residual restriction, and W(2,1) cycling. The earlier version of MUDPACK described in [2,3] only allowed V(2,1) cycling with linear prolongation. This is still available as a possibly more efficient choice for certain problems.

** * Selection of the Relaxation Method used within Multigrid Iteration **

A relaxation menu is provided. It includes vectorized Gauss-Seidel schemes [17] on alternating points (red/black), lines (in any combination of directions) and planes (for three-dimensional anisotropic elliptic PDEs [27]). Choice of the correct relaxation method for a particular problem can be crucial. It depends on the relative grid and PDE coefficient size. Usually this can be pre-determined. Sometimes experimentation is required. Advice on method selection is given in the documentation.

** * Availability of "hybrid" Multigrid/Direct Method Solvers **

The certainty of direct methods is combined with the efficiency of multigrid iteration by providing "hybrid" solvers for 2-d and 3-d nonseparable PDEs. Gaussian elimination is used whenever the coarsest grid is encountered within multigrid cycling. This eliminates the usual constraint that the coarsest grid must have "few" points thus giving additional flexibility in choosing grid size. It also provides a way to compare approximations from multigrid and direct method solutions. The hybrid codes become full direct method solvers replacing the codes described in [6] if grid size arguments are chosen so that the coarse and fine grids coincide. Large storage and computational requirements make the use of the 3-d hybrid codes **muh3,cuh3** as direct methods possible only for very coarse grids.

** * Availability of Subroutines to Compute Residuals **

Subroutines to compute fine grid residual after calling any of the second-order solvers are provided. The residual measures how well the current approximation satisfies the linear system of equations coming from the discretization. Residual ratios can be used to estimate the convergence rate of multigrid iteration.

** * No Initial Guess Requirement**

Unlike the case with classical iterative schemes, initial guesses are not necessary and should not be supplied unless they are very good (as, for example, when restarting multigrid iteration using an approximation generated earlier). Full multigrid cycling [13], beginning at the coarsest grid level, is used when there is no initial guess. Advice on how to use initial guesses within a time marching problem is given in the documentation.

** * Non-initialization Calls **

Redundant discretization and matrix factorization processes can (and should) be bypassed on recalls to the software. For example, this happens when only the right-hand side array has changed from a previous call or when more multigrid cycles are needed for additional accuracy.

** * Error Control **

Maximum relative error can be used to monitor convergence. Use of error control is optional and requires additional storage and computation.

** * Flagging of Errors involving Input Parameters**

This includes detection of singular and/or nonelliptic PDEs. Fatal and nonfatal errors are flagged.

** * Output of Exact Minimal Work Space Requirements **

This is especially important with three-dimensional problems where central memory is easily exhausted.

** * Extensive Documentation and Test Programs**

Users are encouraged to carefully read the documentation and execute the test program for the solver to be used. The next section provides links to documentation and fortran test program files.

## 3. Solver and Test Program Overview

Table 1 below lists all mudpack two- and three-dimensional, second and fourth order solvers for real and complex elliptic partial differential equations with and without cross derivative terms. Clicking on a solver will bring up its documentation file.

Table 2 provides a list of the test and residual codes for each solver. These codes are meant as tests but also as example codes in guiding users through the setup and calling of MUDPACK routines in their own applications.

Table 1 An overview of MUDPACK solvers | ||
---|---|---|

computation | subprograms | |

2nd order/real 2D self-adjoint nonseparable | mud2sa | |

2nd order/real 2D separable | mud2sp | |

2nd order/real 2D nonseparable | muh2, mud2 | |

2nd order/real 2D with cross term | muh2cr, mud2cr | |

4th order/real 2D separable | mud24sp | |

4th order/real 2D nonseparable | muh24, mud24 | |

4th order/real 2D with cross term | muh24cr, mud24cr | |

2nd order/real 3D self-adjoint nonseparable | mud3sa | |

2nd order/real 3D separable | mud3sp | |

2nd order/real 3D nonseparable | muh3, mud3 | |

2nd order/real 3D with cross term | mud3cr | |

4th order/real 3D separable | mud34sp | |

4th order/real 3D nonseparable | mud34, muh34 | |

2nd order/complex 2D separable | cud2sp | |

2nd order/complex 2D nonseparable | cuh2, cud2 | |

2nd order/complex 2D with cross term | cuh2cr, cud2cr | |

4th order/complex 2D separable | cud24sp | |

4th order/complex 2D nonseparable | cud24, cuh24 | |

4th order/complex 2D with cross term | cud24cr, cuh24cr | |

2nd order/complex 3D separable | cud3sp | |

2nd order/complex 3D nonseparable | cuh3, cud3 | |

2nd order/complex 3D with cross term | cud3cr | |

4th order/complex 3D separable | cud34sp | |

4th order/complex 3D nonseparable | cud34 |

Table 2 | ||
---|---|---|

solver | test & residual codes | |

mud2sa |
tmud2sa | |

mud2sp |
tmud2sp, resm2sp | |

mud2,muh2 |
tmud2, tmuh2, resm2 | |

mud2cr,muh2cr |
tmud2cr, tmuh2cr, resm2cr | |

mud24sp |
tmud24sp | |

mud24,muh24 |
tmud24, tmuh24 | |

mud24cr,muh24cr |
tmud24cr, tmuh24cr | |

mud3sa |
tmud3sa | |

mud3sp |
tmud3sp, resm3sp | |

mud3,muh3 |
tmud3, tmuh3, resm3 | |

mud3cr |
tmud3cr | |

mud34sp |
tmud34sp | |

mud34,muh34 |
tmud34, tmuh34 | |

cud2sp |
tcud2sp, resc2sp | |

cud2,cuh2 |
tcud2, tcuh2, resc2 | |

cud2cr,cuh2cr |
tcud2cr, tcuh2cr, resc2cr | |

cud24sp |
tcud24sp | |

cud24,cuh24 |
tcud24, tcuh24 | |

cud24cr,cuh24cr |
tcud24cr, tcuh24cr | |

cud3sp |
tcud3sp, resc3sp | |

cud3,cuh3 |
tcud3, tcuh3, resc3 | |

cud34sp |
tcud34sp | |

cud34 |
tcud34 |

## 4. Solver Selection

The following "flow chart" can be used in selecting the appropriate second-order software for the elliptic PDE to be solved:

(1) If the PDE is complex go to (9) else go to (2)

(2) If the PDE is three-dimensional go to (6) else go to (3)

(3) If the PDE is separable use **mud2sp** else go to (4)

(4) If the PDE has a cross derivative use **muh2cr** or **mud2cr** else go to (5)

(5) If the PDE is self-adjoint use **mud2sa** else use **muh2** or **mud2**.

(6) If the PDE is separable use **mud3sp** else go to (7)

(7) If the PDE is self-adjoint use **mud3sa** else go to (8)

(8) If the PDE has cross derivatives use **mud3cr** else use **muh3** or **mud3**.

(9) If the PDE is three dimensional go to (13) else go to (10)

(10) If the PDE is separable use **cud2sp** else go to (11)

(11) If the PDE has a cross derivative use **cuh2cr** or **cud2cr** else go to (12)

(12) Use **cuh2** or **cud2**

(13) If the PDE is separable use **cud3sp** else use **cuh3** or **cud3**.

Fourth-order solvers can improve the approximation if the corresponding second-order solver has reached discretization level error (i.e., the same error level that a direct method will reach) [1,3,10].

## 5. Parallel Performance via OpenMP

In single processor mode, the openMP statements in version 5.0.1 of MUDPACK are interpreted as comment cards not affecting execution. To ensure this is the case, users should check that their compilers do not recognize OpenMP directives by default. If this is the case, the directives can be turned off with compiler flags or removed by passing MUDPACK source code through an appropriate sed or awk script which removes lines beginning with C$OMP.

Parallel performance was measured on on a Cray J9, a SGI Origin, and a two processor IBM SP computer, using the three MUDPACK solvers **mud2, mud3,** and** mud3cr**. The tables below record measured wall clock time in seconds for an increasing number of processors,** mp**. For each example and grid size, either the least expensive relaxation method (point relaxation with 5 multigrid cycles) or the more expensive and robust relaxation method (line relaxations in all directions with 3 multigrid cycles) is executed. This is typical of the amount of computation needed to solve elliptic problems with the numerical methods embedded in MUDPACK.

The second example illustrates that the cost overhead for parallelization of medium resolution two-dimensional problems can cancel any advantage gained by selecting more than one processor. Once a resolution is chosen, some preliminary timings should be made before using a MUDPACK solver with more than one processor.

** Example 1**: (513 X 769 grid) executing 3 multigrid cycles using bi-directional line relaxations with the multigrid solver **mud2**.

Cray J9 | SGI Origin | IBM SP | |||
---|---|---|---|---|---|

time | mp | time | mp | time | mp |

16.86 | 1 | 8.78 | 1 | 9.93 | 1 |

9.76 | 2 | 6.88 | 2 | 5.61 | 2 |

6.46 | 4 | 5.29 | 4 | ||

5.33 | 8 | 4.30 | 8 | ||

3.90 | 16 | 3.89 | 16 |

** Example 2**: (257 X 193 grid) executing 5 multigrid cycles using red/black Gauss-Seidel point relaxation with the multigrid solver** mud2**.

Cray J9 | SGI Origin | IBM SP | |||
---|---|---|---|---|---|

time | mp | time | mp | time | mp |

0.38 | 1 | 0.27 | 1 | 0.17 | 1 |

0.42 | 2 | 0.49 | 2 | 0.15 | 2 |

0.44 | 4 | 0.95 | 4 | ||

0.44 | 8 | 0.99 | 8 | ||

0.45 | 16 | 1.16 | 16 |

** Example 3**: (95 X 65 X 129 grid) executing 5 multigrid cycles using red/black Gauss-Seidel point relaxation with the multigrid solver **mud3**.

Cray J9 | SGI Origin | IBM SP | |||
---|---|---|---|---|---|

time | mp | time | mp | time | mp |

8.21 | 1 | 9.83 | 1 | 6.63 | 1 |

4.97 | 2 | 8.46 | 2 | 4.03 | 2 |

3.33 | 4 | 5.17 | 4 | ||

2.62 | 8 | 4.22 | 8 | ||

2.54 | 16 | 3.13 | 16 |

** Example 4**: (49 X 257 X 49 grid) executing to a prescribed error tolerance with point relaxation and then with line relaxations in 3 directions with the quasi-multigrid solver **mud3cr** (see the documentation **mud3cr.d** and the test program **tmud3cr.f** for a description of the problem approximated).

SGI Origin (point) | SGI Origin (lines) | ||
---|---|---|---|

time | mp | time | mp |

39.6 | 1 | 74.2 | 1 |

28.7 | 2 | 41.1 | 2 |

12.0 | 4 | 26.8 | 4 |

9.8 | 8 | 20.0 | 8 |

8.9 | 16 | 15.7 | 16 |

**6. Information on Earlier Versions**

** **

The Fortran subroutines in MUDPACK discretize a variety of elliptic Partial Differential Equations (PDEs) and boundary conditions using finite difference formula on grids superimposed on the rectangular solution regions. Then multigrid iterative techniques are used with point, line, or planar relaxation to generate second- and fourth-order approximations to the underlying real and complex, two- and three-dimensional continuous problems.

The MUDPACK software package includes 124 files containing over 100,000 lines of Fortran77 code and documentation. The code may also be compiled with Fortran90 and 95 compilers, but you may need to provide correct flags since the source code is fixed rather than free source form.

The subroutine names, functionality, argument lists, test programs, and documentation files for version 5.0.1 and version 5.0 of MUDPACK are identical. 5.0.1 corrects an error in subroutine mud2sp, namely, an incorrect call to routine rcd2spp is replaced by a call to rmd2spp.

** VERSION 5.0.1 INCOMPATIBLE WITH VERSIONS EARLIER THAN 4.0 **

** ** Version 5.0.1 of MUDPACK is the latest in a series of major revisions since the software was first released in 1990. Changes in argument lists, work space requirements, and file organization make the current version incompatible with versions of MUDPACK built before version 4.0. The functionality has been expanded by adding new solvers. To ensure portability, all MUDPACK 5.0.1 software has been passed through fortran verification software and compiled and executed on different platforms with both fortran77 and fortran90 compilers.

** OUTLINE OF MAJOR CHANGES BETWEEN VERSIONS 4.0 AND 5.0 **

(1) **Open MP directives**

Open MP directives have been inserted in the time critical portions of each of 44 files within MUDPACK. This was done for do loops where red/black Gauss-Seidel point, line, and planar relaxation (for 3-d problems) are executed at different grid levels and in loops where weighted residual restrictions occur. The relaxation and residual restriction portion of multigrid codes account for at least 90% of the execution time with most problems. It was necessary to restructure line relaxation kernels to allow more efficient parallelization when open MP directives are inserted. The changes did not adversely affect single processor vector performance. See the parallel performance tables above, on this page, for improvement measurements on a variety of machines at the National Center for Atmospheric Research.

(2) **Code Simplification and modernization.**

Code simplification has been achieved by expanding internal arrays to include virtual boundaries. This has reduced the code complexity required to handle the variety of boundary conditions MUDPACK software allows. Tests have indicated the resulting streamlining has yielded a 10-30% speedup for a single processor depending on the problem and resolution. Use of equivalencing between arrays has been eliminated. Statement labels have been eliminated, all variable types declared, and nested loops have been streamlined and indented.

(3) **Separation of discretization and approximation calls**

The first call to a solver discretizes the continuous elliptic PDE and boundary conditions using standard second-order finite difference formula. Afterwards, calls to generate approximations can be made. This is a natural separation analogous to separating the LUD factorization and solution phase in matrix solvers. Further it allows more appropriate efficiency monitoring of MUDPACK software since the discretization phase is heavily dependent upon user provided subroutines for inputing coefficients and boundary conditions. Only the approximation phase should be monitored for efficiency. Earlier versions of MUDPACK allowed discretization and approximation to occur with the same call.

(4)**Simplification of multigrid options**

The variants in multigrid cycling have been simplified in version 5.0. F cycling and unweighted or half weighted residual restriction were deemed unnecessary and have been eliminated. V cycles or W cycles with fully weighted residual restriction are retained as the optimal choices. More general multigrid cycling is allowed but flagged as a nonfatal error indicating probable inefficient use of multigrid. The prolongation operator can still be either linear or cubic interpolation.

(5)**New three-dimensional solvers**

New "hybrid" three-dimensional multigrid/direct method solvers **muh3** and **cuh3** along with a quasi-multigrid solver, **mud3cr**, for three-dimensional problems with cross derivative terms and possibly oblique derivative boundary have been added to the package. Use of hybrid solvers for nonsingular two- and three-dimensional problems is encouraged since they can be more robust then their "multigrid only" counterparts and they cost only marginally more in computation and storage. The hybrid solvers also provide more choices of grid resolution.

(6)**Four color relaxation**

Four color relaxation has replaced red/black relaxation for PDEs with cross derivative terms. This can provide better convergence rates.

(7)**Correction in planar relaxation**

If planar relaxation is selected for three-dimensional anisotropic PDEs then full two-dimensional multigrid cycling is implemented for each plane visited during three dimensional multigrid cycling. Earlier versions of MUDPACK did not use full two-dimensional cycling.

## 7. MUDPACK and Multigrid References

[1] J. Adams, "Multigrid Software for Elliptic Partial Differential Equations: MUDPACK," NCAR Technical Note-357+STR, Feb. 1991, 51 pages. The link for this document is http://nldr.library.ucar.edu/repository/assets/technotes/asset-000-000-000-167.pdf.

[2] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient Solution of Linear Elliptic Partial Differential Equations," Applied Math. and Comput. Vol.34, Nov 1989, pp.113-146.

[3] J. Adams, "FMG Results with the Multigrid Software Package MUDPACK," Proceedings of the Fourth Copper Mountain Conference on Multigrid, SIAM, 1989, pp.1-12.

[4] J. Adams, "Fortran Subprograms for Finite Difference Formula," J. Comp. Phys.,Vol 26, Jan 1978, pp. 113-116.

[5] J. Adams, P. Swarztrauber, R. Sweet, "Efficient Fortran Subprograms for the Solution of Elliptic Partial Differential Equations," Elliptic Problem Solvers, Academic Press, 1982, pp.333-390.

[6] J. Adams, "New Software for Elliptic Partial Differential Equations," Computing Facility Notes 55, November 1978

[7] J. Adams, "Comparison of direct and iterative methods for approximating nonseparable elliptic PDEs at NCAR," SCD Computing News, Vol 10, Nov. 1989, pp.12-14.

[8] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo, "Applications of Multigrid Software in the Atmospheric Sciences," Monthly Weather Review,Vol. 120 # 7, July 1992, pp. 1447-1458.

[9] J. Adams, "MUDPACK: Multigrid Software for Linear Elliptic Partial Differential Equations," SCD UserDoc, Version 2.0, NCAR,February 1990.

[10] J. Adams, "Recent Enhancements in MUDPACK, A Multigrid Software Package for Elliptic Partial Differential Equations," Applied Math. and Comp., Vol. 43, May 1991, pp.79-94.

[11] J. Adams, "MUDPACK-2: Multigrid Software for Elliptic Partial Differential Equations on Uniform Grids with any Resolution," Applied. Math. and Comp., Vol. 53, February 1993, pp. 235-249.

[12] J. Adams, and P. Smolarkiewicz, "Modified multigrid for 3D elliptic equations with cross derivatives", Applied Math. Comput., Vol. 121, 2001, pp. 301-312.

[13] A. Brandt, "Multi-level Adaptive Solutions to Boundary Value Problems," Math. Comp., 31, 1977, pp.333-390.

[14] W. Briggs, "A Multigrid Tutorial," SIAM, Philadelphia,1987.

[15] B. Buzbee, G. Golub, and C. Nielson, "On direct methods for solving Poisson's equations," SIAM J. Numer. Anal., 7, 1970, pp.627-656.

[16] S. Fulton, R. Ciesielski, and W. Schubert, Multigrid methods for elliptic problems: a review. Monthly Weather Review, 114:943-959 (1986).

[17] W. Gentzsch, "Vectorization of Computer Programs with Applications to Computational Fluid Dynamics," Vieweg & Sohn, 1984 (246 pages).

[18] W. Hackbush and U. Trottenberg, "Multigrid Methods," Springer-Verlag, Berlin,1982.

[19] D. Jespersen, "Multigrid Methods for Partial Differential Equations." Studies in Numerical Analysis, Vol.24, MAA, 1984.

[20] J. Mandel and S, Parter, "On the Multigrid F-Cycle," Applied Math. and Comput., Vol 37, 1990, pp.19-36.

[21] S. McCormick, "Multigrid Methods," Vol 3 of SIAM Frontiers Series, SIAM, Philadelphia, 1987.

[22] V. Pereyra, "Highly Accurate Numerical Solution of Casilinear Elliptic Boundary-Value Problems in n Dimensions," Math. Comp., 24, 1970, pp.771-783.

[23] D. Sato, "PERFMON: The Cray Performance Monitor Utility," SCD UserDoc, Version 2.0, NCAR,March 1989.

[24] S. Schaffer, "Higher Order Multigrid Methods," Math. Comp., Vol 43, July 1984, pp. 89-115.

[25] P. Swarztrauber, "Fast Poisson Solvers," Studies in Numerical Analysis, Math. Assoc, America, 1985, pp. 319-369.

[26] R. Sweet, "A Parallel and Vector Variant of the Cyclic Reduction Algorithm," SIAM J. Sci. and Stat. Comp., Vol. 9, July 1988, pp. 761-766.

[27] C. Thole and U. Trottenberg, "Basic Smoothing Procedures for the Multigrid treatment of Elliptic 3D Operators," Applied Math. and Comp., 19, 1986, pp. 333-345.

## 8. Obtaining MUDPACK Software

MUDPACK solver routines and test programs can be downloaded from the NCAR web page for MUDPACK. You will see a download tab on that page. When you click on it, you will be asked to agree to the licensing terms for the package.

**9. Acknowledgements**

Steve McCormick introduced the author to the multigrid community and provided numerous helpful suggestions including the use of planar relaxation with the three-dimensional solvers. The importance of adjusting discretization coefficients at coarser grid levels for PDEs with nonzero first-order terms was pointed out by Klauss Steuben. Achi Brandt provided a complimentary foreword for the MUDPACK technical note [1]. A conversation with Achi Brandt affirmed that the default multigrid options in MUDPACK are a good choice and that the use of deferred corrections in obtaining fourth-order approximations with multigrid is a reasonable strategy. Dave Kennison, now retired, formerly of the NCAR graphics group provided the grid coarsening figure at the start of this document.

## Return to beginning of this document

## Text Below Contains Internal Files Referenced by Above Links

c

c file cud2.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud2.d

c

c contains documentation for:

c subroutine cud2(iparm,fparm,work,coef,bndyc,rhs,phi,mgopt,ierror)

c A sample fortran driver is file "tcud2.f".

c

c ... required MUDPACK files

c

c cudcom.f

c

c ... purpose

c

c subroutine cud2 automatically discretizes and attempts to compute

c the second-order difference approximation to the complex 2-d

c linear nonseparable elliptic partial differential equation on a

c rectangle. the approximation is generated on a uniform grid covering

c the rectangle (see mesh description below). boundary conditions

c may be specified (dirchlet), periodic, or mixed derivative in any

c combination. the form of the pde solved is:

c

c

c cxx(x,y)*pxx + cyy(x,y)*pyy + cx(x,y)*px + cy(x,y)*py +

c

c ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pyy,px,py are second and first partial derivatives of the

c unknown real solution function p(x,y) with respect to the

c independent variables x,y. cxx,cyy,cx,cy,ce are the known

c complex coefficients of the elliptic pde and r(x,y) is the known

c complex right hand side of the equation. The real and imaginary

c parts of cxx and cyy should be positive for all x,y in the solution

c region. Nonseparability means some of the coefficients depend on

c both x and y. Otherwise the PDE is separable and subroutine

c cud2sp should be used instead of cud2

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c cud2 should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if cud2 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cud2 is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cud2 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cud2 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cud2

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c

c *** note

c

c let G be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each G(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cabs(cxx(x,y))/dlx**2 over the solution region.

c

c let fy represent the quantity cabs(cyy(x,y))/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the complex work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = 4*[nx*ny*(10+isx+jsy)+8*(nx+ny+2)]/3

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in cud2 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(cabs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(cabs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c ... work

c

c a one dimensional complex saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to cud2. bndyc

c must be declared "external" in the program calling cud2.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tcud2.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y),gbdxa(y) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y),gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x),gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x),gbdyd(x) must be returned.

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c cud2 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling cud2. the actual name chosen may

c be different. alfxa,alfxb,alfyc,alfyd,gbdxa,gbdxb,gbdyc,

c gbdyd must all be declared type complex.

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cyy,cx,cy,ce) which

c provides the known complex coefficients for the elliptic pde at

c any real grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c a complex array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c a complex array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling cud2. these values are preserved by cud2.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual complex work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(cabs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(cabs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output the complex space work contains intermediate values that

c must not be destroyed if cud2 is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c cabs(cx)*dlx > 2.*cabs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c cabs(cy)*dly > 2.*cabs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = cmplx(0.5*cabs(cx)*dx,0.0)

c

c cyy = cmplx(0.5*cabs(cy)*dy,0.0)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made when necessary to preserve convergence. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) =(0.0,0.0) for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic

c

c real(cxx)*real(cyy).le.0.0 or aimag(cxx)*aimag(cyy).le.0.0

c

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa > xb or yc > yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of cud2 documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cud24.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud24.d

c

c contains documentation for subroutine cud24(work,phi,ierror)

c A sample fortran driver is file "tcud24.f".

c

c ... required MUDPACK files

c

c cud2.f, cudcom.f

c

c ... purpose

c

c cud24 attempts to improve the estimate in phi, obtained by calling

c cud2, from second to fourth order accuracy. see the file "cud2.d"

c for a detailed discussion of the complex elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c cud2.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cud2 call

c

c * arguments "work,phi" are the same used in calling cud2

c

c * "work,phi" have not changed since the last call to cud2

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cud24 documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cud24cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud24cr.d

c

c contains documentation for:

c subroutine cud24cr(work,coef,bndyc,phi,ierror)

c A sample fortran driver is file "tcud24cr.f".

c

c ... required MUDPACK files

c

c cud2cr.f, cudcom.f

c

c ... purpose

c

c cud24cr attempts to improve the estimate in phi, obtained by calling

c cud2cr, from second to fourth order accuracy. see the file "cud2cr.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,coef,bndyc,phi" which are also part of the argument

c list for cud2cr

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cud2cr call

c

c * arguments "work,coef,bndyc,phi" are the same used in calling cud2cr

c

c * "work,coef,bndyc,phi" have not changed since the last call to cud2cr

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cud24cr documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cud24sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud24sp.d

c

c contains documentation for subroutine cud24sp(work,phi,ierror)

c A sample fortran driver is file "tcud24sp.f".

c

c ... required MUDPACK files

c

c cud2sp.f, cudcom.f

c

c ... purpose

c

c cud24sp attempts to improve the estimate in phi, obtained by calling

c cud2sp, from second to fourth order accuracy. see the file "cud2sp.d"

c for a detailed discussion of the complex elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c cud2sp.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cud2sp call

c

c * arguments "work,phi" are the same used in calling cud2sp

c

c * "work,phi" have not changed since the last call to cud2sp

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cud24sp documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cud2cr.d

c

c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

c . .

c . copyright (c) 2008 by UCAR .

c . .

c . UNIVERSITY CORPORATION for ATMOSPHERIC RESEARCH .

c . .

c . all rights reserved .

c . .

c . .

c . MUDPACK version 5.0.1 .

c . .

c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

c

c

c

c ... file cud2cr.d

c

c contains documentation for the complex mudpack solver:

c subroutine cud2cr(iparm,fparm,work,coef,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tcud2cr.f".

c

c ... required mudpack files

c

c cudcom.f

c

c ... purpose

c

c subroutine cud2cr automatically discretizes and attempts to compute

c the second-order difference approximation to the complex 2-D

c linear nonseparable elliptic partial differential equation with cross

c derivative term on a rectangle. the approximation is generated on a

c uniform grid covering the rectangle (see mesh description below).

c boundary conditions may be specified (dirchlet), periodic, or mixed

c oblique derivative (see bndyc) in any combination. the form of the pde

c approximated is:

c

c

c cxx(x,y)*pxx + cxy(x,y)*pxy + cyy(x,y)*pyy + cx(x,y)*px +

c

c cy(x,y)*py + ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pxy,pyy,px,py are second and first partial derivatives of the

c unknown complex solution function p(x,y) with respect to the

c independent variables x,y. cxx,cxy,cyy,cx,cy,ce are the known

c complex coefficients of the elliptic pde and r(x,y) is the known

c complex right hand side of the equation. The real parts of cxx,cyy

c or the imaginary parts of cxx,cyy should be positive for all x,y

c in the solution region (see ierror=-2). Nonseparability means some

c of the coefficients depend on both x and y. if the PDE is separable

c and cxy = 0 then subroutine cud2sp should be used.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c cud2cr should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if cud2cr has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cud2cr is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cud2cr is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cud2cr is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cud2cr

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c must be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** note

c

c let g be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c g(1) < ... < g(k) < ... < g(n) = g.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in complex work

c space "work")

c

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = [7*(nx+2)*(ny+2)+4*(11+isx+jsy)*nx*ny]/3

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in cud2cr and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible do not use error control!).

c

c ... work

c

c a complex saved work space (see iparm(15) for size) which

c must be preserved from the previous call when calling with

c intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,beta,gama,gbdy) which

c are used to input mixed boundary conditions to cud2cr. bndyc

c must be declared "external" in the program calling cud2cr. kbdy

c is type integer, xory real, and alfa,beta,gama,gbdy type complex.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tcud2cr.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c alfxa(y)*px + betxa(y)*py + gamxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfxa(y),betxa(y),gamxa(y),

c gbdxa(y) must be returned. alfxa(y) = 0. is not allowed for any y.

c (see ierror = 13)

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c alfxb(y)*px + betxb(y)*py + gamxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfxb(y),betxb(y),gamxb(y),

c gbdxb(y) must be returned. alfxb(y) = 0.0 is not allowed for any y.

c (see ierror = 13)

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c alfyc(x)*px + betyc(x)*py + gamyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfyc(x),betyc(x),gamyc(x),

c gbdyc(x) must be returned. betyc(x) = 0.0 is not allowed for any x.

c (see ierror = 13)

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c alfyd(x)*px + betyd(x)*py + gamyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfyd(x),betyd(x),gamyd(x),

c gbdyd(x) must be returned. betyd(x) = 0.0 is not allowed for any x.

c (see ierror = 13)

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c cud2cr will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling cud2cr. the actual name chosen may

c be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cxy,cyy,cx,cy,ce) which

c provides the known complex coefficients for the elliptic pde at

c any grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c a complex array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c a complex array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling cud2cr. these values are preserved by cud2cr.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if cud2cr is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c for intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(1) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c =13 if there is a pure tangential derivative along a mixed derivative

c boundary (e.g., nyd = 2 and betyd(x) = 0.0 for some

c grid point x along y = yd)

c

c =14 if there is the "singular" condition described below at a

c cornor which is the intersection of two derivative boundaries.

c

c (1) the cornor (xa,yc) if nxa=nyc=2 and

c alfxa(yc)*betyc(xa)-alfyc(xa)*betxa(yc) = 0.0.

c

c (2) the cornor (xa,yd) if nxa=nyd=2 and

c alfxa(yd)*betyd(xa)-alfyd(xa)*betxa(yd) = 0.0.

c

c (3) the cornor (xb,yc) if nxb=nyc=2 and

c alfxb(yc)*betyc(xb)-alfyc(xb)*betxb(yc) = 0.0.

c

c (4) the cornor (xb,yd) if nxb=nyd=2 and

c alfxb(yd)*betyd(xb)-alfyd(xb)*betxb(yd) = 0.0.

c

c *** the conditions described in ierror = 13 or 14 will lead to division

c by zero during discretization if undetected.

c

c

c *********************************************************

c *********************************************************

c

c end of cud2cr documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cud2sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud2sp.d

c

c contains documentation for:

c subroutine cud2sp(iparm,fparm,work,cofx,cofy,bndyc,rhs,phi,mgopt,ierror)

c A sample fortran driver is file "tcud2sp.f".

c

c ... required MUDPACK files

c

c cudcom.f

c

c ... purpose

c

c subroutine cud2sp automatically discretizes and attempts to compute

c the second-order difference approximation to the complex two-dimensional

c linear separable elliptic partial differential equation on a

c rectangle. the approximation is generated on a uniform grid covering

c the rectangle (see mesh description below). boundary conditions

c may be specified (dirchlet), periodic, or mixed derivative in any

c combination. the form of the pde solved is:

c

c

c cxx(x)*pxx + cx(x)*px + cex(x)*p(x,y) +

c

c cyy(y)*pyy + cy(y)*py + cey(y)*p(x,y) = r(x,y)

c

c pxx,pyy,px,py are second and first partial derivatives of the

c unknown complex solution function p(x,y) with respect to the

c independent variables x,y. cxx,cx,cex,cyy,cy,cey are the known

c complex coefficients of the elliptic pde and r(x,y) is the known

c complex right hand side of the equation. cxx and cyy should be

c positive for all x,y in the solution region. If some of the

c coefficients depend on both x and y then the PDE is nonseparable.

c In this case subroutine cud2 must be used instead of cud2sp

c (see the file cud2.d)

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c cud2sp should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if cud2sp has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cud2sp is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cud2sp is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cud2sp is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cud2sp

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by cofx,cofy (see below) have

c changed since the previous call

c

c (e) any of the constant "alfa" coefficients input by bndyc

c (see below) have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c

c *** note

c

c let G be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each G(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the complex work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = nx*ny*(5+3*(isx+jsy)/2)+ 10*(nx+ny)

c

c will suffice in all cases but very small nx and ny.

c the exact minimal work space length required for the

c current set of input arugments is output in iparm(16).

c (even if iparm(15) is too small). this will be usually

c be less then the value given by the simplified formula

c above. * Notice that cud2sp requires considerably less

c work space than the nonseparable solver cud2 if

c and only if method=0 is chosen.

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in cud2sp and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(cabs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(cabs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c ... work

c

c a one dimensional complex saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to cud2sp. bndyc

c must be declared "external" in the program calling cud2sp.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tcud2sp.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa,gbdxa(y) must be returned

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb,gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc,gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd,gbdyd(x) must be returned.

c

c

c *** alfxa,alfxb,alfyc,alfyd must be complex constants and gbdy type

c complex for cud2sp. Use cud2 if any of these depend on x or y.

c bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c cud2sp will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling cud2sp the actual name chosen may

c be different.

c

c

c ... cofx

c

c a subroutine with arguments (x,cxx,cx,cex) which provides

c the known x dependent complex coefficients for the separable

c elliptic pde at any x grid point. the name chosen in the calling

c routine may be different where the coefficient routine must be declared

c "external."

c

c ... cofy

c

c a subroutine with arguments (y,cyy,cy,cey) which provides

c the known y dependent complex coefficients for the separable

c elliptic pde at any y grid point. the name chosen in the calling

c routine may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c a complex array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c a complex array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling cud2sp. these values are preserved by cud2sp.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(cabs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(cabs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if cud2sp is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c cabs(cx)*dlx > 2.*cabs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c cabs(cy)*dly > 2.*cabs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = cmplx(0.5*cabs(cx)*dx,0.0)

c

c cyy = cmplx(0.5*cabs(cy)*dy,0.0)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made when necessary to preserve convergence. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = (0.0,0.0) for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 unlike cud2 and cud2cr, there is no ellipticity test with cud2sp

c so this flag is not set

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on the first call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(1) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of cud2sp documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cud3.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... required MUDPACK files

c

c cudcom.f, cud3ln.f, cud3pn.f

c

c ... purpose

c

c subroutine cud3 automatically discretizes and attempts to compute

c the second order finite difference approximation to a COMPLEX

c 3-d linear nonseparable elliptic partial differential equation

c on a box. the approximation is generated on a uniform grid

c covering the box (see mesh description below). boundary

c conditions may be any combination of mixed, specified (Dirchlet)

c or periodic. the form of the pde solved is . . .

c

c cxx(x,y,z)*pxx + cyy(x,y,z)*pyy + czz(z,y,z)*pzz +

c

c cx(x,y,z)*px + cy(x,y,z)*py + cz(x,y,z)*pz +

c

c ce(x,y,z)*p(x,y,z) = r(x,y,z)

c

c here cxx,cyy,czz,cx,cy,cz,ce are the known complex coefficients

c of the pde; pxx,pyy,pzz,px,py,pz are the second and first partial

c derivatives of the unknown complex solution function p(x,y,z)

c with respect to the independent variables x,y,z; r(x,y,z) is

c is the known complex right hand side of the elliptic pde. cxx,cyy

c and czz should have real and imaginary parts positive for all (x,y,z)

c in the solution region.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 23 used to efficiently pass

c integer arguments. iparm is set internally in cud3

c and defined as follows . . .

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c cud3 should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if cud3 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cud3 is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cud3 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cud3 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cud3

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the (y,z) plane x=xa

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xa,y,z) is specified (this must be input thru phi(1,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see "bndyc" description below where kbdy = 1)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the (y,z) plane x=xb

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xb,y,z) is specified (this must be input thru phi(nx,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see "bndyc" description below where kbdy = 2)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the (x,z) plane y=yc

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yc,z) is specified (this must be input thru phi(i,1,k))

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see "bndyc" description below where kbdy = 3)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the (x,z) plane y=yd

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yd,z) is specified (this must be input thru phi(i,ny,k))

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see "bndyc" description below where kbdy = 4)

c

c

c ... nze=iparm(6)

c

c flags boundary conditions on the (x,y) plane z=ze

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,ze) is specified (this must be input thru phi(i,j,1))

c = 2 if there are mixed derivative boundary conditions at z=ze

c (see "bndyc" description below where kbdy = 5)

c

c

c ... nzf=iparm(7)

c

c flags boundary conditions on the (x,y) plane z=zf

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,zf) is specified (this must be input thru phi(i,j,nz))

c = 2 if there are mixed derivative boundary conditions at z=zf

c (see "bndyc" description below where kbdy = 6)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(8)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(14)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(11)

c without changing nx = iparm(14)

c

c

c ... jyq = iparm(9)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(15)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(12)

c without changing ny = iparm(15)

c

c

c ... kzr = iparm(10)

c

c an integer greater than one which is used in defining the number

c of grid points in the z direction (see nz = iparm(16)). "kzr+1"

c is the number of points on the coarsest z grid visited during

c multigrid cycling. kzr should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the z direction is not used.

c if kzr > 2 then it should be 2 or a small odd value since a power

c of 2 factor of kzr can be removed by increasing kez = iparm(13)

c without changing nz = iparm(16)

c

c

c ... iex = iparm(11)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(14)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx = iparm(14).

c

c

c ... jey = iparm(12)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(15)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(9)

c as small as possible within grid size constraints when

c defining ny = iparm(15).

c

c

c ... kez = iparm(13)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the z direction (see nz = iparm(16)).

c kez .le. 50 is required. for efficient multigrid cycling,

c kez should be chosen as large as possible and kzr=iparm(10)

c as small as possible within grid size constraints when

c defining nz = iparm(16).

c

c

c ... nx = iparm(14)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(8), iex = iparm(11).

c

c

c ... ny = iparm(15)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(9), jey = iparm(12).

c

c

c ... nz = iparm(16)

c

c the number of equally spaced grid points in the interval [ze,zf]

c (including the boundaries). nz must have the form

c

c nz = kzr*(2**(kez-1)) + 1

c

c where kzr = iparm(10), kez = iparm(13)

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 65 by 97 grid. then

c ixp=2, jyq=4, kzr=6 and iex=jey=kez=5 could be used. a better

c choice would be ixp=jyq=2, kzr=3, and iex=5, jey=kez=6.

c

c *** note

c

c let G be the nx by ny by nz fine grid on which the approximation is

c generated and let n = max0(iex,jey,kez). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) by mz(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c mz(k) = kzr*[2**(max0(kez+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(17)

c

c = 0 if no initial guess to the pde is provided

c and/or full multigrid cycling beginning at the

c coarsest grid level is desired.

c

c = 1 if an initial guess to the pde at the finest grid

c level is provided in phi (see below). in this case

c cycling beginning or restarting at the finest grid

c is initiated.

c

c *** comments on iguess = 0 or 1 . . .

c

c

c setting iguess=0 forces full multigrid or "fmg" cycling. phi

c must be initialized at all grid points. it can be set to zero at

c non-Dirchlet grid points if nothing better is available. the

c values set in phi when iguess = 0 are passed and down and serve

c as an initial guess to the pde at the coarsest grid level where

c multigrid cycling commences.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c *** time dependent problems . . .

c

c assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at non-Dirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(18)

c

c the exact number of cycles executed between the finest

c (nx by ny by nz) and the coarsest ((ixp+1) by (jyq+1) by

c (kzr+1)) grid levels when tolmax=fparm(7)=0.0 (no error

c control). when tolmax=fparm(7).gt.0.0 is input (error control)

c then maxcy is a limit on the number of cycles between the

c finest and coarsest grid levels. in any case, at most

c maxcy*(iprer+ipost) relaxation sweeps are performed at the

c finest grid level (see iprer=mgopt(2),ipost=mgopt(3) below)

c when multigrid iteration is working "correctly" only a few

c cycles are required for convergence. large values for maxcy

c should not be required.

c

c

c ... method = iparm(19)

c

c this sets the method of relaxation (all relaxation

c schemes in mudpack use red/black type ordering)

c

c = 0 for gauss-seidel pointwise relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in the z direction

c

c = 4 for line relaxation in the x and y direction

c

c = 5 for line relaxation in the x and z direction

c

c = 6 for line relaxation in the y and z direction

c

c = 7 for line relaxation in the x,y and z direction

c

c = 8 for x,y planar relaxation

c

c = 9 for x,z planar relaxation

c

c =10 for y,z planar relaxation

c

c *** if nxa = 0 and nx = 3 at a grid level where line relaxation in the x

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c *** if nyc = 0 and ny = 3 at a grid level where line relaxation in the y

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c *** if nze = 0 and nz = 3 at a grid level where line relaxation in the z

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c these adjustments are necessary since the simultaneous tri-diagonal

c solvers used with line periodic relaxation must have n > 2 where n

c is number of unknowns (excluding the periodic point).

c *** choice of method

c

c this is very important for efficient convergence. in some cases

c experimentation may be required.

c

c let fx represent the quantity cxx(x,y,z)/dlx**2 over the solution box

c

c let fy represent the quantity cyy(x,y,z)/dly**2 over the solution box

c

c let fz represent the quantity czz(x,y,z)/dlz**2 over the solution box

c

c (0) if fx,fy,fz are roughly the same size and do not vary too

c much choose method = 0. if this fails try method = 7.

c

c (1) if fx is much greater then fy,fz and fy,fz are roughly the same

c size choose method = 1

c

c (2) if fy is much greater then fx,fz and fx,fz are roughly the same

c size choose method = 2

c

c (3) if fz is much greater then fx,fy and fx,fy are roughly the same

c size choose method = 3

c

c (4) if fx,fy are roughly the same and both are much greater then fz

c try method = 4. if this fails try method = 8

c

c (5) if fx,fz are roughly the same and both are much greater then fy

c try method = 5. if this fails try method = 9

c

c (6) if fy,fz are roughly the same and both are much greater then fx

c try method = 6. if this fails try method = 10

c

c (7) if fx,fy,fz vary considerably with none dominating try method = 7

c

c (8) if fx and fy are considerably greater then fz but not necessarily

c the same size (e.g., fx=1000.,fy=100.,fz=1.) try method = 8

c

c (9) if fx and fz are considerably greater then fy but not necessarily

c the same size (e.g., fx=10.,fy=1.,fz=1000.) try method = 9

c

c (10)if fy and fz are considerably greater then fx but not necessarily

c the same size (e.g., fx=1.,fy=100.,fz=10.) try method = 10

c

c

c ... meth2 = iparm(20) determines the method of relaxation used in the planes

c when method = 8 or 9 or 10.

c

c

c as above, let fx,fy,fz represent the quantities cxx/dlx**2,

c cyy/dly**2,czz/dlz**2 over the box.

c

c (if method = 8)

c

c = 0 for gauss-seidel pointwise relaxation

c in the x,y plane for each fixed z

c = 1 for line relaxation in the x direction

c in the x,y plane for each fixed z

c = 2 for line relaxation in the y direction

c in the x,y plane for each fixed z

c = 3 for line relaxation in the x and y direction

c in the x,y plane for each fixed z

c

c (1) if fx,fy are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fy choose meth2 = 1

c (3) if fy is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 9)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the x,z plane for each fixed y

c = 1 for simultaneous line relaxation in the x direction

c of the x,z plane for each fixed y

c = 2 for simultaneous line relaxation in the z direction

c of the x,z plane for each fixed y

c = 3 for simultaneous line relaxation in the x and z direction

c of the x,z plane for each fixed y

c

c (1) if fx,fz are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 10)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the y,z plane for each fixed x

c = 1 for simultaneous line relaxation in the y direction

c of the y,z plane for each fixed x

c = 2 for simultaneous line relaxation in the z direction

c of the y,z plane for each fixed x

c = 3 for simultaneous line relaxation in the y and z direction

c of the y,z plane for each fixed x

c

c (1) if fy,fz are roughly the same and vary little choose meth2 = 0

c (2) if fy is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fy choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c

c ... length = iparm(21)

c

c the length of the work space provided in vector work.

c

c let isx = 3 if method = 1,4,5 or 7 and nxa.ne.0

c let isx = 5 if method = 1,4,5 or 7 and nxa.eq.0

c let isx = 0 if method has any other value

c

c let jsy = 3 if method = 2,4,6 or 7 and nyc.ne.0

c let jsy = 5 if method = 2,4,6 or 7 and nyc.eq.0

c let jsy = 0 if method has any other value

c

c let ksz = 3 if method = 3,5,6 or 7 and nze.ne.0

c let ksz = 5 if method = 3,5,6 or 7 and nze.eq.0

c let ksz = 0 if method has any other value

c

c

c then (for method .le.7)

c

c (1) length = (nx+2)*(ny+2)*(nz+2)*(10+isx+jsy+ksz)

c

c or (for method.gt.7)

c

c (2) length = 14*(nx+2)*(ny+2)*(nz+2)

c

c will usually but not always suffice. The exact minimal length depends,

c in a complex way, on the grid size arguments and method chosen.

c *** It can be predetermined for the current input arguments by calling

c cud3 with length=iparm(21)=0 and printing iparm(22) or (in f90)

c dynamically allocating the work space using the value in iparm(22)

c in a subsequent cud3 call.

c

c ... fparm

c

c a floating point vector of length 8 used to efficiently

c pass floating point arguments. fparm is set internally

c in cud3 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... ze=fparm(5), zf=fparm(6)

c

c the range of the z independent variable. ze must

c be less than zf.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j,k)

c and phi2(i,j,k) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(cabs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(cabs(phi2(i,j,k))) for all i,j,k

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c

c ... work

c

c a complex one dimensional array that must be provided for work space.

c see length = iparm(21). the values in work must be preserved

c if cud3 is called again with intl=iparm(1).ne.0 or if cud34

c is called to improve accuracy.

c

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,yorz,alfa,gbdy).

c which are used to input mixed boundary conditions to cud3.

c the boundaries are numbered one thru six and the form of

c conditions are described below.

c

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y,z)*p(xa,y,z) = gbdxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y,z),gbdxa(y,z) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y,z)*p(xb,y,z) = gbdxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y,z),gbdxb(y,z) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x,z)*p(x,yc,z) = gbdyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x,z),gbdyc(x,z) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x,z)*p(x,yd,z) = gbdyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x,z),gbdyd(x,z) must be returned.

c

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfze(x,y)*p(x,y,ze) = gbdze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfze(x,y),gbdze(x,y) must be returned.

c

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfzf(x,y)*p(x,y,zf) = gbdzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfzf(x,y),gbdzf(x,y) must be returned.

c

c

c *** alfxa,alfyc,alfze nonpositive and alfxb,alfyd,alfze nonnegative

c will help maintain matrix diagonal dominance during discretization

c aiding convergence.

c

c *** bndyc must provide the mixed boundary condition

c values in correspondence with those flagged in iparm(2)

c thru iparm(7). if all boundaries are specified then

c cud3 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared

c external in the routine calling cud3. the actual

c name chosen may be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,z,cxx,cyy,czz,cx,cy,cz,ce)

c which provides the known complex coefficients for the elliptic pde

c at any grid point (x,y,z). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... rhs

c

c a complex array dimensioned nx by ny by nz which contains

c the given right hand side values on the uniform 3-d mesh.

c rhs(i,j,k) = r(xi,yj,zk) for i=1,...,nx and j=1,...,ny

c and k=1,...,nz.

c

c ... phi

c

c a complex array dimensioned nx by ny by nz . on input phi must

c contain specified boundary values and an initial guess

c to the solution if flagged (see iguess=iparm(17)=1). for

c example, if nyd=iparm(5)=1 then phi(i,ny,k) must be set

c equal to p(xi,yd,zk) for i=1,...,nx and k=1,...,nz prior to

c calling cud3. the specified values are preserved by cud3.

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at non-Dirchlet grid points (this is not

c checked). these values are projected down and serve as an initial

c guess to the pde at the coarsest grid level. set phi to 0.0 at

c nonDirchlet grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below).

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(22)

c

c on output iparm(22) contains the actual complex work space length

c required for the current grid sizes and method. This value

c will be computed and returned even if iparm(21) is less then

c iparm(22) (see ierror=9).

c

c

c ... iparm(23)

c

c if error control is selected (tolmax = fparm(7) .gt. 0.0) then

c on output iparm(23) contains the actual number of cycles executed

c between the coarsest and finest grid levels in obtaining the

c approximation in phi. the quantity (iprer+ipost)*iparm(23) is

c the number of relaxation sweeps performed at the finest grid level.

c

c

c ... fparm(8)

c

c on output fparm(8) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(8) is computed only if there is error control (tolmax.gt.0.)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(cabs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(cabs(phi2(i,j,k))) for all i,j,k

c

c then

c

c fparm(8) = phdif/phmax

c

c is returned whenever phmax.gt.0.0. in the degenerate case

c phmax = 0.0, fparm(8) = phdif is returned.

c

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if cud3 is to be called again with iparm(1)=1 or

c if cud34 is to be called to improve the estimate to fourth

c order.

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained (ierror=-1)

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c cabs(cx)*dlx > 2.*cabs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c cabs(cy)*dly > 2.*cabs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c if cabs(cxx) < 0.5*cabs(cx)*dx then

c cxx = cmplx(0.5*cabs(cx)*dx,0.0)

c

c (and)

c

c if (cabs(cyy) < 0.5*cabs(cy)*dy then

c cyy = cmplx(0.5*cabs(cy)*dy,0.0)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd,nze,nzf

c in iparm(2) through iparm(7)is not 0,1 or 2 or if

c (nxa,nxb) or (nyc,nyd) or (nze,nzf) are not pairwise zero.

c

c = 3 if mino(ixp,jyq,kzr) < 2 (ixp=iparm(8),jyq=iparm(9),kzr=iparm(10))

c

c = 4 if min0(iex,jey,kez) < 1 (iex=iparm(11),jey=iparm(12),kez=iparm(13))

c or if max0(iex,jey,kez) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or if ny.ne.jyq*2**(jey-1)+1 or

c if nz.ne.kzr*2**(kez-1)+1 (nx=iparm(14),ny=iparm(15),nz=iparm(16))

c

c = 6 if iguess = iparm(17) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(18) < 1 (large values for maxcy should not be used)

c

c = 8 if method = iparm(19) is less than 0 or greater than 10 or

c if meth2 = iparm(20) is not 0 or 1 or 2 or 3 when method > 7.

c

c = 9 if length = iparm(20) is too small (see iparm(21) on output

c for minimum required work space length)

c

c =10 if any of: xa < xb or yc < yd or ze < zf is false

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4),ze=fparm(5),zf=fparm(6))

c

c =11 if tolmax = fparm(7) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of cud3 documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cud34.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud34.d

c

c contains documentation for subroutine cud34(work,phi,ierror)

c A sample fortran driver is file "tcud34.f".

c

c ... required MUDPACK files

c

c cud3.f, cudcom.f, cud3ln.f, cud3pn.f

c

c ... purpose

c

c cud34 attempts to improve the estimate in phi, obtained by calling

c cud3, from second to fourth order accuracy. see the file "cud3.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c cud3.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cud3 call

c

c * arguments "work,phi" are the same used in calling cud3

c

c * "work,phi" have not changed since the last call to cud3

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cud34 documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cud34sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud34sp.d

c

c contains documentation for subroutine cud34sp(work,phi,ierror)

c A sample fortran driver is file "tcud34sp.f".

c

c ... required MUDPACK files

c

c cud3sp.f, cudcom.f

c

c ... purpose

c

c cud34sp attempts to improve the estimate in phi, obtained by calling

c cud3sp, from second to fourth order accuracy. see the file "cud3sp.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c cud3sp.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cud3sp call

c

c * arguments "work,phi" are the same used in calling cud3sp

c

c * "work,phi" have not changed since the last call to cud3sp

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.01 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cud34sp documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cud3cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud3cr.d

c

c contains documentation for:

c subroutine cud3cr(iparm,fparm,work,coef,bnd3cr,rhs,phi,mgopt,

c +icros,crsxy,crsxz,crsyz,tol,maxit,iouter,rmax,ierror)

c A sample fortran driver is file "tcud3cr.f".

c

c ... required MUDPACK files

c

c cudcom.f

c

c

c ... purpose

c

c subroutine cud3cr automatically discretizes and attempts to compute

c the second order finite difference approximation to a complex 3-d

c linear nonseparable elliptic partial differential equation with

c cross derivative terms on a box. the approximation is generated

c on a uniform grid covering the box (see mesh description below).

c boundary conditions may be any combination of oblique mixed

c derivative (see bnd3cr description below), specified (Dirchlet) or

c periodic. the form of the pde in operator notation is

c

c l(p) + lxyz(p) = r(x,y,z)

c

c where

c

c l(p) = cxx(x,y,z)*pxx + cyy(x,y,z)*pyy + czz(z,y,z)*pzz +

c

c cx(x,y,z)*px + cy(x,y,z)*py + cz(x,y,z)*pz +

c

c ce(x,y,z)*p(x,y,z) = r(x,y,z)

c

c and

c

c lxyz(p) = cxy(x,y,z)*pxy + cxz(x,y,z)*pxz + cyz(x,y,z)*pyz

c

c here cxx,cyy,czz,cx,cy,cz,ce,cxy,cxz,cyz are the known complex

c coefficients of the pde; pxx,pyy,pzz,px,py,pz are the second and

c first partial derivatives of the unknown complex solution function p

c with respect to the independent variables x,y,z; pxy,pxz, and pyz

c are the second order mixed partial derivatives of p with respect

c to xy,xz, and yz. r(x,y,z) is the known complex right hand side.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c

c ... methods

c

c

c subroutine cud3cr is a recent addition to mudpack. details

c of the methods employeed by the other solvers in mudpack are in

c [1,9,10]. [1,2,7,9,10] contain performance measurements on a variety

c of elliptic pdes (see "references" in the file "readme"). the multi-

c grid methods are described in documentation for the other solvers.

c

c *** cud3cr differs fundamentally from the other solvers in mudpack.

c the full pde including cross derivative terms is discretized on

c the INTERIOR of the solution region:

c

c xa < x < xb, yc < y < yd, ze < z < zf

c

c however, on nonspecified (nondirchlet) boundaries only l(p) is

c discretized and the cross derivative term lxyz(p) is moved to the

c right hand side of the pde and approximated by second order finite

c finite difference formula applied to a previous estimate in p(k-1).

c similarly, oblique mixed derivative boundary conditions (see bnd3cr)

c are converted to a "cud3" type mixed normal form using second-order

c finite difference formula applied to a previous estimate p(k-1) to

c approximate non-normal derivative components. for example if

c the mixed derivative condition

c

c py + a(x,z)*px + b(x,z)*pz + c(x,z)*p(x,yd,z) = gyd(y,z)

c

c is specifed on the (x,z) plane of the upper y=yd boundary (see

c bnd3cr for kbdy=4 below) then cud3cr converts this to the mixed

c normal derivative form

c

c py + c(x,z)*p(x,yd,z) = h(k,x,z)

c

c where the modified right hand side h(k,x,z) is given by

c

c h(k,x,z) = gyd(x,z) - [a(x,z)*dx(p(k-1)) + b(x,z)*dz(p(k-1)].

c

c dx(p(k-1)) and dz(p(k-1)) are second order finite difference

c approximations to the nonnormal partial derivatives px,pz using the

c previous estimate in p(k-1).

c

c the result of full discretization on interior grid points and partial

c discretization with right hand side modifications on boundaries,

c is a linear system which we denote by

c

c D(p(k)) = r - Dxyz(p(k-1)).

c

c D is the coefficient matrix coming from the discretization and

c Dxyz(p(k-1)) stands for the right hand side modification obtained

c by approximating boundary cross derivative terms and/or nonnormal

c derivative components from mixed derivative boundary conditions

c with second order finite difference formula applied to p(k-1).

c with this notation, we formally describe the outer iteration employeed

c by cud3cr:

c

c algorithm cud3cr

c .

c set k = 0

c .

c set p(0) = 0.0 for all nonspecified grid points

c .

c repeat

c

c .. k = k+1

c

c .. solve D(p(k)) = r - Dxyz(p(k-1)) using multigrid iteration

c

c .. set rmax(k) = ||p(k) - p(k-1)|| / ||p(k)||

c

c until (rmax(k) < tol or k = maxit)

c .

c end cud3cr

c

c tol is an error tolerance for convergence and maxit is a limit on

c the number of outer iterations. both are user prescribed input

c arguments to cud3cr. the maximum vector norm || || is used in

c computing the relative difference between successive estimates in

c rmax(k). large values for maxit should not be used.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c

c ... references (partial list)

c

c for a complete list see "references" in the mudpack information and

c directory file "readme"

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c **********************************************************************

c *** arguments ********************************************************

c **********************************************************************

c

c arguments iparm,fparm,work,rhs,phi,coef,mgopt are the same as

c those input to cud3 (see cud3.d for a detailed description) with the

c following provisions:

c

c (1) the minimum required complex work space length for cud3cr is increased

c by approximately

c

c nx*ny*nz*(1+8*(icros(1)+icros(2)+icros(3))/7 +

c

c 2*(icros(1)+icros(2)+icros(3))*(nx*ny+nx*nz+ny*nz)

c

c words over the minimum work space required by cud3 (see icros

c description below). the exact minimal work space required

c by cud3cr for the current set of input arguments is output

c in iparm(22). * The exact minimal work length required

c for the current method and grid size arguments can be

c predetermined by calling cud3cr with iparm(21)=0 and

c printout of iparm(22) or (in fortran 90 codes) dynamically

c allocating work using the the value in iparm(22) in subsequent

c calls to cud3cr.

c

c (2) at least two calls to cud3cr are necessary to generate an

c approximation. intl=iparm(1)=0 is required on the first

c call. this call will do "once only" discretization, and

c set intermediate values in work which must be preserved

c for noninitial calls.

c

c (3) maxcy = iparm(18) must be 1 or 2 (see ierror = 13).

c

c (4) tolmax = fparm(5) = 0.0 is required. no "internal" error control

c is allowed within multigrid cycling (see cud3.d)

c

c (5) mgopt(1) = 0 is required. only the default multigrid

c options (W(2,1) cycles with cubic prolongation) can be used

c with cud3cr

c

c *** new arguments

c

c the arguments: bnd3cr,icros,crsxy,crsxz,crsyz,tol,maxit,iouter,rmax

c are all new to cud3cr. the error argument, ierror, has been expanded.

c these are all described below:

c

c

c ... bnd3cr(kbdy,xory,yorz,a,b,c,g)

c

c a subroutine with input arguments kbdy,xory,yorz and output

c arguments a,b,c,g. bnd3cr inputs OBLIQUE mixed derivative

c conditions at any of the six x,y,z boundaries to cud3cr.

c a,b,c,g are all type complex.

c described below:

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2)=2 flags

c an oblique mixed boundary condition of the form

c

c px + axa(y,z)*py + bxa(y,z)*pz +cxa(y,z)*p(xa,y,z) = gxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients axa(y,z),bxa(y,z),

c cxa(y,z),gxa(y,z) must be returned

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3)=2 flags

c an oblique mixed boundary condition of the form

c

c px + axb(y,z)*py + bxb(y,z)*pz +cxb(y,z)*p(xb,y,z) = gxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients axb(y,z),bxb(y,z),

c cxb(y,z),gxb(y,z) must be returned

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4)=2 flags

c an oblique mixed boundary condition of the form

c

c py + ayc(x,z)*px + byc(x,z)*pz +cyc(x,z)*p(x,yc,z) = gyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients ayc(x,z),byc(x,z),

c cyc(x,z),gyc(x,z) must be returned

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5)=2 flags

c an oblique mixed boundary condition of the form

c

c py + ayd(x,z)*px + byd(x,z)*pz +cyd(x,z)*p(x,yd,z) = gyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients ayd(x,z),byd(x,z),

c cyd(x,z),gyd(x,z) must be returned

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6)=2 flags

c an oblique mixed boundary condition of the form

c

c pz + aze(x,y)*px + bze(x,y)*py + cze(x,y)*p(x,y,ze) = gze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients aze(x,y),bze(x,y),

c cze(x,y),gze(x,y) must be returned

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7)=2 flags

c an oblique mixed boundary condition of the form

c

c pz + azf(x,y)*px + bzf(x,y)*py + czf(x,y)*p(x,y,zf) = gzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients azf(x,y),bzf(x,y),

c czf(x,y),gzf(x,y) must be returned

c

c

c bnd3cr must be delcared "external" in the routine calling cud3cr

c where its name may be different. bnd3cr must be entered as a

c dummy subroutine even if there are no derivative boundary conditions.

c for an example of how to set up a subroutine to input derivative

c boundary conditions, see the test program tcud3cr.f

c

c ... icros

c

c an integer vector argument dimensioned 3 which flags the presence

c or absence of cross derivative terms in the pde as follows:

c

c icros(1) = 1 if cxy(x,y,z) is nonzero for any grid point (x,y,z)

c icros(1) = 0 if cxy(x,y,z) = (0.0,0.0) for all grid points (x,y,z)

c

c icros(2) = 1 if cxz(x,y,z) is nonzero for any grid point (x,y,z)

c icros(2) = 0 if cxz(x,y,z) = (0.0,0.0) for all grid points (x,y,z)

c

c icros(3) = 1 if cyz(x,y,z) is nonzero for any grid point (x,y,z)

c icros(3) = 0 if cyz(x,y,z) = (0.0,0.0) for all grid points (x,y,z)

c

c

c ... crsxy(x,y,z,cxy)

c

c if icros(1) = 1 then crsxy is a subroutine with arguments

c (x,y,z,cxy) which supplies the xy cross derivative coefficient

c cxy at the grid point (x,y,z). if icros(1) = 0 then crsxy

c is a dummy subroutine argument (i.e., it must be provided but

c will not be invoked).

c

c

c ... crsxz(x,y,z,cxz)

c

c if icros(2) = 1 then crsxz is a subroutine with arguments

c (x,y,z,cxz) which supplies the xz cross derivative coefficient

c cxz at the grid point (x,y,z). if icros(2) = 0 then crsxz

c is a dummy subroutine argument (i.e., it must be provided but

c will not be invoked).

c

c

c ... crsyz(x,y,z,cyz)

c

c if icros(3) = 1 then crsyz is a subroutine with arguments

c (x,y,z,cyz) which supplies the yz cross derivative coefficient

c cxy at the grid point (x,y,z). if icros(3) = 0 then crsyz

c is a dummy subroutine argument (i.e., it must be provided but

c will not be invoked).

c

c crsxy,crsxz,crsyz must be declared "external" in the routine

c calling cud3cr. the names chosen for these routines can be

c different (see tcud3cr.f for an example)

c

c ... tol

c

c tol is an error control argument for the outer iteration employed

c by cud3cr (see "methods" description above). if tol > 0.0 is input

c then tol is a relative error tolerance for convergence. the outer

c iteration terminates and convergence is deemed to have occurred at the

c k(th) iterate if the maximum relative difference, rmax(k), satisfies

c

c def

c rmax(k) = ||p(k) - p(k-1)||/ ||p(k)|| < tol.

c

c the last approximation p(maxit) is returned in phi even if

c convergence does not occurr. the maximum norm || || is used.

c when tol = 0.0 is input, error control is not implemented and

c exactly maxit (see below) outer iterations are executed in cud3cr.

c the tol = 0.0 option eliminates unnecessary computation when

c the user is certain of the required value for maxit.

c

c

c ... maxit

c

c a limit on the outer iteration loop (see "method" description)

c used to approximate the 3-d pde with cross derivative terms when

c tol > 0.0. if tol = 0.0 is entered then exactly maxit outer

c iterations are performed and only rmax(maxit) is computed. the

c total number of relaxation sweeps performed at the finest grid

c level is bounded by 3*maxcy*maxit. large values for maxit should

c not be used.

c

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(22)

c

c on output iparm(22) contains the actual work space length

c required by cud3cr for the current grid sizes and method.

c this will be approximately

c nx*ny*nz*(1+8*(icros(1)+icros(2)+icros(3))/7 +

c

c 2*(icros(1)+icros(2)+icros(3))*(nx*ny+nx*nz+ny*nz)

c

c words longer than the space required by cud3 (see cud3.d)

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if cud3cr is to be called again with iparm(1)=1

c and iparm(17)=1.

c

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained.

c

c

c ... iouter

c

c the number of outer iterations (see "method" description above)

c executed by cud3cr for the current call. maxit is an upper bound

c for iouter

c

c

c ... rmax (see tol,maxit descriptions above)

c

c a maxit dimensioned real vector. if tol > 0.0 is input then

c rmax(k) for k=1,...,iouter contain the maximum relative

c difference between successive estimates. rmax(k) is

c given by

c

c rmax(k) = ||p(k) - p(k-1)||/ ||p(k)||

c

c for k=1,...,iouter. the maximum norm || || is used. either

c iouter < maxit (convergence) or iouter = maxit is possible.

c if tol = 0.0 input then exactly maxit outer iterations are

c executed and only rmax(maxit) is computed. in this case

c rmax(1),...,rmax(maxit-1) are set to 0.0. the tol = 0.0

c option eliminates unnecessary computation when the user is

c certain of the required value for maxit.

c

c

c ... ierror

c

c an integer error argument which indicates fatal errors when

c returned positive. the negative values -5,-4,-3,-2,-1 and

c ierror = 2,3,4,5,6,9,10 have the same meaning as described for

c for cud3 (see cud3.d). in addition:

c

c *** new nonfatal error

c

c ierror = -10 if tol > 0.0 is input (error control) and convergence

c fails in maxit outer iterations. in this case the latest

c approximation p(maxit) is returned in phi (cud3cr can be recalled

c with iparm(1)=iparm(17)=1 to improve the approximation as long

c as all other arguments are unchanged)

c

c *** new fatal errors

c

c ... ierror

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls of if intl=0 and iguess=iparm(17)=1

c

c = 7 if maxcy = iparm(18) is not 1 or 2

c

c = 8 if method = iparm(19) is less than 0 or greater than 7

c cud3cr does not allow planar relaxation. meth2=iparm(20)

c is not used or checked.

c

c =11 if tolmax = fparm(7) is not 0.0

c

c =12 if kcycle = mgopt(1) is not 0

c

c =13 if icros(1) or icros(2) or icros(3) is not 0 or 1

c

c =14 if tol < 0.0

c

c =15 if maxit < 1

c

c ***********************************************************************

c ***********************************************************************

c

c end of cud3cr documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cud3sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cud3sp.d

c

c contains documentation for:

c subroutine cud3sp(iparm,fparm,work,cofx,cofy,cofz,bndyc,rhs,phi,

c + mgopt,ierror)

c A sample fortran driver is file "tcud3sp.f".

c

c ... required MUDPACK files

c

c cudcom.f

c

c ... purpose

c

c subroutine cud3sp automatically discretizes and attempts to compute

c the second order finite difference approximation to a complex three-

c dimensional linear SEPARABLE elliptic partial differential

c equation on a box. the approximation is generated on a uniform

c grid covering the box (see mesh description below). boundary

c conditions may be any combination of mixed, specified (Dirchlet)

c or periodic. the form of the pde solved is . . .

c

c cxx(x)*pxx + cx(x)*px + cex(x)*p(x,y,z) +

c

c cyy(y)*pyy + cy(y)*py + cey(y)*p(x,y,z) +

c

c czz(z)*pzz + cz(z)*pz + cez(z)*p(x,y,z) = r(x,y,z)

c

c here cxx,cx,cex,cyy,cy,cey,czz,cz,cez are the known complex coefficients

c of the pde; pxx,pyy,pzz,px,py,pz are the second and first

c partial derivatives of the unknown solution function p(x,y,z)

c with respect to the independent variables x,y,z; r(x,y,z) is

c is the known complex right hand side of the elliptic pde.

c

c SEPARABILITY means:

c

c cxx,cx,cex depend only on x

c cyy,cy,cey depend only on y

c czz,cz,cez depend only on z

c

c For example, LaPlace's equation in Cartesian coordinates is separable.

c Nonseparable elliptic PDEs can be approximated with cud3.

c cud3sp requires considerably less work space then cud3.

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 22 used to efficiently pass

c integer arguments. iparm is set internally in cud3sp

c and defined as follows . . .

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c cud3sp should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if cud3sp has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cud3sp is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cud3sp is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cud3sp is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cud3sp

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the (y,z) plane x=xa

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xa,y,z) is specified (this must be input thru phi(1,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see "bndyc" description below where kbdy = 1)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the (y,z) plane x=xb

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xb,y,z) is specified (this must be input thru phi(nx,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see "bndyc" description below where kbdy = 2)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the (x,z) plane y=yc

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yc,z) is specified (this must be input thru phi(i,1,k))

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see "bndyc" description below where kbdy = 3)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the (x,z) plane y=yd

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yd,z) is specified (this must be input thru phi(i,ny,k))

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see "bndyc" description below where kbdy = 4)

c

c

c ... nze=iparm(6)

c

c flags boundary conditions on the (x,y) plane z=ze

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,ze) is specified (this must be input thru phi(i,j,1))

c = 2 if there are mixed derivative boundary conditions at z=ze

c (see "bndyc" description below where kbdy = 5)

c

c

c ... nzf=iparm(7)

c

c flags boundary conditions on the (x,y) plane z=zf

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,zf) is specified (this must be input thru phi(i,j,nz))

c = 2 if there are mixed derivative boundary conditions at z=zf

c (see "bndyc" description below where kbdy = 6)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(8)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(14)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(11)

c without changing nx = iparm(14)

c

c

c ... jyq = iparm(9)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(15)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(12)

c without changing ny = iparm(15)

c

c

c ... kzr = iparm(10)

c

c an integer greater than one which is used in defining the number

c of grid points in the z direction (see nz = iparm(16)). "kzr+1"

c is the number of points on the coarsest z grid visited during

c multigrid cycling. kzr should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the z direction is not used.

c if kzr > 2 then it should be 2 or a small odd value since a power

c of 2 factor of kzr can be removed by increasing kez = iparm(13)

c without changing nz = iparm(16)

c

c

c ... iex = iparm(11)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(14)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx = iparm(14).

c

c

c ... jey = iparm(12)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(15)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(9)

c as small as possible within grid size constraints when

c defining ny = iparm(15).

c

c

c ... kez = iparm(13)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the z direction (see nz = iparm(16)).

c kez .le. 50 is required. for efficient multigrid cycling,

c kez should be chosen as large as possible and kzr=iparm(10)

c as small as possible within grid size constraints when

c defining nz = iparm(16).

c

c

c ... nx = iparm(14)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(8), iex = iparm(11).

c

c

c ... ny = iparm(15)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(9), jey = iparm(12).

c

c

c ... nz = iparm(16)

c

c the number of equally spaced grid points in the interval [ze,zf]

c (including the boundaries). nz must have the form

c

c nz = kzr*(2**(kez-1)) + 1

c

c where kzr = iparm(10), kez = iparm(13)

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 65 by 97 grid. then

c ixp=2, jyq=4, kzr=6 and iex=jey=kez=5 could be used. a better

c choice would be ixp=jyq=2, kzr=3, and iex=5, jey=kez=6.

c

c *** note

c

c let G be the nx by ny by nz fine grid on which the approximation is

c generated and let n = max0(iex,jey,kez). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) by mz(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c mz(k) = kzr*[2**(max0(kez+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(17)

c

c = 0 if no initial guess to the pde is provided

c and/or full multigrid cycling beginning at the

c coarsest grid level is desired.

c

c = 1 if an initial guess to the pde at the finest grid

c level is provided in phi (see below). in this case

c cycling beginning or restarting at the finest grid

c is initiated.

c

c *** comments on iguess = 0 or 1 . . .

c

c

c setting iguess=0 forces full multigrid or "fmg" cycling. phi

c must be initialized at all grid points. it can be set to zero at

c non-Dirchlet grid points if nothing better is available. the

c values set in phi when iguess = 0 are passed and down and serve

c as an initial guess to the pde at the coarsest grid level where

c multigrid cycling commences.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c *** time dependent problems . . .

c

c assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at non-Dirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(18)

c

c the exact number of cycles executed between the finest

c (nx by ny by nz) and the coarsest ((ixp+1) by (jyq+1) by

c (kzr+1)) grid levels when tolmax=fparm(7)=0.0 (no error

c control). when tolmax=fparm(7).gt.0.0 is input (error control)

c then maxcy is a limit on the number of cycles between the

c finest and coarsest grid levels. in any case, at most

c maxcy*(iprer+ipost) relaxation sweeps are performed at the

c finest grid level (see iprer=mgopt(2),ipost=mgopt(3) below)

c when multigrid iteration is working "correctly" only a few

c cycles are required for convergence. large values for maxcy

c should not be required.

c

c

c ... method = iparm(19)

c

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c

c This is the only relaxation method offered with cud3sp. Line

c or planar relaxation would "lose" the significant savings in

c work space length defeating the purpose of cud3sp. If line

c or planar relaxation is required then use cud3. method is

c used as an argument only to focus attention on the purpose

c of cud3sp.

c

c ... length = iparm(20)

c

c the length of the work space provided in vector work.

c This is considerably less then the work space required by

c the nonseparable solver cud3.

c

c length = 7*(nx+2)*(ny+2)*(nz+2)/2

c

c will usually but not always suffice. The exact minimal length

c depends on the grid size arguments. It can be predetermined

c *** for the current input arguments by calling cud3sp with iparm(20)

c set equal to zero and printing iparm(21) or (in f90) dynamically

c allocating the work space using the value in iparm(21) in a

c subsequent cud3sp call.

c

c ... fparm

c

c a floating point vector of length 8 used to efficiently

c pass floating point arguments. fparm is set internally

c in cud3sp and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... ze=fparm(5), zf=fparm(6)

c

c the range of the z independent variable. ze must

c be less than zf.

c

c

c ... tolmax = fparm(7)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j,k)

c and phi2(i,j,k) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(7)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c

c ... work

c

c a complex array that must be provided for work space.

c see length = iparm(20). the values in work must be preserved

c if cud3sp is called again with intl=iparm(1).ne.0 or if cud34sp

c is called to improve accuracy.

c

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,yorz,alfa,gbdy).

c which are used to input complex mixed boundary conditions.

c the boundaries are numbered one thru six and the form of

c conditions are described below.

c

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa*p(xa,y,z) = gbdxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxa,gbdxa(y,z) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb*p(xb,y,z) = gbdxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxb,gbdxb(y,z) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc*p(x,yc,z) = gbdyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyc,gbdyc(x,z) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd*p(x,yd,z) = gbdyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyd,gbdyd(x,z) must be returned.

c

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfze*p(x,y,ze) = gbdze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfze,gbdze(x,y) must be returned.

c

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfzf*p(x,y,zf) = gbdzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfzf,gbdzf(x,y) must be returned.

c

c

c *** alfa,gbdy must be delcared complex. alfa is constant.

c

c *** bndyc must provide the mixed boundary condition

c values in correspondence with those flagged in iparm(2)

c thru iparm(7). if all boundaries are specified then

c cud3sp will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared

c external in the routine calling cud3sp. the actual

c name chosen may be different.

c

c

c ... cofx

c

c a subroutine with arguments (x,cxx,cx,cex) which provides the

c known complex coefficients of the x derivative terms for the pde

c at any grid point x. the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... cofy

c

c a subroutine with arguments (y,cyy,cy,cey) which provides the

c known complex coefficients of the y derivative terms for the pde

c at any grid point y. the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... cofz

c

c a subroutine with arguments (z,czz,cz,cez) which provides the

c known complex coefficients of the z derivative terms for the pde

c at any grid point z. the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... rhs

c

c an array dimensioned nx by ny by nz which contains

c the given right hand side values on the uniform 3-d mesh.

c rhs(i,j,k) = r(xi,yj,zk) for i=1,...,nx and j=1,...,ny

c and k=1,...,nz.

c

c ... phi

c

c an array dimensioned nx by ny by nz . on input phi must

c contain specified boundary values and an initial guess

c to the solution if flagged (see iguess=iparm(17)=1). for

c example, if nyd=iparm(5)=1 then phi(i,ny,k) must be set

c equal to p(xi,yd,zk) for i=1,...,nx and k=1,...,nz prior to

c calling cud3sp. the specified values are preserved by cud3sp.

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at non-Dirchlet grid points (this is not

c checked). these values are projected down and serve as an initial

c guess to the pde at the coarsest grid level. set phi to 0.0 at

c nonDirchlet grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below).

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(21)

c

c on output iparm(21) contains the actual work space length

c required for the current grid sizes and method. This value

c will be computed and returned even if iparm(20) is less then

c iparm(21) (see ierror=9).

c

c

c ... iparm(22)

c

c if error control is selected (tolmax = fparm(7) .gt. 0.0) then

c on output iparm(22) contains the actual number of cycles executed

c between the coarsest and finest grid levels in obtaining the

c approximation in phi. the quantity (iprer+ipost)*iparm(22) is

c the number of relaxation sweeps performed at the finest grid level.

c

c

c ... fparm(8)

c

c on output fparm(8) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(8) is computed only if there is error control (tolmax.gt.0.)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then

c

c fparm(8) = phdif/phmax

c

c is returned whenever phmax.gt.0.0. in the degenerate case

c phmax = 0.0, fparm(8) = phdif is returned.

c

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if cud3sp is to be called again with iparm(1)=1 or

c if cud34sp is to be called to improve the estimate to fourth

c order.

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained (ierror=-1)

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 nonellipticity is not checked with cud3sp so this flag is not set.

c (compare with cud3.f)

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd,nze,nzf

c in iparm(2) through iparm(7)is not 0,1 or 2 or if

c (nxa,nxb) or (nyc,nyd) or (nze,nzf) are not pairwise zero.

c

c = 3 if mino(ixp,jyq,kzr) < 2 (ixp=iparm(8),jyq=iparm(9),kzr=iparm(10))

c

c = 4 if min0(iex,jey,kez) < 1 (iex=iparm(11),jey=iparm(12),kez=iparm(13))

c or if max0(iex,jey,kez) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or if ny.ne.jyq*2**(jey-1)+1 or

c if nz.ne.kzr*2**(kez-1)+1 (nx=iparm(14),ny=iparm(15),nz=iparm(16))

c

c = 6 if iguess = iparm(17) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(18) < 1 (large values for maxcy should not be used)

c

c = 8 if method = iparm(19) is not equat to zero

c

c = 9 if length = iparm(20) is too small (see iparm(21) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd or ze >= zf

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4),ze=fparm(5),zf=fparm(6))

c

c =11 if tolmax = fparm(7) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of cud3sp documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cuh2.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cuh2.d

c

c contains documentation for:

c subroutine cuh2(iparm,fparm,wk,iwk,coef,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tcuh2.f".

c

c ... required mudpack files

c

c cudcom.f, cuhcom.f

c

c ... purpose

c

c the "hybrid" multigrid/direct method code cuh2 approximates the

c same 2-d nonseparable elliptic pde as the mudpack solver cud2.

c cuh2 combines the efficiency of multigrid iteration with the certainty

c a direct method. the basic algorithm is modified by using banded

c gaussian elimination in place of relaxation whenever the coarsest

c subgrid is encountered within multigrid cycling. this provides

c additional grid size flexibility by eliminating the usual multigrid

c constraint that the coarsest grid consist of "few" points for effective

c error reduction with multigrid cycling. In many cases the hybrid method

c provides more robust convergence characteristics than multigrid cycling

c alone.

c

c The form of the pde solved is:

c

c

c cxx(x,y)*pxx + cyy(x,y)*pyy + cx(x,y)*px + cy(x,y)*py +

c

c ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pyy,px,py are second and first partial derivatives of the

c unknown complex solution function p(x,y) with respect to the

c independent variables x,y. cxx,cyy,cx,cy,ce are the known

c complex coefficients of the elliptic pde and r(x,y) is the known

c complex right hand side of the equation. Nonseparability means

c some of the coefficients depend on both x and y. if the pde

c is separable subroutine cud2sp should be used instead

c of cud2 or cuh2.

c

c *** cuh2 becomes a full direct method if grid size arguments are chosen

c so that the coarsest and finest grids coincide. choosing iex=jey=1

c and ixp=nx-1, jyq=ny-1 (iex=iparm(6),jey=iparm(7),ixp=iparm(8),

c jyq=iparm(9),nx=iparm(10),ny=iparm(11)) will set gaussian elimination

c on the nx by ny grid.

c

c

c ... argument differences with cud2.f

c

c the input and output arguments of cuh2 are almost identical to the

c arguments of cud2 (see cud2.d) with the following exceptions:

c

c (1) the complex work space vector "wk" requires

c

c (ixp+1)*(jyq+1)*(2*ixp+3)

c

c additional words of storage (ixp = iparm(6), jyq = iparm(7))

c if periodic boundary conditions are not flagged in the y direction

c (nyc .ne. 0 where nyc = iparm(4)) or

c

c (ixp+1)*[2*(ixp+1)*(2*jyq-1)+jyq+1]

c

c additional words of storage if periodic boundary conditions are

c flagged in the y direction (nyc = 0). the extra work space is

c used for a direct solution with gaussian elimination whenever the

c coarsest grid is encountered within multigrid cycling.

c

c (2) An integer work space iwk of length at least (ixp+1)*(jyq+1)

c must be provided.

c

c (3) jyq must be greater than 2 if periodic boundary conditions

c are flagged in the y direction and ixp must be greater than

c 2 if periodic boundary conditions are flagged in the x direction.

c inputting jyq = 2 when nyc = 0 or inputting ixp = 2 when nxa = 0

c will set the fatal error flag ierror=3

c

c *** (4) it is no longer necessary that ixp and jyq be "small" for

c effective error reduction with multigrid iteration. there

c is no reduction in convergence rates when larger values for

c ixp or jyq are used . this provides additional flexibility

c in choosing grid size. in many cases cuh2 provides more

c robust convergence than cud2. it can be used in place of

c cud2 for all nonsingular problems (see (5) below).

c

c (5) iguess = iparm(11) = 1 (flagging an initial guess) or

c maxcy = iparm(14) > 1 (setting more than one multigrid

c cycle) are not allowed if cuh2 becomes a full direct method

c by choosing iex = jey = 1 (iex = iparm(8),jey = iparm(9)).

c this conflicting combination of input arguments for multigrid

c iteration and a full direct method set the fatal error flag

c

c ierror = 13

c

c iguess = 0 and maxcy = 1 are required when cuh2 becomes a

c full direct method.

c

c (6) if a "singular" pde is detected (see ierror=-3 description in cud2.d;

c ce(x,y) = 0.0 for all x,y and the boundary conditions are a combination

c of periodic and/or pure derivatives) then cuh2 sets the fatal error

c flag

c

c ierror = 14

c

c The direct method utilized by cuh2 would likely cause a division

c by zero in the singular case. cud2 can be tried for singular problems

c

c

c ... grid size considerations

c

c (1) flexibility

c

c cuh2 should be used in place of cud2 whenever grid size

c requirements do not allow choosing ixp and jyq to be "small"

c positive integers (typically less than 4).

c

c example:

c

c suppose we wish to solve an elliptic pde on a one degree grid on

c the full surface of a sphere. choosing ixp = jyq = 45 and iex = 4

c and jyq = 3 fits the required 361 by 181 grid exactly. multigrid

c cycling will be used on the sequence of subgrid sizes:

c

c 46 x 46 < 91 x 46 < 181 x 91 < 361 x 181

c

c the 46 x 46 coarsest subgrid has too much resolution for effective

c error reduction with relaxation only. cuh2 circumvents this

c difficulty by generating an exact direct solution (modulo roundoff

c error) whenever the coarsest grid is encountered.

c

c (2) additional work space (see (1) under "arguments differences") is

c required by cuh2 to implement gaussian elimination at the coarsest

c grid level. this may limit the size of ixp and jyq.

c

c (3) operation counts

c

c for simplicity, assume p = ixp = jyq and n = nx = ny. banded

c gaussian elimination requires o(p**4) operations for solution

c on the coarsest subgrid while multigrid iteration is a o(n**2)

c algorithm. these are approximately balanced when

c

c p**4 =: (n/(2**k))**4 =: n**2

c

c or

c

c k =: log2(n)/2

c

c grid levels are chosen with the hybrid method. so if

c p is approximately equal to

c

c n/(2**(log2(n)/2))

c

c then the direct method and multigrid parts of the hybrid algorithm

c require roughly the same amount of computer time. larger values

c for p mean the direct method will dominate the computation. smaller

c values mean the hybrid method will cost only marginally more than

c multigrid iteration with coarse grid relaxation.

c

c

c *** the remaining documentation is almost identical to cud2.d

c except for the modifications already indicated.

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c cuh2 should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if cuh2 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cuh2 is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cuh2 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cuh2 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cuh2

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c cust be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** grid size flexibility considerations:

c

c the hybrid multigrid/direct method code cuh2 provides more grid size

c flexibility than cud2 by removing the constraint that ixp and jyq are

c 2 or 3. this is accomplished by using a direct method whenever the

c coarsest (ixp+1) x (jyq+1) grid is encountered in multigrid cycling.

c if nx = ixp+1 and ny = jyq+1 then cuh2 becomes a full direct method.

c cuh2 is roughly equivalent to cud2 in efficiency as long as ixp and

c jyq remain "small". if the problem to be approximated requires

c a grid neither cud2 or cuh2 can exactly fit then another option

c is to generate an approximation on a "close grid" using cud2 or cuh2.

c then transfer the result to the required grid using cubic interpolation

c via the package "regridpack"(contact john adams about this software)

c

c *** note

c

c let G be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = g.

c

c each G(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c If iex = jey = 1 then G(1) = G(n) and cuh2 solves the problem

c directly with block banded Gaussian elimination. Otherwise

c cuh2 replaces relaxation with a direct method on G(1).

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cabs(cxx(x,y))/dlx**2 over the solution region.

c

c let fy represent the quantity cabs(cyy(x,y))/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c

c let ldir = (ixp+1)*(jyq+1)*(2*ixp+3) if nyc.ne.0 or

c let ldir = (ixp+1)*[2*(ixp+1)*(2*jyq-1)+jyq+1] if nyc=0

c

c then . . .

c

c length = 4*[nx*ny*(10+isx+jsy)+8*(nx+ny+2)]/3 + ldir

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in cuh2 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(cabs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(cabs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT error control!).

c

c ... wk

c

c a one dimensional complex saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... iwk

c

c an integer vector dimensioned of length at least (ixp+1)*(jyq+1)

c (ixp = iparm(6),jyq=iparm(7)) in the routine calling cuh2.

c The length of iwk is not checked! If iwk has length less than

c (ixp+1)*(jyq+1) then undetectable errors will result.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to cuh2. bndyc

c must be declared "external" in the program calling cuh2.

c kbdy is type integer, xory type real, alfa,gbdy type complex

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tcuh2.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y),gbdxa(y) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y),gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x),gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x),gbdyd(x) must be returned.

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c cuh2 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling cuh2. the actual name chosen may

c be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cyy,cx,cy,ce) which

c provides the known complex coefficients for the elliptic pde at

c any grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c an array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c an array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling cuh2. these values are preserved by cuh2.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid parameters (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the parameters

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction. The

c "D" at grid level 1 indicates a direct method is used.

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---D-----D-----------D-----------------D--------------- level 1

c

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --D---D-------D---D-----------D---D-------D---D-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c *** this algorithim is modified with the hybrid solvers which use

c a direct method whenever grid level 1 is encountered.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(cabs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(cabs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if cuh2 is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c for intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. discretization is bypassed for intl=1 calls

c which can only return ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c cabs(cx)*dlx > 2.*cabs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c cabs(cy)*dly > 2.*cabs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = cmplx(0.5*cabs(cx)*dx,0.0)

c

c cyy = cmplx(0.5*cabs(cy)*dy,0.0)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made when necessary to preserve convergence. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-2 if the pde is not elliptic

c

c real(cxx)*real(cyy).le.0.0 or aimag(cxx)*aimag(cyy).le.0.0

c

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c of if ixp < 3 when nxa=0 or if jyq < 3 when nyc=0.

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(1) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c =13 if iex=jey=1 (full direct method) and iguess=1 or maxcy > 1

c

c =14 if the elliptic pde is singular (see ierror=-3 in cud2.d)

c

c *********************************************************

c *********************************************************

c

c end of cuh2 documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cuh24.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cuh24.d

c

c contains documentation for:

c subroutine cuh24(wk,iwk,phi,ierror)

c A sample fortran driver is file "tcuh24.f".

c

c ... required MUDPACK files

c

c cuh2.f, cudcom.f

c

c ... purpose

c

c cuh24 attempts to improve the estimate in phi, obtained by calling

c cuh2, from second to fourth order accuracy. see the file "cuh2.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "wk,iwk,phi" which are also part of the argument list for

c cuh2.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cuh2 call

c

c * arguments "wk,iwk,phi" are the same used in calling cuh2

c

c * "wk,iwk,phi" have not changed since the last call to cuh2

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error parameter

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cuh24 documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cuh24cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cuh24cr.d

c

c contains documentation for:

c subroutine cuh24cr(wk,iwk,coef,bndyc,phi,ierror)

c A sample fortran driver is file "tcuh24cr.f".

c

c ... required MUDPACK files

c

c cuh2cr.f, cudcom.f

c

c ... purpose

c

c cuh24cr attempts to improve the estimate in phi, obtained by calling

c cuh2cr, from second to fourth order accuracy. see the file "cuh2cr.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "wk,iwk,coef,bndyc,phi" which are also part of the argument

c list for cuh2cr.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cuh2cr call

c

c * arguments "wk,iwk,coef,bndyc,phi" are the same used in calling cuh2cr

c

c * "wk,iwk,coef,bndyc,phi" have not changed since the last call to cuh2cr

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error parameter

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cuh24cr documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file cuh2cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cuh2cr.d

c

c contains documentation for the complex mudpack solver:

c subroutine cuh2cr(iparm,fparm,work,iw,coef,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tcuh2cr.f".

c

c ... required mudpack files

c

c cudcom.f

c

c ... purpose

c

c the "hybrid" multigrid/direct method code cuh2cr approximates the

c same 2-d nonseparable elliptic pde as the mudpack solver cud2cr.

c cuh2cr combines the efficiency of multigrid iteration with the certainty

c a direct method. the basic algorithm is modified by using banded

c gaussian elimination in place of relaxation whenever the coarsest

c subgrid is encountered within multigrid cycling. this provides

c additional grid size flexibility by eliminating the usual multigrid

c constraint that the coarsest grid consist of "few" points for effective

c error reduction with multigrid cycling. In many cases the hybrid method

c provides more robust convergence characteristics than multigrid cycling

c alone.

c The pde approximated is:

c

c

c cxx(x,y)*pxx + cxy(x,y)*pxy + cyy(x,y)*pyy + cx(x,y)*px +

c

c cy(x,y)*py + ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pxy,pyy,px,py are second and first partial derivatives of the

c unknown complex solution function p(x,y) with respect to the

c independent variables x,y. cxx,cxy,cyy,cx,cy,ce are the known

c complex coefficients of the elliptic pde and r(x,y) is the known

c complex right hand side of the equation. The real parts of cxx,cyy

c or the imaginary parts of cxx,cyy should be positive for all x,y

c in the solution region (see ierror=-2). Nonseparability means some

c of the coefficients depend on both x and y. if the PDE is separable

c and cxy = 0 then subroutine cud2sp should be used.

c

c *** cuh2cr becomes a full direct method if grid size arguments are chosen

c so that the coarsest and finest grids coincide. choosing iex=jey=1

c and ixp=nx-1, jyq=ny-1 (iex=iparm(6),jey=iparm(7),ixp=iparm(8),

c jyq=iparm(9),nx=iparm(10),ny=iparm(11)) will set gaussian elimination

c on the nx by ny grid.

c

c ... argument differences with cud2cr.f

c

c the input and output arguments of cuh2cr are almost identical to the

c arguments of cud2cr (see cud2cr.d) with the following exceptions:

c

c (1) the complex work space vector "wk" requires

c

c (ixp+1)*(jyq+1)*(2*ixp+3)

c

c additional words of storage (ixp = iparm(6), jyq = iparm(7))

c if periodic boundary conditions are not flagged in the y direction

c (nyc .ne. 0 where nyc = iparm(4)) or

c

c (ixp+1)*[2*(ixp+1)*(2*jyq-1)+jyq+1]

c

c additional words of storage if periodic boundary conditions are

c flagged in the y direction (nyc = 0). the extra work space is

c used for a direct solution with gaussian elimination whenever the

c coarsest grid is encountered within multigrid cycling.

c

c (2) An integer work space iwk of length at least (ixp+1)*(jyq+1)

c must be provided.

c

c (3) jyq must be greater than 2 if periodic boundary conditions

c are flagged in the y direction and ixp must be greater than

c 2 if periodic boundary conditions are flagged in the x direction.

c inputting jyq = 2 when nyc = 0 or inputting ixp = 2 when nxa = 0

c will set the fatal error flag ierror=3

c

c *** (4) it is no longer necessary that ixp and jyq be "small" for

c effective error reduction with multigrid iteration. there

c is no reduction in convergence rates when larger values for

c ixp or jyq are used . this provides additional flexibility

c in choosing grid size. in many cases cuh2 provides more

c robust convergence than cud2. it can be used in place of

c cud2 for all nonsingular problems (see (5) below).

c

c (5) iguess = iparm(11) = 1 (flagging an initial guess) or

c maxcy = iparm(14) > 1 (setting more than one multigrid

c cycle) are not allowed if cuh2 becomes a full direct method

c by choosing iex = jey = 1 (iex = iparm(8),jey = iparm(9)).

c this conflicting combination of input arguments for multigrid

c iteration and a full direct method set the fatal error flag

c

c ierror = 13

c

c iguess = 0 and maxcy = 1 are required when cuh2 becomes a

c full direct method.

c

c (6) if a "singular" pde is detected (see ierror=-3 description in cud2.d;

c ce(x,y) = 0.0 for all x,y and the boundary conditions are a combination

c of periodic and/or pure derivatives) then cuh2 sets the fatal error

c flag

c

c ierror = 14

c

c The direct method utilized by cuh2 would likely cause a division

c by zero in the singular case. cud2 can be tried for singular problems

c

c

c ... grid size considerations

c

c (1) flexibility

c

c cuh2 should be used in place of cud2 whenever grid size

c requirements do not allow choosing ixp and jyq to be "small"

c positive integers (typically less than 4).

c

c example:

c

c suppose we wish to solve an elliptic pde on a one degree grid on

c the full surface of a sphere. choosing ixp = jyq = 45 and iex = 4

c and jyq = 3 fits the required 361 by 181 grid exactly. multigrid

c cycling will be used on the sequence of subgrid sizes:

c

c 46 x 46 < 91 x 46 < 181 x 91 < 361 x 181

c

c the 46 x 46 coarsest subgrid has too much resolution for effective

c error reduction with relaxation only. cuh2 circumvents this

c difficulty by generating an exact direct solution (modulo roundoff

c error) whenever the coarsest grid is encountered.

c

c (2) additional work space (see (1) under "arguments differences") is

c required by cuh2 to implement gaussian elimination at the coarsest

c grid level. this may limit the size of ixp and jyq.

c

c (3) operation counts

c

c for simplicity, assume p = ixp = jyq and n = nx = ny. banded

c gaussian elimination requires o(p**4) operations for solution

c on the coarsest subgrid while multigrid iteration is a o(n**2)

c algorithm. these are approximately balanced when

c

c p**4 =: (n/(2**k))**4 =: n**2

c

c or

c

c k =: log2(n)/2

c

c grid levels are chosen with the hybrid method. so if

c p is approximately equal to

c

c n/(2**(log2(n)/2))

c

c then the direct method and multigrid parts of the hybrid algorithm

c require roughly the same amount of computer time. larger values

c for p mean the direct method will dominate the computation. smaller

c values mean the hybrid method will cost only marginally more than

c multigrid iteration with coarse grid relaxation.

c

c

c *** the remaining documentation is almost identical to cud2.d

c except for the modifications already indicated.

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c cuh2cr should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if cuh2cr has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cuh2cr is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cuh2cr is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cuh2cr is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cuh2cr

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c must be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** note

c

c let g be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c g(1) < ... < g(k) < ... < g(n) = g.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in complex work

c space "work")

c

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = [7*(nx+2)*(ny+2)+4*(11+isx+jsy)*nx*ny]/3

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in cuh2cr and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible do not use error control!).

c

c ... work

c

c a complex saved work space (see iparm(15) for size) which

c must be preserved from the previous call when calling with

c intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,beta,gama,gbdy) which

c are used to input mixed boundary conditions to cuh2cr. bndyc

c must be declared "external" in the program calling cuh2cr. kbdy

c is type integer, xory real, and alfa,beta,gama,gbdy type complex.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tcuh2cr.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c alfxa(y)*px + betxa(y)*py + gamxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfxa(y),betxa(y),gamxa(y),

c gbdxa(y) must be returned. alfxa(y) = 0. is not allowed for any y.

c (see ierror = 13)

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c alfxb(y)*px + betxb(y)*py + gamxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfxb(y),betxb(y),gamxb(y),

c gbdxb(y) must be returned. alfxb(y) = 0.0 is not allowed for any y.

c (see ierror = 13)

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c alfyc(x)*px + betyc(x)*py + gamyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfyc(x),betyc(x),gamyc(x),

c gbdyc(x) must be returned. betyc(x) = 0.0 is not allowed for any x.

c (see ierror = 13)

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c alfyd(x)*px + betyd(x)*py + gamyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfyd(x),betyd(x),gamyd(x),

c gbdyd(x) must be returned. betyd(x) = 0.0 is not allowed for any x.

c (see ierror = 13)

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c cuh2cr will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling cuh2cr. the actual name chosen may

c be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cxy,cyy,cx,cy,ce) which

c provides the known complex coefficients for the elliptic pde at

c any grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c a complex array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c a complex array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling cuh2cr. these values are preserved by cuh2cr.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if cuh2cr is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c for intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(1) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c =13 if there is a pure tangential derivative along a mixed derivative

c boundary (e.g., nyd = 2 and betyd(x) = 0.0 for some

c grid point x along y = yd)

c

c =14 if there is the "singular" condition described below at a

c cornor which is the intersection of two derivative boundaries.

c

c (1) the cornor (xa,yc) if nxa=nyc=2 and

c alfxa(yc)*betyc(xa)-alfyc(xa)*betxa(yc) = 0.0.

c

c (2) the cornor (xa,yd) if nxa=nyd=2 and

c alfxa(yd)*betyd(xa)-alfyd(xa)*betxa(yd) = 0.0.

c

c (3) the cornor (xb,yc) if nxb=nyc=2 and

c alfxb(yc)*betyc(xb)-alfyc(xb)*betxb(yc) = 0.0.

c

c (4) the cornor (xb,yd) if nxb=nyd=2 and

c alfxb(yd)*betyd(xb)-alfyd(xb)*betxb(yd) = 0.0.

c

c *** the conditions described in ierror = 13 or 14 will lead to division

c by zero during discretization if undetected.

c

c

c *********************************************************

c *********************************************************

c

c end of cuh2cr documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cuh3.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cuh3.d

c

c contains documentation for:

c subroutine cuh3(iparm,fparm,wk,iw,coef,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tcuh3.f".

c

c ... required mudpack files

c

c cudcom.f, cud3ln.f, cud3pn.f

c

c ... purpose

c

c the complex "hybrid" multigrid/direct method code cuh3 approximates

c the same 3-d nonseparable elliptic pde as the mudpack solver cud3.

c the basic algorithm is modified by using block banded gaussian

c elimination in place of relaxation whenever the coarsest subgrid is

c encountered within multigrid cycling. use of the direct method at

c the coarsest grid level gives cuh3 at least two advantages over cud3:

c

c (1) improved convergence rates

c

c the use of a direct method at the coarsest grid level can

c improve convergence rates at a small additional computational

c cost if the coarse grid parameters ixp=iparm(8),jyq=iparm(9),

c kzr=iparm(10) are small relative to the fine grid parameters

c nx=iparm(14),ny=iparm(15),nz=iparm(16). this is especially true

c in the presence of certain boundary conditions. for example,

c if all boundary conditions are neuman (pure derivative) and/or

c periodic then cud3 may fail to converge. cuh3 should handle

c these boundary conditions with the expected multigrid efficiency

c (see tcuh3.f). in all cases, cuh3 should give convergence

c rates which equal or exceed cud3.

c

c

c (2) more resolution choices

c

c cuh3 allows more grid size flexibility by "relaxing" the

c constraint on the coarse grid parameters that ixp,jyq,kzr

c be "very" small (2 or 3 for cud3) for effective error

c error reduction within multigrid cycling. convergence

c rates will not deteoriate with larger values for ixp,

c jyq,kzr.

c

c *** caution

c

c because of the very large computational and storage

c requirements, the three-dimensional dimensional direct

c method costs can overwhelm the multigrid cycling costs

c if the coarsest grid is not small relative to the finest

c solution grid. this is a user decision set by the choice

c of coarse and fine grid parameters (see iparm(8) through

c iparm(16) and iparm(21) descriptions)

c

c subroutine cuh3 automatically discretizes and attempts to compute

c the second order finite difference approximation to a three-

c dimensional linear nonseparable elliptic partial differential

c equation on a box. the approximation is generated on a uniform

c grid covering the box (see mesh description below). boundary

c conditions may be any combination of mixed, specified (Dirchlet)

c or periodic. the form of the pde solved is . . .

c

c cxx(x,y,z)*pxx + cyy(x,y,z)*pyy + czz(z,y,z)*pzz +

c

c cx(x,y,z)*px + cy(x,y,z)*py + cz(x,y,z)*pz +

c

c ce(x,y,z)*p(x,y,z) = r(x,y,z)

c

c here cxx,cyy,czz,cx,cy,cz,ce are the known complex coefficients

c of the pde; pxx,pyy,pzz,px,py,pz are the second and first partial

c derivatives of the unknown complex solution function p(x,y,z)

c with respect to the independent variables x,y,z; r(x,y,z) is

c is the known complex right hand side of the elliptic pde. cxx,cyy

c and czz should have real or imaginary parts positive for all (x,y,z)

c

c

c ... argument differences with cud3.f

c

c the input and output arguments of cuh3 are almost identical to the

c arguments of cud3 (see cud3.d) with the following exceptions:

c

c (1) let mx=ixp+1, my=jyq+1, mz=kzr+1 (the coarsest grid

c resolutions, ixp=iparm(8), jyq=iparm(9), kzr=iparm(10))

c then the work space vector "wk" requires

c

c mx*my*mz*(2*mx*my+1)) (nze.ne.0)

c

c additional words of storage if periodic boundary conditions

c are not flagged in the z direction or

c

c mx*my*(mz*(4*mx*my+1)) (nze=0)

c

c additional words of storage if periodic boundary conditions are

c flagged in the z direction (nze = 0). the extra work space is

c used for a direct solution with gaussian elimination whenever the

c coarsest grid is encountered within multigrid cycling.

c

c (2) an integer work space iwk of length at least mx*my*mz words

c must be provided. the length of iwk is not checked!

c

c (3) kzr > 2 if nze=0, jyq > 2 if nyc=0, ixp > 2 if nxe = 0 are

c required (i.e., the coarsest grid must contain at least four

c points in any direction with periodic boundary conditions,

c see the expanded meaning of ierror=3).

c

c *** (4) it is no longer necessary that ixp,jyq,kzr be 2 or 3 for

c effective error reduction with multigrid iteration. there

c is no reduction in convergence rates when larger values for

c ixp,jyq,kzr are used . this provides additional flexibility

c in choosing grid size. in many cases cuh3 provides more

c robust convergence than cud3. it can be used in place of

c cud3 for all nonsingular problems (see (5) below).

c

c (5) iguess = iparm(17) = 1 (flagging an initial guess) or

c maxcy = iparm(18) > 1 (setting more than one multigrid

c cycle) are not allowed if cuh3 becomes a full direct method

c by choosing iex=jey=kez=1. this conflicting combination

c of input arguments for multigrid iteration and a full

c direct method set the fatal error flag

c

c ierror = 13

c

c iguess = 0 and maxcy = 1 are required when cuh3 becomes a

c full direct method. ordinarily (see *** caution above) this

c should not happen except when testing with very coarse resolution.

c

c

c (6) if a "singular" pde is detected (see ierror=-3 description in

c cud3.d, ce(x,y) = 0.0 for all x,y and the boundary conditions

c are a combination of periodic and/or pure derivatives) then cuh3

c sets the fatal error flag

c

c ierror = 14

c

c the direct method utilized by cuh3 would likely cause a near

c division by zero in the singular case. cud3 can be tried for

c singular problems.

c

c ... grid size considerations

c

c (1) flexibility

c

c cuh3 should be used in place of cud3 whenever grid size

c requirements do not allow choosing ixp,jyq,kzr to be 2 or 3.

c

c (2) additional work space (see (1) under "arguments differences") is

c required by cuh3 to implement gaussian elimination at the coarsest

c grid level. this may limit the size of ixp,jyq,kzr.

c

c (3) operation counts

c k

c for simplicity, assume p=ixp=jyq=kzr and n=nx=ny=nz=2 *p.

c gaussian elimination requires o(p**7) operations for solution

c on the coarsest subgrid while multigrid iteration is a o(n**3)

c algorithm. consequently the storage and computational

c requirements for the 3-d direct method will dominate the

c calculation if p is "large." note that o(p**7)=:o(n**3)

c whenever k =: (4/3)*log2(p) grid levels are used in cycling.

c larger values mean the direct method will dominate the

c calculation. smaller values for k mean the direct method

c will only marginally add to the cost of multigrid iteration

c alone.

c

c *** the remaining documentation is almost identical to cud3.d

c except for the modifications already indicated.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 23 used to efficiently pass

c integer arguments. iparm is set internally in cuh3

c and defined as follows . . .

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c cuh3 should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if cuh3 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) cuh3 is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) cuh3 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) cuh3 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to cuh3

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c must be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "wk" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the (y,z) plane x=xa

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xa,y,z) is specified (this must be input thru phi(1,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see "bndyc" description below where kbdy = 1)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the (y,z) plane x=xb

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xb,y,z) is specified (this must be input thru phi(nx,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see "bndyc" description below where kbdy = 2)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the (x,z) plane y=yc

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yc,z) is specified (this must be input thru phi(i,1,k))

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see "bndyc" description below where kbdy = 3)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the (x,z) plane y=yd

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yd,z) is specified (this must be input thru phi(i,ny,k))

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see "bndyc" description below where kbdy = 4)

c

c

c ... nze=iparm(6)

c

c flags boundary conditions on the (x,y) plane z=ze

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,ze) is specified (this must be input thru phi(i,j,1))

c = 2 if there are mixed derivative boundary conditions at z=ze

c (see "bndyc" description below where kbdy = 5)

c

c

c ... nzf=iparm(7)

c

c flags boundary conditions on the (x,y) plane z=zf

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,zf) is specified (this must be input thru phi(i,j,nz))

c = 2 if there are mixed derivative boundary conditions at z=zf

c (see "bndyc" description below where kbdy = 6)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(8)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(14)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible

c within grid size requirements.

c

c

c ... jyq = iparm(9)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(15)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible

c within grid size requirements.

c

c

c ... kzr = iparm(10)

c

c an integer greater than one which is used in defining the number

c of grid points in the z direction (see nz = iparm(16)). "kzr+1"

c is the number of points on the coarsest z grid visited during

c multigrid cycling. kzr should be chosen as small as possible

c within grid size requirements.

c

c

c ... iex = iparm(11)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(14)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx = iparm(14).

c

c

c ... jey = iparm(12)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(15)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(9)

c as small as possible within grid size constraints when

c defining ny = iparm(15).

c

c

c ... kez = iparm(13)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the z direction (see nz = iparm(16)).

c kez .le. 50 is required. for efficient multigrid cycling,

c kez should be chosen as large as possible and kzr=iparm(10)

c as small as possible within grid size constraints when

c defining nz = iparm(16).

c

c

c ... nx = iparm(14)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(8), iex = iparm(11).

c

c

c ... ny = iparm(15)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(9), jey = iparm(12).

c

c

c ... nz = iparm(16)

c

c the number of equally spaced grid points in the interval [ze,zf]

c (including the boundaries). nz must have the form

c

c nz = kzr*(2**(kez-1)) + 1

c

c where kzr = iparm(10), kez = iparm(13)

c

c

c *** note

c

c let g be the nx by ny by nz fine grid on which the approximation is

c generated and let n = max0(iex,jey,kez). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c g(1) < ... < g(k) < ... < g(n) = g.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) by mz(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c mz(k) = kzr*[2**(max0(kez+k-n,1)-1)] + 1

c

c additionally cuh3 implements a direct method whenever g(1) is

c encountered.

c

c ... iguess=iparm(17)

c

c = 0 if no initial guess to the pde is provided

c and/or full multigrid cycling beginning at the

c coarsest grid level is desired.

c

c = 1 if an initial guess to the pde at the finest grid

c level is provided in phi (see below). in this case

c cycling beginning or restarting at the finest grid

c is initiated.

c

c *** comments on iguess = 0 or 1 . . .

c

c

c setting iguess=0 forces full multigrid or "fmg" cycling. phi

c must be initialized at all grid points. it can be set to zero at

c non-Dirchlet grid points if nothing better is available.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c *** time dependent problems . . .

c

c assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at non-Dirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(18)

c

c the exact number of cycles executed between the finest

c (nx by ny by nz) and the coarsest ((ixp+1) by (jyq+1) by

c (kzr+1)) grid levels when tolmax=fparm(7)=0.0 (no error

c control). when tolmax=fparm(7).gt.0.0 is input (error control)

c then maxcy is a limit on the number of cycles between the

c finest and coarsest grid levels. in any case, at most

c maxcy*(iprer+ipost) relaxation sweeps are performed at the

c finest grid level (see iprer=mgopt(2),ipost=mgopt(3) below)

c when multigrid iteration is working "correctly" only a few

c cycles are required for convergence. large values for maxcy

c should not be required.

c

c

c ... method = iparm(19)

c

c this sets the method of relaxation (all relaxation

c schemes in mudpack use red/black type ordering)

c

c = 0 for gauss-seidel pointwise relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in the z direction

c

c = 4 for line relaxation in the x and y direction

c

c = 5 for line relaxation in the x and z direction

c

c = 6 for line relaxation in the y and z direction

c

c = 7 for line relaxation in the x,y and z direction

c

c = 8 for x,y planar relaxation

c

c = 9 for x,z planar relaxation

c

c =10 for y,z planar relaxation

c

c *** choice of method

c

c this is very important for efficient convergence. in some cases

c experimentation may be required.

c

c let fx represent the quantity cxx(x,y,z)/dlx**2 over the solution box

c

c let fy represent the quantity cyy(x,y,z)/dly**2 over the solution box

c

c let fz represent the quantity czz(x,y,z)/dlz**2 over the solution box

c

c (0) if fx,fy,fz are roughly the same size and do not vary too

c much choose method = 0. if this fails try method = 7.

c

c (1) if fx is much greater then fy,fz and fy,fz are roughly the same

c size choose method = 1

c

c (2) if fy is much greater then fx,fz and fx,fz are roughly the same

c size choose method = 2

c

c (3) if fz is much greater then fx,fy and fx,fy are roughly the same

c size choose method = 3

c

c (4) if fx,fy are roughly the same and both are much greater then fz

c try method = 4. if this fails try method = 8

c

c (5) if fx,fz are roughly the same and both are much greater then fy

c try method = 5. if this fails try method = 9

c

c (6) if fy,fz are roughly the same and both are much greater then fx

c try method = 6. if this fails try method = 10

c

c (7) if fx,fy,fz vary considerably with none dominating try method = 7

c

c (8) if fx and fy are considerably greater then fz but not necessarily

c the same size and method=4 fails try method = 8

c

c (9) if fx and fz are considerably greater then fy but not necessarily

c the same size and method=5 fails try method = 9

c

c (10)if fy and fz are considerably greater then fx but not necessarily

c the same size and method=6 fails try method = 10

c

c

c ... meth2 = iparm(20) determines the method of relaxation used in the planes

c when method = 8 or 9 or 10.

c

c

c as above, let fx,fy,fz represent the quantities cxx/dlx**2,

c cyy/dly**2,czz/dlz**2 over the box.

c

c (if method = 8)

c

c = 0 for gauss-seidel pointwise relaxation

c in the x,y plane for each fixed z

c = 1 for line relaxation in the x direction

c in the x,y plane for each fixed z

c = 2 for line relaxation in the y direction

c in the x,y plane for each fixed z

c = 3 for line relaxation in the x and y direction

c in the x,y plane for each fixed z

c

c (1) if fx,fy are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fy choose meth2 = 1

c (3) if fy is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 9)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the x,z plane for each fixed y

c = 1 for simultaneous line relaxation in the x direction

c of the x,z plane for each fixed y

c = 2 for simultaneous line relaxation in the z direction

c of the x,z plane for each fixed y

c = 3 for simultaneous line relaxation in the x and z direction

c of the x,z plane for each fixed y

c

c (1) if fx,fz are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 10)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the y,z plane for each fixed x

c = 1 for simultaneous line relaxation in the y direction

c of the y,z plane for each fixed x

c = 2 for simultaneous line relaxation in the z direction

c of the y,z plane for each fixed x

c = 3 for simultaneous line relaxation in the y and z direction

c of the y,z plane for each fixed x

c

c (1) if fy,fz are roughly the same and vary little choose meth2 = 0

c (2) if fy is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fy choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c

c *** note that planar relaxation implements full two-dimensional multigrid

c cycling for each plane visited during three dimensional multigrid

c cycling. Consequently it can be computationally expensive.

c

c ... length = iparm(21)

c

c the length of the work space provided in vector wk.

c

c let isx = 3 if method = 1,4,5 or 7 and nxa.ne.0

c let isx = 5 if method = 1,4,5 or 7 and nxa.eq.0

c let isx = 0 if method has any other value

c

c let jsy = 3 if method = 2,4,6 or 7 and nyc.ne.0

c let jsy = 5 if method = 2,4,6 or 7 and nyc.eq.0

c let jsy = 0 if method has any other value

c

c let ksz = 3 if method = 3,5,6 or 7 and nze.ne.0

c let ksz = 5 if method = 3,5,6 or 7 and nze.eq.0

c let ksz = 0 if method has any other value

c

c

c

c let ls = (nx+2)*(ny+2)*(nz+2)*(10+isx+jsy+ksz)

c

c let mx = ixp+1; my = jyq+1; mz = kzr+1. the block gaussian

c elimination at the coarsest mx by my by mz grid level requires

c

c ld = mx*my*mz*(2*mx*my+1)) (nze.ne.0)

c

c words of storage if z boundary conditions are not periodic or

c

c ld = mx*my*(mz*(4*mx*my+1)) (nze=0)

c

c words of storage if z boundary conditions are periodic.

c if ixp,jyq,kzr are not the same, this quantity is

c minimized if they are chosen so that kzr > max0(ixp,jyq).

c

c finally

c

c length = ls + ld

c will usually but not always suffice. the exact minimal length depends,

c in a complex way, on the grid size arguments and method chosen.

c *** it can be predetermined for the current input arguments by calling

c cuh3 with length=iparm(21)=0 and printing iparm(22) or (in f90)

c dynamically allocating the work space using the value in iparm(22)

c in a subsequent cuh3 call.

c

c ... fparm

c

c a floating point vector of length 8 used to efficiently

c pass floating point arguments. fparm is set internally

c in cuh3 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... ze=fparm(5), zf=fparm(6)

c

c the range of the z independent variable. ze must

c be less than zf.

c

c

c ... tolmax = fparm(7)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j,k)

c and phi2(i,j,k) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(7)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c

c ... wk

c

c a one dimensional array that must be provided for work space.

c see length = iparm(21). the values in wk must be preserved

c if cuh3 is called again with intl=iparm(1).ne.0 or if cuh34

c is called to improve accuracy.

c

c

c ... iwk

c

c an integer vector dimensioned of length at least

c

c (ixp+1)*(jyq+1)*(kzr+1)

c

c in the routine calling cuh3. the length of iwk is not

c checked! if iwk has length too small then undetectable

c undetectable errors will result.

c

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,yorz,alfa,gbdy).

c which are used to input mixed boundary conditions to cuh3.

c the boundaries are numbered one thru six and the form of

c conditions are described below.

c

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y,z)*p(xa,y,z) = gbdxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y,z),gbdxa(y,z) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y,z)*p(xb,y,z) = gbdxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y,z),gbdxb(y,z) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x,z)*p(x,yc,z) = gbdyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x,z),gbdyc(x,z) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x,z)*p(x,yd,z) = gbdyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x,z),gbdyd(x,z) must be returned.

c

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfze(x,y)*p(x,y,ze) = gbdze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfze(x,y),gbdze(x,y) must be returned.

c

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfzf(x,y)*p(x,y,zf) = gbdzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfzf(x,y),gbdzf(x,y) must be returned.

c

c

c *** alfxa,alfyc,alfze nonpositive and alfxb,alfyd,alfze nonnegative

c will help maintain matrix diagonal dominance during discretization

c aiding convergence.

c

c *** bndyc must provide the mixed boundary condition

c values in correspondence with those flagged in iparm(2)

c thru iparm(7). if all boundaries are specified then

c cuh3 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared

c external in the routine calling cuh3. the actual

c name chosen may be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,z,cxx,cyy,czz,cx,cy,cz,ce)

c which provides the known complex coefficients for the elliptic pde

c at any grid point (x,y,z). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... rhs

c

c a complex array dimensioned nx by ny by nz which contains

c the given right hand side values on the uniform 3-d mesh.

c rhs(i,j,k) = r(xi,yj,zk) for i=1,...,nx and j=1,...,ny

c and k=1,...,nz.

c

c ... phi

c

c a complex array dimensioned nx by ny by nz . on input phi must

c contain specified boundary values and an initial guess

c to the solution if flagged (see iguess=iparm(17)=1). for

c example, if nyd=iparm(5)=1 then phi(i,ny,k) must be set

c equal to p(xi,yd,zk) for i=1,...,nx and k=1,...,nz prior to

c calling cuh3. the specified values are preserved by cuh3.

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at non-Dirchlet grid points (this is not

c checked). these values are projected down and serve as an initial

c guess to the pde at the coarsest grid level. set phi to 0.0 at

c nonDirchlet grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). the "d" at level 1

c indicates a direct method is used

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---d-----d-----------d-----------------d--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --d---d-------d---d-----------d---d-------d---d-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c *** this algorithim is modified with the hybrid solvers which use

c a direct method whenever grid level 1 is encountered.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(22)

c

c on output iparm(22) contains the actual work space length

c required for the current grid sizes and method. this value

c will be computed and returned even if iparm(21) is less then

c iparm(22) (see ierror=9).

c

c

c ... iparm(23)

c

c if error control is selected (tolmax = fparm(7) .gt. 0.0) then

c on output iparm(23) contains the actual number of cycles executed

c between the coarsest and finest grid levels in obtaining the

c approximation in phi. the quantity (iprer+ipost)*iparm(23) is

c the number of relaxation sweeps performed at the finest grid level.

c

c

c ... fparm(8)

c

c on output fparm(8) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(8) is computed only if there is error control (tolmax.gt.0.)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then

c

c fparm(8) = phdif/phmax

c

c is returned whenever phmax.gt.0.0. in the degenerate case

c phmax = 0.0, fparm(8) = phdif is returned.

c

c

c

c ... wk

c

c on output wk contains intermediate values that must not be

c destroyed if cuh3 is to be called again with iparm(1)=1 or

c if cuh34 is to be called to improve the estimate to fourth

c order.

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained (ierror=-1)

c

c ... ierror

c

c for intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd,nze,nzf

c in iparm(2) through iparm(7)is not 0,1 or 2 or if

c (nxa,nxb) or (nyc,nyd) or (nze,nzf) are not pairwise zero.

c

c = 3 if mino(ixp,jyq,kzr) < 2 (ixp=iparm(8),jyq=iparm(9),kzr=iparm(10))

c or if ixp<3 when nxa=0 or jyq<3 when nyc=0 or kzr<3 when nze=0.

c

c = 4 if min0(iex,jey,kez) < 1 (iex=iparm(11),jey=iparm(12),kez=iparm(13))

c or if max0(iex,jey,kez) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or if ny.ne.jyq*2**(jey-1)+1 or

c if nz.ne.kzr*2**(kez-1)+1 (nx=iparm(14),ny=iparm(15),nz=iparm(16))

c

c = 6 if iguess = iparm(17) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(18) < 1 (large values for maxcy should not be used)

c

c = 8 if method = iparm(19) is less than 0 or greater than 10

c

c = 9 if length = iparm(20) is too small (see iparm(21) on output

c for minimum required work space length)

c

c =10 if xa > xb or yc > yd or ze > zf

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4),ze=fparm(5),zf=fparm(6))

c

c =11 if tolmax = fparm(7) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c =13 if iex=jey=kez=1 (full direct method) and iguess=1 or maxcy > 1

c

c =14 if the elliptic pde is singular (see ierror=-3 in cud3.d)

c

c

c *********************************************************

c *********************************************************

c

c end of cuh3 documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file cuh34.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file cuh34.d

c

c contains documentation for:

c subroutine cuh34(wk,iwk,phi,ierror)

c A sample fortran driver is file "tcuh34.f".

c

c ... required MUDPACK files

c

c cuh3.f, cudcom.f, cud3ln.f, cud3pn.f

c

c ... purpose

c

c cuh34 attempts to improve the estimate in phi, obtained by calling

c cuh3, from second to fourth order accuracy. see the file "cuh3.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "wk,iwk,phi" which are also part of the argument list for

c cuh3.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier cuh3 call

c

c * arguments "wk,iwk,phi" are the same used in calling cuh3

c

c * "wk,iwk,phi" have not changed since the last call to cuh3

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error parameter

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny,nz) < 6 where nx,ny,nz are the fine grid sizes

c in the x,y,z directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of cuh34 documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud2.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud2.d

c

c contains documentation for:

c subroutine mud2(iparm,fparm,work,coef,bndyc,rhs,phi,mgopt,ierror)

c A sample fortran driver is file "tmud2.f".

c

c ... required MUDPACK files

c

c mudcom.f

c

c ... purpose

c

c subroutine mud2 automatically discretizes and attempts to compute

c the second-order difference approximation to the two-dimensional

c linear nonseparable elliptic partial differential equation on a

c rectangle. the approximation is generated on a uniform grid covering

c the rectangle (see mesh description below). boundary conditions

c may be specified (dirchlet), periodic, or mixed derivative in any

c combination. the form of the pde solved is:

c

c

c cxx(x,y)*pxx + cyy(x,y)*pyy + cx(x,y)*px + cy(x,y)*py +

c

c ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pyy,px,py are second and first partial derivatives of the

c unknown real solution function p(x,y) with respect to the

c independent variables x,y. cxx,cyy,cx,cy,ce are the known

c real coefficients of the elliptic pde and r(x,y) is the known

c real right hand side of the equation. cxx and cyy should be

c positive for all x,y in the solution region. Nonseparability

c means some of the coefficients depend on both x and y. If

c the PDE is separable subroutine mud2sp should be used instead

c of mud2 or muh2.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c mud2 should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if mud2 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud2 is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud2 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud2 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud2

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** grid size flexibility considerations:

c

c the hybrid multigrid/direct method code muh2 provides more grid size

c flexibility than mud2 by removing the constraint that ixp and jyq are

c 2 or 3. This is accomplished by using a direct method whenever the

c coarsest (ixp+1) X (jyq+1) grid is encountered in multigrid cycling.

c If nx = ixp+1 and ny = jyq+1 then muh2 becomes a full direct method.

c muh2 is roughly equivalent to mud2 in efficiency as long as ixp and

c jyq remain "small" (see muh2.d). If the problem to be approximated

c requires a grid neither mud2 por muh2 can exactly fit then another option

c is to generate an approximation on a "close grid" using mud2 or muh2.

c Then transfer the result to the required grid using cubic interpolation

c via the package "regridpack"(contact John Adams about this software)

c

c *** note

c

c let G be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each G(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = 4*[nx*ny*(10+isx+jsy)+8*(nx+ny+2)]/3

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in mud2 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c ... work

c

c a one dimensional real saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to mud2. bndyc

c must be declared "external" in the program calling mud2.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tmud2.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y),gbdxa(y) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y),gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x),gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x),gbdyd(x) must be returned.

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c mud2 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling mud2. the actual name chosen may

c be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cyy,cx,cy,ce) which

c provides the known real coefficients for the elliptic pde at

c any grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c an array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c an array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling mud2. these values are preserved by mud2.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if mud2 is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of mud2 documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file mud24.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud24.d

c

c contains documentation for subroutine mud24(work,phi,ierror)

c A sample fortran driver is file "tmud24.f".

c

c ... required MUDPACK files

c

c mud2.f, mudcom.f

c

c ... purpose

c

c mud24 attempts to improve the estimate in phi, obtained by calling

c mud2, from second to fourth order accuracy. see the file "mud2.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c mud2.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier mud2 call

c

c * arguments "work,phi" are the same used in calling mud2

c

c * "work,phi" have not changed since the last call to mud2

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of mud24 documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud24cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud24cr.d

c

c contains documentation for:

c subroutine mud24cr(work,coef,bndyc,phi,ierror)

c A sample fortran driver is file "tmud24cr.f".

c

c ... required MUDPACK files

c

c mud2cr.f, mudcom.f

c

c ... purpose

c

c mud24cr attempts to improve the estimate in phi, obtained by calling

c mud2cr, from second to fourth order accuracy. see the file "mud2cr.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,coef,bndyc,phi" which are also part of the argument

c list for mud2cr

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier mud2cr call

c

c * arguments "work,coef,bndyc,phi" are the same used in calling mud2cr

c

c * "work,coef,bndyc,phi" have not changed since the last call to mud2cr

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of mud24cr documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud24sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud24sp.d

c

c contains documentation for subroutine mud24sp(work,phi,ierror)

c A sample fortran driver is file "tmud24sp.f".

c

c ... required MUDPACK files

c

c mud2sp.f, mudcom.f

c

c ... purpose

c

c mud24sp attempts to improve the estimate in phi, obtained by calling

c mud2sp, from second to fourth order accuracy. see the file "mud2sp.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c mud2sp.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier mud2sp call

c

c * arguments "work,phi" are the same used in calling mud2sp

c

c * "work,phi" have not changed since the last call to mud2sp

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of mud24sp documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud2cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud2cr.d

c

c contains documentation for:

c subroutine mud2cr(iparm,fparm,work,coef,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tmud2cr.f".

c

c ... required mudpack files

c

c mudcom.f

c

c ... purpose

c

c subroutine mud2cr automatically discretizes and attempts to compute

c the second-order difference approximation to the two-dimensional

c linear nonseparable elliptic partial differential equation with cross

c derivative term on a rectangle. the approximation is generated on a

c uniform grid covering the rectangle (see mesh description below).

c boundary conditions may be specified (dirchlet), periodic, or mixed

c oblique derivative (see bndyc) in any combination. the form of the pde

c approximated is:

c

c

c cxx(x,y)*pxx + cxy(x,y)*pxy + cyy(x,y)*pyy + cx(x,y)*px +

c

c cy(x,y)*py + ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pxy,pyy,px,py are second and first partial derivatives of the

c unknown real solution function p(x,y) with respect to the

c independent variables x,y. cxx,cxy,cyy,cx,cy,ce are the known

c real coefficients of the elliptic pde and r(x,y) is the known

c real right hand side of the equation. cxx and cyy should be

c positive for all x,y in the solution region and

c

c 4*cxx(x,y)*cyy(x,y) .le. cxy(x,y)**2

c

c for ellipticity (see ierror=-2). nonseparability means some

c of the coefficients depend on both x and y and cxy.ne.0. if

c the pde is separable and cxy = 0 then subroutine mud2sp should

c be used.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c mud2cr should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if mud2cr has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud2cr is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud2cr is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud2cr is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud2cr

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c must be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** grid size flexibility considerations:

c

c the hybrid multigrid/direct method code muh2cr provides more grid size

c flexibility than mud2cr by removing the constraint that ixp and jyq are

c 2 or 3. this is accomplished by using a direct method whenever the

c coarsest (ixp+1) x (jyq+1) grid is encountered in multigrid cycling.

c if nx = ixp+1 and ny = jyq+1 then muh2cr becomes a full direct method.

c muh2cr is roughly equivalent to mud2cr in efficiency as long as ixp and

c jyq remain "small" (see muh2cr.d). if the problem to be approximated

c requires a grid neither mud2cr or muh2cr can exactly fit then another option

c is to generate an approximation on a "close grid" using mud2cr or muh2cr.

c then transfer the result to the required grid using cubic interpolation

c via the package "regridpack"(contact john adams about this software)

c

c *** note

c

c let g be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c g(1) < ... < g(k) < ... < g(n) = g.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = [7*(nx+2)*(ny+2)+4*(11+isx+jsy)*nx*ny]/3

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in mud2cr and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible do not use error control!).

c

c ... work

c

c a one dimensional real saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,beta,gama,gbdy) which

c are used to input mixed boundary conditions to mud2cr. bndyc

c must be declared "external" in the program calling mud2cr.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tmud2cr.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c alfxa(y)*px + betxa(y)*py + gamxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfxa(y),betxa(y),gamxa(y),

c gbdxa(y) must be returned. alfxa(y) = 0. is not allowed for any y.

c (see ierror = 13)

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c alfxb(y)*px + betxb(y)*py + gamxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfxb(y),betxb(y),gamxb(y),

c gbdxb(y) must be returned. alfxb(y) = 0.0 is not allowed for any y.

c (see ierror = 13)

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c alfyc(x)*px + betyc(x)*py + gamyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfyc(x),betyc(x),gamyc(x),

c gbdyc(x) must be returned. betyc(x) = 0.0 is not allowed for any x.

c (see ierror = 13)

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c alfyd(x)*px + betyd(x)*py + gamyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,beta,gama,gbdy corresponding to alfyd(x),betyd(x),gamyd(x),

c gbdyd(x) must be returned. betyd(x) = 0.0 is not allowed for any x.

c (see ierror = 13)

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c mud2cr will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling mud2cr. the actual name chosen may

c be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cxy,cyy,cx,cy,ce) which

c provides the known real coefficients for the elliptic pde at

c any grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c an array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c an array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling mud2cr. these values are preserved by mud2cr.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if mud2cr is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c for intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(1) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c =13 if there is a pure tangential derivative along a mixed derivative

c boundary (e.g., nyd = 2 and betyd(x) = 0.0 for some

c grid point x along y = yd)

c

c =14 if there is the "singular" condition described below at a

c cornor which is the intersection of two derivative boundaries.

c

c (1) the cornor (xa,yc) if nxa=nyc=2 and

c alfxa(yc)*betyc(xa)-alfyc(xa)*betxa(yc) = 0.0.

c

c (2) the cornor (xa,yd) if nxa=nyd=2 and

c alfxa(yd)*betyd(xa)-alfyd(xa)*betxa(yd) = 0.0.

c

c (3) the cornor (xb,yc) if nxb=nyc=2 and

c alfxb(yc)*betyc(xb)-alfyc(xb)*betxb(yc) = 0.0.

c

c (4) the cornor (xb,yd) if nxb=nyd=2 and

c alfxb(yd)*betyd(xb)-alfyd(xb)*betxb(yd) = 0.0.

c

c *** the conditions described in ierror = 13 or 14 will lead to division

c by zero during discretization if undetected.

c

c

c *********************************************************

c *********************************************************

c

c end of mud2cr documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file mud2sa.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud2sa.d

c

c contains documentation for:

c subroutine mud2sa(iparm,fparm,work,sigx,sigy,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tmud2sa.f".

c

c ... required mudpack files

c

c mudcom.f

c

c

c ... purpose

c

c subroutine mud2sa automatically discretizes and attempts to

c compute the second order conservative finite difference approximation

c to a two dimensional linear nonseparable "self adjoint" elliptic

c partial differential equation on a rectangle. the approximation

c is generated on a uniform grid covering the rectangle. boundary

c conditions may be specified (Dirchlet), periodic, or mixed.

c the form of the pde solved is:

c

c d(sigx(x,y)*dp/dx)/dx + d(sigy(x,y)*dp/dy)/dy -

c

c xlmbda(x,y)*p(x,y) = r(x,y)

c

c where sigx(x,y),sigy(x,y) (both positive), xlmbda(x,y) (non-negative)

c r(x,y) (the given right hand side) and p(x,y) (the unknown solution

c function) are all real valued functions of the real independent

c variables x,y. the use of the variable names "x,y" is arbitrary and

c does not imply the cartesian coordinate system underlies the pde.

c for example, any pde in divergence form in cartesian coordinates can

c be put in a self-adjoint form suitable for mud2sa after a curvilinear

c coordinate transform (see tmud2sa.f)

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c mud2sa should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if mud2sa has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud2sa is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud2sa is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud2sa is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud2sa

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c must be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** grid size flexibility considerations:

c

c the hybrid multigrid/direct method code muh2 provides more grid size

c flexibility than mud2sa by removing the constraint that ixp and jyq are

c 2 or 3. this is accomplished by using a direct method whenever the

c coarsest (ixp+1) x (jyq+1) grid is encountered in multigrid cycling.

c if nx = ixp+1 and ny = jyq+1 then muh2 becomes a full direct method.

c muh2 is roughly equivalent to mud2sa in efficiency as long as ixp and

c jyq remain "small" (see muh2.d). if the problem to be approximated

c requires a grid neither mud2sa por muh2 can exactly fit then another option

c is to generate an approximation on a "close grid" using mud2sa or muh2.

c then transfer the result to the required grid using cubic interpolation

c via the package "regridpack"(contact john adams about this software)

c

c *** note

c

c let g be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c g(1) < ... < g(k) < ... < g(n) = g.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity sigx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity sigy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = 4*[nx*ny*(10+isx+jsy)+8*(nx+ny+2)]/3

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in mud2sa and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible do not use error control!).

c

c ... work

c

c a one dimensional real saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to mud2sa. bndyc

c must be declared "external" in the program calling mud2sa.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tmud2sa.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y),gbdxa(y) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y),gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x),gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x),gbdyd(x) must be returned.

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c mud2sa will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling mud2sa. the actual name chosen may

c be different.

c

c

c ... sigx,sigy

c

c function subroutines which returns the real value of the

c coefficients at any point (x,y). they must be constructed

c to return values outside the solution region for nonDirchlet

c boundaries. Let dx = (xb=xa)/ixp and dy = (yd-yc)/jyq.

c then sigx,sigy will be invoked for x,y in the intervals

c [xa-0.5*dx,xa], [xb,xb+0.5*dx], [yc-0.5*dy,yc], [yd,yd+0.5*dy]

c whenever boundary conditions at x=xa,x=xb,y=yc,y=yd are unspecified.

c this is necessitated by conservative finite differencing. sigx,

c sigy will not be invoked outside specified boundaries. sigx,

c sigy should be positive for all (x,y) (see ierror = -2). they

c must be declared "external" in the user constructed program calling

c mud2sa where their names may be different.

c

c ... xlmbda

c

c a real valued function subroutine which returns the value

c of "xlmbda" in the pde at any grid point (xi,yj). xlmbda should

c be nonnegative for any (xi,yj) (see ierror = -4). xlmbda must be

c declared "external" in the user constructed program calling

c mud2sa where its name may be different.

c

c

c ... rhs

c

c an array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c an array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling mud2sa. these values are preserved by mud2sa.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid parameters (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the parameters

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if mud2sa is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c for intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if xlmbda < 0 for some grid point (xi,yj)

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and xlmbda(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4,-5 nonfatal flags.

c

c

c =-2 if the pde is not elliptic (sigx(x,y) or sigy(x,y) .le. 0.0).

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of mud2sa documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file mud2sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud2sp.d

c

c contains documentation for:

c subroutine mud2sp(iparm,fparm,work,cofx,cofy,bndyc,rhs,phi,mgopt,ierror)

c A sample fortran driver is file "tmud2sp.f".

c

c ... required MUDPACK files

c

c mudcom.f

c

c ... purpose

c

c subroutine mud2sp automatically discretizes and attempts to compute

c the second-order difference approximation to the two-dimensional

c linear separable elliptic partial differential equation on a

c rectangle. the approximation is generated on a uniform grid covering

c the rectangle (see mesh description below). boundary conditions

c may be specified (dirchlet), periodic, or mixed derivative in any

c combination. the form of the pde solved is:

c

c

c cxx(x)*pxx + cx(x)*px + cex(x)*p(x,y) +

c

c cyy(y)*pyy + cy(y)*py + cey(y)*p(x,y) = r(x,y)

c

c pxx,pyy,px,py are second and first partial derivatives of the

c unknown real solution function p(x,y) with respect to the

c independent variables x,y. cxx,cx,cex,cyy,cy,cey are the known

c real coefficients of the elliptic pde and r(x,y) is the known

c real right hand side of the equation. cxx and cyy should be

c positive for all x,y in the solution region. If some of the

c coefficients depend on both x and y then the PDE is nonseparable.

c In this case subroutine muh2 or mud2 must be used instead of mud2sp

c (see the files muh2.d or mud2.d)

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c mud2sp should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if mud2sp has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud2sp is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud2sp is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud2sp is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud2sp

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by cofx,cofy (see below) have

c changed since the previous call

c

c (e) any of the constant "alfa" coefficients input by bndyc

c (see below) have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c

c *** note

c

c let G be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each G(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c then . . .

c

c length = nx*ny*(5+3*(isx+jsy)/2)+ 10*(nx+ny)

c

c will suffice in all cases but very small nx and ny.

c the exact minimal work space length required for the

c current set of input arugments is output in iparm(16).

c (even if iparm(15) is too small). this will be usually

c be less then the value given by the simplified formula

c above. * Notice that mud2sp requires considerably less

c work space than the nonseparable solvers muh2,mud2 if

c and only if method=0 is chosen.

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in mud2sp and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c ... work

c

c a one dimensional real saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to mud2sp. bndyc

c must be declared "external" in the program calling mud2sp.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tmud2sp.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa,gbdxa(y) must be returned

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb,gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc,gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd,gbdyd(x) must be returned.

c

c

c *** alfxa,alfxb,alfyc,alfyd must be constants for mud2sp.

c Use muh2 or mud2 if any of these depend on x or y.

c bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c mud2sp will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling mud2sp the actual name chosen may

c be different.

c

c

c ... cofx

c

c a subroutine with arguments (x,cxx,cx,cex) which provides

c the known real x dependent coefficients for the separable

c elliptic pde at any x grid point. the name chosen in the calling

c routine may be different where the coefficient routine must be declared

c "external."

c

c ... cofy

c

c a subroutine with arguments (y,cyy,cy,cey) which provides

c the known real y dependent coefficients for the separable

c elliptic pde at any y grid point. the name chosen in the calling

c routine may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c an array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c an array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling mud2sp. these values are preserved by mud2sp.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if mud2sp is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on output phi(i,j) contains the approximation to p(xi,yj)

c for all mesh points i = 1,...,nx and j=1,...,ny. the last

c computed iterate in phi is returned even if convergence is

c not obtained

c

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on the first call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd

c in iparm(2),iparm(3),iparm(4),iparm(5) are not 0,1 or 2

c or if nxa,nxb or nyc,nyd are not pairwise zero.

c

c = 3 if mino(ixp,jyq) < 2 (ixp = iparm(6), jyq = iparm(7))

c

c = 4 if min0(iex,jey) < 1 (iex = iparm(8), jey = iparm(9)) or

c if max0(iex,jey) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or ny.ne.jyq*2**(jey-1)+1

c (nx = iparm(10), ny = iparm(11))

c

c = 6 if iguess = iparm(12) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(13) < 1

c

c = 8 if method = iparm(14) is not 0,1,2, or 3

c

c = 9 if length = iparm(15) is too small (see iparm(16) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4))

c

c =11 if tolmax = fparm(5) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(1) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of mud2sp documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file mud3.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud3.d

c

c contains documentation for:

c subroutine mud3(iparm,fparm,work,coef,bndyc,rhs,phi,mgopt,ierror)

c A sample fortran driver is file "tmud3.f".

c

c ... required MUDPACK files

c

c mudcom.f, mud3ln.f, mud3pn.f

c

c ... purpose

c

c subroutine mud3 automatically discretizes and attempts to compute

c the second order finite difference approximation to a three-

c dimensional linear nonseparable elliptic partial differential

c equation on a box. the approximation is generated on a uniform

c grid covering the box (see mesh description below). boundary

c conditions may be any combination of mixed, specified (Dirchlet)

c or periodic. the form of the pde solved is . . .

c

c cxx(x,y,z)*pxx + cyy(x,y,z)*pyy + czz(z,y,z)*pzz +

c

c cx(x,y,z)*px + cy(x,y,z)*py + cz(x,y,z)*pz +

c

c ce(x,y,z)*p(x,y,z) = r(x,y,z)

c

c here cxx,cyy,czz,cx,cy,cz,ce are the known real coefficients

c of the pde; pxx,pyy,pzz,px,py,pz are the second and first

c partial derivatives of the unknown solution function p(x,y,z)

c with respect to the independent variables x,y,z; r(x,y,z) is

c is the known real right hand side of the elliptic pde. cxx,cyy

c and czz should be positive for all (x,y,z) in the solution region.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 23 used to efficiently pass

c integer arguments. iparm is set internally in mud3

c and defined as follows . . .

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c mud3 should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if mud3 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud3 is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud3 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud3 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud3

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the (y,z) plane x=xa

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xa,y,z) is specified (this must be input thru phi(1,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see "bndyc" description below where kbdy = 1)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the (y,z) plane x=xb

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xb,y,z) is specified (this must be input thru phi(nx,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see "bndyc" description below where kbdy = 2)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the (x,z) plane y=yc

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yc,z) is specified (this must be input thru phi(i,1,k))

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see "bndyc" description below where kbdy = 3)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the (x,z) plane y=yd

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yd,z) is specified (this must be input thru phi(i,ny,k))

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see "bndyc" description below where kbdy = 4)

c

c

c ... nze=iparm(6)

c

c flags boundary conditions on the (x,y) plane z=ze

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,ze) is specified (this must be input thru phi(i,j,1))

c = 2 if there are mixed derivative boundary conditions at z=ze

c (see "bndyc" description below where kbdy = 5)

c

c

c ... nzf=iparm(7)

c

c flags boundary conditions on the (x,y) plane z=zf

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,zf) is specified (this must be input thru phi(i,j,nz))

c = 2 if there are mixed derivative boundary conditions at z=zf

c (see "bndyc" description below where kbdy = 6)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(8)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(14)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(11)

c without changing nx = iparm(14)

c

c

c ... jyq = iparm(9)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(15)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(12)

c without changing ny = iparm(15)

c

c

c ... kzr = iparm(10)

c

c an integer greater than one which is used in defining the number

c of grid points in the z direction (see nz = iparm(16)). "kzr+1"

c is the number of points on the coarsest z grid visited during

c multigrid cycling. kzr should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the z direction is not used.

c if kzr > 2 then it should be 2 or a small odd value since a power

c of 2 factor of kzr can be removed by increasing kez = iparm(13)

c without changing nz = iparm(16)

c

c

c ... iex = iparm(11)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(14)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx = iparm(14).

c

c

c ... jey = iparm(12)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(15)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(9)

c as small as possible within grid size constraints when

c defining ny = iparm(15).

c

c

c ... kez = iparm(13)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the z direction (see nz = iparm(16)).

c kez .le. 50 is required. for efficient multigrid cycling,

c kez should be chosen as large as possible and kzr=iparm(10)

c as small as possible within grid size constraints when

c defining nz = iparm(16).

c

c

c ... nx = iparm(14)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(8), iex = iparm(11).

c

c

c ... ny = iparm(15)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(9), jey = iparm(12).

c

c

c ... nz = iparm(16)

c

c the number of equally spaced grid points in the interval [ze,zf]

c (including the boundaries). nz must have the form

c

c nz = kzr*(2**(kez-1)) + 1

c

c where kzr = iparm(10), kez = iparm(13)

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 65 by 97 grid. then

c ixp=2, jyq=4, kzr=6 and iex=jey=kez=5 could be used. a better

c choice would be ixp=jyq=2, kzr=3, and iex=5, jey=kez=6.

c

c *** note

c

c let G be the nx by ny by nz fine grid on which the approximation is

c generated and let n = max0(iex,jey,kez). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) by mz(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c mz(k) = kzr*[2**(max0(kez+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(17)

c

c = 0 if no initial guess to the pde is provided

c and/or full multigrid cycling beginning at the

c coarsest grid level is desired.

c

c = 1 if an initial guess to the pde at the finest grid

c level is provided in phi (see below). in this case

c cycling beginning or restarting at the finest grid

c is initiated.

c

c *** comments on iguess = 0 or 1 . . .

c

c

c setting iguess=0 forces full multigrid or "fmg" cycling. phi

c must be initialized at all grid points. it can be set to zero at

c non-Dirchlet grid points if nothing better is available. the

c values set in phi when iguess = 0 are passed and down and serve

c as an initial guess to the pde at the coarsest grid level where

c multigrid cycling commences.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c *** time dependent problems . . .

c

c assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at non-Dirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(18)

c

c the exact number of cycles executed between the finest

c (nx by ny by nz) and the coarsest ((ixp+1) by (jyq+1) by

c (kzr+1)) grid levels when tolmax=fparm(7)=0.0 (no error

c control). when tolmax=fparm(7).gt.0.0 is input (error control)

c then maxcy is a limit on the number of cycles between the

c finest and coarsest grid levels. in any case, at most

c maxcy*(iprer+ipost) relaxation sweeps are performed at the

c finest grid level (see iprer=mgopt(2),ipost=mgopt(3) below)

c when multigrid iteration is working "correctly" only a few

c cycles are required for convergence. large values for maxcy

c should not be required.

c

c

c ... method = iparm(19)

c

c this sets the method of relaxation (all relaxation

c schemes in mudpack use red/black type ordering)

c

c = 0 for gauss-seidel pointwise relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in the z direction

c

c = 4 for line relaxation in the x and y direction

c

c = 5 for line relaxation in the x and z direction

c

c = 6 for line relaxation in the y and z direction

c

c = 7 for line relaxation in the x,y and z direction

c

c = 8 for x,y planar relaxation

c

c = 9 for x,z planar relaxation

c

c =10 for y,z planar relaxation

c

c *** if nxa = 0 and nx = 3 at a grid level where line relaxation in the x

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c *** if nyc = 0 and ny = 3 at a grid level where line relaxation in the y

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c *** if nze = 0 and nz = 3 at a grid level where line relaxation in the z

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c these adjustments are necessary since the simultaneous tri-diagonal

c solvers used with line periodic relaxation must have n > 2 where n

c is number of unknowns (excluding the periodic point).

c *** choice of method

c

c this is very important for efficient convergence. in some cases

c experimentation may be required.

c

c let fx represent the quantity cxx(x,y,z)/dlx**2 over the solution box

c

c let fy represent the quantity cyy(x,y,z)/dly**2 over the solution box

c

c let fz represent the quantity czz(x,y,z)/dlz**2 over the solution box

c

c (0) if fx,fy,fz are roughly the same size and do not vary too

c much choose method = 0. if this fails try method = 7.

c

c (1) if fx is much greater then fy,fz and fy,fz are roughly the same

c size choose method = 1

c

c (2) if fy is much greater then fx,fz and fx,fz are roughly the same

c size choose method = 2

c

c (3) if fz is much greater then fx,fy and fx,fy are roughly the same

c size choose method = 3

c

c (4) if fx,fy are roughly the same and both are much greater then fz

c try method = 4. if this fails try method = 8

c

c (5) if fx,fz are roughly the same and both are much greater then fy

c try method = 5. if this fails try method = 9

c

c (6) if fy,fz are roughly the same and both are much greater then fx

c try method = 6. if this fails try method = 10

c

c (7) if fx,fy,fz vary considerably with none dominating try method = 7

c

c (8) if fx and fy are considerably greater then fz but not necessarily

c the same size (e.g., fx=1000.,fy=100.,fz=1.) try method = 8

c

c (9) if fx and fz are considerably greater then fy but not necessarily

c the same size (e.g., fx=10.,fy=1.,fz=1000.) try method = 9

c

c (10)if fy and fz are considerably greater then fx but not necessarily

c the same size (e.g., fx=1.,fy=100.,fz=10.) try method = 10

c

c

c ... meth2 = iparm(20) determines the method of relaxation used in the planes

c when method = 8 or 9 or 10.

c

c

c as above, let fx,fy,fz represent the quantities cxx/dlx**2,

c cyy/dly**2,czz/dlz**2 over the box.

c

c (if method = 8)

c

c = 0 for gauss-seidel pointwise relaxation

c in the x,y plane for each fixed z

c = 1 for line relaxation in the x direction

c in the x,y plane for each fixed z

c = 2 for line relaxation in the y direction

c in the x,y plane for each fixed z

c = 3 for line relaxation in the x and y direction

c in the x,y plane for each fixed z

c

c (1) if fx,fy are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fy choose meth2 = 1

c (3) if fy is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 9)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the x,z plane for each fixed y

c = 1 for simultaneous line relaxation in the x direction

c of the x,z plane for each fixed y

c = 2 for simultaneous line relaxation in the z direction

c of the x,z plane for each fixed y

c = 3 for simultaneous line relaxation in the x and z direction

c of the x,z plane for each fixed y

c

c (1) if fx,fz are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 10)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the y,z plane for each fixed x

c = 1 for simultaneous line relaxation in the y direction

c of the y,z plane for each fixed x

c = 2 for simultaneous line relaxation in the z direction

c of the y,z plane for each fixed x

c = 3 for simultaneous line relaxation in the y and z direction

c of the y,z plane for each fixed x

c

c (1) if fy,fz are roughly the same and vary little choose meth2 = 0

c (2) if fy is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fy choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c

c ... length = iparm(21)

c

c the length of the work space provided in vector work.

c

c let isx = 3 if method = 1,4,5 or 7 and nxa.ne.0

c let isx = 5 if method = 1,4,5 or 7 and nxa.eq.0

c let isx = 0 if method has any other value

c

c let jsy = 3 if method = 2,4,6 or 7 and nyc.ne.0

c let jsy = 5 if method = 2,4,6 or 7 and nyc.eq.0

c let jsy = 0 if method has any other value

c

c let ksz = 3 if method = 3,5,6 or 7 and nze.ne.0

c let ksz = 5 if method = 3,5,6 or 7 and nze.eq.0

c let ksz = 0 if method has any other value

c

c

c then (for method .le.7)

c

c (1) length = (nx+2)*(ny+2)*(nz+2)*(10+isx+jsy+ksz)

c

c or (for method.gt.7)

c

c (2) length = 14*(nx+2)*(ny+2)*(nz+2)

c

c will usually but not always suffice. The exact minimal length depends,

c in a complex way, on the grid size arguments and method chosen.

c *** It can be predetermined for the current input arguments by calling

c mud3 with length=iparm(21)=0 and printing iparm(22) or (in f90)

c dynamically allocating the work space using the value in iparm(22)

c in a subsequent mud3 call.

c

c ... fparm

c

c a floating point vector of length 8 used to efficiently

c pass floating point arguments. fparm is set internally

c in mud3 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... ze=fparm(5), zf=fparm(6)

c

c the range of the z independent variable. ze must

c be less than zf.

c

c

c ... tolmax = fparm(7)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j,k)

c and phi2(i,j,k) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(7)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c

c ... work

c

c a one dimensional array that must be provided for work space.

c see length = iparm(21). the values in work must be preserved

c if mud3 is called again with intl=iparm(1).ne.0 or if mud34

c is called to improve accuracy.

c

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,yorz,alfa,gbdy).

c which are used to input mixed boundary conditions to mud3.

c the boundaries are numbered one thru six and the form of

c conditions are described below.

c

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y,z)*p(xa,y,z) = gbdxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y,z),gbdxa(y,z) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y,z)*p(xb,y,z) = gbdxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y,z),gbdxb(y,z) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x,z)*p(x,yc,z) = gbdyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x,z),gbdyc(x,z) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x,z)*p(x,yd,z) = gbdyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x,z),gbdyd(x,z) must be returned.

c

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfze(x,y)*p(x,y,ze) = gbdze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfze(x,y),gbdze(x,y) must be returned.

c

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfzf(x,y)*p(x,y,zf) = gbdzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfzf(x,y),gbdzf(x,y) must be returned.

c

c

c *** alfxa,alfyc,alfze nonpositive and alfxb,alfyd,alfze nonnegative

c will help maintain matrix diagonal dominance during discretization

c aiding convergence.

c

c *** bndyc must provide the mixed boundary condition

c values in correspondence with those flagged in iparm(2)

c thru iparm(7). if all boundaries are specified then

c mud3 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared

c external in the routine calling mud3. the actual

c name chosen may be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,z,cxx,cyy,czz,cx,cy,cz,ce)

c which provides the known real coefficients for the elliptic pde

c at any grid point (x,y,z). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... rhs

c

c an array dimensioned nx by ny by nz which contains

c the given right hand side values on the uniform 3-d mesh.

c rhs(i,j,k) = r(xi,yj,zk) for i=1,...,nx and j=1,...,ny

c and k=1,...,nz.

c

c ... phi

c

c an array dimensioned nx by ny by nz . on input phi must

c contain specified boundary values and an initial guess

c to the solution if flagged (see iguess=iparm(17)=1). for

c example, if nyd=iparm(5)=1 then phi(i,ny,k) must be set

c equal to p(xi,yd,zk) for i=1,...,nx and k=1,...,nz prior to

c calling mud3. the specified values are preserved by mud3.

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at non-Dirchlet grid points (this is not

c checked). these values are projected down and serve as an initial

c guess to the pde at the coarsest grid level. set phi to 0.0 at

c nonDirchlet grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below).

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(22)

c

c on output iparm(22) contains the actual work space length

c required for the current grid sizes and method. This value

c will be computed and returned even if iparm(21) is less then

c iparm(22) (see ierror=9).

c

c

c ... iparm(23)

c

c if error control is selected (tolmax = fparm(7) .gt. 0.0) then

c on output iparm(23) contains the actual number of cycles executed

c between the coarsest and finest grid levels in obtaining the

c approximation in phi. the quantity (iprer+ipost)*iparm(23) is

c the number of relaxation sweeps performed at the finest grid level.

c

c

c ... fparm(8)

c

c on output fparm(8) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(8) is computed only if there is error control (tolmax.gt.0.)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then

c

c fparm(8) = phdif/phmax

c

c is returned whenever phmax.gt.0.0. in the degenerate case

c phmax = 0.0, fparm(8) = phdif is returned.

c

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if mud3 is to be called again with iparm(1)=1 or

c if mud34 is to be called to improve the estimate to fourth

c order.

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained (ierror=-1)

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd,nze,nzf

c in iparm(2) through iparm(7)is not 0,1 or 2 or if

c (nxa,nxb) or (nyc,nyd) or (nze,nzf) are not pairwise zero.

c

c = 3 if mino(ixp,jyq,kzr) < 2 (ixp=iparm(8),jyq=iparm(9),kzr=iparm(10))

c

c = 4 if min0(iex,jey,kez) < 1 (iex=iparm(11),jey=iparm(12),kez=iparm(13))

c or if max0(iex,jey,kez) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or if ny.ne.jyq*2**(jey-1)+1 or

c if nz.ne.kzr*2**(kez-1)+1 (nx=iparm(14),ny=iparm(15),nz=iparm(16))

c

c = 6 if iguess = iparm(17) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(18) < 1 (large values for maxcy should not be used)

c

c = 8 if method = iparm(19) is less than 0 or greater than 10 or

c if meth2 = iparm(20) is not 0 or 1 or 2 or 3 when method > 7.

c

c = 9 if length = iparm(20) is too small (see iparm(21) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd or ze >= zf

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4),ze=fparm(5),zf=fparm(6))

c

c =11 if tolmax = fparm(7) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of mud3 documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file mud34.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud34.d

c

c contains documentation for subroutine mud34(work,phi,ierror)

c A sample fortran driver is file "tmud34.f".

c

c ... required MUDPACK files

c

c mud3.f, mudcom.f, mud3ln.f, mud3pn.f

c

c ... purpose

c

c mud34 attempts to improve the estimate in phi, obtained by calling

c mud3, from second to fourth order accuracy. see the file "mud3.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c mud3.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier mud3 call

c

c * arguments "work,phi" are the same used in calling mud3

c

c * "work,phi" have not changed since the last call to mud3

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of mud34 documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud34sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud34sp.d

c

c contains documentation for subroutine mud34sp(work,phi,ierror)

c A sample fortran driver is file "tmud34sp.f".

c

c ... required MUDPACK files

c

c mud3sp.f, mudcom.f

c

c ... purpose

c

c mud34sp attempts to improve the estimate in phi, obtained by calling

c mud3sp, from second to fourth order accuracy. see the file "mud3sp.d"

c for a detailed discussion of the elliptic pde approximated and

c arguments "work,phi" which are also part of the argument list for

c mud3sp.

c

c ... assumptions

c

c * phi contains a second-order approximation from an earlier mud3sp call

c

c * arguments "work,phi" are the same used in calling mud3sp

c

c * "work,phi" have not changed since the last call to mud3sp

c

c * the finest grid level contains at least 6 points in each direction

c

c

c *** warning

c

c if the first assumption is not true then a fourth order approximation

c cannot be produced in phi. the last assumption (adequate grid size)

c is the only one checked. failure in any of the others can result in

c in an undetectable error.

c

c ... language

c

c fortran90/fortran77

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections"

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... error argument

c

c = 0 if no error is detected

c

c = 30 if min0(nx,ny) < 6 where nx,ny are the fine grid sizes

c in the x,y directions.

c

c

c ***********************************************************************

c ***********************************************************************

c

c end of mud34sp documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud3cr.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud3cr.d

c

c contains documentation for:

c subroutine mud3cr(iparm,fparm,work,coef,bnd3cr,rhs,phi,mgopt,

c +icros,crsxy,crsxz,crsyz,tol,maxit,iouter,rmax,ierror)

c A sample fortran driver is file "tmud3cr.f".

c

c ... required MUDPACK files

c

c mudcom.f

c

c

c ... purpose

c

c subroutine mud3cr automatically discretizes and attempts to compute

c the second order finite difference approximation to a three-

c dimensional linear nonseparable elliptic partial differential

c equation with cross derivative terms on a box. the approximation

c is generated on a uniform grid covering the box (see mesh description

c below). boundary conditions may be any combination of oblique mixed

c derivative (see bnd3cr description below), specified (Dirchlet) or

c periodic. the form of the pde in operator notation is

c

c l(p) + lxyz(p) = r(x,y,z)

c

c where

c

c l(p) = cxx(x,y,z)*pxx + cyy(x,y,z)*pyy + czz(z,y,z)*pzz +

c

c cx(x,y,z)*px + cy(x,y,z)*py + cz(x,y,z)*pz +

c

c ce(x,y,z)*p(x,y,z) = r(x,y,z)

c

c and

c

c lxyz(p) = cxy(x,y,z)*pxy + cxz(x,y,z)*pxz + cyz(x,y,z)*pyz

c

c here cxx,cyy,czz,cx,cy,cz,ce,cxy,cxz,cyz are the known real

c coefficients of the pde; pxx,pyy,pzz,px,py,pz are the second and

c first partial derivatives of the unknown solution function p(x,y,z)

c with respect to the independent variables x,y,z; pxy,pxz, and pyz

c are the second order mixed partial derivatives of p with respect

c to xy,xz, and yz. r(x,y,z) is the known right hand side of the pde.

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c

c ... methods

c

c

c subroutine mud3cr is a recent addition to mudpack. details

c of the methods employeed by the other solvers in mudpack are in

c [1,9,10]. [1,2,7,9,10] contain performance measurements on a variety

c of elliptic pdes (see "references" in the file "readme"). the multi-

c grid methods are described in documentation for the other solvers.

c *** mud3cr differs fundamentally from the other solvers in mudpack.

c the full pde including cross derivative terms is discretized on

c the INTERIOR of the solution region:

c

c xa < x < xb, yc < y < yd, ze < z < zf

c

c however, on nonspecified (nondirchlet) boundaries only l(p) is

c discretized and the cross derivative term lxyz(p) is moved to the

c right hand side of the pde and approximated by second order finite

c finite difference formula applied to a previous estimate in p(k-1).

c similarly, oblique mixed derivative boundary conditions (see bnd3cr)

c are converted to a "mud3" type mixed normal form using second-order

c finite difference formula applied to a previous estimate p(k-1) to

c approximate non-normal derivative components. for example if

c the mixed derivative condition

c

c py + a(x,z)*px + b(x,z)*pz + c(x,z)*p(x,yd,z) = gyd(y,z)

c

c is specifed on the (x,z) plane of the upper y=yd boundary (see

c bnd3cr for kbdy=4 below) then mud3cr converts this to the mixed

c normal derivative form

c

c py + c(x,z)*p(x,yd,z) = h(k,x,z)

c

c where the modified right hand side h(k,x,z) is given by

c

c h(k,x,z) = gyd(x,z) - [a(x,z)*dx(p(k-1)) + b(x,z)*dz(p(k-1)].

c

c dx(p(k-1)) and dz(p(k-1)) are second order finite difference

c approximations to the nonnormal partial derivatives px,pz using the

c previous estimate in p(k-1).

c

c the result of full discretization on interior grid points and partial

c discretization with right hand side modifications on boundaries,

c is a linear system which we denote by

c

c D(p(k)) = r - Dxyz(p(k-1)).

c

c D is the coefficient matrix coming from the discretization and

c Dxyz(p(k-1)) stands for the right hand side modification obtained

c by approximating boundary cross derivative terms and/or nonnormal

c derivative components from mixed derivative boundary conditions

c with second order finite difference formula applied to p(k-1).

c with this notation, we formally describe the outer iteration employeed

c by mud3cr:

c

c algorithm mud3cr

c .

c set k = 0

c .

c set p(0) = 0.0 for all nonspecified grid points

c .

c repeat

c

c .. k = k+1

c

c .. solve D(p(k)) = r - Dxyz(p(k-1)) using multigrid iteration

c

c .. set rmax(k) = ||p(k) - p(k-1)|| / ||p(k)||

c

c until (rmax(k) < tol or k = maxit)

c .

c end mud3cr

c

c tol is an error tolerance for convergence and maxit is a limit on

c the number of outer iterations. both are user prescribed input

c arguments to mud3cr. the maximum vector norm || || is used in

c computing the relative difference between successive estimates in

c rmax(k). large values for maxit should not be used.

c

c *** note

c

c originally a code, mud3cr0, was designed by moving all cross terms

c to the right hand side and solving

c

c l(p(k)) = r - lxyz(p(k-1)) for k=1,...

c

c over the entire solution region including the interior. in this

c relatively straightforward approach, the standard mudpack solver

c mud3 is used iteratively at each step. however, convergence with

c mud3cro is slow and unreliable. an attempt was then made to

c discretize the complete pde including cross terms and boundary

c conditions over the entire region like the other solvers in mudpack.`

c undoubtedly, this would have the most efficient and robust convergence

c properties. the main difficulty with this approach is the

c unmanageable code complexity required to discretize all possible

c combinations of nonzero cross derivative terms and oblique derivative

c conditions. for example, in the presence of nonzero cross terms,

c cornors which are at the intersections of oblique derivative boundary

c conditions become too complex (for this person) to discretize.

c detection of combinations of oblique derivative conditions at cornors

c which are singular (i.e., for which discretization leads to division

c by 0.0) is a logical nightmare.

c

c the present mud3cr is a middle ground between these two approaches.

c it is an attempt to include as much of the pde cross terms as possible

c in the discretization while bypassing the code complexity required to

c include all possible boundary situations in the discretization.

c

c by including the (manageable) discretization of cross terms on the

c interior, the unfavorable convergence properties of the first approach

c are (hopefully) avoided and the favorable convergence properties

c of the second approach are (hopefully) obtained. by moving nonzero

c boundary cross terms and nonnormal components of mixed derivative

c boundary conditions to the right hand side, the problems of too much

c code complexity with discretization in the second approach are bypassed.

c

c extensive numerical testing indicates for most problems mud3cr is more

c robust than mud3cro. convergence is more "iffy" and computationally

c expensive than with the other solvers in mudpack.

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c

c ... references (partial list)

c

c for a complete list see "references" in the mudpack information and

c directory file "readme"

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c **********************************************************************

c *** arguments ********************************************************

c **********************************************************************

c

c arguments iparm,fparm,work,rhs,phi,coef,mgopt are the same as

c those input to mud3 (see mud3.d for a detailed description) with the

c following provisions:

c

c (1) the minimum required work space length for mud3cr is increased

c by approximately

c

c nx*ny*nz*(1+8*(icros(1)+icros(2)+icros(3))/7 +

c

c 2*(icros(1)+icros(2)+icros(3))*(nx*ny+nx*nz+ny*nz)

c

c words over the minimum work space required by mud3 (see icros

c description below). the exact minimal work space required

c by mud3cr for the current set of input arguments is output

c in iparm(22). * The exact minimal work length required

c for the current method and grid size arguments can be

c predetermined by calling mud3cr with iparm(21)=0 and

c printout of iparm(22) or (in fortran 90 codes) dynamically

c allocating work using the the value in iparm(22) in subsequent

c calls to mud3cr.

c

c (2) at least two calls to mud3cr are necessary to generate an

c approximation. intl=iparm(1)=0 is required on the first

c call. this call will do "once only" discretization, and

c set intermediate values in work which must be preserved

c for noninitial calls.

c

c (3) maxcy = iparm(18) must be 1 or 2 (see ierror = 13).

c

c (4) tolmax = fparm(5) = 0.0 is required. no "internal" error control

c is allowed within multigrid cycling (see mud3.d)

c

c (5) mgopt(1) = 0 is required. only the default multigrid

c options (W(2,1) cycles with cubic prolongation) can be used

c with mud3cr

c

c *** new arguments

c

c the arguments: bnd3cr,icros,crsxy,crsxz,crsyz,tol,maxit,iouter,rmax

c are all new to mud3cr. the error argument, ierror, has been expanded.

c these are all described below:

c

c

c ... bnd3cr(kbdy,xory,yorz,a,b,c,g)

c

c a subroutine with input arguments kbdy,xory,yorz and output

c arguments a,b,c,g. bnd3cr inputs OBLIQUE mixed derivative

c conditions at any of the six x,y,z boundaries to mud3cr as

c described below:

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2)=2 flags

c an oblique mixed boundary condition of the form

c

c px + axa(y,z)*py + bxa(y,z)*pz +cxa(y,z)*p(xa,y,z) = gxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients axa(y,z),bxa(y,z),

c cxa(y,z),gxa(y,z) must be returned

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3)=2 flags

c an oblique mixed boundary condition of the form

c

c px + axb(y,z)*py + bxb(y,z)*pz +cxb(y,z)*p(xb,y,z) = gxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients axb(y,z),bxb(y,z),

c cxb(y,z),gxb(y,z) must be returned

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4)=2 flags

c an oblique mixed boundary condition of the form

c

c py + ayc(x,z)*px + byc(x,z)*pz +cyc(x,z)*p(x,yc,z) = gyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients ayc(x,z),byc(x,z),

c cyc(x,z),gyc(x,z) must be returned

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5)=2 flags

c an oblique mixed boundary condition of the form

c

c py + ayd(x,z)*px + byd(x,z)*pz +cyd(x,z)*p(x,yd,z) = gyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients ayd(x,z),byd(x,z),

c cyd(x,z),gyd(x,z) must be returned

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6)=2 flags

c an oblique mixed boundary condition of the form

c

c pz + aze(x,y)*px + bze(x,y)*py + cze(x,y)*p(x,y,ze) = gze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients aze(x,y),bze(x,y),

c cze(x,y),gze(x,y) must be returned

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7)=2 flags

c an oblique mixed boundary condition of the form

c

c pz + azf(x,y)*px + bzf(x,y)*py + czf(x,y)*p(x,y,zf) = gzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bnd3cr and

c a,b,c,g corresponding to the known coefficients azf(x,y),bzf(x,y),

c czf(x,y),gzf(x,y) must be returned

c

c

c bnd3cr must be delcared "external" in the routine calling mud3cr

c where its name may be different. bnd3cr must be entered as a

c dummy subroutine even if there are no derivative boundary conditions.

c for an example of how to set up a subroutine to input derivative

c boundary conditions, see the test program tmud3cr.f

c

c ... icros

c

c an integer vector argument dimensioned 3 which flags the presence

c or absence of cross derivative terms in the pde as follows:

c

c icros(1) = 1 if cxy(x,y,z) is nonzero for any grid point (x,y,z)

c icros(1) = 0 if cxy(x,y,z) = 0.0 for all grid points (x,y,z)

c

c icros(2) = 1 if cxz(x,y,z) is nonzero for any grid point (x,y,z)

c icros(2) = 0 if cxz(x,y,z) = 0.0 for all grid points (x,y,z)

c

c icros(3) = 1 if cyz(x,y,z) is nonzero for any grid point (x,y,z)

c icros(3) = 0 if cyz(x,y,z) = 0.0 for all grid points (x,y,z)

c

c

c ... crsxy(x,y,z,cxy)

c

c if icros(1) = 1 then crsxy is a subroutine with arguments

c (x,y,z,cxy) which supplies the xy cross derivative coefficient

c cxy at the grid point (x,y,z). if icros(1) = 0 then crsxy

c is a dummy subroutine argument (i.e., it must be provided but

c will not be invoked).

c

c

c ... crsxz(x,y,z,cxz)

c

c if icros(2) = 1 then crsxz is a subroutine with arguments

c (x,y,z,cxz) which supplies the xz cross derivative coefficient

c cxz at the grid point (x,y,z). if icros(2) = 0 then crsxz

c is a dummy subroutine argument (i.e., it must be provided but

c will not be invoked).

c

c

c ... crsyz(x,y,z,cyz)

c

c if icros(3) = 1 then crsyz is a subroutine with arguments

c (x,y,z,cyz) which supplies the yz cross derivative coefficient

c cxy at the grid point (x,y,z). if icros(3) = 0 then crsyz

c is a dummy subroutine argument (i.e., it must be provided but

c will not be invoked).

c

c crsxy,crsxz,crsyz must be declared "external" in the routine

c calling mud3cr. the names chosen for these routines can be

c different (see tmud3cr.f for an example)

c

c ... tol

c

c tol is an error control argument for the outer iteration employed

c by mud3cr (see "methods" description above). if tol > 0.0 is input

c then tol is a relative error tolerance for convergence. the outer

c iteration terminates and convergence is deemed to have occurred at the

c k(th) iterate if the maximum relative difference, rmax(k), satisfies

c

c def

c rmax(k) = ||p(k) - p(k-1)||/ ||p(k)|| < tol.

c

c the last approximation p(maxit) is returned in phi even if

c convergence does not occurr. the maximum norm || || is used.

c when tol = 0.0 is input, error control is not implemented and

c exactly maxit (see below) outer iterations are executed in mud3cr.

c the tol = 0.0 option eliminates unnecessary computation when

c the user is certain of the required value for maxit.

c

c

c ... maxit

c

c a limit on the outer iteration loop (see "method" description)

c used to approximate the 3-d pde with cross derivative terms when

c tol > 0.0. if tol = 0.0 is entered then exactly maxit outer

c iterations are performed and only rmax(maxit) is computed. the

c total number of relaxation sweeps performed at the finest grid

c level is bounded by 3*maxcy*maxit. large values for maxit should

c not be used.

c

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(22)

c

c on output iparm(22) contains the actual work space length

c required by mud3cr for the current grid sizes and method.

c this will be approximately

c nx*ny*nz*(1+8*(icros(1)+icros(2)+icros(3))/7 +

c

c 2*(icros(1)+icros(2)+icros(3))*(nx*ny+nx*nz+ny*nz)

c

c words longer than the space required by mud3 (see mud3.d)

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if mud3cr is to be called again with iparm(1)=1

c and iparm(17)=1.

c

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained.

c

c

c ... iouter

c

c the number of outer iterations (see "method" description above)

c executed by mud3cr for the current call. maxit is an upper bound

c for iouter

c

c

c ... rmax (see tol,maxit descriptions above)

c

c a maxit dimensioned real vector. if tol > 0.0 is input then

c rmax(k) for k=1,...,iouter contain the maximum relative

c difference between successive estimates. rmax(k) is

c given by

c

c rmax(k) = ||p(k) - p(k-1)||/ ||p(k)||

c

c for k=1,...,iouter. the maximum norm || || is used. either

c iouter < maxit (convergence) or iouter = maxit is possible.

c if tol = 0.0 input then exactly maxit outer iterations are

c executed and only rmax(maxit) is computed. in this case

c rmax(1),...,rmax(maxit-1) are set to 0.0. the tol = 0.0

c option eliminates unnecessary computation when the user is

c certain of the required value for maxit.

c

c

c ... ierror

c

c an integer error argument which indicates fatal errors when

c returned positive. the negative values -5,-4,-3,-2,-1 and

c ierror = 2,3,4,5,6,9,10 have the same meaning as described for

c for mud3 (see mud3.d). in addition:

c

c *** new nonfatal error

c

c ierror = -10 if tol > 0.0 is input (error control) and convergence

c fails in maxit outer iterations. in this case the latest

c approximation p(maxit) is returned in phi (mud3cr can be recalled

c with iparm(1)=iparm(17)=1 to improve the approximation as long

c as all other arguments are unchanged)

c

c *** new fatal errors

c

c ... ierror

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls of if intl=0 and iguess=iparm(17)=1

c

c = 7 if maxcy = iparm(18) is not 1 or 2

c

c = 8 if method = iparm(19) is less than 0 or greater than 7

c mud3cr does not allow planar relaxation. meth2=iparm(20)

c is not used or checked.

c

c =11 if tolmax = fparm(7) is not 0.0

c

c =12 if kcycle = mgopt(1) is not 0

c

c =13 if icros(1) or icros(2) or icros(3) is not 0 or 1

c

c =14 if tol < 0.0

c

c =15 if maxit < 1

c

c ***********************************************************************

c ***********************************************************************

c

c end of mud3cr documentation

c

c ***********************************************************************

c ***********************************************************************

c

c

c file mud3sa.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud3sa.d

c

c contains documentation for:

c subroutine mud3sa(iparm,fparm,work,sgx,sgy,sgz,xlmbda,bndyc,rhs,phi,

c + mgopt,ierror)

c A sample fortran driver is file "tmud3sa.f".

c

c ... required MUDPACK files

c

c mudcom.f, mud3ln.f, mud3pn.f

c

c ... purpose

c

c subroutine mud3sa automatically discretizes and attempts to compute

c the second order conservative finite difference approximation

c to a three dimensional linear nonseparable "self adjoint" elliptic

c partial differential equation on a rectangle. the approximation

c is generated on a uniform grid covering the rectangle. boundary

c conditions may be specified (Dirichlet), periodic, or mixed.

c the form of the pde solved is:

c

c d(sgx(x,y,z)*dp/dx)/dx + d(sgy(x,y,z)*dp/dy)/dy +

c

c d(sgz(x,y,z)*dp/dz)/dz - xlmbda(x,y,z)*p(x,y,z) = r(x,y,z)

c

c where sgx(x,y,z),sgy(x,y,z),sgz(x,y,z) (all positive), xlmbda(x,y,z)

c (nonnegative), r(x,y,z) (the given right hand side) and p(x,y,z) (the

c unknown solution function) are all real valued functions of the real

c independent variables x,y,z. the use of the variable names "x,y,z"

c does not imply the cartesian coordinate system underlies the pde.

c for example, any pde in divergence form in cartesian coordinates can

c be put in a self-adjoint form suitable for mud3sa after a curvilinear

c coordinate transform (see tmud3sa.f)

c

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 23 used to efficiently pass

c integer parameters. iparm is set internally in mud3sa

c and defined as follows . . .

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c mud3sa should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if mud3sa has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud3sa is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud3sa is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud3sa is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud3sa

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the (y,z) plane x=xa

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xa,y,z) is specified (this must be input thru phi(1,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see "bndyc" description below where kbdy = 1)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the (y,z) plane x=xb

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xb,y,z) is specified (this must be input thru phi(nx,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see "bndyc" description below where kbdy = 2)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the (x,z) plane y=yc

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yc,z) is specified (this must be input thru phi(i,1,k))

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see "bndyc" description below where kbdy = 3)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the (x,z) plane y=yd

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yd,z) is specified (this must be input thru phi(i,ny,k))

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see "bndyc" description below where kbdy = 4)

c

c

c ... nze=iparm(6)

c

c flags boundary conditions on the (x,y) plane z=ze

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,ze) is specified (this must be input thru phi(i,j,1))

c = 2 if there are mixed derivative boundary conditions at z=ze

c (see "bndyc" description below where kbdy = 5)

c

c

c ... nzf=iparm(7)

c

c flags boundary conditions on the (x,y) plane z=zf

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,zf) is specified (this must be input thru phi(i,j,nz))

c = 2 if there are mixed derivative boundary conditions at z=zf

c (see "bndyc" description below where kbdy = 6)

c

c

c *** grid size parameters

c

c

c ... ixp = iparm(8)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(14)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(11)

c without changing nx = iparm(14)

c

c

c ... jyq = iparm(9)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(15)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(12)

c without changing ny = iparm(15)

c

c

c ... kzr = iparm(10)

c

c an integer greater than one which is used in defining the number

c of grid points in the z direction (see nz = iparm(16)). "kzr+1"

c is the number of points on the coarsest z grid visited during

c multigrid cycling. kzr should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the z direction is not used.

c if kzr > 2 then it should be 2 or a small odd value since a power

c of 2 factor of kzr can be removed by increasing kez = iparm(13)

c without changing nz = iparm(16)

c

c

c ... iex = iparm(11)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(14)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx = iparm(14).

c

c

c ... jey = iparm(12)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(15)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(9)

c as small as possible within grid size constraints when

c defining ny = iparm(15).

c

c

c ... kez = iparm(13)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the z direction (see nz = iparm(16)).

c kez .le. 50 is required. for efficient multigrid cycling,

c kez should be chosen as large as possible and kzr=iparm(10)

c as small as possible within grid size constraints when

c defining nz = iparm(16).

c

c

c ... nx = iparm(14)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(8), iex = iparm(11).

c

c

c ... ny = iparm(15)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(9), jey = iparm(12).

c

c

c ... nz = iparm(16)

c

c the number of equally spaced grid points in the interval [ze,zf]

c (including the boundaries). nz must have the form

c

c nz = kzr*(2**(kez-1)) + 1

c

c where kzr = iparm(10), kez = iparm(13)

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 65 by 97 grid. then

c ixp=2, jyq=4, kzr=6 and iex=jey=kez=5 could be used. a better

c choice would be ixp=jyq=2, kzr=3, and iex=5, jey=kez=6.

c

c *** note

c

c let G be the nx by ny by nz fine grid on which the approximation is

c generated and let n = max0(iex,jey,kez). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) by mz(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c mz(k) = kzr*[2**(max0(kez+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(17)

c

c = 0 if no initial guess to the pde is provided

c and/or full multigrid cycling beginning at the

c coarsest grid level is desired.

c

c = 1 if an initial guess to the pde at the finest grid

c level is provided in phi (see below). in this case

c cycling beginning or restarting at the finest grid

c is initiated.

c

c *** comments on iguess = 0 or 1 . . .

c

c

c setting iguess=0 forces full multigrid or "fmg" cycling. phi

c must be initialized at all grid points. it can be set to zero at

c non-Dirchlet grid points if nothing better is available. the

c values set in phi when iguess = 0 are passed and down and serve

c as an initial guess to the pde at the coarsest grid level where

c multigrid cycling commences.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c *** time dependent problems . . .

c

c assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at non-Dirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(18)

c

c the exact number of cycles executed between the finest

c (nx by ny by nz) and the coarsest ((ixp+1) by (jyq+1) by

c (kzr+1)) grid levels when tolmax=fparm(7)=0.0 (no error

c control). when tolmax=fparm(7).gt.0.0 is input (error control)

c then maxcy is a limit on the number of cycles between the

c finest and coarsest grid levels. in any case, at most

c maxcy*(iprer+ipost) relaxation sweeps are performed at the

c finest grid level (see iprer=mgopt(2),ipost=mgopt(3) below)

c when multigrid iteration is working "correctly" only a few

c cycles are required for convergence. large values for maxcy

c should not be required.

c

c

c ... method = iparm(19)

c

c this sets the method of relaxation (all relaxation

c schemes in mudpack use red/black type ordering)

c

c = 0 for gauss-seidel pointwise relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in the z direction

c

c = 4 for line relaxation in the x and y direction

c

c = 5 for line relaxation in the x and z direction

c

c = 6 for line relaxation in the y and z direction

c

c = 7 for line relaxation in the x,y and z direction

c

c = 8 for x,y planar relaxation

c

c = 9 for x,z planar relaxation

c

c =10 for y,z planar relaxation

c

c *** if nxa = 0 and nx = 3 at a grid level where line relaxation in the x

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c *** if nyc = 0 and ny = 3 at a grid level where line relaxation in the y

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c *** if nze = 0 and nz = 3 at a grid level where line relaxation in the z

c direction is flagged then it will be replaced by gauss-seidel point

c relaxation at that grid level.

c

c these adjustments are necessary since the simultaneous tri-diagonal

c solvers used with line periodic relaxation must have n > 2 where n

c is number of unknowns (excluding the periodic point).

c *** choice of method

c

c this is very important for efficient convergence. in some cases

c experimentation may be required.

c

c let fx represent the quantity sgx(x,y,z)/dlx**2 over the solution box

c

c let fy represent the quantity sgy(x,y,z)/dly**2 over the solution box

c

c let fz represent the quantity sgz(x,y,z)/dlz**2 over the solution box

c

c (0) if fx,fy,fz are roughly the same size and do not vary too

c much choose method = 0. if this fails try method = 7.

c

c (1) if fx is much greater then fy,fz and fy,fz are roughly the same

c size choose method = 1

c

c (2) if fy is much greater then fx,fz and fx,fz are roughly the same

c size choose method = 2

c

c (3) if fz is much greater then fx,fy and fx,fy are roughly the same

c size choose method = 3

c

c (4) if fx,fy are roughly the same and both are much greater then fz

c try method = 4. if this fails try method = 8

c

c (5) if fx,fz are roughly the same and both are much greater then fy

c try method = 5. if this fails try method = 9

c

c (6) if fy,fz are roughly the same and both are much greater then fx

c try method = 6. if this fails try method = 10

c

c (7) if fx,fy,fz vary considerably with none dominating try method = 7

c

c (8) if fx and fy are considerably greater then fz but not necessarily

c the same size (e.g., fx=1000.,fy=100.,fz=1.) try method = 8

c

c (9) if fx and fz are considerably greater then fy but not necessarily

c the same size (e.g., fx=10.,fy=1.,fz=1000.) try method = 9

c

c (10)if fy and fz are considerably greater then fx but not necessarily

c the same size (e.g., fx=1.,fy=100.,fz=10.) try method = 10

c

c

c ... meth2 = iparm(20) determines the method of relaxation used in the planes

c when method = 8 or 9 or 10.

c

c

c (if method = 8)

c

c = 0 for gauss-seidel pointwise relaxation

c in the x,y plane for each fixed z

c = 1 for line relaxation in the x direction

c in the x,y plane for each fixed z

c = 2 for line relaxation in the y direction

c in the x,y plane for each fixed z

c = 3 for line relaxation in the x and y direction

c in the x,y plane for each fixed z

c

c (1) if fx,fy are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fy choose meth2 = 1

c (3) if fy is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 9)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the x,z plane for each fixed y

c = 1 for simultaneous line relaxation in the x direction

c of the x,z plane for each fixed y

c = 2 for simultaneous line relaxation in the z direction

c of the x,z plane for each fixed y

c = 3 for simultaneous line relaxation in the x and z direction

c of the x,z plane for each fixed y

c

c (1) if fx,fz are roughly the same and vary little choose meth2 = 0

c (2) if fx is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fx choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c (if method = 10)

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

c in the y,z plane for each fixed x

c = 1 for simultaneous line relaxation in the y direction

c of the y,z plane for each fixed x

c = 2 for simultaneous line relaxation in the z direction

c of the y,z plane for each fixed x

c = 3 for simultaneous line relaxation in the y and z direction

c of the y,z plane for each fixed x

c

c (1) if fy,fz are roughly the same and vary little choose meth2 = 0

c (2) if fy is much greater then fz choose meth2 = 1

c (3) if fz is much greater then fy choose meth2 = 2

c (4) if none of the above or meth2 = 0 fails choose meth2 = 3

c

c

c ... length = iparm(21)

c

c the length of the work space provided in vector work.

c

c let isx = 3 if method = 1,4,5 or 7 and nxa.ne.0

c let isx = 5 if method = 1,4,5 or 7 and nxa.eq.0

c let isx = 0 if method has any other value

c

c let jsy = 3 if method = 2,4,6 or 7 and nyc.ne.0

c let jsy = 5 if method = 2,4,6 or 7 and nyc.eq.0

c let jsy = 0 if method has any other value

c

c let ksz = 3 if method = 3,5,6 or 7 and nze.ne.0

c let ksz = 5 if method = 3,5,6 or 7 and nze.eq.0

c let ksz = 0 if method has any other value

c

c

c then (for method .le.7)

c

c (1) length = (nx+2)*(ny+2)*(nz+2)*(10+isx+jsy+ksz)

c

c or (for method.gt.7)

c

c (2) length = 14*(nx+2)*(ny+2)*(nz+2)

c

c will usually but not always suffice. The exact minimal length depends,

c in a complex way, on the grid size arguments and method chosen.

c *** It can be predetermined for the current input arguments by calling

c mud3sa with length=iparm(21)=0 and printing iparm(22) or (in f90)

c dynamically allocating the work space using the value in iparm(22)

c in a subsequent mud3sa call.

c

c

c ... fparm

c

c a floating point vector of length 8 used to efficiently

c pass floating point parameters. fparm is set internally

c in mud3sa and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... ze=fparm(5), zf=fparm(6)

c

c the range of the z independent variable. ze must

c be less than zf.

c

c

c ... tolmax = fparm(7)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j,k)

c and phi2(i,j,k) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(7)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c

c ... work

c

c a one dimensional array that must be provided for work space.

c see length = iparm(21). the values in work must be preserved

c if mud3sa is called again with intl=iparm(1).ne.0

c

c

c ... bndyc

c

c a subroutine with parameters (kbdy,xory,yorz,alfa,gbdy).

c which are used to input mixed boundary conditions to mud3sa.

c the boundaries are numbered one thru six and the form of

c conditions are described below.

c

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y,z)*p(xa,y,z) = gbdxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y,z),gbdxa(y,z) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y,z)*p(xb,y,z) = gbdxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y,z),gbdxb(y,z) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x,z)*p(x,yc,z) = gbdyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x,z),gbdyc(x,z) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x,z)*p(x,yd,z) = gbdyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x,z),gbdyd(x,z) must be returned.

c

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfze(x,y)*p(x,y,ze) = gbdze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfze(x,y),gbdze(x,y) must be returned.

c

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfzf(x,y)*p(x,y,zf) = gbdzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfzf(x,y),gbdzf(x,y) must be returned.

c

c

c *** alfxa,alfyc,alfze nonpositive and alfxb,alfyd,alfze nonnegative

c will help maintain matrix diagonal dominance during discretization

c aiding convergence.

c

c *** bndyc must provide the mixed boundary condition

c values in correspondence with those flagged in iparm(2)

c thru iparm(7). if all boundaries are specified then

c mud3sa will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared

c external in the routine calling mud3sa. the actual

c name chosen may be different.

c

c

c ... sgx,sgy,sgz

c

c function subroutines which returns the real value of the

c coefficients at any point (x,y,z). they must be constructed

c to return values outside the solution region. let dx=(xb-xa)/ixp,

c dy = (yd-yc)/jyq, dz=(zf-ze)/kzr (coarse grid increments).

c then sgx,sgy,sgz will be invoked for x,y,z in the intervals

c

c [xa-0.5*dx,xa], [xb,xb+0.5*dx]

c [yc-0.5*dy,yc], [yd,yd+0.5*dy]

c [ze-0.5*dz,ze], [zf,zf+0.5*dz].

c

c this is necessitated by conservative finite differencing. values

c outside specified (Dirichlet) boundaries will not be used in

c the final discretization (e.g., if nxa = 1 then sgx will be

c invoked for x.lt.xa but these values will not used). on the other-

c hand such values are required outside mixed derivative boundaries.

c sgx,sgy,sgz should be positive for all (x,y,z) (see ierror = -5). they

c must be declared "external" in the user constructed program calling

c mud3sa where their names may be different.

c

c ... xlmbda

c

c a real valued function subroutine which returns the value of

c "xlmbda" in the pde at any grid point (xi,yj,zk). xlmbda should

c be nonnegative for any (xi,yj,zk) (see ierror = -4). xlmbda must be

c declared "external" in the user constructed program calling

c mud3sa where its name may be different.

c

c

c ... rhs

c

c an array dimensioned nx by ny by nz which contains

c the given right hand side values on the uniform 3-d mesh.

c rhs(i,j,k) = r(xi,yj,zk) for i=1,...,nx and j=1,...,ny

c and k=1,...,nz.

c

c ... phi

c

c an array dimensioned nx by ny by nz . on input phi must

c contain specified boundary values and an initial guess

c to the solution if flagged (see iguess=iparm(17)=1). for

c example, if nyd=iparm(5)=1 then phi(i,ny,k) must be set

c equal to p(xi,yd,zk) for i=1,...,nx and k=1,...,nz prior to

c calling mud3sa. the specified values are preserved by mud3sa.

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at non-Dirchlet grid points (this is not

c checked). these values are projected down and serve as an initial

c guess to the pde at the coarsest grid level. set phi to 0.0 at

c nonDirchlet grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid parameters (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the parameters

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below).

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c ***********************************************************************

c ****output parameters**************************************************

c ***********************************************************************

c

c

c ... iparm(22)

c

c on output iparm(22) contains the actual work space length

c required for the current grid sizes and method.

c

c

c ... iparm(23)

c

c if error control is selected (tolmax = fparm(7) .gt. 0.0) then

c on output iparm(23) contains the actual number of cycles executed

c between the coarsest and finest grid levels in obtaining the

c approximation in phi. the quantity (iprer+ipost)*iparm(23) is

c the number of relaxation sweeps performed at the finest grid level.

c

c

c ... fparm(8)

c

c on output fparm(8) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(8) is computed only if there is error control (tolmax.gt.0.)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then

c

c fparm(8) = phdif/phmax

c

c is returned whenever phmax.gt.0.0. in the degenerate case

c phmax = 0.0, fparm(8) = phdif is returned.

c

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if mud3sa is to be called again with intl = iparm(1)=1

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained (ierror=-1)

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd,nze,nzf

c in iparm(2) through iparm(7)is not 0,1 or 2 or if

c (nxa,nxb) or (nyc,nyd) or (nze,nzf) are not pairwise zero.

c

c = 3 if mino(ixp,jyq,kzr) < 2 (ixp=iparm(8),jyq=iparm(9),kzr=iparm(10))

c

c = 4 if min0(iex,jey,kez) < 1 (iex=iparm(11),jey=iparm(12),kez=iparm(13))

c or if max0(iex,jey,kez) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or if ny.ne.jyq*2**(jey-1)+1 or

c if nz.ne.kzr*2**(kez-1)+1 (nx=iparm(14),ny=iparm(15),nz=iparm(16))

c

c = 6 if iguess = iparm(17) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(18) < 1 (large values for maxcy should not be used)

c

c = 8 if method = iparm(19) is less than 0 or greater than 10 or

c if meth2 = iparm(20) is not 0 or 1 or 2 or 3 when method > 7.

c

c = 9 if length = iparm(20) is too small (see iparm(21) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd or ze >= zf

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4),ze=fparm(5),zf=fparm(6))

c

c =11 if tolmax = fparm(7) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of mud3sa documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file mud3sp.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file mud3sp.d

c

c contains documentation for:

c subroutine mud3sp(iparm,fparm,work,cofx,cofy,cofz,bndyc,rhs,phi,

c + mgopt,ierror)

c A sample fortran driver is file "tmud3sp.f".

c

c ... required MUDPACK files

c

c mudcom.f

c

c ... purpose

c

c subroutine mud3sp automatically discretizes and attempts to compute

c the second order finite difference approximation to a three-

c dimensional linear SEPARABLE elliptic partial differential

c equation on a box. the approximation is generated on a uniform

c grid covering the box (see mesh description below). boundary

c conditions may be any combination of mixed, specified (Dirchlet)

c or periodic. the form of the pde solved is . . .

c

c cxx(x)*pxx + cx(x)*px + cex(x)*p(x,y,z) +

c

c cyy(y)*pyy + cy(y)*py + cey(y)*p(x,y,z) +

c

c czz(z)*pzz + cz(z)*pz + cez(z)*p(x,y,z) = r(x,y,z)

c

c here cxx,cx,cex,cyy,cy,cey,czz,cz,cez are the known real coefficients

c of the pde; pxx,pyy,pzz,px,py,pz are the second and first

c partial derivatives of the unknown solution function p(x,y,z)

c with respect to the independent variables x,y,z; r(x,y,z) is

c is the known real right hand side of the elliptic pde. cxx,cyy

c and czz should be positive for all (x,y,z) in the solution region.

c

c SEPARABILITY means:

c

c cxx,cx,cex depend only on x

c cyy,cy,cey depend only on y

c czz,cz,cez depend only on z

c

c For example, LaPlace's equation in Cartesian coordinates is separable.

c Nonseparable elliptic PDEs can be approximated with muh3 or mud3.

c mud3sp requires considerably less work space then muh3 or mud3.

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny by nz grid.

c the grid is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd] x [ze,zf].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1), dlz = (zf-ze)/(nz-1)

c

c be the uniform grid increments in the x,y,z directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly, zk = ze+(k-1)*dlz

c

c for i=1,...,nx; j=1,...,ny; k=1,...,nz denote the x,y,z uniform

c mesh points.

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with Fortran77

c and Fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 22 used to efficiently pass

c integer arguments. iparm is set internally in mud3sp

c and defined as follows . . .

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** An approximation is NOT generated after an intl=0 call!

c mud3sp should be called with intl=1 to approximate the elliptic

c PDE discretized by the intl=0 call. intl=1 should also

c be input if mud3sp has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. This will bypass

c redundant pde discretization and argument checking

c and save computational time. Some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) mud3sp is being recalled for additional accuracy. In

c this case iguess=iparm(12)=1 should also be used.

c

c (2) mud3sp is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) mud3sp is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to mud3sp

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c If any of (a) through (e) are true then the elliptic PDE

c must be discretized or rediscretized. If none of (a)

c through (e) holds, calls can be made with intl=1.

c Incorrect calls with intl=1 will produce erroneous results.

c *** The values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the (y,z) plane x=xa

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xa,y,z) is specified (this must be input thru phi(1,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see "bndyc" description below where kbdy = 1)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the (y,z) plane x=xb

c

c = 0 if p(x,y,z) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y,z) = p(x,y,z) for all x,y,z)

c

c = 1 if p(xb,y,z) is specified (this must be input thru phi(nx,j,k))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see "bndyc" description below where kbdy = 2)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the (x,z) plane y=yc

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yc,z) is specified (this must be input thru phi(i,1,k))

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see "bndyc" description below where kbdy = 3)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the (x,z) plane y=yd

c

c = 0 if p(x,y,z) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc,z) = p(x,y,z) for all x,y,z)

c = 1 if p(x,yd,z) is specified (this must be input thru phi(i,ny,k))

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see "bndyc" description below where kbdy = 4)

c

c

c ... nze=iparm(6)

c

c flags boundary conditions on the (x,y) plane z=ze

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,ze) is specified (this must be input thru phi(i,j,1))

c = 2 if there are mixed derivative boundary conditions at z=ze

c (see "bndyc" description below where kbdy = 5)

c

c

c ... nzf=iparm(7)

c

c flags boundary conditions on the (x,y) plane z=zf

c

c = 0 if p(x,y,z) is periodic in z on [ze,zf]

c (i.e., p(x,y,z+zf-ze) = p(x,y,z) for all x,y,z

c = 1 if p(x,y,zf) is specified (this must be input thru phi(i,j,nz))

c = 2 if there are mixed derivative boundary conditions at z=zf

c (see "bndyc" description below where kbdy = 6)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(8)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(14)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(11)

c without changing nx = iparm(14)

c

c

c ... jyq = iparm(9)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(15)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(12)

c without changing ny = iparm(15)

c

c

c ... kzr = iparm(10)

c

c an integer greater than one which is used in defining the number

c of grid points in the z direction (see nz = iparm(16)). "kzr+1"

c is the number of points on the coarsest z grid visited during

c multigrid cycling. kzr should be chosen as small as possible.

c recommended values are the small primes 2 or 3 or (possibly) 5.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the z direction is not used.

c if kzr > 2 then it should be 2 or a small odd value since a power

c of 2 factor of kzr can be removed by increasing kez = iparm(13)

c without changing nz = iparm(16)

c

c

c ... iex = iparm(11)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(14)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx = iparm(14).

c

c

c ... jey = iparm(12)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(15)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(9)

c as small as possible within grid size constraints when

c defining ny = iparm(15).

c

c

c ... kez = iparm(13)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the z direction (see nz = iparm(16)).

c kez .le. 50 is required. for efficient multigrid cycling,

c kez should be chosen as large as possible and kzr=iparm(10)

c as small as possible within grid size constraints when

c defining nz = iparm(16).

c

c

c ... nx = iparm(14)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(8), iex = iparm(11).

c

c

c ... ny = iparm(15)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(9), jey = iparm(12).

c

c

c ... nz = iparm(16)

c

c the number of equally spaced grid points in the interval [ze,zf]

c (including the boundaries). nz must have the form

c

c nz = kzr*(2**(kez-1)) + 1

c

c where kzr = iparm(10), kez = iparm(13)

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 65 by 97 grid. then

c ixp=2, jyq=4, kzr=6 and iex=jey=kez=5 could be used. a better

c choice would be ixp=jyq=2, kzr=3, and iex=5, jey=kez=6.

c

c *** note

c

c let G be the nx by ny by nz fine grid on which the approximation is

c generated and let n = max0(iex,jey,kez). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = G.

c

c each g(k) (k=1,...,n) has mx(k) by my(k) by mz(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c mz(k) = kzr*[2**(max0(kez+k-n,1)-1)] + 1

c

c

c

c ... iguess=iparm(17)

c

c = 0 if no initial guess to the pde is provided

c and/or full multigrid cycling beginning at the

c coarsest grid level is desired.

c

c = 1 if an initial guess to the pde at the finest grid

c level is provided in phi (see below). in this case

c cycling beginning or restarting at the finest grid

c is initiated.

c

c *** comments on iguess = 0 or 1 . . .

c

c

c setting iguess=0 forces full multigrid or "fmg" cycling. phi

c must be initialized at all grid points. it can be set to zero at

c non-Dirchlet grid points if nothing better is available. the

c values set in phi when iguess = 0 are passed and down and serve

c as an initial guess to the pde at the coarsest grid level where

c multigrid cycling commences.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c *** time dependent problems . . .

c

c assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at non-Dirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(18)

c

c the exact number of cycles executed between the finest

c (nx by ny by nz) and the coarsest ((ixp+1) by (jyq+1) by

c (kzr+1)) grid levels when tolmax=fparm(7)=0.0 (no error

c control). when tolmax=fparm(7).gt.0.0 is input (error control)

c then maxcy is a limit on the number of cycles between the

c finest and coarsest grid levels. in any case, at most

c maxcy*(iprer+ipost) relaxation sweeps are performed at the

c finest grid level (see iprer=mgopt(2),ipost=mgopt(3) below)

c when multigrid iteration is working "correctly" only a few

c cycles are required for convergence. large values for maxcy

c should not be required.

c

c

c ... method = iparm(19)

c

c

c = 0 for gauss-seidel pointwise relaxation with red/black ordering

C

C This is the only relaxation method offered with mud3sp. Line

C or planar relaxation would "lose" the significant savings in

C work space length defeating the purpose of mud3sp. If line

C or planar relaxation is required then use muh3 or mud3.

C method is used as an argument only to focus attention on the

C purpose of mud3sp.

C

c ... length = iparm(20)

c

c the length of the work space provided in vector work.

c This is considerably less then the work space required by

c the nonseparable solvers muh3 or mud3.

c

c length = 7*(nx+2)*(ny+2)*(nz+2)/2

c

c will usually but not always suffice. The exact minimal length

c depends on the grid size arguments. It can be predetermined

c *** for the current input arguments by calling mud3sp with iparm(20)

c set equal to zero and printing iparm(21) or (in f90) dynamically

c allocating the work space using the value in iparm(21) in a

c subsequent mud3sp call.

c

c ... fparm

c

c a floating point vector of length 8 used to efficiently

c pass floating point arguments. fparm is set internally

c in mud3sp and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... ze=fparm(5), zf=fparm(6)

c

c the range of the z independent variable. ze must

c be less than zf.

c

c

c ... tolmax = fparm(7)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j,k)

c and phi2(i,j,k) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(7)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT use error control!).

c

c

c ... work

c

c a one dimensional array that must be provided for work space.

c see length = iparm(20). the values in work must be preserved

c if mud3sp is called again with intl=iparm(1).ne.0 or if mud34sp

c is called to improve accuracy.

c

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,yorz,alfa,gbdy).

c which are used to input mixed boundary conditions to mud3sp.

c the boundaries are numbered one thru six and the form of

c conditions are described below.

c

c

c (1) the kbdy=1 boundary

c

c this is the (y,z) plane x=xa where nxa=iparm(2) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa*p(xa,y,z) = gbdxa(y,z)

c

c in this case kbdy=1,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxa,gbdxa(y,z) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the (y,z) plane x=xb where nxb=iparm(3) = 2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb*p(xb,y,z) = gbdxb(y,z)

c

c in this case kbdy=2,xory=y,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfxb,gbdxb(y,z) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the (x,z) plane y=yc where nyc=iparm(4) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc*p(x,yc,z) = gbdyc(x,z)

c

c in this case kbdy=3,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyc,gbdyc(x,z) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the (x,z) plane y=yd where nyd=iparm(5) = 2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd*p(x,yd,z) = gbdyd(x,z)

c

c in this case kbdy=4,xory=x,yorz=z will be input to bndyc and

c alfa,gbdy corresponding to alfyd,gbdyd(x,z) must be returned.

c

c

c (5) the kbdy=5 boundary

c

c this is the (x,y) plane z=ze where nze=iparm(6) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfze*p(x,y,ze) = gbdze(x,y)

c

c in this case kbdy=5,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfze,gbdze(x,y) must be returned.

c

c

c (6) the kbdy=6 boundary

c

c this is the (x,y) plane z=zf where nzf=iparm(7) = 2 flags

c a mixed boundary condition of the form

c

c dp/dz + alfzf*p(x,y,zf) = gbdzf(x,y)

c

c in this case kbdy=6,xory=x,yorz=y will be input to bndyc and

c alfa,gbdy corresponding to alfzf,gbdzf(x,y) must be returned.

c

c

c *** The constants alfxa,alfyc,alfze nonpositive and alfxb,alfyd,alfze

c nonnegative will help maintain matrix diagonal dominance during

c discretization aiding convergence.

c

c *** bndyc must provide the mixed boundary condition

c values in correspondence with those flagged in iparm(2)

c thru iparm(7). if all boundaries are specified then

c mud3sp will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared

c external in the routine calling mud3sp. the actual

c name chosen may be different.

c

c

c ... cofx

c

c a subroutine with arguments (x,cxx,cx,cex) which provides the

c known real coefficients of the x derivative terms for the pde

c at any grid point x. the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... cofy

c

c a subroutine with arguments (y,cyy,cy,cey) which provides the

c known real coefficients of the y derivative terms for the pde

c at any grid point y. the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... cofz

c

c a subroutine with arguments (z,czz,cz,cez) which provides the

c known real coefficients of the z derivative terms for the pde

c at any grid point z. the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c external.

c

c ... rhs

c

c an array dimensioned nx by ny by nz which contains

c the given right hand side values on the uniform 3-d mesh.

c rhs(i,j,k) = r(xi,yj,zk) for i=1,...,nx and j=1,...,ny

c and k=1,...,nz.

c

c ... phi

c

c an array dimensioned nx by ny by nz . on input phi must

c contain specified boundary values and an initial guess

c to the solution if flagged (see iguess=iparm(17)=1). for

c example, if nyd=iparm(5)=1 then phi(i,ny,k) must be set

c equal to p(xi,yd,zk) for i=1,...,nx and k=1,...,nz prior to

c calling mud3sp. the specified values are preserved by mud3sp.

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at non-Dirchlet grid points (this is not

c checked). these values are projected down and serve as an initial

c guess to the pde at the coarsest grid level. set phi to 0.0 at

c nonDirchlet grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid arguments (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the arguments

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below).

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---3-----3-----------3-----------------3--------------- level 1

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --6---6-------6---6-----------6---6-------6---6-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c ***********************************************************************

c ****output arguments**************************************************

c ***********************************************************************

c

c

c ... iparm(21)

c

c on output iparm(21) contains the actual work space length

c required for the current grid sizes and method. This value

c will be computed and returned even if iparm(20) is less then

c iparm(21) (see ierror=9).

c

c

c ... iparm(22)

c

c if error control is selected (tolmax = fparm(7) .gt. 0.0) then

c on output iparm(22) contains the actual number of cycles executed

c between the coarsest and finest grid levels in obtaining the

c approximation in phi. the quantity (iprer+ipost)*iparm(22) is

c the number of relaxation sweeps performed at the finest grid level.

c

c

c ... fparm(8)

c

c on output fparm(8) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(8) is computed only if there is error control (tolmax.gt.0.)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j,k)-phi1(i,j,k))) for all i,j,k

c

c and

c

c phmax = max(abs(phi2(i,j,k))) for all i,j,k

c

c then

c

c fparm(8) = phdif/phmax

c

c is returned whenever phmax.gt.0.0. in the degenerate case

c phmax = 0.0, fparm(8) = phdif is returned.

c

c

c

c ... work

c

c on output work contains intermediate values that must not be

c destroyed if mud3sp is to be called again with iparm(1)=1 or

c if mud34sp is to be called to improve the estimate to fourth

c order.

c

c ... phi

c

c on output phi(i,j,k) contains the approximation to

c p(xi,yj,zk) for all mesh points i=1,...,nx; j=1,...,ny;

c k=1,...,nz. the last computed iterate in phi is returned

c even if convergence is not obtained (ierror=-1)

c

c ... ierror

c

c For intl=iparm(1)=0 initialization calls, ierror is an

c error flag that indicates invalid input arguments when

c returned positive and nonfatal warnings when returned

c negative. Argument checking and discretization

c is bypassed for intl=1 calls which can only return

c ierror = -1 or 0 or 1.

c

c

c non-fatal warnings * * *

c

c

c =-5 if kcycle=mgopt(1) is greater than 2. values larger than 2 results

c in an algorithm which probably does far more computation than

c necessary. kcycle = 1 (v cycles) or kcycle=2 (w cycles) should

c suffice for most problems. ierror = -5 is also set if either

c iprer = mgopt(2) or ipost=mgopt(3) is greater than 2. the

c ierror=-5 flag is overridden by any other fatal or non-fatal

c error.

c

c =-4 if there are dominant nonzero first order terms in the pde which

c make it "hyperbolic" at the finest grid level. numerically, this

c happens if:

c

c abs(cx)*dlx > 2.*abs(cxx) (dlx = (xb-xa)/(nx-1))

c

c (or)

c

c abs(cy)*dly > 2.*abs(cyy) (dly = (yd-yc)/(ny-1))

c

c

c at some fine grid point (xi,yj). if an adjustment is not made the

c condition can lead to a matrix coming from the discretization

c which is not diagonally dominant and divergence is possible. since

c the condition is "likely" at coarser grid levels for pde's with

c nonzero first order terms, the adjustments (actually first order

c approximations to the pde)

c

c

c cxx = amax1(cxx,0.5*abs(cx)*dx)

c

c (and)

c

c cyy = amax1(cyy,0.5*abs(cy)*dy)

c

c

c (here dx,dy are the x,y mesh sizes of the subgrid)

c

c are made to preserve convergence of multigrid iteration. if made

c at the finest grid level, it can lead to convergence to an

c erroneous solution (flagged by ierror = -4). a possible remedy

c is to increase resolution. the ierror = -4 flag overrides the

c nonfatal ierror = -5 flag.

c

c

c =-3 if the continuous elliptic pde is singular. this means the

c boundary conditions are periodic or pure derivative at all

c boundaries and ce(x,y) = 0.0 for all x,y. a solution is still

c attempted but convergence may not occur due to ill-conditioning

c of the linear system coming from the discretization. the

c ierror = -3 flag overrides the ierror=-4 and ierror=-5 nonfatal

c flags.

c

c

c =-2 if the pde is not elliptic (i.e., cxx*cyy.le.0.0 for some (xi,yj))

c in this case a solution is still attempted although convergence

c may not occur due to ill-conditioning of the linear system.

c the ierror = -2 flag overrides the ierror=-5,-4,-3 nonfatal

c flags.

c

c

c =-1 if convergence to the tolerance specified in tolmax=fparm(5)>0.

c is not obtained in maxcy=iparm(13) multigrid cycles between the

c coarsest (ixp+1,jyq+1) and finest (nx,ny) grid levels.

c in this case the last computed iterate is still returned.

c the ierror = -1 flag overrides all other nonfatal flags

c

c

c no errors * * *

c

c = 0

c

c fatal argument errors * * *

c

c = 1 if intl=iparm(1) is not 0 on initial call or not 0 or 1

c on subsequent calls

c

c = 2 if any of the boundary condition flags nxa,nxb,nyc,nyd,nze,nzf

c in iparm(2) through iparm(7)is not 0,1 or 2 or if

c (nxa,nxb) or (nyc,nyd) or (nze,nzf) are not pairwise zero.

c

c = 3 if mino(ixp,jyq,kzr) < 2 (ixp=iparm(8),jyq=iparm(9),kzr=iparm(10))

c

c = 4 if min0(iex,jey,kez) < 1 (iex=iparm(11),jey=iparm(12),kez=iparm(13))

c or if max0(iex,jey,kez) > 50

c

c = 5 if nx.ne.ixp*2**(iex-1)+1 or if ny.ne.jyq*2**(jey-1)+1 or

c if nz.ne.kzr*2**(kez-1)+1 (nx=iparm(14),ny=iparm(15),nz=iparm(16))

c

c = 6 if iguess = iparm(17) is not equal to 0 or 1

c

c = 7 if maxcy = iparm(18) < 1 (large values for maxcy should not be used)

c

c = 8 if method = iparm(19) is not equat to zero

c

c = 9 if length = iparm(20) is too small (see iparm(21) on output

c for minimum required work space length)

c

c =10 if xa >= xb or yc >= yd or ze >= zf

c (xa=fparm(1),xb=fparm(2),yc=fparm(3),yd=fparm(4),ze=fparm(5),zf=fparm(6))

c

c =11 if tolmax = fparm(7) < 0.0

c

c errors in setting multigrid options * * * (see also ierror=-5)

c

c =12 if kcycle = mgopt(1) < 0 or

c if iprer = mgopt(2) < 1 or

c if ipost = mgopt(3) < 1 or

c if intpol = mgopt(4) is not 1 or 3

c

c *********************************************************

c *********************************************************

c

c end of mud3sp documentation

c

c **********************************************************

c **********************************************************

c

c

c

c file muh2.d

c

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c * *

c * copyright (c) 2008 by UCAR *

c * *

c * University Corporation for Atmospheric Research *

c * *

c * all rights reserved *

c * *

c * MUDPACK version 5.0.1 *

c * *

c * A Fortran Package of Multigrid *

c * *

c * Subroutines and Example Programs *

c * *

c * for Solving Elliptic Partial Differential Equations *

c * *

c * by *

c * *

c * John Adams *

c * *

c * of *

c * *

c * the National Center for Atmospheric Research *

c * *

c * Boulder, Colorado (80307) U.S.A. *

c * *

c * which is sponsored by *

c * *

c * the National Science Foundation *

c * *

c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

c ... file muh2.d

c

c contains documentation for:

c subroutine muh2(iparm,fparm,wk,iwk,coef,bndyc,rhs,phi,mgopt,ierror)

c a sample fortran driver is file "tmuh2.f".

c

c ... required mudpack files

c

c mudcom.f, muhcom.f

c

c ... purpose

c

c the "hybrid" multigrid/direct method code muh2 approximates the

c same 2-d nonseparable elliptic pde as the mudpack solver mud2.

c muh2 combines the efficiency of multigrid iteration with the certainty

c a direct method. the basic algorithm is modified by using banded

c gaussian elimination in place of relaxation whenever the coarsest

c subgrid is encountered within multigrid cycling. this provides

c additional grid size flexibility by eliminating the usual multigrid

c constraint that the coarsest grid consist of "few" points for effective

c error reduction with multigrid cycling. In many cases the hybrid method

c provides more robust convergence characteristics than multigrid cycling

c alone.

c

c The form of the pde solved is:

c

c

c cxx(x,y)*pxx + cyy(x,y)*pyy + cx(x,y)*px + cy(x,y)*py +

c

c ce(x,y)*p(x,y) = r(x,y).

c

c

c pxx,pyy,px,py are second and first partial derivatives of the

c unknown real solution function p(x,y) with respect to the

c independent variables x,y. cxx,cyy,cx,cy,ce are the known

c real coefficients of the elliptic pde and r(x,y) is the known

c real right hand side of the equation. cxx and cyy should be

c positive for all x,y in the solution region. nonseparability

c means some of the coefficients depend on both x and y. if

c the pde is separable subroutine mud2sp should be used instead

c of mud2 or muh2.

c

c *** muh2 becomes a full direct method if grid size arguments are chosen

c so that the coarsest and finest grids coincide. choosing iex=jey=1

c and ixp=nx-1, jyq=ny-1 (iex=iparm(6),jey=iparm(7),ixp=iparm(8),

c jyq=iparm(9),nx=iparm(10),ny=iparm(11)) will set gaussian elimination

c on the nx by ny grid. in this case, muh2 produces a direct solution

c to the same nonseparable elliptic pde as the direct solver liptic [5,6].

c muh2 is more general than liptic since it allows periodic boundary

c conditions in the y direction.

c

c

c ... argument differences with mud2.f

c

c the input and output arguments of muh2 are almost identical to the

c arguments of mud2 (see mud2.d) with the following exceptions:

c

c (1) the work space vector "wk" requires

c

c (ixp+1)*(jyq+1)*(2*ixp+3)

c

c additional words of storage (ixp = iparm(6), jyq = iparm(7))

c if periodic boundary conditions are not flagged in the y direction

c (nyc .ne. 0 where nyc = iparm(4)) or

c

c (ixp+1)*[2*(ixp+1)*(2*jyq-1)+jyq+1]

c

c additional words of storage if periodic boundary conditions are

c flagged in the y direction (nyc = 0). the extra work space is

c used for a direct solution with gaussian elimination whenever the

c coarsest grid is encountered within multigrid cycling.

c

c (2) An integer work space iwk of length at least (ixp+1)*(jyq+1)

c must be provided.

c

c (3) jyq must be greater than 2 if periodic boundary conditions

c are flagged in the y direction and ixp must be greater than

c 2 if periodic boundary conditions are flagged in the x direction.

c inputting jyq = 2 when nyc = 0 or inputting ixp = 2 when nxa = 0

c will set the fatal error flag ierror=3

c

c *** (4) it is no longer necessary that ixp and jyq be "small" for

c effective error reduction with multigrid iteration. there

c is no reduction in convergence rates when larger values for

c ixp or jyq are used . this provides additional flexibility

c in choosing grid size. in many cases muh2 provides more

c robust convergence than mud2. it can be used in place of

c mud2 for all nonsingular problems (see (5) below).

c

c (5) iguess = iparm(11) = 1 (flagging an initial guess) or

c maxcy = iparm(14) > 1 (setting more than one multigrid

c cycle) are not allowed if muh2 becomes a full direct method

c by choosing iex = jey = 1 (iex = iparm(8),jey = iparm(9)).

c this conflicting combination of input arguments for multigrid

c iteration and a full direct method set the fatal error flag

c

c ierror = 13

c

c iguess = 0 and maxcy = 1 are required when muh2 becomes a

c full direct method.

c

c (6) if a "singular" pde is detected (see ierror=-3 description in mud2.d;

c ce(x,y) = 0.0 for all x,y and the boundary conditions are a combination

c of periodic and/or pure derivatives) then muh2 sets the fatal error

c flag

c

c ierror = 14

c

c The direct method utilized by muh2 would likely cause a division

c by zero in the singular case. mud2 can be tried for singular problems

c

c

c ... grid size considerations

c

c (1) flexibility

c

c muh2 should be used in place of mud2 whenever grid size

c requirements do not allow choosing ixp and jyq to be "small"

c positive integers (typically less than 4).

c

c example:

c

c suppose we wish to solve an elliptic pde on a one degree grid on

c the full surface of a sphere. choosing ixp = jyq = 45 and iex = 4

c and jyq = 3 fits the required 361 by 181 grid exactly. multigrid

c cycling will be used on the sequence of subgrid sizes:

c

c 46 x 46 < 91 x 46 < 181 x 91 < 361 x 181

c

c the 46 x 46 coarsest subgrid has too much resolution for effective

c error reduction with relaxation only. muh2 circumvents this

c difficulty by generating an exact direct solution (modulo roundoff

c error) whenever the coarsest grid is encountered.

c

c (2) additional work space (see (1) under "arguments differences") is

c required by muh2 to implement gaussian elimination at the coarsest

c grid level. this may limit the size of ixp and jyq.

c

c (3) operation counts

c

c for simplicity, assume p = ixp = jyq and n = nx = ny. banded

c gaussian elimination requires o(p**4) operations for solution

c on the coarsest subgrid while multigrid iteration is a o(n**2)

c algorithm. these are approximately balanced when

c

c p**4 =: (n/(2**k))**4 =: n**2

c

c or

c

c k =: log2(n)/2

c

c grid levels are chosen with the hybrid method. so if

c p is approximately equal to

c

c n/(2**(log2(n)/2))

c

c then the direct method and multigrid parts of the hybrid algorithm

c require roughly the same amount of computer time. larger values

c for p mean the direct method will dominate the computation. smaller

c values mean the hybrid method will cost only marginally more than

c multigrid iteration with coarse grid relaxation.

c

c

c *** the remaining documentation is almost identical to mud2.d

c except for the modifications already indicated.

c

c ... mesh description . . .

c

c the approximation is generated on a uniform nx by ny grid. the grid

c is superimposed on the rectangular solution region

c

c [xa,xb] x [yc,yd].

c

c let

c

c dlx = (xb-xa)/(nx-1), dly = (yd-yc)/(ny-1)

c

c be the uniform grid increments in the x,y directions. then

c

c xi=xa+(i-1)*dlx, yj=yc+(j-1)*dly

c

c for i=1,...,nx and j=1,...,ny denote the x,y uniform mesh points

c

c

c ... language

c

c fortran90/fortran77

c

c

c ... portability

c

c mudpack5.0.1 software has been compiled and tested with fortran77

c and fortran90 on a variety of platforms.

c

c ... methods

c

c details of the methods employeed by the solvers in mudpack are given

c in [1,9]. [1,2,9] contain performance measurements on a variety of

c elliptic pdes (see "references" in the file "readme"). in summary:

c

c *** discretization and solution (second-order solvers) (see [1])

c

c the pde and boundary conditions are automatically discretized at all

c grid levels using second-order finite difference formula. diagonal

c dominance at coarser grid levels is maintained in the presence of

c nonzero first-order terms by adjusting the second-order coefficient

c when necessary. the resulting block tri-diagonal linear system is

c approximated using multigrid iteration [10,11,13,15,16,18]. version

c 5.0.1 of mudpack uses only fully weighted residual restriction. defaults

c include cubic prolongation and w(2,1) cycles. these can be overridden

c with selected multigrid options (see "mgopt"). error control based on

c maximum relative differences is available. full multigrid cycling (fmg)

c or cycling beginning or restarting at the finest grid level can be

c selected. a menu of relaxation methods including gauss-seidel point,

c line relaxation(s) (in any combination of directions) and planar

c relaxation (for three-dimensional anisotropic problems) are provided.

c all methods use ordering based on alternating points (red/black),

c lines, or planes for cray vectorization and improved convergence

c rates [14].

c

c *** higher order solution (fourth-order solvers) (see [9,19,21])

c

c if the multigrid cycling results in a second-order estimate (i.e.,

c discretization level error is reached) then this can be improved to a

c fourth-order estimate using the technique of "deferred corrections."

c the values in the solution array are used to generate a fourth-order

c approximation to the truncation error. second-order finite difference

c formula are used to approximate third and fourth partial derivatives

c of the solution function [3]. the truncation error estimate is

c transferred down to all grid levels using weighted averaging where

c it serves as a new right hand side. the default multigrid options

c are used to compute the fourth-order correction term which is added

c to the original solution array.

c

c

c ... references (partial)

c

c

c [1] J. Adams, "MUDPACK: Multigrid Fortran Software for the Efficient

c Solution of Linear Elliptic Partial Differential Equations,"

c Applied Math. and Comput. vol.34, Nov 1989, pp.113-146.

c

c [2] J. Adams,"FMG Results with the Multigrid Software Package MUDPACK,"

c proceedings of the fourth Copper Mountain Conference on Multigrid, SIAM,

c 1989, pp.1-12.

c .

c .

c .

c [7] J. Adams, R. Garcia, B. Gross, J. Hack, D. Haidvogel, and V. Pizzo,

c "Applications of Multigrid Software in the Atmospheric Sciences,"

c Mon. Wea. Rev.,vol. 120 # 7, July 1992, pp. 1447-1458.

c .

c .

c .

c [9] J. Adams, "Recent Enhancements in MUDPACK, a Multigrid Software

c package for Elliptic Partial Differential Equations," Applied Math.

c and Comp., 1991, vol. 43, May 1991, pp. 79-94.

c

c [10]J. Adams, "MUDPACK-2: Multigrid Software for Approximating

c Elliptic Partial Differential Equations on Uniform Grids with

c any Resolution," Applied Math. and Comp., 1993, vol. 53, February

c 1993, pp. 235-249

c .

c .

c .

c

c ... argument description

c

c

c **********************************************************************

c *** input arguments *************************************************

c **********************************************************************

c

c

c ... iparm

c

c an integer vector of length 17 used to pass integer

c arguments. iparm is set internally and defined as

c follows:

c

c

c ... intl=iparm(1)

c

c an initialization argument. intl=0 must be input

c on an initial call. in this case input arguments will

c be checked for errors and the elliptic partial differential

c equation and boundary conditions will be discretized using

c second order finite difference formula.

c

c *** an approximation is not generated after an intl=0 call!

c muh2 should be called with intl=1 to approximate the elliptic

c pde discretized by the intl=0 call. intl=1 should also

c be input if muh2 has been called earlier and only the

c values in in rhs (see below) or gbdy (see bndyc below)

c or phi (see below) have changed. this will bypass

c redundant pde discretization and argument checking

c and save computational time. some examples of when

c intl=1 calls should be used are:

c

c (0) after a intl=0 argument checking and discretization call

c

c (1) muh2 is being recalled for additional accuracy. in

c this case iguess=iparm(12)=1 should also be used.

c

c (2) muh2 is being called every time step in a time dependent

c problem (see discussion below) where the elliptic operator

c does not depend on time.

c

c (3) muh2 is being used to solve the same elliptic equation

c for several different right hand sides (iguess=0 should

c probably be used for each new righthand side).

c

c intl = 0 must be input before calling with intl = 1 when any

c of the following conditions hold:

c

c (a) the initial call to muh2

c (b) any of the integer arguments other than iguess=iparm(12)

c or maxcy=iparm(13) or mgopt have changed since the previous

c call.

c

c (c) any of the floating point arguments other than tolmax=

c fparm(5) have changed since the previous call

c

c (d) any of the coefficients input by coef (see below) have

c changed since the previous call

c

c (e) any of the "alfa" coefficients input by bndyc (see below)

c have changed since the previous call.

c

c if any of (a) through (e) are true then the elliptic pde

c must be discretized or rediscretized. if none of (a)

c through (e) holds, calls can be made with intl=1.

c incorrect calls with intl=1 will produce erroneous results.

c *** the values set in the saved work space "work" (see below) with

c an intl=0 call must be preserved with subsequent intl=1 calls.

c

c MUDPACK software performance should be monitored for intl=1

c calls. The intl=0 discretization call performance depends

c primarily on the efficiency or lack of efficiency of the

c user provided subroutines for pde coefficients and

c boundary conditions.

c

c

c ... nxa=iparm(2)

c

c flags boundary conditions on the edge x=xa

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y

c (if nxa=0 then nxb=0 is required, see ierror = 2)

c

c = 1 if p(xa,y) is specified (this must be input thru phi(1,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xa

c (see bndyc)

c

c

c ... nxb=iparm(3)

c

c flags boundary conditions on the edge x=xb

c

c = 0 if p(x,y) is periodic in x on [xa,xb]

c (i.e., p(x+xb-xa,y) = p(x,y) for all x,y)

c (if nxb=0 then nxa=0 is required, see ierror = 2)

c

c = 1 if p(xb,y) is specified (this must be input thru phi(nx,j))

c

c = 2 if there are mixed derivative boundary conditions at x=xb

c (see bndyc)

c

c

c ... nyc=iparm(4)

c

c flags boundary conditions on the edge y=yc

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyc=0 then nyd=0 is required, see ierror = 2)

c

c = 1 if p(x,yc) is specified (this must be input thru phi(i,1))

c

c = 2 if there are mixed derivative boundary conditions at y=yc

c (see bndyc)

c

c

c ... nyd=iparm(5)

c

c flags boundary conditions on the edge y=yd

c

c = 0 if p(x,y) is periodic in y on [yc,yd]

c (i.e., p(x,y+yd-yc) = p(x,y) for all x,y

c (if nyd=0 then nyc=0 is required, see ierror = 2)

c

c = 1 if p(x,yd) is specified (this must be input thru phi(i,ny))

c

c = 2 if there are mixed derivative boundary conditions at y=yd

c (see bndyc)

c

c

c *** grid size arguments

c

c

c ... ixp = iparm(6)

c

c an integer greater than one which is used in defining the number

c of grid points in the x direction (see nx = iparm(10)). "ixp+1"

c is the number of points on the coarsest x grid visited during

c multigrid cycling. ixp should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the x direction is not used.

c if ixp > 2 then it should be 2 or a small odd value since a power

c of 2 factor of ixp can be removed by increasing iex = iparm(8)

c without changing nx = iparm(10).

c

c

c ... jyq = iparm(7)

c

c an integer greater than one which is used in defining the number

c of grid points in the y direction (see ny = iparm(11)). "jyq+1"

c is the number of points on the coarsest y grid visited during

c multigrid cycling. jyq should be chosen as small as possible.

c recommended values are the small primes 2 or 3.

c larger values can reduce multigrid convergence rates considerably,

c especially if line relaxation in the y direction is not used.

c if jyq > 2 then it should be 2 or a small odd value since a power

c of 2 factor of jyq can be removed by increasing jey = iparm(9)

c without changing ny = iparm(11).

c

c

c ... iex = iparm(8)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the x direction (see nx = iparm(10)).

c iex .le. 50 is required. for efficient multigrid cycling,

c iex should be chosen as large as possible and ixp=iparm(8)

c as small as possible within grid size constraints when

c defining nx.

c

c

c ... jey = iparm(9)

c

c a positive integer exponent of 2 used in defining the number

c of grid points in the y direction (see ny = iparm(11)).

c jey .le. 50 is required. for efficient multigrid cycling,

c jey should be chosen as large as possible and jyq=iparm(7)

c as small as possible within grid size constraints when

c defining ny.

c

c

c

c ... nx = iparm(10)

c

c the number of equally spaced grid points in the interval [xa,xb]

c (including the boundaries). nx must have the form

c

c nx = ixp*(2**(iex-1)) + 1

c

c where ixp = iparm(6), iex = iparm(8).

c

c

c ... ny = iparm(11)

c

c the number of equally spaced grid points in the interval [yc,yd]

c (including the boundaries). ny must have the form:

c

c ny = jyq*(2**(jey-1)) + 1

c

c where jyq = iparm(7), jey = iparm(9).

c

c

c *** example

c

c suppose a solution is wanted on a 33 by 97 grid. then

c ixp=2, jyq=6 and iex=jey=5 could be used. a better

c choice would be ixp=2, jyq=3, and iex=5, jey=6.

c

c *** grid size flexibility considerations:

c

c the hybrid multigrid/direct method code muh2 provides more grid size

c flexibility than mud2 by removing the constraint that ixp and jyq are

c 2 or 3. this is accomplished by using a direct method whenever the

c coarsest (ixp+1) x (jyq+1) grid is encountered in multigrid cycling.

c if nx = ixp+1 and ny = jyq+1 then muh2 becomes a full direct method.

c muh2 is roughly equivalent to mud2 in efficiency as long as ixp and

c jyq remain "small". if the problem to be approximated requires

c a grid neither mud2 por muh2 can exactly fit then another option

c is to generate an approximation on a "close grid" using mud2 or muh2.

c then transfer the result to the required grid using cubic interpolation

c via the package "regridpack"(contact john adams about this software)

c

c *** note

c

c let G be the nx by ny fine grid on which the approximation is

c generated and let n = max0(iex,jey). in mudpack, multigrid

c cycling is implemented on the ascending chain of grids

c

c G(1) < ... < G(k) < ... < G(n) = g.

c

c each G(k) (k=1,...,n) has mx(k) by my(k) grid points

c given by:

c

c mx(k) = ixp*[2**(max0(iex+k-n,1)-1)] + 1

c

c my(k) = jyq*[2**(max0(jey+k-n,1)-1)] + 1

c

c If iex = jey = 1 then G(1) = G(n) and muh2 solves the problem

c directly with block banded Gaussian elimination. Otherwise

c muh2 replaces relaxation with a direct method on G(1).

c

c ... iguess=iparm(12)

c

c = 0 if no initial guess to the pde is provided

c

c = 1 if an initial guess to the pde is at the finest grid

c level is provided in phi (see below)

c

c comments on iguess = 0 or 1 . . .

c

c even if iguess = 0, phi must be initialized at all grid points (this

c is not checked). phi can be set to 0.0 at non-dirchlet grid points

c if nothing better is available. the values set in phi when iguess = 0

c are passed down and serve as an initial guess to the pde at the coarsest

c grid level where cycling commences. in this sense, values input in

c phi always serve as an initial guess. setting iguess = 0 forces full

c multigrid cycling beginning at the coarsest and finishing at the finest

c grid level.

c

c if iguess = 1 then the values input in phi are an initial guess to the

c pde at the finest grid level where cycling begins. this option should

c be used only if a "very good" initial guess is available (as, for

c example, when restarting from a previous iguess=0 call).

c

c time dependent problems . . .

c

c *** assume we are solving an elliptic pde every time step in a

c marching problem of the form:

c

c l(p(t)) = r(t)

c

c where the differential operator "l" has no time dependence,

c "p(t)" is the solution and "r(t)" is the right hand side at

c current time "t". let "dt" be the increment between time steps.

c then p(t) can be used as an initial guess to p(t+dt) with

c intl = 1 when solving

c

c l(p(t+dt)) = r(t+dt).

c

c after the first two time steps, rather than continue, it would

c be better to define the "correction" term:

c

c e(t,dt) = p(t+dt) - p(t)

c

c this clearly satisfies the equation

c

c l(e(t,dt)) = r(t+dt) - r(t).

c

c this should be solved with iguess = 0 and intl = 1. boundary

c conditions for e(t,dt) are obtained from the boundary conditions

c for p(t) by subtracting given values at t from given values at

c t+dt. for example if

c

c d(p(t))/dx = f(t), d(p(t+dt))/dx = f(t+dt)

c

c at some x boundary then e(t,dt) satisfies the derivative

c boundary condition

c

c d(e(t,dt))/dx = f(t+dt) - f(t).

c

c e(t,dt) can be preset to 0.0 (at nondirchlet points) or (if p(t-dt)

c is saved) to p(t)-p(t-dt). with iguess = 0, these values will serve

c as an initial guess to e(t,dt) at the coarsest grid level. this

c approach has the advantage that a full sequence of multigrid cycles,

c beginning at the coarsest grid level, is invoked every time step in

c solving for e(t,dt). a few digits of accuracy in e(t,dt), which is

c ordinarily much smaller than p(t), will yield several more digits of

c accuracy in the final approximation:

c

c p(t+dt) = p(t) + e(t,dt).

c

c using this approach to integrate in time will give more accuracy

c then using p(t) as an initial guess to p(t+dt) for all time steps.

c it does require additional storage.

c

c if the differential operator "l" has time dependence (either thru

c the coefficients in the pde or the coefficients in the derivative

c boundary conditions) then use p(t) as an initial guess to p(t+dt)

c when solving

c

c l(t+dt)(p(t+dt)) = r(t+dt)

c

c with intl = 0 for all time steps (the discretization must be repeated

c for each new "t"). either iguess = 0 (p(t) will then be an initial

c guess at the coarsest grid level where cycles will commence) or

c iguess = 1 (p(t) will then be an initial guess at the finest grid

c level where cycles will remain fixed) can be tried.

c

c

c ... maxcy = iparm(13)

c

c the exact number of cycles executed between the finest (nx by

c ny) and the coarsest ((ixp+1) by (jyq+1)) grid levels when

c tolmax=fparm(5)=0.0 (no error control). when tolmax > 0.0

c is input (error control) then maxcy is a limit on the number

c of cycles between the finest and coarsest grid levels. in

c any case, at most maxcy*(iprer+ipost) relaxation sweeps are

c are performed at the finest grid level (see iprer=mgopt(2),

c ipost=mgopt(3) below). when multigrid iteration is working

c "correctly" only a few are required for convergence. large

c values for maxcy should not be necessary.

c

c

c ... method = iparm(14) determines the method of relaxation

c (gauss-seidel based on alternating points or lines)

c

c = 0 for point relaxation

c

c = 1 for line relaxation in the x direction

c

c = 2 for line relaxation in the y direction

c

c = 3 for line relaxation in both the x and y direction

c

c

c *** choice of method. . .

c

c let fx represent the quantity cxx(x,y)/dlx**2 over the solution region.

c

c let fy represent the quantity cyy(x,y)/dly**2 over the solution region

c

c if fx,fy are roughly the same size and do not vary too much over

c the solution region choose method = 0. if this fails try method=3.

c

c if fx is much greater than fy choose method = 1.

c

c if fy is much greater than fx choose method = 2

c

c if neither fx or fy dominates over the solution region and they

c both vary considerably choose method = 3.

c

c

c ... length = iparm(15)

c

c the length of the work space provided in vector work (see below).

c let isx = 0 if method = 0 or method = 2

c let isx = 3 if method = 1 or method = 3 and nxa.ne.0

c let isx = 5 if method = 1 or method = 3 and nxa.eq.0

c let jsy = 0 if method = 0 or method = 1

c let jsy = 3 if method = 2 or method = 3 and nyc.ne.0

c let jsy = 5 if method = 2 or method = 3 and nyc.eq.0

c

c let ldir = (ixp+1)*(jyq+1)*(2*ixp+3) if nyc.ne.0 or

c let ldir = (ixp+1)*[2*(ixp+1)*(2*jyq-1)+jyq+1] if nyc=0

c

c then . . .

c

c length = 4*[nx*ny*(10+isx+jsy)+8*(nx+ny+2)]/3 + ldir

c

c will suffice in most cases. the exact minimal work space

c length required for the current nx,ny and method is output

c in iparm(16) (even if iparm(15) is too small). this will be

c less then the value given by the simplified formula above

c in most cases.

c

c

c ... fparm

c

c a floating point vector of length 6 used to efficiently

c pass floating point arguments. fparm is set internally

c in muh2 and defined as follows . . .

c

c

c ... xa=fparm(1), xb=fparm(2)

c

c the range of the x independent variable. xa must

c be less than xb

c

c

c ... yc=fparm(3), yd=fparm(4)

c

c the range of the y independent variable. yc must

c be less than yd.

c

c

c ... tolmax = fparm(5)

c

c when input positive, tolmax is a maximum relative error tolerance

c used to terminate the relaxation iterations. assume phi1(i,j)

c and phi2(i,j) are the last two computed approximations at all

c grid points of the finest grid level. if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) for all i,j

c

c and

c

c phmax = max(abs(phi2(i,j))) for all i,j

c

c then "convergence" is considered to have occurred if and only if

c

c phdif/phmax < tolmax.

c

c

c if tolmax=fparm(5)=0.0 is input then there is no error control

c and maxcy cycles from the finest grid level are executed. maxcy

c is a limit which cannot be exceeded even with error control.

c *** calls with tolmax=0.0, when appropriate because of known

c convergence behavior, are more efficient than calls with tolmax

c positive (i.e., if possible DO NOT error control!).

c

c ... wk

c

c a one dimensional real saved work space (see iparm(15) for

c length) which must be preserved from the previous call when

c calling with intl=iparm(1)=1.

c

c ... iwk

c

c an integer vector dimensioned of length at least (ixp+1)*(jyq+1)

c (ixp = iparm(6),jyq=iparm(7)) in the routine calling muh2.

c The length of iwk is not checked! If iwk has length less than

c (ixp+1)*(jyq+1) then undetectable errors will result.

c

c ... bndyc

c

c a subroutine with arguments (kbdy,xory,alfa,gbdy) which

c are used to input mixed boundary conditions to muh2. bndyc

c must be declared "external" in the program calling muh2.

c the boundaries are numbered one thru four and the mixed

c derivative boundary conditions are described below (see the

c sample driver code "tmuh2.f" for an example of how bndyc is

c can beset up).

c

c * * * * * * * * * * * * y=yd

c * kbdy=4 *

c * *

c * *

c * *

c * kbdy=1 kbdy=2 *

c * *

c * *

c * *

c * kbdy=3 *

c * * * * * * * * * * * * y=yc

c

c x=xa x=xb

c

c

c (1) the kbdy=1 boundary

c

c this is the edge x=xa where nxa=iparm(2)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxa(y)*p(xa,y) = gbdxa(y)

c

c in this case kbdy=1,xory=y will be input to bndyc and

c alfa,gbdy corresponding to alfxa(y),gbdxa(y) must be returned.

c

c

c (2) the kbdy=2 boundary

c

c this is the edge x=xb where nxb=iparm(3)=2 flags

c a mixed boundary condition of the form

c

c dp/dx + alfxb(y)*p(xb,y) = gbdxb(y)

c

c in this case kbdy=2,xory=y, will be input to bndyc and

c alfa,gbdy corresponding to alfxb(y),gbdxb(y) must be returned.

c

c

c (3) the kbdy=3 boundary

c

c this is the edge y=yc where nyc=iparm(4)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyc(x)*p(x,yc) = gbdyc(x)

c

c in this case kbdy=3,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyc(x),gbdyc(x) must be returned.

c

c

c (4) the kbdy=4 boundary

c

c this is the edge y=yd where nyd=iparm(5)=2 flags

c a mixed boundary condition of the form

c

c dp/dy + alfyd(x)*p(x,yd) = gbdyd(x)

c

c in this case kbdy=4,xory=x will be input to bndyc and

c alfa,gbdy corresponding to alfyd(x),gbdyd(x) must be returned.

c

c

c *** bndyc must provide the mixed boundary condition values

c in correspondence with those flagged in iparm(2) thru

c iparm(5). if all boundaries are specified or periodic

c muh2 will never call bndyc. even then it must be entered

c as a dummy subroutine. bndyc must be declared "external"

c in the routine calling muh2. the actual name chosen may

c be different.

c

c

c ... coef

c

c a subroutine with arguments (x,y,cxx,cyy,cx,cy,ce) which

c provides the known real coefficients for the elliptic pde at

c any grid point (x,y). the name chosen in the calling routine

c may be different where the coefficient routine must be declared

c "external."

c

c ... rhs

c

c an array dimensioned nx by ny which contains the given

c right hand side values on the uniform 2-d mesh.

c

c rhs(i,j) = r(xi,yj) for i=1,...,nx and j=1,...,ny

c

c ... phi

c

c an array dimensioned nx by ny. on input phi must contain

c specified boundary values. for example, if nyd=iparm(5)=1

c then phi(i,ny) must be set equal to p(xi,yd) for i=1,...nx

c prior to calling muh2. these values are preserved by muh2.

c if an initial guess is provided (iguess=iparm(11)=1) it must

c be input thru phi.

c

c

c *** if no initial guess is given (iguess=0) then phi must still

c be initialized at all grid points (this is not checked). these

c values will serve as an initial guess to the pde at the coarsest

c grid level after a transfer from the fine solution grid. set phi

c equal to to 0.0 at all internal and non-specified boundaries

c grid points if nothing better is available.

c

c

c ... mgopt

c

c an integer vector of length 4 which allows the user to select

c among various multigrid options. if mgopt(1)=0 is input then

c a default set of multigrid parameters (chosen for robustness)

c will be internally selected and the remaining values in mgopt

c will be ignored. if mgopt(1) is nonzero then the parameters

c in mgopt are set internally and defined as follows: (see the

c basic coarse grid correction algorithm below)

c

c

c kcycle = mgopt(1)

c

c = 0 if default multigrid options are to be used

c

c = 1 if v cycling is to be used (the least expensive per cycle)

c

c = 2 if w cycling is to be used (the default)

c

c > 2 if more general k cycling is to be used

c *** warning--values larger than 2 increase

c the execution time per cycle considerably and

c result in the nonfatal error ierror = -5

c which indicates inefficient multigrid cycling.

c

c iprer = mgopt(2)

c

c the number of "pre-relaxation" sweeps executed before the

c residual is restricted and cycling is invoked at the next

c coarser grid level (default value is 2 whenever mgopt(1)=0)

c

c ipost = mgopt(3)

c

c the number of "post relaxation" sweeps executed after cycling

c has been invoked at the next coarser grid level and the residual

c correction has been transferred back (default value is 1

c whenever mgopt(1)=0).

c

c *** if iprer, ipost, or (especially) kcycle is greater than 2

c than inefficient multigrid cycling has probably been chosen and

c the nonfatal error (see below) ierror = -5 will be set. note

c this warning may be overridden by any other nonzero value

c for ierror.

c

c intpol = mgopt(4)

c

c = 1 if multilinear prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c

c = 3 if multicubic prolongation (interpolation) is used to

c transfer residual corrections and the pde approximation

c from coarse to fine grids within full multigrid cycling.

c (this is the default value whenever mgopt(1)=0).

c

c *** the default values (2,2,1,3) in the vector mgopt were chosen for

c robustness. in some cases v(2,1) cycles with linear prolongation will

c give good results with less computation (especially in two-dimensions).

c this was the default and only choice in an earlier version of mudpack

c (see [1]) and can be set with the integer vector (1,2,1,1) in mgopt.

c

c *** the schedules for one full multigrid cycle (iguess=0) using v(2,1)

c cycles and w(2,1) cycles are depicted for a four level grid below.

c the number of relaxation sweeps when each grid is visited are indicated.

c the "*" stands for prolongation of the full approximation and the "."

c stands for transfer of residuals and residual corrections within the

c coarse grid correction algorithm (see below). all version 5.0.1

c mudpack solvers use only fully weighted residual restriction. The

c "D" at grid level 1 indicates a direct method is used.

c

c one fmg with v(2,1) cycles:

c

c

c ------------------------------2-----------------1------ level 4

c * . .

c * . .

c ---------------2-----------1-----2-----------1--------- level 3

c * . . . .

c * . . . .

c ------2-----1-----2-----1-----------2-----1------------ level 2

c * . . . . . .

c * . . . . . .

c ---D-----D-----------D-----------------D--------------- level 1

c

c

c

c one fmg with w(2,1) cycles:

c

c ------------------------2---------------------------1-- level 4

c * . .

c ----------2-----------1---2-----------3-----------1---- level 3

c * . . . . . .

c ----2---1---2---3---1-------2---3---1---2---3---1------ level 2

c * . . . . . . . . . . . . . .

c --D---D-------D---D-----------D---D-------D---D-------- level 1

c

c

c the form of the "recursive" coarse grid correction cycling used

c when kcycle.ge.0 is input is described below in pseudo-algorithmic

c language. it is implemented non-recursively in fortran in mudpack.

c *** this algorithim is modified with the hybrid solvers which use

c a direct method whenever grid level 1 is encountered.

c

c algorithm cgc(k,l(k),u(k),r(k),kcycle,iprer,ipost,iresw,intpol)

c

c *** approximately solve l(k)*u(k) = r(k) using multigrid iteration

c *** k is the current grid level

c *** l(k) is the discretized pde operator at level k

c *** u(k) is the initial guess at level k

c *** r(k) is the right hand side at level k

c *** i(k,k-1) is the restriction operator from level k to level k-1

c *** (the form of i(k,k-1) depends on iresw)

c *** i(k-1,k) is the prolongation operator from level k-1 to level k

c *** (the form of i(k-1,k) depends on intpol)

c

c begin algorithm cgc

c

c *** pre-relax at level k

c

c . do (i=1,iprer)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . if (k > 1) then

c

c *** restrict the residual from level k to level k-1

c

c . . r(k-1) = i(k,k-1)(r(k)-l(k)*u(k))

c

c . . kount = 0

c

c . . repeat

c

c *** solve for the residual correction at level k-1 in u(k-1)

c *** using algorithm cgc "kcycle" times (this is the recursion)

c

c . . . kount = kount+1

c

c . . . invoke cgc(k-1,l(k-1),u(k-1),r(k-1),kcycle,iprer,ipost,iresw)

c

c

c . . until (kount.eq.kcycle)

c

c *** transfer residual correction in u(k-1) to level k

c *** with the prolongation operator and add to u(k)

c

c . . u(k) = u(k) + i(k-1,k)(u(k-1))

c

c . end if

c

c *** post relax at level k

c

c . do (i=1,ipost)

c

c . . relax(l(k),u(k),r(k))

c

c . end do

c

c . return

c

c end algorithm cgc

c

c

c **********************************************************************

c *** output arguments ************************************************

c **********************************************************************

c

c

c ... iparm(16) *** set for intl=0 calls only

c

c on output iparm(16) contains the actual work space length

c required. this will usually be less than that given by the

c simplified formula for length=iparm(15) (see as input argument)

c

c

c ... iparm(17) *** set for intl=1 calls only

c

c on output iparm(17) contains the actual number of multigrid cycles

c between the finest and coarsest grid levels used to obtain the

c approximation when error control (tolmax > 0.0) is set.

c

c

c ... fparm(6) *** set for intl=1 calls with fparm(5) > 0. only

c

c on output fparm(6) contains the final computed maximum relative

c difference between the last two iterates at the finest grid level.

c fparm(6) is computed only if there is error control (tolmax > 0.0)

c assume phi1(i,j,k) and phi2(i,j,k) are the last two computed

c values for phi(i,j,k) at all points of the finest grid level.

c if we define

c

c phdif = max(abs(phi2(i,j)-phi1(i,j))) over all i,j

c

c and

c

c phmax = max(abs(phi2(i,j)) over all i,j

c

c then

c

c fparm(6) = phdif/phmax

c

c is returned whenever phmax > 0.0. in the degenerate case

c phmax = 0.0, fparm(6) = phdif is returned.

c

c

c ... work

c

c on output work contains intermediate values that must not

c be destroyed if muh2 is to be called again with intl=1

c

c

c ... phi *** for intl=1 calls only

c

c on ou