General Remarks
Use of the ScaLAPACK enhanced routines allows a user to solve large linear systems of algebraic equations at a performance level that might not be achievable on one computer by performing the work in parallel across multiple computers. One might also use these routines on linear systems that prove to be too large for the address space of the target computer. Visual Numerics has tried to facilitate the use of parallel computing in these situations by providing interfaces to ScaLAPACK routines which accomplish the task. The IMSL Library solver interface has the same look and feel whether one is using the routine on a single computer or across multiple computers.
The basic steps required to utilize the IMSL routines which interface with ScaLAPACK routines are:
1. Initialize MPI
2. Initialize the processor grid
3. Define any necessary array descriptors
4. Allocate space for the local arrays
5. Set up local matrices across the processor grid
6. Call the IMSL routine which interfaces with ScaLAPACK
7. Gather the results from across the processor grid
8. Release the processor grid
9. Exit MPI
Utilities are provided in the IMSL Library that facilitate these steps for the user. Each of these utilities is documented in Chapter 11, “Utilities”. We visit the steps briefly here:
The user should call MP_SETUP()
at this step. This function is described in detail in
“Getting Started with
Modules MPI_setup_int
and MPI_node_int
” in Chapter 10, Linear Algebra Operators and Generic Functions of this manual.
For ScaLAPACK usage, suffice it to say that following a call to the
function
MP_SETUP(),
the module
MPI_node_int
will contain information about the number of processors, the rank of a
processor, and the communicator for the application. A call to
this function will return the number of processors available to the program.
Since the module MPI_node_int is
used by MPI_setup_int, it
is not necessary to explicitly use the module MPI_node_int.
If MP_SETUP()
is not called, then the program will compute entirely on one node.
No routine from MPI will be called.
SCALAPACK_SETUP (see Chapter 11, “Utilities”) is called at this step. This call will set up the processor grid for the user, define the context ID variable, MP_ICTXT, for the processor grid, and place MP_ICTXT into the module GRIDINFO_INT. Use of SCALAPACK_SUPPORT will make the information in MPI_NODE_INT and GRIDINFO_INT available to the user's program.
Consider the generic matrix A which is to be carved up and
distributed across the processors in the processor grid. In ScaLAPACK parlance,
we refer to A
as being the “global” array A
which is to be distributed across the processor grid in 2D block cyclic fashion
( Chapter 11, “Utilities”). Each processor
in the grid will then have access to a subset of the global array A.
We refer to the subset array to which the individual processor has access as the
“local” array A0. Just as it is sometimes
necessary for a program to be aware of the leading dimension of the global array
A,
it is also necessary for the program to be aware of other critical information
about the local array A0. This information can be
obtained by calling the IMSL utility SCALAPACK_GETDIM
(Chapter 11, “Utilities”). ScaLAPACK
Library utility DESCINIT
(See the Usage Notes section of Chapter 11,
“Utilities” ) is then used to store this information in a vector.
The array dimensions, obtained in the previous step, are used at this point to allocate space for any local arrays that will be used in the call to the IMSL routine.
If the matrices to be used by the solvers have not been distributed across the processor grid, IMSL provides utility routines SCALAPACK_READ and SCALAPACK_MAP to help in the distribution of global arrays across processors. SCALAPACK_READ will read data from a file while SCALAPACK_MAP will map a global array to the processor grid. Users may choose to distribute the arrays themselves as long as they distribute the arrays in 2D block cyclic fashion consistent with the array descriptors that have been defined.
The IMSL routines which interface with ScaLAPACK are listed in Table D.
IMSL provides utility routines SCALAPACK_WRITE and SCALAPACK_UNMAP to help in the gathering of results from across processors to a global array or file. SCALAPACK_WRITE will write data to a file while SCALAPACK_UNMAP will map local arrays from the processor grid to a global array.
This is accomplished by a call to SCALAPACK_EXIT.
A call to MP_SETUP with the argument ‘FINAL' will shut down MPI and set the value of MP_NPROCS = 0. This flags that MPI has been initialized and terminated. It cannot be initialized again in the same program unit execution. No MPI routine is defined when MP_NPROCS has this value.
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |