Usage Notes for ScaLAPACK Utilities

Note: For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in Linear Algebra Operators and Generic Functions.

This section describes the use of ScaLAPACK, a suite of dense linear algebra solvers, applicable when a single problem size is large. We have integrated usage of IMSL Fortran Library with ScaLAPACK. However, the ScaLAPACK library, including libraries for BLACS and PBLAS, are not part of this Library. To use ScaLAPACK software, the required libraries must be installed on the user’s computer system. We adhered to the specification of Blackford, et al. (1997), but use only MPI for communication. The ScaLAPACK library includes certain LAPACK routines, Anderson, et al. (1995), redesigned for distributed memory parallel computers. It is written in a Single Program, Multiple Data (SPMD) style using explicit message passing for communication. Matrices are laid out in a two-dimensional block-cyclic decomposition. Using High Performance Fortran (HPF) directives, Koelbel, et al. (1994), and a static processor array, and following declaration of the array, A(*,*), this is illustrated by:

INTEGER, PARAMETER :: N=500, P= 2, Q=3, MB=32, NB=32 
!HPF$ PROCESSORS PROC(P,Q)
!HPF$ DISTRIBUTE A(cyclic(MB), cyclic(NB)) ONTO PROC

Our integration work provides modules that describe the interface to the ScaLAPACK library. We recommend that users include these modules when using ScaLAPACK or ancillary packages, including BLACS and PBLAS. For the job of distributing data within a user’s application to the block-cyclic decomposition required by ScaLAPACK solvers, we provide a utility that reads data from an external file and arranges the data within the distributed machines for a computational step. Another utility writes the results into an external file. We also provide similar utilities that map/unmap global arrays to/from local arrays. These utilities are used in our ScaLAPACK examples for brevity.

The data types supported for these utilities are integer; single precision, real; double precision, real; single precision, complex; and double precision, complex.

A ScaLAPACK library normally includes routines for:

  • the solution of full-rank linear systems of equations,

  • general and symmetric, positive-definite, banded linear systems of equations,

  • general and symmetric, positive-definite, tri-diagonal, linear systems of equations,

  • condition number estimation and iterative refinement for LU and Cholesky factorization,

  • matrix inversion,

  • full-rank linear least-squares problems,

  • orthogonal and generalized orthogonal factorizations,

  • orthogonal transformation routines,

  • reductions to upper Hessenberg, bidiagonal and tridiagonal form,

  • reduction of a symmetric-definite, generalized eigenproblem to standard form,

  • the self-adjoint or Hermitian eigenproblem,

  • the generalized self-adjoint or Hermitian eigenproblem, and

  • the non-symmetric eigenproblem.

ScaLAPACK routines are available in four data types: single precision, real; double precision; real, single precision, complex, and double precision, complex. At present, the non-symmetric eigenproblem is only available in single and double precision. More background information and user documentation is available on the World Wide Web at location www.netlib.org/scalapack/slug/scalapack_slug.html.

For users with rank deficiency or simple constraints in their linear systems or least-squares problem, we have routines for:

  • full or deficient rank least-squares problems with non-negativity constraints

  • full or deficient rank least-squares problems with simple upper and lower bound constraints

These are available in two data types: single precision, real, and double precision, real, and they are not part of ScaLAPACK. The matrices are distributed in a general block-column layout.

We also provide generic interfaces to a number of ScaLAPACK routines through the standard IMSL Library routines. These are listed in in the Introduction of this manual.

The global arrays which are to be distributed across the processor grid for use by the ScaLAPACK routines require that an array descriptor be defined for each of them. We use the ScaLAPACK TOOLS routine DESCINIT to set up array descriptors in our examples. A typical call to DESCINIT:

   CALL DESCINIT(DESCA, M, N, MB, NB, IRSRC, ICSRC, ICTXT, LLD, INFO)

Where the arguments in the above call are defined as follows for the matrix being described:

DESCA — An input integer vector of length 9 which is to contain the array descriptor information.

M — An input integer which indicates the row size of the global array which is being described.

N — An input integer which indicates the column size of the global array which is being described.

MB — An input integer which indicates the blocking factor used to distribute the rows of the matrix being described.

NB — An input integer which indicates the blocking factor used to distribute the columns of the matrix being described.

IRSRC — An input integer which indicates the processor grid row over which the first row of the array being described is distributed.

ICSRC — An input integer which indicates the processor grid column over which the first column of the array being described is distributed.

ICTXT — An input integer which indicates the BLACS context handle.

LLD — An input integer indicating the leading dimension of the local array which is to be used for storing the local blocks of the array being described

INFO — An output integer indicating whether or not the call was successful. INFO = 0 indicates a successful exit. INFO = -i indicates the i-th argument had an illegal value.

This call is equivalent to the following assignment statements:

DESCA(1) = 1               ! This is the descriptor
                           ! type. In this case,  1.
DESCA(2) = ICTXT
DESCA(3) = M
DESCA(4) = N
DESCA(5) = MB
DESCA(6) = NB
DESCA(7) = IRSRC
DESCA(8) = ICSRC
DESCA(9) = LLD

The IMSL Library routines which interface with ScaLAPACK routines use IRSRC = 0 and ICSRC = 0 for the internal calls to DESCINIT.

Supporting Modules

We recommend that users needing routines from ScaLAPACK, PBLAS or BLACS, Version 1.4, use modules that describe the interface to individual codes. This practice, including use of the declaration directive, IMPLICIT NONE, is a reliable way of writing ScaLAPACK application code, since the routines may have lengthy lists of arguments. Using the modules is helpful to avoid the mistakes such as missing arguments or mismatches involving Type, Kind or Rank (TKR). The modules are part of the Fortran Library product. There is a comprehensive module, ScaLAPACK_Support, that includes use of all the modules in the table below. This module decreases the number of lines of code for checking the interface, but at the cost of increasing source compilation time compared with using individual modules.

Module Name

Contents of the Module

ScaLAPACK_Support

All of the following modules

ScaLAPACK_Int

All interfaces to ScaLAPACK routines

PBLAS_Int

All interfaces to parallel BLAS, or PBLAS

BLACS_Int

All interfaces to basic linear algebra communication routines, or BLACS

TOOLS_Int

Interfaces to ancillary routines used by ScaLAPACK, but not in other packages

LAPACK_Int

All interfaces to LAPACK routines required by ScaLAPACK

ScaLAPACK_IO_Int

All interfaces to ScaLAPACK_READ, ScaLAPACK_WRITE utility routines. See this Chapter.

MPI_Node_Int

The module holding data describing the MPI communicator, MP_LIBRARY_WORLD. See Dense Matrix Parallelism Using MPI.

GRIDINFO_Int

The module holding data describing the processor grid and information required to map the target array to the processors. See the Description section of ScaLAPACK_SETUP below.

ScaLAPACK_MAP_Int

The interface to the ScaLAPACK_MAP utility routines.

ScaLAPACK_UNMAP_Int

The interface to the ScaLAPACK_UNMAP utility routines.