ScaLAPACK

ScaLAPACK_UNMAP

Example: Distributed Linear Solver with IMSL ScaLAPACK Interface

For a detailed description of MPI Requirements see “Using ScaLAPACK Enhanced Routines” in the Introduction of this manual.

This routine unmaps array data from local distributed arrays to a global array. The data in the local arrays must have been stored in the two-dimensional block-cyclic form required by ScaLAPACK routines. All processors in the BLACS context call the routine.

Required Arguments

A0 — This is a local rank-1 or rank-2 array that contains this processor’s piece of the block-cyclic array. The data type for A0 is any of five Fortran intrinsic types: integer; single precision, real; double precision, real; single precision, complex; double precision, complex. (Input)

DESC_A — An integer vector containing the nine parameters associated with the ScaLAPACK matrix descriptor for array A. See “Usage Notes for ScaLAPACK Utilities” for a description of the nine parameters. (Input)

A — Global rank-1 or rank-2 array which is to receive the array which had been mapped to the processor grid. The data type for A is any of five Fortran intrinsic types: integer; single precision, real; double precision, real; single precision, complex; double precision, complex. A is only valid on MP_RANK = 0 after ScaLAPACK_UNMAP has been called. (Output)

Optional Arguments

LDA — Leading dimension of A as specified in the calling program. If this argument is not present, size(A,1) is used. (Input)

COLMAP — Input logical which indicates whether the global array should be mapped in column major form or row major form. COLMAP set to .TRUE. will result in the array being mapped in column major form while setting COLMAP to .FALSE. will result in the array being mapped in row major form. The default value of COLMAP is .TRUE.. (Input)

FORTRAN 90 Interface

Generic: CALL ScaLAPACK_UNMAP (A0, DESC_A, A , [, …])

Description

Subroutine ScaLAPACK_UNMAP unmaps columns or rows of local distributed arrays to a global array on MP_RANK = 0. It uses the two-dimensional block-cyclic array descriptor for the matrix to retrieve the data from the assumed-size arrays on the processors. The block sizes, contained in the array descriptor, determine the data set size for each blocking send and receive pair. The number of these synchronization points is proportional to

. A temporary local buffer is allocated for staging the array data. It is of size M by NB, when mapping by columns, or N by MB, when mapping by rows.

Example: Distributed Linear Solver with IMSL ScaLAPACK Interface

The program SCPKMP_EX1 illustrates solving a system of linear-algebraic equations, Ax = b, by calling routine LSLRG, an IMSL routine which interfaces with a ScaLAPACK routine. The right-hand side is produced by defining A and y to have random values. Then the matrix-vector product b = Ay is computed. The problem size is such that the residuals, x-y≈0, are checked on MP_RANK = 0. IMSL routine ScaLAPACK_SETUP is called to define the process grid and provide further information identifying each process. IMSL routine ScaLAPACK_MAP is called to map the global arrays to local distributed arrays. Then LSLRG is called to compute the approximate solution, x.

program scpkmp_ex1

! This is Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP.

! A linear system is solved with an IMSL routine which

! interfaces with ScaLAPACK and is checked.

USE ScaLAPACK_SUPPORT

USE ERROR_OPTION_PACKET

USE MPI_SETUP_INT

USE LSLRG_INT

IMPLICIT NONE

INCLUDE "mpif.h"

INTEGER, PARAMETER :: N=9

INTEGER MXLDA, MXCOL, INFO, DESC_A(9), DESC_X(9)

LOGICAL :: GRID1D = .TRUE., NSQUARE = .TRUE.

real(kind(1d0)) :: ERROR=0d0, SIZE_Y

real(kind(1d0)), allocatable, dimension(:,:) :: A, B(:), &

X(:), Y(:), A0, B0(:), X0(:)

MP_NPROCS=MP_SETUP()

! Set up a 1D processor grid and define its context ID, MP_ICTXT

CALL SCALAPACK_SETUP(N, N, NSQUARE, GRID1D)

! Get the array descriptor entities MXLDA, and MXCOL

CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)

! Set up the array descriptors

CALL DESCINIT(DESC_A, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, &

MXLDA, INFO)

CALL DESCINIT(DESC_X, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, &

MXLDA, INFO)

! Allocate space for local arrays

ALLOCATE(A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))

! A root process is used to create the matrix data for the test.

IF(MP_RANK == 0) THEN

ALLOCATE(A(N,N), B(N), X(N), Y(N))

CALL RANDOM_NUMBER(A); CALL RANDOM_NUMBER(Y)

! Compute the correct result.

B=MATMUL(A,Y); SIZE_Y=SUM(ABS(Y))

END IF

! Map the input arrays to the processor grid

CALL SCALAPACK_MAP(A, DESC_A, A0)

CALL SCALAPACK_MAP(B, DESC_X, B0)

! Compute the distributed product solution to A x = b.

CALL LSLRG(A0, B0, X0)

! Put the result on the root node.

Call ScaLAPACK_UNMAP(X0, DESC_X, X)

IF(MP_RANK == 0) THEN

! Check the residuals for size.

B=X-Y

ERROR=SUM(ABS(B))/SIZE_Y

END IF

! See to any error messages.

call e1pop("Mp_Setup")

IF(ERROR <= SQRT(EPSILON(ERROR)) .and. MP_RANK == 0) THEN

write(*,*) &

" Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP is correct."

END IF

! Deallocate storage arrays.

IF (MP_RANK == 0) DEALLOCATE(A, B, X, Y)

DEALLOCATE(A0, B0, X0)

! Exit from using this process grid.

CALL SCALAPACK_EXIT( MP_ICTXT )

! Shut down MPI

MP_NPROCS = MP_SETUP(‘FINAL’)

END

Output

Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP is correct.