ScaLAPACK_UNMAP
Note: For a detailed description of MPI Requirements see “Using ScaLAPACK Enhanced Routines” in the Introduction of this manual.
This routine unmaps array data from local distributed arrays to a global array. The data in the local arrays must have been stored in the two-dimensional block-cyclic form required by ScaLAPACK routines. All processors in the BLACS context call the routine.
Required Arguments
A0 — This is a local rank-1 or rank-2 array that contains this processor’s piece of the block-cyclic array. The data type for A0 is any of five Fortran intrinsic types: integer; single precision, real; double precision, real; single precision, complex; double precision, complex. (Input)
DESC_A — An integer vector containing the nine parameters associated with the ScaLAPACK matrix descriptor for array A. See “Usage Notes for ScaLAPACK Utilities” for a description of the nine parameters. (Input)
A — Global rank-1 or rank-2 array which is to receive the array which had been mapped to the processor grid. The data type for A is any of five Fortran intrinsic types: integer; single precision, real; double precision, real; single precision, complex; double precision, complex. A is only valid on MP_RANK = 0 after ScaLAPACK_UNMAP has been called. (Output)
Optional Arguments
LDA — Leading dimension of A as specified in the calling program. If this argument is not present, size(A,1) is used. (Input)
COLMAP — Input logical which indicates whether the global array should be mapped in column major form or row major form. COLMAP set to .TRUE. will result in the array being mapped in column major form while setting COLMAP to .FALSE. will result in the array being mapped in row major form. The default value of COLMAP is .TRUE.. (Input)
FORTRAN 90 Interface
Generic: CALL ScaLAPACK_UNMAP (A0, DESC_A, A , [, …])
Description
Subroutine ScaLAPACK_UNMAP unmaps columns or rows of local distributed arrays to a global array on MP_RANK = 0. It uses the two-dimensional block-cyclic array descriptor for the matrix to retrieve the data from the assumed-size arrays on the processors. The block sizes, contained in the array descriptor, determine the data set size for each blocking send and receive pair. The number of these synchronization points is proportional to . A temporary local buffer is allocated for staging the array data. It is of size M by NB, when mapping by columns, or N by MB, when mapping by rows.
Example: Distributed Linear Solver with IMSL ScaLAPACK Interface
The program SCPKMP_EX1 illustrates solving a system of linear-algebraic equations, Ax = b, by calling routine LSLRG, an IMSL routine which interfaces with a ScaLAPACK routine. The right-hand side is produced by defining A and y to have random values. Then the matrix-vector product b = Ay is computed. The problem size is such that the residuals, x-y≈0, are checked on MP_RANK = 0. IMSL routine ScaLAPACK_SETUP is called to define the process grid and provide further information identifying each process. IMSL routine ScaLAPACK_MAP is called to map the global arrays to local distributed arrays. Then LSLRG is called to compute the approximate solution, x.
program scpkmp_ex1
! This is Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP.
! A linear system is solved with an IMSL routine which
! interfaces with ScaLAPACK and is checked.
USE ScaLAPACK_SUPPORT
USE ERROR_OPTION_PACKET
USE MPI_SETUP_INT
USE LSLRG_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: N=9
INTEGER MXLDA, MXCOL, INFO, DESC_A(9), DESC_X(9)
LOGICAL :: GRID1D = .TRUE., NSQUARE = .TRUE.
real(kind(1d0)) :: ERROR=0d0, SIZE_Y
real(kind(1d0)), allocatable, dimension(:,:) :: A, B(:), &
X(:), Y(:), A0, B0(:), X0(:)
MP_NPROCS=MP_SETUP()
! Set up a 1D processor grid and define its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, NSQUARE, GRID1D)
! Get the array descriptor entities MXLDA, and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
! Set up the array descriptors
CALL DESCINIT(DESC_A, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
CALL DESCINIT(DESC_X, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, &
MXLDA, INFO)
! Allocate space for local arrays
ALLOCATE(A0(MXLDA,MXCOL), B0(MXLDA), X0(MXLDA))
! A root process is used to create the matrix data for the test.
IF(MP_RANK == 0) THEN
ALLOCATE(A(N,N), B(N), X(N), Y(N))
CALL RANDOM_NUMBER(A); CALL RANDOM_NUMBER(Y)
! Compute the correct result.
B=MATMUL(A,Y); SIZE_Y=SUM(ABS(Y))
END IF
! Map the input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESC_A, A0)
CALL SCALAPACK_MAP(B, DESC_X, B0)
! Compute the distributed product solution to A x = b.
CALL LSLRG(A0, B0, X0)
! Put the result on the root node.
Call ScaLAPACK_UNMAP(X0, DESC_X, X)
IF(MP_RANK == 0) THEN
! Check the residuals for size.
B=X-Y
ERROR=SUM(ABS(B))/SIZE_Y
END IF
! See to any error messages.
call e1pop("Mp_Setup")
IF(ERROR <= SQRT(EPSILON(ERROR)) .and. MP_RANK == 0) THEN
write(*,*) &
" Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP is correct."
END IF
! Deallocate storage arrays.
IF (MP_RANK == 0) DEALLOCATE(A, B, X, Y)
DEALLOCATE(A0, B0, X0)
! Exit from using this process grid.
CALL SCALAPACK_EXIT( MP_ICTXT )
! Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP is correct.