This routine writes the matrix data to a file. The data is transmitted from the two-dimensional block-cyclic form used by ScaLAPACK routines. This routine contains a call to a barrier routine so that if one process is writing the file and an alternate process is to read it, the results will be synchronized. All processors in the BLACS context call the routine.
Required Arguments
File_Name— A character variable naming the file to receive the matrix data. (Input) This file is opened with “STATUS=”UNKNOWN.” If any access violation happens, a type = terminal error message will occur. If the file already exists it will be overwritten. After the contents are written, the file is closed. This file is written with a loop logically equivalent to groups of writes:
WRITE() ((BUFFER(I,J), I=1,M), J=1, NB)
or (optionally):
WRITE() ((BUFFER(I,J), J=1,N), I=1, MB)
DESC_A(*) — The nine integer parameters associated with the ScaLAPACK matrix descriptor. Values for NB, MB, LDA are contained in this array. (Input)
A(LDA,*)— This is an assumed-size array, with leading dimension LDA, containing this processor’s piece of the block-cyclic matrix. The data type for A(*,*) is any of five Fortran intrinsic types: integer; single precision, real; double precision, real; single precision, complex; or double precision, complex. (Input)
Optional Arguments
Format—A character variable containing a format to be used for writing the file that receives matrix data. If this argument is not present, an unformatted or list-directed write is used. (Input)
iopt— Derived type array with the same precision as the array A(*,*), used for passing optional data to ScaLAPACK_WRITE. Use single precision when A(*,*) is type INTEGER. (Input) The options are as follows:
Packaged Options forScaLAPACK_WRITE
Option Prefix = ?
Option Name
Option Value
S_, d_
ScaLAPACK_WRITE_UNIT
1
S_, d_
ScaLAPACK_WRITE_FROM_PROCESS
2
S_, d_
ScaLAPACK_WRITE_BY_ROWS
3
iopt(IO)=ScaLAPACK_WRITE_UNIT Sets the unit number to the integer component of iopt(IO + 1)%idummy. The default unit number is the value 11.
iopt(IO)= ScaLAPACK_WRITE_FROM_PROCESS Sets the process number that writes the named file to the integer component of iopt(IO + 1)%idummy. The default process number is the value 0.
iopt(IO)= ScaLAPACK_WRITE_BY_ROWS Write the matrix by rows to the named file. By default the matrix is written by columns.
Specific: The specific interface names are S_ScaLAPACK_WRITE and D_ScaLAPACK_WRITE.
Description
Subroutine ScaLAPACK_WRITE writes columns or rows of a problem matrix output by a ScaLAPACK routine. It uses the two-dimensional block-cyclic array descriptor for the matrix to extract the data from the assumed-size arrays on the processors. The blocks of data are transmitted and received, then written. The block sizes, contained in the array descriptor, determines the data set size for each blocking send and receive pair. The number of these synchronization points is proportional to . A temporary local buffer is allocated for staging the matrix data. It is of size M by NB, when writing by columns, or N by MB, when writing by rows.
Examples
Example 1: Distributed Transpose of a Matrix, In Place
The program SCPK_EX1 illustrates an in-situ transposition of a matrix. An m×n matrix, A, is written to a file, by rows. The n×m matrix, B = AT, overwrites storage for A. Two temporary files are created and deleted. This algorithm for transposing a matrix is not efficient. It is used to illustrate the read and write routines and optional arguments for writing of data by matrix rows.
program scpk_ex1
! This is Example 1 for ScaLAPACK_READ and ScaLAPACK_WRITE.
! It shows in-situ or in-place transposition of a
! block-cyclic matrix.
USE ScaLAPACK_SUPPORT
USE ERROR_OPTION_PACKET
USE MPI_SETUP_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: M=6, N=6, NIN=10
INTEGER DESC_A(9), IERROR, INFO, I, J, K, L, MXLDA, MXCOL
LOGICAL :: GRID1D = .TRUE., NSQUARE = .TRUE.
real(kind(1d0)), allocatable :: A(:,:), A0(:,:)
real(kind(1d0)) ERROR
TYPE(d_OPTIONS) IOPT(1)
MP_NPROCS=MP_SETUP()
! Set up a 1D processor grid and define its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(M, N, NSQUARE, GRID1D)
! Get the array descriptor entities MXLDA, and MXCOL
CALL SCALAPACK_GETDIM(M, N, MP_MB, MP_NB, MXLDA, MXCOL)
! Set up the array descriptor
CALL DESCINIT(DESC_A, M, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
! Allocate space for local arrays
ALLOCATE(A0(MXLDA,MXCOL))
! A root process is used to create the matrix data for the test.
IF(MP_RANK == 0) THEN
ALLOCATE(A(M,N))
! Fill array with a pattern that is easy to recognize.
K=0
DO
K=K+1; IF(10**K > N) EXIT
END DO
DO J=1,N
DO I=1,M
! The values will appear, as decimals I.J, where I is
! The values will appear, as decimals I.J, where I is the row
! and J is the column.
A(I,J)=REAL(J)+REAL(I)*10d0**(-K) - A(I,J)
END DO
END DO
ERROR=SUM(ABS(A))
END IF
! See to any error messages.
call e1pop("Mp_setup")
! Check results on just one process.
IF(ERROR <= SQRT(EPSILON(ERROR)) .and. &
MP_RANK == 0) THEN
write(*,*) " Example 1 for BLACS is correct."
END IF
! Deallocate storage arrays and exit from BLACS.
IF(ALLOCATED(A)) DEALLOCATE(A)
IF(ALLOCATED(A0)) DEALLOCATE(A0)
! Exit from using this process grid.
CALL SCALAPACK_EXIT( MP_ICTXT )
! Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
Example 1 for BLACS is correct.
Example 2: Distributed Matrix Product with PBLAS
The program SCPK_EX2 illustrates computation of the matrix product . The matrices on the right-hand side are random. Three temporary files are created and deleted. BLACS and PBLAS are used. The problem size is such that the results are checked on one process.
program scpk_ex2
! This is Example 2 for ScaLAPACK_READ and ScaLAPACK_WRITE.
! The product of two matrices is computed with PBLAS
write(*,*) " Example 2 for BLACS and PBLAS is correct."
END IF
! Exit from using this process grid.
CALL SCALAPACK_EXIT( MP_ICTXT )
! Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Output
Example 2 for BLACS and PBLAS is correct.
Example 3: Distributed Linear Solver with ScaLAPACK
The program SCPK_EX3 illustrates solving a system of linear-algebraic equations, Ax = B by calling a ScaLAPACK routine directly. The right-hand side is produced by defining A and y to have random values. Then the matrix-vector product b = Ay is computed. The problem size is such that the residuals, x–y≈0 are checked on one process. Three temporary files are created and deleted. BLACS are used to define the process grid and provide further information identifying each process. Then a ScaLAPACK routine is called directly to compute the approximate solution, x.
program scpk_ex3
! This is Example 3 for ScaLAPACK_READ and ScaLAPACK_WRITE.
! A linear system is solved with ScaLAPACK and checked.