Initializes or finalizes MPI.
Number of nodes, MP_NPROCS, in the
communicator, MP_LIBRARY_WORLD.
(Output)
Returned when MP_SETUP is called
with no arguments:
MP_NPROCS = MP_SETUP().
None.
NOTE
Character string Final'. (Input)
With Final' all pending error
messages are sent from the nodes to the root and printed. If any node
should STOP after printing messages, then MPI_Finalize() and a
STOP are executed. Otherwise, only MPI_Finalize()is
called. The character string Final' is the only valid
string for this argument.
N Size of array
to be allocated for timing. (Input)
When this argument is supplied, the array
MPI_NODE_PRIORITY is
allocated with MP_PROCS
components. The matrix products A .x. B are timed
individually at each node of the machine. The elapsed time is noted and
sorted to determine the node priority order. A and B are allocated to
size N by N, and initialized
with random data. The priority order is finally broadcast to the other
nodes.
MP_SETUP ( [, ])
Following a call to the function MP_SETUP(),
the module
MPI_node_int
will contain information about the number of processors, the rank of a
processor, the communicator for
IMSL Fortran Numerical Library, and the
usage priority order of the node machines:
MODULE MPI_NODE_INT
INTEGER, ALLOCATABLE :: MPI_NODE_PRIORITY(:)
INTEGER, SAVE :: MP_LIBRARY_WORLD = huge(1)
LOGICAL, SAVE :: MPI_ROOT_WORKS = .TRUE.
INTEGER, SAVE :: MP_RANK = 0, MP_NPROCS = 1
END MODULE
When the function MP_SETUP() is called with no arguments, the following events occur:
If MPI has not been initialized, it is first initialized. This step uses the routines MPI_Initialized() and possibly MPI_Init(). Users who choose not to call MP_SETUP() must make the required initialization call before using any IMSL Fortran Numerical Library code that relies on MPI for its execution. If the user's code calls an IMSL Fortran Numerical Library function utilizing the box data type and MPI has not been initialized, then the computations are performed on the root node. The only MPI routine always called in this context is MPI_Initialized(). The name MP_SETUP is pushed onto the subprogram or call stack.
If MP_LIBRARY_WORLD equals its initial value (=huge(1)) then MPI_COMM_WORLD, the default MPI communicator, is duplicated and becomes its handle. This uses the routine MPI_Comm_dup(). Users can change the handle of MP_LIBRARY_WORLD as required by their application code. Often this issue can be ignored.
The integers MP_RANK and MP_NPROCS are respectively the node's rank and the number of nodes in the communicator, MP_LIBRARY_WORLD. Their values require the routines MPI_Comm_size() and MPI_Comm_rank(). The default values are important when MPI is not initialized and a box data type is computed. In this case the root node is the only node and it will do all the work. No calls to MPI communication routines are made when MP_NPROCS = 1 when computing the box data type functions. A program can temporarily assign this value to force box data type computation entirely at the root node. This is desirable for problems where using many nodes would be less efficient than using the root node exclusively.
The
array MPI_NODE_PRIORITY(:) is not allocated
unless the user allocates it. The IMSL Fortran Numerical Library codes use this
array for assigning tasks to processors, if it is allocated. If it is not
allocated, the default priority of the nodes is
(0,1,...,MP_NPROCS-1).
Use of the function call MP_SETUP(N) allocates
the array, as explained below. Once the array is allocated its size is MP_NPROCS. The contents of the
array is a permutation of the integers 0,...,MP_NPROCS-1. Nodes appearing at
the start of the list are used first for parallel computing. A node other
than the root can avoid any computing, except receiving the schedule, by setting
the value MPI_NODE_PRIORITY(I) <
0. This
means that node
|MPI_NODE_PRIORITY(I)| will be sent the task
schedule but will not perform any significant work as part of box data type
function evaluations.
The LOGICAL flag MPI_ROOT_WORKS designates whether or not the root node participates in the major computation of the tasks. The root node communicates with the other nodes to complete the tasks but can be designated to do no other work. Since there may be only one processor, this flag has the default value .TRUE., assuring that one node exists to do work. When more than one processor is available users can consider assigning MPI_ROOT_WORKS=.FALSE. This is desirable when the alternate nodes have equal or greater computational resources compared with the root node. Parallel Example 4 illustrates this usage. A single problem is given a box data type, with one rack. The computing is done at the node, other than the root, with highest priority. This example requires more than one processor since the root does no work.
When the generic function MP_SETUP(N) is called, where N is a positive integer, a call to MP_SETUP() is first made, using no argument. Use just one of these calls to MP_SETUP(). This initializes the MPI system and the other parameters described above. The array MPI_NODE_PRIORITY(:) is allocated with size MP_NPROCS. Then DOUBLE PRECISION matrix products C = AB, where A and B are N by N matrices, are computed at each node and the elapsed time is recorded. These elapsed times are sorted and the contents of MPI_NODE_PRIORITY(:) are permuted in accordance with the shortest times yielding the highest priority. All the nodes in the communicator MP_LIBRARY_WORLD are timed. The array MPI_NODE_PRIORITY(:) is then broadcast from the root to the remaining nodes of MP_LIBRARY_WORLD using the routine MPI_Bcast(). Timing matrix products to define the node priority is relevant because the effort to compute C is comparable to that of many linear algebra computations of similar size. Users are free to define their own node priority and broadcast the array MPI_NODE_PRIORITY(:) to the alternate nodes in the communicator.
To print any IMSL Fortran Numerical Library error messages that have occurred at any node, and to finalize MPI, use the function call MP_SETUP(Final'). The case of the string Final' is not important. Any error messages pending will be discarded after printing on the root node. This is triggered by popping the name MP_SETUP' from the subprogram stack or returning to Level 1 in the stack. Users can obtain error messages by popping the stack to Level 1 and still continuing with MPI calls. This requires executing call e1pop (MP_SETUP'). To continue on after summarizing errors execute call e1psh (MP_SETUP'). More details about the error processor are found in Reference Material chapter of this manual.
Messages are printed by nodes from largest rank to smallest, which is the root node. Use of the routine MPI_Finalize() is made within MP_SETUP(Final'), which shuts down MPI. After MPI_Finalize() is called, the value of MP_NPROCS = 0. This flags that MPI has been initialized and terminated. It cannot be initialized again in the same program unit execution. No MPI routine is defined when MP_NPROCS has this value.
Parallel Example (parallel_ex01.f90)
use linear_operators
use mpi_setup_int
implicit none
! This is the equivalent of Parallel Example 1 for .ix., with box data types
! and functions.
integer, parameter :: n=32, nr=4
real(kind(1e0)) :: one=1e0
real(kind(1e0)), dimension(n,n,nr) :: A, b, x, err(nr)
! Setup for MPI.
MP_NPROCS=MP_SETUP()
! Generate random matrices for A and b:
A = rand(A); b=rand(b)
! Compute the box solution matrix of Ax = b.
x = A .ix. b
! Check the results.
err = norm(b - (A .x. x))/(norm(A)*norm(x)+norm(b))
if (ALL(err <= sqrt(epsilon(one))) .and. MP_RANK == 0) &
write (*,*) 'Parallel Example 1 is correct.'
! See to any error messages and quit MPI.
MP_NPROCS=MP_SETUP('Final')
end
Parallel Example (parallel_ex04.f90)
Here an alternate node is used to compute the majority of a
single application, and the user does not need to make any explicit calls to MPI
routines. The time-consuming parts are the evaluation of the
eigenvalue-eigenvector expansion, the solving step, and the residuals. To
do this, the
rank-2 arrays are changed to a box data type with a unit third
dimension. This uses parallel computing. The node priority order is
established by the initial function call,
MP_SETUP(n).
The root is restricted from working on the box data type by assigning
MPI_ROOT_WORKS=.false.
This example anticipates that the most efficient node, other than the
root, will perform the heavy computing. Two nodes are required to
execute.
use linear_operators
use mpi_setup_int
implicit none
! This is the equivalent of Parallel Example 4 for matrix exponential.
! The box dimension has a single rack.
integer, parameter :: n=32, k=128, nr=1
integer i
real(kind(1e0)), parameter :: one=1e0, t_max=one, delta_t=t_max/(k-1)
real(kind(1e0)) err(nr), sizes(nr), A(n,n,nr)
real(kind(1e0)) t(k), y(n,k,nr), y_prime(n,k,nr)
complex(kind(1e0)), dimension(n,nr) :: x(n,n,nr), z_0, &
Z_1(n,nr,nr), y_0, d
! Setup for MPI. Establish a node priority order.
! Restrict the root from significant computing.
! Illustrates using the 'best' performing node that
! is not the root for a single task.
MP_NPROCS=MP_SETUP(n)
MPI_ROOT_WORKS=.false.
! Generate a random coefficient matrix.
A = rand(A)
! Compute the eigenvalue-eigenvector decomposition
! of the system coefficient matrix on an alternate node.
D = EIG(A, W=X)
! Generate a random initial value for the ODE system.
y_0 = rand(y_0)
! Solve complex data system that transforms the initial
! values, X z_0=y_0.
z_1= X .ix. y_0 ; z_0(:,nr) = z_1(:,nr,nr)
! The grid of points where a solution is computed:
t = (/(i*delta_t,i=0,k-1)/)
! Compute y and y' at the values t(1:k).
! With the eigenvalue-eigenvector decomposition AX = XD, this
! is an evaluation of EXP(A t)y_0 = y(t).
y = X .x.exp(spread(d(:,nr),2,k)*spread(t,1,n))*spread(z_0(:,nr),2,k)
! This is y', derived by differentiating y(t).
y_prime = X .x. &
spread(d(:,nr),2,k)*exp(spread(d(:,nr),2,k)*spread(t,1,n))* &
spread(z_0(:,nr),2,k)
! Check results. Is y' - Ay = 0?
err = norm(y_prime-(A .x. y))
sizes=norm(y_prime)+norm(A)*norm(y)
if (ALL(err <= sqrt(epsilon(one))*sizes) .and. MP_RANK == 0) &
write (*,*) 'Parallel Example 4 is correct.'
! See to any error messages and quit MPI.
MP_NPROCS=MP_SETUP('Final')
end
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |