Parallel Program Descriptions

A set of parallel benchmark programs is shown in Table D. These main programs call Fortran 90 box data type functions, in single and double precision.  They compare our parallel allocation algorithm to a scalar sequential method. The main program reads single lines of input:

 

NSIZE NTIMES NRACKS PREC ROOT_WORKS Description

QUIT to Stop

Two initial lines of output echo the “Description” field, whether or not the root is working, and the number of processors in the MPI communicator.  The parameters NSIZENTRIES and NRACKS appear in the summary tables.  The parameter PREC has values 1, 2 or 3.  The choice depends on whether the user wants precision of single, double or both versions timed.  The array functions return a 7´ 2 summary table of values.  The (1:6, 1) and (1:6,2) elements of this array represent the results and parameters of the benchmark for the parallel and non-parallel versions.  The (7,1) and (7,2) elements of this array represent the ratio of the parallel to the scalar times and a first-order approximation to the variation in the ratio.

Parallel Box Version

Scalar Box Equivalent

1. Average time

Average time

2. Standard deviation

Standard deviation

3. Total Seconds

Total Seconds

4. nsize

nsize

5. nracks

nracks

6. ntries

ntries

7. Parallel/Scalar Ratio

Variation in Ratio

As an example, the program time_parallel_i is compiled and linked with the single and double precision timing functions  s_parallel_i_bench and d_parallel_i_bench.

This routine evaluates the time to compute 4 inverse matrices of size 600 by 600 using the defined operator .i. The “Average” is the mean of the individual elapsed times for 5 calls to the routines, obtaining 4 inverses in each call. The “St. Dev.” is the standard deviation for that “Average”. This value indicates the variability of the “Average”. In order for this value to provide any useful information it is necessary for |NTRIES| > 1. The value |NTRIES| = 1 is acceptable, but only one time sample and no standard deviation is obtained. Values of NTRIES > 0 result in the printing of results as shown in Table C.  The numbers in the table will vary depending on the machine and other factors that impact performance of Fortran codes. If NTRIES < 0 the 7 ´ 2 functions return the tabular values shown, with |NTRIES| samples. No printing is performed with NTRIES < 0.

 

Single precision benchmark of parallel .i. and non-parallel .i.:

Date of benchmark, (Y, Mo, D, H, M, S): 2006 5 11    8 58 58

1

1.5815E+00

4.0241E+00

Average

2

2.5031E-01

1.8035E-02

St. Dev.

3

7.9077E+00

2.0121E+01

Total Seconds

4

5.0000E+01

5.0000E+01

Size

5

5.0000E+00

5.0000E+00

Racks per box

6

5.0000E+00

5.0000E+00

Repeats

Non-parallel/parallel averages and variation:

 

2.5444E+00

3.9129E-01

 

 

Double precision benchmark of parallel .i. and non-parallel .i.:

Date of benchmark, (Y, Mo, D, H, M, S): 2006 5 11    8 58 59

1

1.6985D+00

4.0372D+00

Average

2

9.8576D-01

2.3836D-02

St. Dev.

3

8.4923D+00

2.0186D+01

Total Seconds

4

5.0000D+01

5.0000D+01

Size

5

5.0000D+00

5.0000D+00

Racks per box

6

5.0000D+00

5.0000D+00

Repeats

Non-parallel/parallel averages and variation:

 

2.3770D+00

1.2392D-01

 

Table C: Performance Summary: Box operator .i.

Below is a list of the performance evaluation programs that time the box data computations using parallel and non-parallel resources.

 

Number

Program Units

Function Timed

1

time_parallel_i.f90, s_parallel_i_bench.f90,

d_parallel_i_bench.f90

.i. A

2

time_parallel_ix.f90, s_parallel_ix_bench.f90,

d_parallel_ix_bench.f90

A .ix. B

3

time_parallel_xi.f90, s_parallel_xi_bench.f90,

d_parallel_xi_bench.f90

B .xi. A

4

time_parallel_x.f90, s_parallel_x_bench.f90,

d_parallel_x_bench.f90

A .x. B

5

time_parallel_tx.f90, s_parallel_tx_bench.f90,

d_parallel_tx_bench.f90

A .tx. B

6

time_parallel_xt.f90, s_parallel_xt_bench.f90,

d_parallel_xt_bench.f90

A .xt. B

7

time_parallel_hx.f90, s_parallel_hx_bench.f90,

d_parallel_hx_bench.f90

A .hx. B

8

time_parallel_xh.f90, s_parallel_xh_bench.f90,

d_parallel_xh_bench.f90

A .xh. B

9

time_parallel_chol.f90, s_parallel_chol_bench.f90,

d_parallel_chol_bench.f90

CHOL(A)

10

time_parallel_cond.f90, s_parallel_cond_bench.f90,

d_parallel_cond_bench.f90

COND(A)

11

time_parallel_rank.f90, s_parallel_rank_bench.f90,

d_parallel_rank_bench.f90

RANK(A)

Table D: Parallel and non-Parallel Box Comparisons

 

 

 

 


Number


Program Units


Function Timed

12

time_parallel_det.f90, s_parallel_det_bench.f90,

d_parallel_det_bench.f90

DET(A)

13

time_parallel_orth.f90, s_parallel_orth_bench.f90,

d_parallel_orht_bench.f90

ORTH(A,R=R)

14

time_parallel_svd.f90, s_parallel_svd_bench.f90,

d_parallel_svd_bench.f90

SVD(A,U=U,V=V)

15

time_parallel_norm.f90, s_parallel_norm_bench.f90,

d_parallel_norm_bench.f90

NORM(A,TYPE=I)

16

time_parallel_eig.f90, s_parallel_eig_bench.f90,

d_parallel_eig_bench.f90

EIG(A,W=W)

17

time_parallel_fft.f90, s_parallel_fft_bench.f90,

d_parallel_fft_bench.f90

FFT_BOX(A)

IFFT_BOX(A)

Table D continued: Parallel and non-Parallel Box Comparisons

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Visual Numerics, Inc.
Visual Numerics - Developers of IMSL and PV-WAVE
http://www.vni.com/
PHONE: 713.784.3131
FAX:713.781.9260