LQRRV

Computes the least-squares solution using Householder transformations applied in blocked form.

Required Arguments

A — Real LDA by (NCA + NUMEXC) array containing the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, …, NUMEXC. The array A is preserved upon output. The Householder factorization of the matrix is computed and used to solve the systems.

X — Real LDX by NUMEXC array containing the solution. (Output)

Optional Arguments

NRA — Number of rows in the matrix. (Input)
Default: NRA = size (A,1).

NCA — Number of columns in the matrix. (Input)
Default: NCA = size (A,2) - NUMEXC.

NUMEXC — Number of right-hand sides. (Input)
Default: NUMEXC = size (X,2).

LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program. (Input)
Default: LDA = size (A,1).

LDX — Leading dimension of the solution array X exactly as specified in the dimension statement of the calling program. (Input)
Default: LDX = size (X,1).

FORTRAN 90 Interface

Generic: CALL LQRRV (A, X [, …])

Specific: The specific interface names are S_LQRRV and D_LQRRV.

FORTRAN 77 Interface

Single: CALL LQRRV (NRA, NCA, NUMEXC, A, LDA, X, LDX)

Double: The double precision name is DLQRRV.

ScaLAPACK Interface

Generic: CALL LQRRV (A0, X0 [, …])

Specific: The specific interface names are S_LQRRV and D_LQRRV.

See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.

Description

The routine LQRRV computes the QR decomposition of a matrix A using blocked Householder transformations. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The standard algorithm is based on the storage-efficient WY representation for products of Householder transformations. See Schreiber and Van Loan (1989).

The routine LQRRV determines an orthogonal matrix Q and an upper triangular matrix R such that A = QR. The QR factorization of a matrix A having NRA rows and NCA columns is as follows:

Initialize A1 ← A
For k = 1, min(NRA - 1, NCA)

Determine a Householder transformation for column k of Ak having the form

where uk has zeros in the first k - 1 positions and τk is a scalar.

Update

End k

Thus,

where p = min(NRA − 1, NCA). The matrix Q is not produced directly by LQRRV. The information needed to construct the Householder transformations is saved instead. If the matrix Q is needed explicitly, QT can be determined while the matrix is factored. No pivoting among the columns is done. The primary purpose of LQRRV is to give the user a high-performance QR least-squares solver. It is intended for least-squares problems that are well-posed. For background, see Golub and Van Loan (1989, page 225). During the QR factorization, the most time-consuming step is computing the matrix-vector update Ak ← HkAk−1. The routine LQRRV constructs “block” of NB Householder transformations in which the update is “rich” in matrix multiplication. The product of NB Householder transformations are written in the form

where YNRA×NB is a lower trapezoidal matrix and TNB×NB is upper triangular. The optimal choice of the block size parameter NB varies among computer systems. Users may want to change it from its default value of 1.

Comments

1. Workspace may be explicitly provided, if desired, by use of L2RRV/DL2RRV. The reference is:

CALL L2RRV (NRA, NCA, NUMEXC, A, LDA, X, LDX, FACT, LDFACT, WK)

The additional arguments are as follows:

FACT — LDFACT × (NCA + NUMEXC) work array containing the Householder factorization of the matrix on output. If the input data is not needed, A and FACT can share the same storage locations.

LDFACT — Leading dimension of the array FACT exactly as specified in the dimension statement of the calling program. (Input)
If A and FACT are sharing the same storage, then LDA = LDFACT is required.

WK — Work vector of length (NCA + NUMEXC + 1) * (NB + 1) . The default value is NB = 1. This value can be reset. See item 3 below.

2. Informational errors

Type	Code	Description
4	1	The input matrix is singular.

3. Integer Options with Chapter 11 Options Manager

5 This option allows the user to reset the blocking factor used in computing the factorization. On some computers, changing IVAL(*) to a value larger than 1 will result in greater efficiency. The value IVAL(*) is the maximum value to use. (The software is specialized so that IVAL(*) is reset to an “optimal” used value within routine L2RRV.) The user can control the blocking by resetting IVAL(*) to a smaller value than the default. Default values are IVAL(*) = 1, IMACH(5).

6 This option is the vector dimension where a shift is made from in-line level-2 loops to the use of level-2 BLAS in forming the partial product of Householder transformations. Default value is IVAL(*) = IMACH(5).

10 This option allows the user to control the factorization step. If the value is 1 the Householder factorization will be computed. If the value is 2, the factorization will not be computed. In this latter case the decomposition has already been computed. Default value is IVAL(*) = 1.

11 This option allows the user to control the solving steps. The rules for IVAL(*) are:

1. Compute b ← QTb, and x ← R+b.

2. Compute b ← QTb.

3. Compute b ← Qb.

4. Compute x ← R+b.

Default value is IVAL (*) = 1. Note that IVAL (*) = 2 or 3 may only be set when calling L2RRV/DL2RRV.

ScaLAPACK Usage Notes

The arguments which differ from the standard version of this routine are:

A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, …, NUMEXC. The array A is preserved upon output. The Householder factorization of the matrix is computed and used to solve the systems.. (Input)

X0 — MXLDX by MXCOLX local matrix containing the local portions of the distributed matrix X. X contains the solution. (Output)

All other arguments are global and are the same as described for the standard version of the routine. In the argument descriptions above, MXLDA, MXLDX, MXCOL, and MXCOLX can be obtained through a call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example below.

Examples

Example

Given a real m × k matrix B it is often necessary to compute the k least-squares solutions of the linear system AX = B, where A is an m × n real matrix. When m > n the system is considered overdetermined. A solution with a zero residual normally does not exist. Instead the minimization problem

USE LQRRV_INT

USE WRRRN_INT

USE SGEMM_INT

! Declare variables

INTEGER LDA, LDX, NCA, NRA, NUMEXC

PARAMETER (NCA=3, NRA=5, NUMEXC=2, LDA=NRA, LDX=NCA)

! SPECIFICATIONS FOR LOCAL VARIABLES

REAL X(LDX,NUMEXC)

! SPECIFICATIONS FOR SAVE VARIABLES

REAL A(LDA,NCA+NUMEXC)

SAVE A

! SPECIFICATIONS FOR SUBROUTINES

! Set values for A and the

! righthand sides.

! A = ( 1 2 4 | 7 10)

! ( 1 4 16 | 21 10)

! ( 1 6 36 | 43 9 )

! ( 1 8 64 | 73 10)

! ( 1 10 100 | 111 10)

DATA A/5*1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 4.0, 16.0, 36.0, 64.0, &

100.0, 7.0, 21.0, 43.0, 73.0, 111.0, 2*10., 9., 2*10./

! QR factorization and solution

CALL LQRRV (A, X)

CALL WRRRN (’SOLUTIONS 1-2’, X)

! Compute residuals and print

CALL SGEMM (’N’, ’N’, NRA, NUMEXC, NCA, 1.E0, A, LDA, X, LDX, &

-1.E0, A(1:,(NCA+1):),LDA)

CALL WRRRN (’RESIDUALS 1-2’, A(1:,(NCA+1):))

END

Output

SOLUTIONS 1-2

1 2

1 1.00 10.80

2 1.00 -0.43

3 1.00 0.04

RESIDUALS 1-2

1 2

1 0.0000 0.0857

2 0.0000 -0.3429

3 0.0000 0.5143

4 0.0000 -0.3429

5 0.0000 0.0857

ScaLAPACK Example

The previous example is repeated here as a distributed computing example. Given a real m × k matrix B it is often necessary to compute the k least-squares solutions of the linear system AX = B, where A is an m × n real matrix. When m > n the system is considered overdetermined. A solution with a zero residual normally does not exist. Instead the minimization problem

is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of full column rank there exits a unique solution XLS that solves the above minimization problem. By using the routine LQRRV, XLS is computed. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.

USE MPI_SETUP_INT

USE LQRRV_INT

USE SGEMM_INT

USE WRRRN_INT

USE SCALAPACK_SUPPORT

IMPLICIT NONE

INCLUDE ‘mpif.h’

! Declare variables

INTEGER LDA, LDX, NCA, NRA, NUMEXC, DESCA(9), DESCX(9)

INTEGER INFO, MXCOL, MXLDA, MXLDX, MXCOLX

INTEGER K REAL, ALLOCATABLE :: A(:,:), X(:)

REAL, ALLOCATABLE :: A0(:,:), X0(:)

PARAMETER (NRA=5, NCA=3, NUMEXC=2, LDA=NRA, LDX=NCA)

! Set up for MPI

MP_NPROCS = MP_SETUP()

IF(MP_RANK .EQ. 0) THEN

ALLOCATE (A(LDA,NCA+NUMEXC), X(LDX, NUMEXC))

! Set values for A and the righthand sides

A(1,:) = (/ 1.0, 2.0, 4.0, 7.0, 10.0/)

A(2,:) = (/ 1.0, 4.0, 16.0, 21.0, 10.0/)

A(3,:) = (/ 1.0, 6.0, 36.0, 43.0, 9.0/)

A(4,:) = (/ 1.0, 8.0, 64.0, 73.0, 10.0/)

A(5,:) = (/ 1.0, 10.0, 100.0, 111.0, 10.0/)

ENDIF

! Set up a 1D processor grid and define

! its context ID, MP_ICTXT

CALL SCALAPACK_SETUP(NRA, NCA+NUMEXC, .TRUE., .TRUE.)

! Get the array descriptor entities MXLDA,

! and MXCOL

CALL SCALAPACK_GETDIM(NRA, NCA+NUMEXC, MP_MB, MP_NB, MXLDA, MXCOL)

! Set up the array descriptors

CALL DESCINIT(DESCA, NRA, NCA+NUMEXC, MP_MB, MP_NB, 0, 0, MP_ICTXT, &

MXLDA, INFO)

K = MIN0(NRA, NCA)

! Need to get dimensions of local x

! separate since x's leading

! dimension differs from A's

! Get the array descriptor entities

! MXLDX, AND MXCOLX

CALL SCALAPACK_GETDIM(K, NUMEXC, MP_MB, MP_NB, MXLDX, MXCOLX)

CALL DESCINIT (DESCX, K, NUMEXC, MP_NB, MP_NB, 0, 0, MP_ICTXT, &

MXLDX, INFO)

! Allocate space for the local arrays

ALLOCATE (A0(MXLDA,MXCOL), X0(MXLDX,MXCOLX))

! Map input array to the processor grid

CALL SCALAPACK_MAP(A, DESCA, A0)

! Solve the least squares problem

CALL LQRRV (A0, X0)

! Unmap the results from the distributed

! arrays back to a non-distributed array.

! After the unmap, only Rank=0 has the full

! array.

CALL SCALAPACK_UNMAP(X0, DESCX, X)

! Print results.

! Only Rank=0 has the solution, X.

IF(MP_RANK .EQ. 0)THEN

CALL WRRRN (’SOLUTIONS 1-2’, X)

! Compute residuals and print

CALL SGEMM (’N’, ’N’, NRA, NUMEXC, NCA, 1.E0, A, LDA, X, LDX, &

-1.E0, A(1:,(NCA+1):),LDA)

CALL WRRRN (’RESIDUALS 1-2’, A(1:,(NCA+1):))

ENDIF

! Exit ScaLAPACK usage

CALL SCALAPACK_EXIT(MP_ICTXT)

! Shut down MPI

MP_NPROCS = MP_SETUP(‘FINAL’)

END