LQRRV
Computes the least-squares solution using Householder transformations applied in blocked form.
Required Arguments
A — Real LDA by (NCA + NUMEXC) array containing the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, …, NUMEXC. The array A is preserved upon output. The Householder factorization of the matrix is computed and used to solve the systems.
X — Real LDX by NUMEXC array containing the solution. (Output)
Optional Arguments
NRA — Number of rows in the matrix. (Input)
Default: NRA = size (A,1).
NCA — Number of columns in the matrix. (Input)
Default: NCA = size (A,2) - NUMEXC.
NUMEXC — Number of right-hand sides. (Input)
Default: NUMEXC = size (X,2).
LDA — Leading dimension of A exactly as specified in the dimension statement of the calling program. (Input)
Default: LDA = size (A,1).
LDX — Leading dimension of the solution array X exactly as specified in the dimension statement of the calling program. (Input)
Default: LDX = size (X,1).
FORTRAN 90 Interface
Generic: CALL LQRRV (A, X [, …])
Specific: The specific interface names are S_LQRRV and D_LQRRV.
FORTRAN 77 Interface
Single: CALL LQRRV (NRA, NCA, NUMEXC, A, LDA, X, LDX)
Double: The double precision name is DLQRRV.
ScaLAPACK Interface
Generic: CALL LQRRV (A0, X0 [, …])
Specific: The specific interface names are S_LQRRV and D_LQRRV.
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed computing.
Description
The routine LQRRV computes the QR decomposition of a matrix A using blocked Householder transformations. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The standard algorithm is based on the storage-efficient WY representation for products of Householder transformations. See Schreiber and Van Loan (1989).
The routine LQRRV determines an orthogonal matrix Q and an upper triangular matrix R such that A = QR. The QR factorization of a matrix A having NRA rows and NCA columns is as follows:
Initialize A1 ← A
For k = 1, min(NRA - 1, NCA)
Determine a Householder transformation for column k of Ak having the form
where uk has zeros in the first k - 1 positions and τk is a scalar.
Update
End k
Thus,
where p = min(NRA − 1, NCA). The matrix Q is not produced directly by LQRRV. The information needed to construct the Householder transformations is saved instead. If the matrix Q is needed explicitly, QT can be determined while the matrix is factored. No pivoting among the columns is done. The primary purpose of LQRRV is to give the user a high-performance QR least-squares solver. It is intended for least-squares problems that are well-posed. For background, see Golub and Van Loan (1989, page 225). During the QR factorization, the most time-consuming step is computing the matrix-vector update Ak ← HkAk−1. The routine LQRRV constructs “block” of NB Householder transformations in which the update is “rich” in matrix multiplication. The product of NB Householder transformations are written in the form
where YNRA×NB is a lower trapezoidal matrix and TNB×NB is upper triangular. The optimal choice of the block size parameter NB varies among computer systems. Users may want to change it from its default value of 1.
Comments
1. Workspace may be explicitly provided, if desired, by use of L2RRV/DL2RRV. The reference is:
CALL L2RRV (NRA, NCA, NUMEXC, A, LDA, X, LDX, FACT, LDFACT, WK)
The additional arguments are as follows:
FACT — LDFACT × (NCA + NUMEXC) work array containing the Householder factorization of the matrix on output. If the input data is not needed, A and FACT can share the same storage locations.
LDFACT — Leading dimension of the array FACT exactly as specified in the dimension statement of the calling program. (Input)
If A and FACT are sharing the same storage, then LDA = LDFACT is required.
WK — Work vector of length (NCA + NUMEXC + 1) * (NB + 1) . The default value is NB = 1. This value can be reset. See item 3 below.
2. Informational errors
Type |
Code |
Description |
4 |
1 |
The input matrix is singular. |
3. Integer Options with Chapter 11 Options Manager
5 This option allows the user to reset the blocking factor used in computing the factorization. On some computers, changing IVAL(*) to a value larger than 1 will result in greater efficiency. The value IVAL(*) is the maximum value to use. (The software is specialized so that IVAL(*) is reset to an “optimal” used value within routine L2RRV.) The user can control the blocking by resetting IVAL(*) to a smaller value than the default. Default values are IVAL(*) = 1, IMACH(5).
6 This option is the vector dimension where a shift is made from in-line level-2 loops to the use of level-2 BLAS in forming the partial product of Householder transformations. Default value is IVAL(*) = IMACH(5).
10 This option allows the user to control the factorization step. If the value is 1 the Householder factorization will be computed. If the value is 2, the factorization will not be computed. In this latter case the decomposition has already been computed. Default value is IVAL(*) = 1.
11 This option allows the user to control the solving steps. The rules for IVAL(*) are:
1. Compute b ← QTb, and x ← R+b.
2. Compute b ← QTb.
3. Compute b ← Qb.
4. Compute x ← R+b.
Default value is IVAL (*) = 1. Note that IVAL (*) = 2 or 3 may only be set when calling L2RRV/DL2RRV.
ScaLAPACK Usage Notes
The arguments which differ from the standard version of this routine are:
A0 — MXLDA by MXCOL local matrix containing the local portions of the distributed matrix A. A contains the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, …, NUMEXC. The array A is preserved upon output. The Householder factorization of the matrix is computed and used to solve the systems.. (Input)
X0 — MXLDX by MXCOLX local matrix containing the local portions of the distributed matrix X. X contains the solution. (Output)
All other arguments are global and are the same as described for the standard version of the routine. In the argument descriptions above, MXLDA, MXLDX, MXCOL, and MXCOLX can be obtained through a call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example below.
Examples
Given a real m × k matrix B it is often necessary to compute the k least-squares solutions of the linear system AX = B, where A is an m × n real matrix. When m > n the system is considered overdetermined. A solution with a zero residual normally does not exist. Instead the minimization problem
is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of full column rank there exits a unique solution XLS that solves the above minimization problem. By using the routine LQRRV, XLS is computed.
USE LQRRV_INT
USE WRRRN_INT
USE SGEMM_INT
! Declare variables
INTEGER LDA, LDX, NCA, NRA, NUMEXC
PARAMETER (NCA=3, NRA=5, NUMEXC=2, LDA=NRA, LDX=NCA)
! SPECIFICATIONS FOR LOCAL VARIABLES
REAL X(LDX,NUMEXC)
! SPECIFICATIONS FOR SAVE VARIABLES
REAL A(LDA,NCA+NUMEXC)
SAVE A
! SPECIFICATIONS FOR SUBROUTINES
!
! Set values for A and the
! righthand sides.
!
! A = ( 1 2 4 | 7 10)
! ( 1 4 16 | 21 10)
! ( 1 6 36 | 43 9 )
! ( 1 8 64 | 73 10)
! ( 1 10 100 | 111 10)
!
DATA A/5*1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 4.0, 16.0, 36.0, 64.0, &
100.0, 7.0, 21.0, 43.0, 73.0, 111.0, 2*10., 9., 2*10./
!
!
! QR factorization and solution
CALL LQRRV (A, X)
CALL WRRRN (’SOLUTIONS 1-2’, X)
! Compute residuals and print
CALL SGEMM (’N’, ’N’, NRA, NUMEXC, NCA, 1.E0, A, LDA, X, LDX, &
-1.E0, A(1:,(NCA+1):),LDA)
CALL WRRRN (’RESIDUALS 1-2’, A(1:,(NCA+1):))
!
END
SOLUTIONS 1-2
1 2
1 1.00 10.80
2 1.00 -0.43
3 1.00 0.04
RESIDUALS 1-2
1 2
1 0.0000 0.0857
2 0.0000 -0.3429
3 0.0000 0.5143
4 0.0000 -0.3429
5 0.0000 0.0857
The previous example is repeated here as a distributed computing example. Given a real m × k matrix B it is often necessary to compute the k least-squares solutions of the linear system AX = B, where A is an m × n real matrix. When m > n the system is considered overdetermined. A solution with a zero residual normally does not exist. Instead the minimization problem
is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of full column rank there exits a unique solution XLS that solves the above minimization problem. By using the routine LQRRV, XLS is computed. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Utilities) used to map and unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRV_INT
USE SGEMM_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
! Declare variables
INTEGER LDA, LDX, NCA, NRA, NUMEXC, DESCA(9), DESCX(9)
INTEGER INFO, MXCOL, MXLDA, MXLDX, MXCOLX
INTEGER K REAL, ALLOCATABLE :: A(:,:), X(:)
REAL, ALLOCATABLE :: A0(:,:), X0(:)
PARAMETER (NRA=5, NCA=3, NUMEXC=2, LDA=NRA, LDX=NCA)
! Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA+NUMEXC), X(LDX, NUMEXC))
! Set values for A and the righthand sides
A(1,:) = (/ 1.0, 2.0, 4.0, 7.0, 10.0/)
A(2,:) = (/ 1.0, 4.0, 16.0, 21.0, 10.0/)
A(3,:) = (/ 1.0, 6.0, 36.0, 43.0, 9.0/)
A(4,:) = (/ 1.0, 8.0, 64.0, 73.0, 10.0/)
A(5,:) = (/ 1.0, 10.0, 100.0, 111.0, 10.0/)
ENDIF
! Set up a 1D processor grid and define
! its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA+NUMEXC, .TRUE., .TRUE.)
! Get the array descriptor entities MXLDA,
! and MXCOL
CALL SCALAPACK_GETDIM(NRA, NCA+NUMEXC, MP_MB, MP_NB, MXLDA, MXCOL)
! Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA+NUMEXC, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
K = MIN0(NRA, NCA)
! Need to get dimensions of local x
! separate since x's leading
! dimension differs from A's
! Get the array descriptor entities
! MXLDX, AND MXCOLX
CALL SCALAPACK_GETDIM(K, NUMEXC, MP_MB, MP_NB, MXLDX, MXCOLX)
CALL DESCINIT (DESCX, K, NUMEXC, MP_NB, MP_NB, 0, 0, MP_ICTXT, &
MXLDX, INFO)
! Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), X0(MXLDX,MXCOLX))
! Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
! Solve the least squares problem
CALL LQRRV (A0, X0)
! Unmap the results from the distributed
! arrays back to a non-distributed array.
! After the unmap, only Rank=0 has the full
! array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
! Print results.
! Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)THEN
CALL WRRRN (’SOLUTIONS 1-2’, X)
! Compute residuals and print
CALL SGEMM (’N’, ’N’, NRA, NUMEXC, NCA, 1.E0, A, LDA, X, LDX, &
-1.E0, A(1:,(NCA+1):),LDA)
CALL WRRRN (’RESIDUALS 1-2’, A(1:,(NCA+1):))
ENDIF
! Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
! Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END