Maximums likelihood or least‑squares estimates for principal components from one or more matrices.
Required Arguments
COV — NVAR by NVAR by NMAT array containing the NMAT covariance or correlation matrices. (Input) Only the upper triangular elements of each matrix are referenced.
ANI — Vector of length NMAT containing the number of observations in each of the covariance matrices. (Input) For least‑squares estimation, the square root of ANI(I) is the weight to be used for the I‑th covariance matrix. Since the elements of ANI are used as weights, they need not be integers.
EVEC — NVAR by NVAR matrix containing the estimated principal components. (Output) Each column of EVEC contains a principal component vector (an “eigenvector”). The ordering of the eigenvectors is such that the sum of the corresponding eigenvalues are ordered from largest to smallest. Each vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
NVAR — Number of variables in each matrix. (Input) NVAR must be 2 or greater. Default: NVAR = size (COV,2).
NMAT — Number of matrices. (Input) Default: NMAT = size (COV,3).
LDCOV — Leading and second dimensions of COV exactly as specified in the dimension statement of the calling program. (Input) The first two dimensions of COV must be equal. Default: LDCOV = size (COV,1).
IMETH — Method to be used for extracting the estimated principal components. (Input) For IMETH = 0, maximum likelihood estimation is used. For IMETH = 1, least‑squares estimation is used. Default: IMETH = 0.
LDEVEC — Leading dimension of EVEC exactly as specified in the dimension statement in the calling program. (Input) Default: LDEVEC = size (EVEC,1).
FORTRAN 90 Interface
Generic: CALLKPRIN (COV, ANI, EVEC[, …])
Specific: The specific interface names are S_KPRIN and D_KPRIN.
FORTRAN 77 Interface
Single: CALLKPRIN (NVAR, NMAT, COV, LDCOV, ANI, IMETH, EVEC, LDEVEC)
Double: The double precision name is DKPRIN.
Description
Routine KPRIN is the IMSL version of the F‑G diagonalization routine of Flury and Constantine (1985) with modifications as discussed by Clarkson (1988a, 1988b). Let k = NMAT. Routine KPRIN computes the common principal components of k≥ 1 covariance (or correlation) matrices using either a least‑squares or a maximum likelihood criterion. Computing common principal components is equivalent to finding the “eigenvectors” that best simultaneously diagonalize k symmetric matrices. (Note that when k = 1, both least‑squares and maximum likelihood estimation yield the eigenvectors of the input matrix.) See Flury (1988) for applications of common principal components.
The algorithm proceeds by accumulating simple rotations as follows: Initial estimates of the diagonalizing principal components are found as the eigenvectors of the summed covariance matrices (unless K3RIN is used, see Comment 3). The covariance matrices are then pre- and post‑multiplied by the initial estimates to obtain approximately diagonal matrices. Let
denote the l‑th 2 × 2 matrix obtained from the (i, j), (i, i), and (j, j) elements of Sl, where Sl is the l‑th covariance matrix in COV. Then, for each i and j, a Jacobi rotation is found and applied such that the least‑squares or maximum likelihood criterion is optimized over all k matrices in
An iteration consists of computing and applying a Jacobi rotation for all p(p‑ 1)/2 possible off‑diagonal elements (i, j) where p = NVAR. A maximum of 50 iterations are allowed before convergence. Convergence is assumed when the maximum change in the any element in the eigenvectors from one iteration to the next is less than 0.0001.
Let Γ denote the current estimates of the optimizing principal components. Then, maximizing the multivariate normal likelihood is equivalent to minimizing the criterion
where Si is the i‑th covariance matrix, ni is its degrees of freedom, and
is the estimate of the covariance matrix under the common principal components model.
During each Jacobi iteration, an optimal orthogonal matrix Tij is found that rotates the two vectors in columns i and j of Γ. When restricted to Tij, the criterion above becomes
Γ is updated as ΓTij. When convergence has been reached (the maximum change in Γ is less than 0.0001), Γ contains the optimizing principal components. Initially, Γ is taken as the eigenvectors of the matrix ΣiSi.
In least‑squares estimation, the matrices Tij are found such that the sum of the squared off‑diagonal elements in the resulting “diagonalized” matrices are minimized. That is, Tij is found to minimize
where vij is the vector of length k containing the off‑diagonal elements in the matrices
See Flury and Gautschi (1986) for further details on the general algorithm, especially in maximum likelihood estimation. See Clarkson (1988b) for details of the least‑squares algorithm.
If the “residual” matrices ΓTSiΓ are desired, they may be obtained in the work vector H returned from K2RIN or from the matrix COV returned from K3RIN. If the least‑squares criterion is needed, it is easily computed as the sum of the squared off‑diagonal elements in H (or COV). To compute the likelihood ratio criterion, the eigenvalues of each matrix in COV first need to be computed. Denote the eigenvalues from the l‑th matrix by λlj, and let
be the eigenvalues obtained under the common principal component model (and returned as the diagonal elements of H or, from K3RIN, COV). Then, the log‑likelihood‑ratio statistic for testing,
is diagonal, l= 1, …, k, is computed as:
The distribution of under H0 is asymptotically X2 with (k‑ 1)p(p‑ 1)/2 degrees of freedom (see Flury 1984).
Comments
1. Workspace may be explicitly provided, if desired, by use of K2RIN/DK2RIN. The reference is:
H — Work vector of length NVAR2*NMAT. On return from K2RIN, H may be treated as an array dimensioned as H(NVAR, NVAR, NMAT). Each NVAR by NVAR matrix in H is computed as (EVEC)T*COV(I) *EVEC, i.e., H contains the “eigenvalues” and the “residuals” for each covariance matrix. Here, COV(I) is the I‑th covariance matrix.
AUX — Work vector of length max(3 *NMAT, NVAR2).
BOLD — Work vector of length NVAR2.
G — Work vector of length 2 *NVAR.
T — Work vector of length 4 *NMAT.
2. Informational errors
Type
Code
Description
3
1
Convergence did not occur within 50 iterations. Convergence is assumed.
4
2
An input matrix is singular. Singular input matrices are not allowed in maximum likelihood estimation.
3. If user specified initial estimates for EVEC are desired (and argument error checking is not needed), then the routine K3RIN (DK3RIN) may be used. The calling sequence is
On input, EVEC contains the initial estimates of the common principal components (EVEC must be an orthogonal matrix). On output, COV contains the NMAT matrices (EVEC )T*COV (I) *EVEC. The user should be wary of stationary points in the likelihood if K3RIN is used.
Example
The following example is taken from Flury and Constantine (1985). It involves two 4 by 4 covariance matrices. The two covariance matrices are given as:
USE KPRIN_INT
USE WRRRN_INT
IMPLICIT NONE
! SPECIFICATIONS FOR PARAMETERS
INTEGER LDCOV, LDEVEC, NGROUP, NVAR
PARAMETER (LDCOV=4, LDEVEC=4, NGROUP=2, NVAR=4)
!
REAL ANI(NGROUP), COV(LDCOV,LDCOV,NGROUP), EVEC(LDEVEC,NVAR)