CANVC
Performs canonical correlation analysis from a variance‑covariance matrix or a correlation matrix.
Required Arguments
NDF — Number of degrees of freedom in the covariance or correlation matrix. (Input)
If NDF is unknown, an estimate of NDF = 100 is suggested in which case the last four columns of CORR are meaningless.
COV — NVAR1 + NVAR2 by NVAR1 + NVAR2 matrix containing the covariance or correlation matrix. (Input)
Routines COVPL, RBCOV, or CORVC (see Chapter 3, “Correlation”) may be used to calculate COV from a data matrix. COV must be nonnegative definite within a tolerance of 100.0 * AMACH(4). Only the upper triangle of COV is referenced.
IND1 — Vector of length NVAR1 containing the column and row numbers in COV for the group 1 variables. (Input)
IND2 — Vector of length NVAR2 containing the column and row numbers in COV for the group 2 variables. (Input)
CORR — NV by 6 matrix containing the output statistics. (Output)
NV is the minimum of NVAR1 and NVAR2.
Col |
Statistic |
1 |
Canonical correlations sorted from the largest to the smallest. |
2 |
Wilks’ lambda for testing that the current and all smaller canonical correlations are zero. |
3 |
Rao’s F corresponding to Wilks’ lambda. If the canonical correlation is greater than 0.99999, F is set to 9999.99. |
4 |
Numerator degrees of freedom for the F. |
5 |
Denominator degrees of freedom for the F. |
6 |
Probability of a larger F statistic. |
If an F statistic is negative, then CORR(i, 6) is set to one. If either CORR(i, 4) or CORR(i, 5) is not positive, then CORR(i, 6) is set to the missing value code (NaN).
COEF1 — NVAR1 by NVAR1 matrix containing the group 1 canonical coefficients. (Output)
The columns of COEF1 contain the vectors of canonical coefficients for group 1.
COEF2 — NVAR2 by NVAR2 matrix containing the group 2 canonical coefficients. (Output)
The columns of COEF2 contain the vectors of canonical coefficients for group 2.
COEFR1 — NVAR1 by NV matrix containing the correlations between the group 1 variables and the group 1 canonical scores. (Output)
NV is the minimum of NVAR1 and NVAR2.
COEFR2 — NVAR2 by NV matrix containing the correlations between the group 2 variables and the group 2 canonical scores. (Output)
NV is the minimum of NVAR1 and NVAR2.
Optional Arguments
NVAR1 — Number of variables in group 1. (Input)
Default: NVAR1 = size (IND1,1).
NVAR2 — Number of variables in group 2. (Input)
Default: NVAR2 = size (IND2,1).
LDCOV — Leading dimension of COV exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOV = size (COV,1).
IPRINT — Printing option. (Input)
Default: IPRINT = 0.
IPRINT |
Action |
0 |
No printing. |
1 |
Printing of CORR, COEF1, COEF2, COEFR1, and COEFR2 is performed. |
LDCORR — Leading dimension of CORR exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCORR = size (CORR,1).
LDCOF1 — Leading dimension of COEF1 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOF1 = size (COEF1,1).
LDCOF2 — Leading dimension of COEF2 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOF2 = size (COEF2,1).
LDCFR1 — Leading dimension of COEFR1 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFR1 = size (COEFR1,1).
LDCFR2 — Leading dimension of COEFR2 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFR2 = size (COEFR2,1).
FORTRAN 90 Interface
Generic: CALL CANVC (NDF, COV, IND1, IND2, CORR, COEF1, COEF2, COEFR1,
COEFR2 [, …])
Specific: The specific interface names are S_CANVC and D_CANVC.
FORTRAN 77 Interface
Single: CALL CANVC (NDF, NVAR1, NVAR2, COV, LDCOV, IND1, IND2, IPRINT, CORR, LDCORR, COEF1, LDCOF1, COEF2, LDCOF2, COEFR1, LDCFR1, COEFR2, LDCFR2)
Double: The double precision name is DCANVC.
Description
Routine CANVC computes the canonical correlations, the canonical coefficients, Wilks’ lambda (for testing the independence of two sets of variates), and a series of tests due to Bartlett for testing that all canonical correlations greater than or equal to the k‑th largest are simultaneously zero. The covariance matrix is used in these computations.
The group 1 variables covariance matrix is first extracted from COV and placed in the matrix S11. Similarly, the group 2 variables covariance matrix is placed in S22. The “standardized” cross covariance matrix is then computed as:
where S12 is the NVAR1 × NVAR2 matrix of covariances between the group 1 and group 2 variables, and S1∕2denotes the upper triangular Cholesky (RT R) factorization of S. In the computation of C and in the following, it is assumed that NVAR1 is greater than NVAR2. The group 1 and group 2 variables should be interchanged in the following if this is not the case.
The canonical correlations are computed as the singular values of the matrix C. The canonical coefficients are obtained from the left and right orthogonal matrices resulting from the singular value decomposition of C. In particular, for Γ1 = COEF1.
where L is the left orthogonal matrix from the singular value decomposition.
Similarly, the correlations between the original variables and the canonical variables, R1 = COEFR1, are obtained for the group 1 variables as:
where Δ11 is a diagonal matrix containing the diagonal of S11 along its diagonal.
Wilks’ lambda, the Bartlett’s tests, Rao’s F corresponding to these tests, the numerator and denominator degrees of freedom of F , and the significance level of F are computed as in Rao (1973, page 556). Bartlett’s tests are computed as
where q = NVAR2 is the number of canonical correlations, the canonical correlations are ordered from largest to smallest, and ρj denotes the j‑th largest canonical correlation. Wilks’ lambda is given as Λ1. The degrees of freedom in the numerator of the corresponding Rao’s F statistic is given as
d1 = pu
where p = v1 ‑ i + 1, u = v2 ‑ i + 1, v1 = NVAR2, and v2 = NVAR1. Let
where t is the degrees of freedom in COV, and let
if p2 + u2 ‑ 5 ≠ 0, and let s = 2 otherwise. Then, Rao’s F corresponding to Bartlett’s test is computed as
Rao’s F has numerator degrees of freedom d2 = ms ‑ pu/2 + 1. The significance level of F is obtained from the standard F distribution
Comments
1. Workspace may be explicitly provided, if desired, by use of C2NVC/DC2NVC. The reference is:
CALL C2NVC (NDF, NVAR1, NVAR2, COV, LDCOV, IND1, IND2, IPRINT, CORR, LDCORR, COEF1, LDCOF1, COEF2, LDCOF2, COEFR1, LDCFR1, COEFR2, LDCFR2, R, S, STD1, STD2, WKA, WK)
The additional arguments are as follows:
R — Work vector of length NVAR12.
S — Work vector of length NVAR22.
STD1 — Work vector of length NVAR1.
STD2 — Work vector of length NVAR2.
WKA — Work vector of length (NVAR1 + NVAR2)2.
WK — Work vector of length 3 * max(NVAR1, NVAR2).
2. Informational errors
Type |
Code |
Description |
3 |
1 |
The standardized cross covariance matrix is not of full rank or is very ill‑conditioned. Small canonical correlations may not be accurate. |
4 |
2 |
COV is not nonnegative definite. |
Example
The following example is taken from Van de Geer (1971). There are six group 1 variables and two group 2 variables. The maximum correlation turns out to be 0.609.
USE CANVC_INT
IMPLICIT NONE
INTEGER IPRINT, LDCFR1, LDCFR2, LDCOF1, LDCOF2, LDCORR, &
LDCOV, NDF, NV, NVAR1, NVAR2
PARAMETER (IPRINT=1, LDCFR1=6, LDCFR2=2, LDCOF1=6, LDCOF2=2, &
LDCORR=2, LDCOV=8, NDF=100, NV=2, NVAR1=6, NVAR2=2)
!
INTEGER IND1(NVAR1), IND2(NVAR2)
REAL COEF1(NVAR1,NVAR1), COEF2(NVAR2,NVAR2), &
COEFR1(NVAR1,NVAR2), COEFR2(NVAR2,NVAR2), &
CORR(NVAR2,NVAR1), COV(LDCOV,NVAR1+NVAR2)
!
DATA COV/1.0000, 0.1839, 0.0489, 0.0186, 0.0782, 0.1147, 0.2137, &
0.2742, 0.1839, 1.0000, 0.2220, 0.1861, 0.3355, 0.1021, &
0.4105, 0.4043, 0.0489, 0.2220, 1.0000, 0.2707, 0.2302, &
0.0931, 0.3240, 0.4047, 0.0186, 0.1861, 0.2707, 1.0000, &
0.2950, -0.0438, 0.2930, 0.2407, 0.0782, 0.3355, 0.2302, &
0.2950, 1.0000, 0.2087, 0.2995, 0.2863, 0.1147, 0.1021, &
0.0931, -0.0438, 0.2087, 1.0000, 0.0760, 0.0702, 0.2137, &
0.4105, 0.3240, 0.2930, 0.2995, 0.0760, 1.0000, 0.6247, &
0.2742, 0.4043, 0.4047, 0.2407, 0.2863, 0.0702, 0.6247, &
1.0000/
!
DATA IND1/1, 2, 3, 4, 5, 6/, IND2/7, 8/
!
CALL CANVC (NDF, COV, IND1, IND2, CORR, &
COEF1, COEF2, COEFR1, COEFR2, IPRINT=IPRINT)
!
!
END
Output
*** Canonical Correlations Statistics ***
Canonical Prob. of
Correlations Wilks Lambda Raos F Num. df Denom. df Larger F
1 0.6093 0.6159 4.250 12 186 0.0000
2 0.1431 0.9795 0.393 5 94 0.8524
Group One Canonical Coefficients
1 2 3 4 5 6
1 0.326 0.411 -0.799 0.358 -0.032 0.053
2 0.481 -0.340 -0.083 -0.766 -0.484 -0.139
3 0.456 0.718 0.625 0.134 -0.056 0.038
4 0.202 -0.689 0.060 0.732 -0.335 0.080
5 0.184 -0.125 -0.064 -0.045 1.079 -0.225
6 -0.027 -0.174 0.054 -0.086 -0.021 1.017
Group Two Canonical Coefficients
1 2
1 0.464 1.194
2 0.642 -1.108
Correlations Between the Group One Variables
and the Group One Canonical Scores
1 2
1 0.4517 0.3408
2 0.7388 -0.2932
3 0.6733 0.4313
4 0.4769 -0.5799
5 0.5299 -0.2811
6 0.1319 -0.0903
Correlations Between the Group Two Variables
and the Group Two Canonical Scores
1 2
1 0.8653 0.5013
2 0.9320 -0.3625