CANVC


   more...

Performs canonical correlation analysis from a variance‑covariance matrix or a correlation matrix.

Required Arguments

NDF — Number of degrees of freedom in the covariance or correlation matrix. (Input)
If NDF is unknown, an estimate of NDF = 100 is suggested in which case the last four columns of CORR are meaningless.

COVNVAR1 + NVAR2 by NVAR1 + NVAR2 matrix containing the covariance or correlation matrix. (Input)
Routines COVPL, RBCOV, or CORVC (see Chapter 3, “Correlation”) may be used to calculate COV from a data matrix. COV must be nonnegative definite within a tolerance of 100.0 * AMACH(4). Only the upper triangle of COV is referenced.

IND1 — Vector of length NVAR1 containing the column and row numbers in COV for the group 1 variables. (Input)

IND2 — Vector of length NVAR2 containing the column and row numbers in COV for the group 2 variables. (Input)

CORRNV by 6 matrix containing the output statistics. (Output)
NV is the minimum of NVAR1 and NVAR2.

 

Col

Statistic

1

Canonical correlations sorted from the largest to the smallest.

2

Wilks’ lambda for testing that the current and all smaller canonical correlations are zero.

3

Rao’s F corresponding to Wilks’ lambda. If the canonical correlation is greater than 0.99999, F is set to 9999.99.

4

Numerator degrees of freedom for the F.

5

Denominator degrees of freedom for the F.

6

Probability of a larger F statistic.

If an F statistic is negative, then CORR(i, 6) is set to one. If either CORR(i, 4) or CORR(i, 5) is not positive, then CORR(i, 6) is set to the missing value code (NaN).

COEF1NVAR1 by NVAR1 matrix containing the group 1 canonical coefficients. (Output)
The columns of COEF1 contain the vectors of canonical coefficients for group 1.

COEF2NVAR2 by NVAR2 matrix containing the group 2 canonical coefficients. (Output)
The columns of COEF2 contain the vectors of canonical coefficients for group 2.

COEFR1NVAR1 by NV matrix containing the correlations between the group 1 variables and the group 1 canonical scores. (Output)
NV is the minimum of NVAR1 and NVAR2.

COEFR2NVAR2 by NV matrix containing the correlations between the group 2 variables and the group 2 canonical scores. (Output)
NV is the minimum of NVAR1 and NVAR2.

Optional Arguments

NVAR1 — Number of variables in group 1. (Input)
Default: NVAR1 = size (IND1,1).

NVAR2 — Number of variables in group 2. (Input)
Default: NVAR2 = size (IND2,1).

LDCOV — Leading dimension of COV exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOV = size (COV,1).

IPRINT — Printing option. (Input)
Default: IPRINT = 0.

 

IPRINT

Action

0

No printing.

1

Printing of CORR, COEF1, COEF2, COEFR1, and COEFR2 is performed.

LDCORR — Leading dimension of CORR exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCORR = size (CORR,1).

LDCOF1 — Leading dimension of COEF1 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOF1 = size (COEF1,1).

LDCOF2 — Leading dimension of COEF2 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOF2 = size (COEF2,1).

LDCFR1 — Leading dimension of COEFR1 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFR1 = size (COEFR1,1).

LDCFR2 — Leading dimension of COEFR2 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFR2 = size (COEFR2,1).

FORTRAN 90 Interface

Generic: CALL CANVC (NDF, COV, IND1, IND2, CORR, COEF1, COEF2, COEFR1,
COEFR2 [])

Specific: The specific interface names are S_CANVC and D_CANVC.

FORTRAN 77 Interface

Single: CALL CANVC (NDF, NVAR1, NVAR2, COV, LDCOV, IND1, IND2, IPRINT, CORR, LDCORR, COEF1, LDCOF1, COEF2, LDCOF2, COEFR1, LDCFR1, COEFR2, LDCFR2)

Double: The double precision name is DCANVC.

Description

Routine CANVC computes the canonical correlations, the canonical coefficients, Wilks’ lambda (for testing the independence of two sets of variates), and a series of tests due to Bartlett for testing that all canonical correlations greater than or equal to the k‑th largest are simultaneously zero. The covariance matrix is used in these computations.

The group 1 variables covariance matrix is first extracted from COV and placed in the matrix S11. Similarly, the group 2 variables covariance matrix is placed in S22. The “standardized” cross covariance matrix is then computed as:

 

where S12 is the NVAR1 × NVAR2 matrix of covariances between the group 1 and group 2 variables, and S12denotes the upper triangular Cholesky (RT R) factorization of S. In the computation of C and in the following, it is assumed that NVAR1 is greater than NVAR2. The group 1 and group 2 variables should be interchanged in the following if this is not the case.

The canonical correlations are computed as the singular values of the matrix C. The canonical coefficients are obtained from the left and right orthogonal matrices resulting from the singular value decomposition of C. In particular, for Γ1 = COEF1.

 

where L is the left orthogonal matrix from the singular value decomposition.

Similarly, the correlations between the original variables and the canonical variables, R1 = COEFR1, are obtained for the group 1 variables as:

 

where Δ11 is a diagonal matrix containing the diagonal of S11 along its diagonal.

Wilks’ lambda, the Bartlett’s tests, Rao’s F corresponding to these tests, the numerator and denominator degrees of freedom of F , and the significance level of F are computed as in Rao (1973, page 556). Bartlett’s tests are computed as

 

where q = NVAR2 is the number of canonical correlations, the canonical correlations are ordered from largest to smallest, and ρj denotes the j‑th largest canonical correlation. Wilks’ lambda is given as Λ1. The degrees of freedom in the numerator of the corresponding Rao’s F statistic is given as

d1 = pu

where p = v1  i + 1, u = v2  i + 1, v1 = NVAR2, and v2 = NVAR1. Let

 

where t is the degrees of freedom in COV, and let

 

if p2 + u2  5 0, and let s = 2 otherwise. Then, Rao’s F corresponding to Bartlett’s test is computed as

 

Rao’s F has numerator degrees of freedom d2 = ms  pu/2 + 1. The significance level of F is obtained from the standard F distribution

Comments

1. Workspace may be explicitly provided, if desired, by use of C2NVC/DC2NVC. The reference is:

CALL C2NVC (NDF, NVAR1, NVAR2, COV, LDCOV, IND1, IND2, IPRINT, CORR, LDCORR, COEF1, LDCOF1, COEF2, LDCOF2, COEFR1, LDCFR1, COEFR2, LDCFR2, R, S, STD1, STD2, WKA, WK)

The additional arguments are as follows:

R — Work vector of length NVAR12.

S — Work vector of length NVAR22.

STD1 — Work vector of length NVAR1.

STD2 — Work vector of length NVAR2.

WKA — Work vector of length (NVAR1 + NVAR2)2.

WK — Work vector of length 3 * max(NVAR1, NVAR2).

2. Informational errors

 

Type

Code

Description

3

1

The standardized cross covariance matrix is not of full rank or is very ill‑conditioned. Small canonical correlations may not be accurate.

4

2

COV is not nonnegative definite.

Example

The following example is taken from Van de Geer (1971). There are six group 1 variables and two group 2 variables. The maximum correlation turns out to be 0.609.

 

USE CANVC_INT

 

IMPLICIT NONE

INTEGER IPRINT, LDCFR1, LDCFR2, LDCOF1, LDCOF2, LDCORR, &

LDCOV, NDF, NV, NVAR1, NVAR2

PARAMETER (IPRINT=1, LDCFR1=6, LDCFR2=2, LDCOF1=6, LDCOF2=2, &

LDCORR=2, LDCOV=8, NDF=100, NV=2, NVAR1=6, NVAR2=2)

!

INTEGER IND1(NVAR1), IND2(NVAR2)

REAL COEF1(NVAR1,NVAR1), COEF2(NVAR2,NVAR2), &

COEFR1(NVAR1,NVAR2), COEFR2(NVAR2,NVAR2), &

CORR(NVAR2,NVAR1), COV(LDCOV,NVAR1+NVAR2)

!

DATA COV/1.0000, 0.1839, 0.0489, 0.0186, 0.0782, 0.1147, 0.2137, &

0.2742, 0.1839, 1.0000, 0.2220, 0.1861, 0.3355, 0.1021, &

0.4105, 0.4043, 0.0489, 0.2220, 1.0000, 0.2707, 0.2302, &

0.0931, 0.3240, 0.4047, 0.0186, 0.1861, 0.2707, 1.0000, &

0.2950, -0.0438, 0.2930, 0.2407, 0.0782, 0.3355, 0.2302, &

0.2950, 1.0000, 0.2087, 0.2995, 0.2863, 0.1147, 0.1021, &

0.0931, -0.0438, 0.2087, 1.0000, 0.0760, 0.0702, 0.2137, &

0.4105, 0.3240, 0.2930, 0.2995, 0.0760, 1.0000, 0.6247, &

0.2742, 0.4043, 0.4047, 0.2407, 0.2863, 0.0702, 0.6247, &

1.0000/

!

DATA IND1/1, 2, 3, 4, 5, 6/, IND2/7, 8/

!

CALL CANVC (NDF, COV, IND1, IND2, CORR, &

COEF1, COEF2, COEFR1, COEFR2, IPRINT=IPRINT)

!

!

END

Output

 

*** Canonical Correlations Statistics ***

Canonical Prob. of

Correlations Wilks Lambda Raos F Num. df Denom. df Larger F

1 0.6093 0.6159 4.250 12 186 0.0000

2 0.1431 0.9795 0.393 5 94 0.8524

 

Group One Canonical Coefficients

1 2 3 4 5 6

1 0.326 0.411 -0.799 0.358 -0.032 0.053

2 0.481 -0.340 -0.083 -0.766 -0.484 -0.139

3 0.456 0.718 0.625 0.134 -0.056 0.038

4 0.202 -0.689 0.060 0.732 -0.335 0.080

5 0.184 -0.125 -0.064 -0.045 1.079 -0.225

6 -0.027 -0.174 0.054 -0.086 -0.021 1.017

 

Group Two Canonical Coefficients

1 2

1 0.464 1.194

2 0.642 -1.108

 

Correlations Between the Group One Variables

and the Group One Canonical Scores

1 2

1 0.4517 0.3408

2 0.7388 -0.2932

3 0.6733 0.4313

4 0.4769 -0.5799

5 0.5299 -0.2811

6 0.1319 -0.0903

 

Correlations Between the Group Two Variables

and the Group Two Canonical Scores

1 2

1 0.8653 0.5013

2 0.9320 -0.3625