CANCR
Performs canonical correlation analysis from a data matrix.
Required Arguments
X — NOBS by NVAR1 + NVAR2 + m data matrix where m is 0, 1, or 2 depending on whether any columns of X correspond to frequencies or weights. (Input)
Each row of X contains an observation of the NVAR1 + NVAR2 variables for which canonical correlations are desired (plus a weight and/or a frequency variable if IFRQ and/or IWT(see below) are not zero). If both IWT and IFRQ are zero, m is 0; 1, if one of IFRQ or IWT is positive; and 2, otherwise. X may not have any missing values (NaN, not a number).
IND1 — Vector of length NVAR1 containing the column numbers in X of the group 1 variables. (Input)
IND2 — Vector of length NVAR2 containing the column numbers in X of the group 2 variables. (Input)
XX — NOBS by NVAR1 + NVAR2 + m matrix containing the canonical scores. (Output)
m is defined in the description for X. X and XX may occupy the same storage locations. Canonical scores are returned in the first NVAR1 + NVAR2 columns of XX. Scores for the NVAR1 variables come first. If one of IFRQ or IWT are not zero, then the last column of XX contains the weight or frequency. If both IFRQ and IWT are not zero, then the frequencies and weights are in the second to last and last column of XX, respectively.
CORR — NV by 6 matrix of output statistics. (Output)
NV is the minimum of NVAR1 and NVAR2. CORR has the following statistics.
Col. |
Statistic |
1 |
Canonical correlations sorted from the largest to the smallest. |
2 |
Wilks’ lambda for testing that the current and all smaller canonical correlations are zero. |
3 |
Rao’s F corresponding to Wilks’ lambda. If the canonical correlation is greater than 0.99999, then F is set to 9999.99. |
4 |
Numerator degrees of freedom for F. |
5 |
Denominator degrees of freedom for F. |
6 |
Probability of a larger F statistic. |
If an F statistic is negative, then CORR(i, 6) is set to one. If either CORR(i, 4) or CORR(i, 5) is not positive, then CORR(i, 6) is set to the missing value code (NaN).
COEF1 — NVAR1 by NVAR1 matrix containing the group 1 canonical coefficients. (Output)
The columns of COEF1 contain the vectors of canonical coefficients for group 1.
COEF2 — NVAR2 by NVAR2 matrix containing the group 2 canonical coefficients. (Output)
The columns of COEF2 contain the vectors of canonical coefficients for group 2.
COEFR1 — NVAR1 by NV matrix containing the correlations between the group 1 variables and the group 1 canonical scores. (Output)
NV is the minimum of NVAR1 and NVAR2.
COEFR2 — NVAR2 by NV matrix containing the correlations between the group 2 variables and the group 2 canonical scores. (Output)
NV is the minimum of NVAR1 and NVAR2.
STAT — 15 by NVAR1 + NVAR2 matrix containing statistics on all of the variables. (Output)
The first NVAR1 columns of STAT correspond to the group one variables with the last NVAR2 columns corresponding to the group two variables.
Row |
Statistic |
1 |
Means |
2 |
Variances |
3 |
Standard deviations |
4 |
Coeffici ents of skewness |
5 |
Coefficients of excess (kurtosis) |
6 |
Minima |
7 |
Maxima |
8 |
Ranges |
9 |
Coefficients of variation, when defined, 0.0 otherwise |
10 |
Numbers of nonmissing observations |
11 |
Lower endpoints of 95% confidence interval for the means |
12 |
Upper endpoints of 95% confidence interval for the means |
13 |
Lower endpoints of 95% confidence interval for the variances |
14 |
Upper endpoints of 95% confidence interval for the variances |
15 |
Sums of the weights if IWT greater than zero, 0.0 otherwise |
Optional Arguments
NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).
NVAR1 — Number of variables in group 1. (Input)
Default: NVAR1 = size (IND1,1).
NVAR2 — Number of variables in group 2. (Input)
Default: NVAR2 = size (IND2,1).
NCOL — Number of columns in X. (Input)
Default: NCOL = size(X,2).
LDX — Leading dimension of X exactly as specified in the dimension statement in the calling program. (Input)
Default: LDX = size (X,1).
IFRQ — Frequency option. (Input)
If IFRQ = 0, then all frequencies are 1. If IFRQ is positive, then column number IFRQ of X contains the nonnegative frequencies.
Default: IFRQ = 0.
IWT — Weighting option. (Input)
If IWT = 0, then there is no weighting, i.e., all weights are 1. If IWT is positive, then column number IWT of X contains the nonnegative weights.
Default: IWT = 0.
TOL — Constant used for determining linear dependence. (Input)
If the squared multiple correlation coefficient of a variable with its predecessors in IND1 (or IND2) is greater than 1 ‑ TOL, then the variable is considered to be linearly dependent upon the previous variables; it is excluded from the analysis. TOL = .001 is a typical value. TOL must be in the exclusive range of 0.0 to 1.0.
Default: TOL = .001.
IPRINT — Printing option. (Input)
Default: IPRINT = 0.
IPRINT |
Action |
0 |
No printing. |
1 |
Print CORR, COEF1, COEF2, COEFR1, COEFR2, and STAT. |
2 |
Print all output. |
LDXX — Leading dimension of XX exactly as specified in the dimension statement in the calling program. (Input)
Default: LDXX = size (XX,1).
LDCORR — Leading dimension of CORR exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCORR = size (CORR,1).
LDCOF1 — Leading dimension of COEF1 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOF1 = size (COEF1,1).
LDCOF2 — Leading dimension of COEF2 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOF2 = size (COEF2,1).
LDCFR1 — Leading dimension of COEFR1 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFR1 = size (COEFR1,1).
LDCFR2 — Leading dimension of COEFR2 exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFR2 = size (COEFR2,1).
LDSTAT — Leading dimension of STAT exactly as specified in the dimension statement in the calling program. (Input)
Default: LDSTAT = size(STAT,1).
FORTRAN 90 Interface
Generic: CALL CANCR (X, IND1, IND2, XX, CORR, COEF1, COEF2, COEFR1, COEFR2, STAT [, …])
Specific: The specific interface names are S_CANCR and D_CANCR.
FORTRAN 77 Interface
Single: CALL CANCR (NOBS, NVAR1, NVAR2, NCOL, X, LDX, IFRQ, IWT, IND1, IND2, TOL, IPRINT, XX, LDXX, CORR, LDCORR, COEF1, LDCOF1, COEF2, LDCOF2, COEFR1, LDCFR1, COEFR2, LDCFR2, STAT, LDSTAT)
Double: The double precision name is DCANCR.
Description
Routine CANCR computes the canonical correlations, the canonical coefficients, the canonical scores, Wilks’ lambda for testing the independence of two sets of variates, and a series of Bartlett’s tests of the hypothesis that the k‑th largest and all larger canonical correlations are simultaneously zero. A matrix of observations is used in these computations.
Let xij denote the j‑th variable on the i‑th observation, wi denote the observation weight, fi denote the observation frequency, Γ11 denote the upper triangular Cholesky (RT R) factorization of the sample covariance matrix of the group 1 variables, Γ22 denote the upper triangular Cholesky (RT R) factorization of the group 2 variables sample covariance matrix, and
where
is the sample estimate of the matrix of covariances between the group 1 and the group 2 variables. Then, the computational procedure in obtaining the canonical correlations is as follows:
1. The weighted mean of each variable is computed via the standard formula (see UVSTA, Chapter 1, “Basic Statistics”). The means are then subtracted from the observations.
2. Each element in the i‑th row of X is multiplied by
3. Gram‑Schmidt orthogonalization is used on X to obtain Y1 and Y2, where Y1 and Y2 are the results of the Gram‑Schmidt orthogonalization of the group 1 and the group 2 variables, respectively. The matrices Γ11and Γ22 are obtained as a by‑product of the orthogonalization. Compute
4. The canonical correlations are obtained as the singular values of the matrix Γ12. Denote the left and right orthogonal matrices obtained as a by‑product of this decomposition by L and R, respectively.
5. The canonical coefficients are obtained from L and R by multiplying L and R by the inverses of Γ11and Γ22, respectively (see Golub 1969).
6. The correlations of the original variables with the canonical variables are obtained by multiplying L and R by Γ11and Γ22, respectively.
7. The canonical scores are obtained by multiplying the matrices Y1 and Y2 by the matrices L and R, respectively, and then dividing each row of Y1 and Y2 by
8. Wilks’ lambda, the Bartlett’s tests, Rao’s F corresponding to these tests, the numerator and denominator degrees of freedom of F, and the significance level of F are computed as in Rao (1973, page 556). Bartlett’s tests are computed as
where q = NVAR2 is the number of canonical correlations, the canonical correlations are ordered from largest to smallest, and ρj denotes the j‑th largest canonical correlation. Wilks’ lambda is given as Λ1. The degrees of freedom in the numerator of the corresponding Rao’s F statistic is given as
d1 = pu
where p = v1 ‑ i + 1, u = v2 ‑ i + 1, v1 = NVAR2, and v2 = NVAR1. Let
where t is the degrees of freedom in COV(Σi fi ‑ 1), and let
if p2 + u2 ‑ 5 ≠ 0, and let s = 2 otherwise. Then, Rao’s F corresponding to Bartlett’s test is computed as
Rao’s F has numerator degrees of freedom d2 = ms ‑ pu/2 + 1. The significance level of F is obtained from the standard F distribution.
Comments
1. Workspace may be explicitly provided, if desired, by use of C2NCR/DC2NCR. The reference is:
CALL C2NCR (NOBS, NVAR1, NVAR2, NCOL, X, LDX, IFRQ, IWT, IND1, IND2, TOL, IPRINT, XX, LDXX, CORR, LDCORR, COEF1, LDCOF1, COEF2, LDCOF2, COEFR1, LDCFR1, COEFR2, LDCFR2, STAT, LDSTAT, R, S, IND, WORK, WKA, WK)
The additional arguments are as follows:
R — Work vector of length NVAR12.
S — Work vector of length NVAR22.
IND — Work vector of length NVAR1 + NVAR2 + 2.
WORK — Work vector of length max(NOBS, 2 * (NVAR1 + NVAR2))
WKA — Work vector of length (max (NVAR1, NVAR2))2.
WK — Work vector of length 3 * max(NVAR1, NVAR2) ‑ 1.
2. Informational errors
Type |
Code |
Description |
3 |
1 |
The standardized cross covariance matrix is not of full rank or is very ill‑conditioned. Small canonical correlations may not be accurate. |
3 |
2 |
One or more variables is linearly dependent upon the proceeding variables in its group. |
4 |
3 |
The sum of the frequencies is equal to zero. The sum of the frequencies must be positive. |
4 |
4 |
The sum of the weights is equal to zero. The sum of the weights must be positive. |
Examples
Example 1
The following example is taken from Levin and Marascuilo (1983), pages 191–197. It is examining the relationship between the performance of individuals in a sociology course and predictor variables. The measures of performance in the sociology course are two midterms examinations, a final examination, and a course evaluation, the predictor variables are social class, sex, grade point average, college board test score, whether the student has previously taken a course in sociology, and the student’s score on a pretest.
USE WRRRL_INT
USE CANCR_INT
IMPLICIT NONE
INTEGER IPRINT, LDCFR1, LDCFR2, LDCOF1, LDCOF2, LDCORR, LDSTAT,&
LDX, LDXX, NCOL, NOBS, NV, NVAR1, NVAR2, I
REAL TOL
PARAMETER (IPRINT=1, LDSTAT=15, NCOL=10, NOBS=40, NVAR1=6, &
NVAR2=4, TOL=0.0001, LDCFR1=NVAR1, LDCFR2=NVAR2, &
LDCOF1=NVAR1, LDCOF2=NVAR2, LDX=NOBS, LDXX=NOBS, &
NV=NVAR2, LDCORR=NV)
!
INTEGER IND1(NVAR1), IND2(NVAR2)
REAL COEF1(LDCOF1,NVAR1), COEF2(LDCOF2,NVAR2), &
COEFR1(LDCFR1,NV), COEFR2(LDCFR2,NV), &
CORR(LDCORR,6),STAT(LDSTAT,NVAR1+NVAR2), &
X(LDX,NCOL), XX(LDXX,NCOL)
CHARACTER FMT*35, NUMBER(1)*6, XLAB(11)*25
!
DATA IND1/1, 2, 3, 4, 5, 6/, IND2/7, 8, 9, 10/
DATA (X(I,1),I=1,NOBS)/3*2.0, 3.0, 2.0, 3.0, 1.0, 2.0, 3.0, &
2*2.0, 3.0, 1.0, 4*2.0, 3.0, 3*2.0, 1.0, 3*2.0, 1.0, 2.0, &
1.0, 2.0, 3.0, 2*2.0, 2*1.0, 2.0, 3.0, 1.0, 2.0, 3.0, 1.0/
DATA (X(I,2),I=1,NOBS)/6*1.0, 0.0, 2*1.0, 3*0.0, 3*1.0, 3*0.0, &
1.0, 0.0, 3*1.0, 3*0.0, 4*1.0, 0.0, 8*1.0, 0.0/
DATA (X(I,3),I=1,NOBS)/3.55, 2.70, 3.50, 2.91, 3.10, 3.49, 3.17, &
3.57, 3.76, 3.81, 3.60, 3.10, 3.08, 3.50, 3.43, 3.39, 3.76, &
3.71, 3.00, 3.47, 3.69, 3.24, 3.46, 3.39, 3.90, 2.76, 2.70, &
3.77, 4.00, 3.40, 3.09, 3.80, 3.28, 3.70, 3.42, 3.09, 3.70, &
2.69, 3.40, 2.95/
DATA (X(I,4),I=1,NOBS)/410.0, 390.0, 510.0, 430.0, 600.0, &
2*610.0, 560.0, 700.0, 460.0, 590.0, 500.0, 410.0, 470.0, &
210.0, 610.0, 510.0, 600.0, 470.0, 460.0, 800.0, 610.0, &
490.0, 470.0, 610.0, 580.0, 410.0, 630.0, 790.0, 490.0, &
400.0, 2*610.0, 500.0, 430.0, 540.0, 610.0, 400.0, 390.0, &
490.0/
DATA (X(I,5),I=1,NOBS)/8*0.0, 4*1.0, 0.0, 2*1.0, 0.0, 1.0, 0.0, &
1.0, 0.0, 1.0, 3*0.0, 1.0, 2*0.0, 2*1.0, 2*0.0, 4*1.0, &
5*0.0/
DATA (X(I,6),I=1,NOBS)/17.0, 20.0, 22.0, 13.0, 16.0, 28.0, 14.0, &
10.0, 28.0, 30.0, 28.0, 15.0, 24.0, 15.0, 26.0, 16.0, 25.0, &
3.0, 5.0, 16.0, 28.0, 13.0, 9.0, 13.0, 30.0, 10.0, 13.0, &
8.0, 29.0, 17.0, 15.0, 16.0, 13.0, 30.0, 2*17.0, 25.0, &
10.0, 23.0, 18.0/
DATA (X(I,7),I=1,NOBS)/43.0, 50.0, 47.0, 24.0, 47.0, 57.0, &
2*42.0, 69.0, 48.0, 59.0, 21.0, 52.0, 2*35.0, 59.0, 68.0, &
38.0, 45.0, 37.0, 54.0, 45.0, 31.0, 39.0, 67.0, 30.0, 19.0, &
71.0, 80.0, 47.0, 46.0, 59.0, 48.0, 68.0, 43.0, 31.0, 64.0, &
19.0, 43.0, 20.0/
DATA (X(I,8),I=1,NOBS)/61.0, 47.0, 79.0, 40.0, 60.0, 59.0, 61.0, &
79.0, 83.0, 67.0, 74.0, 40.0, 71.0, 40.0, 57.0, 58.0, 66.0, &
58.0, 24.0, 48.0, 100.0, 83.0, 70.0, 48.0, 85.0, 14.0, &
55.0, 100.0, 94.0, 45.0, 58.0, 90.0, 84.0, 81.0, 49.0, &
54.0, 87.0, 36.0, 51.0, 59.0/
DATA (X(I,9),I=1,NOBS)/129.0, 60.0, 119.0, 100.0, 79.0, 99.0, &
92.0, 107.0, 156.0, 110.0, 116.0, 49.0, 107.0, 125.0, 64.0, &
100.0, 138.0, 63.0, 82.0, 73.0, 132.0, 87.0, 89.0, 99.0, &
119.0, 100.0, 84.0, 166.0, 111.0, 110.0, 93.0, 141.0, 99.0, &
114.0, 96.0, 39.0, 149.0, 53.0, 39.0, 91.0/
DATA (X(I,10),I=1,NOBS)/3.0, 3*1.0, 2.0, 1.0, 3.0, 2.0, 4*1.0, &
5.0, 1.0, 5.0, 1.0, 2.0, 1.0, 2*3.0, 3*2.0, 1.0, 2.0, 1.0, &
2.0, 3.0, 2.0, 2*1.0, 2*2.0, 5.0, 2*1.0, 4.0, 3.0, 2*1.0/
!
DATA XLAB/' ','Social%/Class', '%/Sex', '%/GPA', &
'College%/Boards', 'H.S.%/Soc.', 'Pretest%/Score', &
'%/Exam 1', '%/Exam 2', 'Final%/Exam', 'Course%/Eval.'/
DATA NUMBER/'NUMBER'/, FMT/'(2W3.1,W5.3,W4.1,W3.1,4W5.1,W3.1)'/
!
CALL WRRRL ('First 10 Observations', X, NUMBER, XLAB, &
10, NCOL, LDX, FMT=FMT)
!
CALL CANCR (X, IND1, IND2, XX, CORR, COEF1, &
COEF2, COEFR1, COEFR2, STAT, TOL=TOL, IPRINT=IPRINT)
!
END
Output
First 10 Observations
Social College H.S. Pretest Final Course
Class Sex GPA Boards Soc. Score Exam 1 Exam 2 Exam Eval
1 2 1 3.55 410 0 17 43 61 129 3
2 2 1 2.70 390 0 20 50 47 60 1
3 2 1 3.50 510 0 22 47 79 119 1
4 3 1 2.91 430 0 13 24 40 100 1
5 2 1 3.10 600 0 16 47 60 79 2
6 3 1 3.49 610 0 28 57 59 99 1
7 1 0 3.17 610 0 14 42 61 92 3
8 2 1 3.57 560 0 10 42 79 107 2
9 3 1 3.76 700 1 28 69 83 156 1
10 2 0 3.81 460 1 30 48 67 110 1
*** Canonical Correlations Statistics ***
Canonical Prob. of
Correlations Wilks Lambda Raos F Num. df Denom. df Larger F
1 0.9242 0.0612 5.412 24 105.9 0.0000
2 0.7184 0.4201 2.116 15 86.0 0.0162
3 0.2893 0.8683 0.586 8 64.0 0.7861
4 0.2290 0.9476 0.609 3 33.0 0.6142
Group One Canonical Coefficients
1 2 3 4 5 6
1 -0.622 1.158 -0.285 -0.179 0.601 -0.423
2 0.558 -0.739 0.231 -1.278 1.391 -0.024
3 1.796 -0.432 0.765 0.185 -0.643 -3.314
4 0.002 0.006 0.004 -0.002 0.000 0.006
5 -0.059 -0.043 -0.456 1.671 1.463 0.774
6 0.031 0.018 -0.121 -0.058 -0.042 0.056
Group Two Canonical Coefficients
1 2 3 4
1 0.0233 -0.0365 0.0845 -0.0176
2 0.0257 -0.0057 -0.0352 0.0555
3 0.0073 0.0110 -0.0259 -0.0341
4 0.1034 0.8089 0.2828 0.0260
Correlations Between the Group One Variables
and the Group One Canonical Scores
1 2 3 4
1 -0.3685 0.6795 -0.2291 -0.1854
2 0.2157 -0.3252 0.0521 -0.5985
3 0.8153 0.2770 -0.0692 0.2123
4 0.6144 0.5681 0.4151 -0.0050
5 0.4661 0.0603 -0.3034 0.6530
6 0.5461 0.1768 -0.7915 -0.1375
Correlations Between the Group Two Variables
and the Group Two Canonical Scores
1 2 3 4
1 0.8713 -0.2406 0.3864 -0.1835
2 0.9174 -0.0557 -0.2068 0.3355
3 0.7707 0.0293 -0.3146 -0.5533
4 0.3490 0.8765 0.3077 0.1240
*** Statistics for Group One Variables ***
Univariate Statistics from UVSTA
Variable Mean Variance Std. Dev. Skewness Kurtosis
1 1.9750 0.4353 0.6597 0.02476 -0.6452
2 0.6750 0.2250 0.4743 -0.74726 -1.4416
3 3.3758 0.1247 0.3532 -0.37911 -0.7521
4 524.2499 13148.1377 114.6653 0.09897 0.6494
5 0.4000 0.2462 0.4961 0.40825 -1.8333
6 18.1250 55.1378 7.4255 0.10633 -0.9358
Variable Minimum Maximum Range Coef. Var. Count
1 1.0000 3.0000 2.0000 0.3340 40.0000
2 0.0000 1.0000 1.0000 0.7027 40.0000
3 2.6900 4.0000 1.3100 0.1046 40.0000
4 210.0000 800.0000 590.0000 0.2187 40.0000
5 0.0000 1.0000 1.0000 1.2403 40.0000
6 3.0000 30.0000 27.0000 0.4097 40.0000
Variable Lower CLM Upper CLM Lower CLV Upper CLV
1 1.7640 2.1860 0.29207 0.7176
2 0.5233 0.8267 0.15098 0.3710
3 3.2628 3.4887 0.08369 0.2056
4 487.5782 560.9217 8822.72168 21677.9590
5 0.2413 0.5587 0.16518 0.4058
6 15.7502 20.4998 36.99883 90.9083
*** Statistics for Group Two Variables ***
Univariate Statistics from UVSTA
Variable Mean Variance Std. Dev. Skewness Kurtosis
1 46.0500 237.0231 15.3956 0.08762 -0.5505
2 62.8750 403.4967 20.0872 -0.10762 -0.3642
3 99.4750 919.4864 30.3230 -0.03483 -0.2533
4 1.9500 1.4333 1.1972 1.27704 0.8407
Variable Minimum Maximum Range Coef. Var. Count
1 19.0000 80.0000 61.0000 0.3343 40.0000
2 14.0000 100.0000 86.0000 0.3195 40.0000
3 39.0000 166.0000 127.0000 0.3048 40.0000
4 1.0000 5.0000 4.0000 0.6140 40.0000
Variable Lower CLM Upper CLM Lower CLV Upper CLV
1 41.1263 50.9737 159.0483 390.7912
2 56.4508 69.2992 270.7562 665.2642
3 89.7772 109.1728 616.9979 1516.0009
4 1.5671 2.3329 0.9618 2.3632
Example 2
Correspondence analysis is an interesting application of canonical correlation in the analysis of contingency tables. The example is taken from Kendall and Stuart (1979, pages 595–599) and involves finding the optimal scores for the values of two categorical variables to maximize the correlation between the two variables. The contingency table is given below, along with the more traditional matrix X of “observations” for which canonical correlations are desired.
The data matrix X is given as:
Group 1 Var. |
Group 2 Var. |
Frequencies |
||||||
1 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
821 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
112 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
85 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
35 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
116 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
0 |
494 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
145 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
27 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
72 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
151 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
583 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
87 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
43 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
34 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
106 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
331 |
For this table, the optimal correlation turns out to be 0.70 when scores of 2.67, 1.34, 0.62, and 0.00 (see Column 1 of COEF1) are assigned to the variable 1 categories, and scores of 2.72, 1.37, 0.68, and 0.00 are assigned to the variable 2 categories. These scores are obtained as the canonical scores when canonical correlations are computed between the the row and column variable indicator variables (variables 1‑4 and variables 5‑8 in X, respectively). The warning error appears in the output because the covariance matrix is not of full rank (indeed, neither the group 1 or the group 2 covariance matrices are of full rank).
USE CANCR_INT
IMPLICIT NONE
INTEGER IFRQ, IPRINT, LDCFR1, LDCFR2, LDCOF1, LDCOF2, &
LDCORR, LDSTAT, LDX, LDXX, NCOL, NOBS, NV, NVAR1, &
NVAR2
REAL TOL
PARAMETER (IFRQ=9, IPRINT=2, LDCFR1=4, LDCFR2=4, &
LDCOF1=4, LDCOF2=4, LDCORR=4, LDSTAT=15, LDX=16, &
LDXX=16, NCOL=9, NOBS=16, NV=4, NVAR1=4, NVAR2=4, &
TOL=0.0001)
!
INTEGER IND1(NVAR1), IND2(NVAR2)
REAL COEF1(LDCOF1,NVAR1), COEF2(LDCOF2,NVAR2), &
COEFR1(LDCFR1,NV), COEFR2(LDCFR2,NV), CORR(LDCORR,6), &
STAT(LDSTAT,8), X(LDX,NCOL), XX(LDXX,NCOL)
!
DATA IND1/1, 2, 3, 4/, IND2/5, 6, 7, 8/
DATA X/4*1.0, 16*0.0, 4*1.0, 16*0.0, 4*1.0, 16*0.0, 5*1.0, &
3*0.0, 1.0, 3*0.0, 1.0, 3*0.0, 1.0, 4*0.0, 1.0, 3*0.0, 1.0, &
3*0.0, 1.0, 3*0.0, 1.0, 4*0.0, 1.0, 3*0.0, 1.0, 3*0.0, 1.0, &
3*0.0, 1.0, 4*0.0, 1.0, 3*0.0, 1.0, 3*0.0, 1.0, 3*0.0, 1.0, &
821.0, 112.0, 85.0, 35.0, 116.0, 494.0, 145.0, 27.0, 72.0, &
151.0, 583.0, 87.0, 43.0, 34.0, 106.0, 331.0/
!
CALL CANCR (X, IND1, IND2, XX, CORR, COEF1, &
COEF2, COEFR1, COEFR2, STAT, IFRQ=IFRQ, &
TOL=TOL, IPRINT=IPRINT)
!
END
Output
*** WARNING ERROR 2 from C2NCR. One or more Group 1 variables is linearly
*** dependent on the proceeding variables in Group 1.
Here is a traceback of subprogram calls in reverse order:
Routine name Error type Error code
------------ ---------- ----------
C2NCR 6 2 (Called internally)
CANCR 0 0
USER 0 0
*** WARNING ERROR 3 from C2NCR. One or more Group 2 variables is linearly
*** dependent on the proceeding variables in Group 2.
Here is a traceback of subprogram calls in reverse order:
Routine name Error type Error code
------------ ---------- ----------
C2NCR 6 3 (Called internally)
CANCR 0 0
USER 0 0
*** Canonical Correlations Statistics ***
Canonical Prob. of
Correlations Wilks Lambda Raos F Num. df Denom. df Larger F
1 0.6965 0.2734 615.925 9 7875.7 0.0000
2 0.5883 0.5310 602.598 4 6474.0 0.0000
3 0.4336 0.8120 749.823 1 3238.0 0.0000
4 0.0000 0.0000 0.000 0 0.0 0.0000
Group One Canonical Coefficients
1 2 3 4
1 2.670 1.100 1.023 0.000
2 1.341 2.905 -0.460 0.000
3 0.624 2.222 2.147 0.000
4 0.000 0.000 0.000 0.000
Group Two Canonical Coefficients
1 2 3 4
1 2.715 1.164 1.053 0.000
2 1.366 2.972 -0.393 0.000
3 0.676 2.250 2.182 0.000
4 0.000 0.000 0.000 0.000
Correlations Between the Group One Variables
and the Group One Canonical Scores
1 2 3 4
1 0.9068 -0.3954 0.1459 0.0000
2 -0.0121 0.6965 -0.7175 0.0000
3 -0.4555 0.3404 0.8226 0.0000
4 0.0000 0.0000 0.0000 0.0000
Correlations Between the Group Two Variables
and the Group Two Canonical Scores
1 2 3 4
1 0.9072 -0.3997 0.1310 0.0000
2 -0.0227 0.6995 -0.7143 0.0000
3 -0.4590 0.3205 0.8287 0.0000
4 0.0000 0.0000 0.0000 0.0000
*** Statistics for Group One Variables ***
Univariate Statistics from UVSTA
Variable Mean Variance Std. Dev. Skewness Kurtosis
1 0.3248 0.2194 0.4684 0.7482 -1.4401
2 0.2412 0.1831 0.4279 1.2098 -0.5363
3 0.2754 0.1996 0.4468 1.0053 -0.9894
4 0.1585 0.0000 0.0000 1.8697 1.4958
Variable Minimum Maximum Range Coef. Var. Count
1 0.0000 1.0000 1.0000 1.4420 3242.0000
2 0.0000 1.0000 1.0000 1.7739 3242.0000
3 0.0000 1.0000 1.0000 1.6221 3242.0000
4 0.0000 1.0000 1.0000 2.3041 3242.0000
Variable Lower CLM Upper CLM Lower CLV Upper CLV
1 0.3087 0.3409 0.2091 0.2305
2 0.2265 0.2559 0.1745 0.1923
3 0.2601 0.2908 0.1903 0.2097
4 0.1460 0.1711 0.1272 0.1402
Canonical Scores for Group One
1 2 3 4
1 1.307 -0.570 0.210 0.000
2 1.307 -0.570 0.210 0.000
3 1.307 -0.570 0.210 0.000
4 1.307 -0.570 0.210 0.000
5 -0.021 1.235 -1.272 0.000
6 -0.021 1.235 -1.272 0.000
7 -0.021 1.235 -1.272 0.000
8 -0.021 1.235 -1.272 0.000
9 -0.739 0.552 1.334 0.000
10 -0.739 0.552 1.334 0.000
11 -0.739 0.552 1.334 0.000
12 -0.739 0.552 1.334 0.000
13 -1.362 -1.670 -0.813 0.000
14 -1.362 -1.670 -0.813 0.000
15 -1.362 -1.670 -0.813 0.000
16 -1.362 -1.670 -0.813 0.000
*** Statistics for Group Two Variables ***
Univariate Statistics from UVSTA
Variable Mean Variance Std. Dev. Skewness Kurtosis
1 0.3245 0.2193 0.4683 0.7497 -1.4379
2 0.2440 0.1845 0.4296 1.1922 -0.5787
3 0.2835 0.2032 0.4508 0.9609 -1.0766
4 0.1481 0.0000 0.0000 1.9819 1.9280
Variable Minimum Maximum Range Coef. Var. Count
1 0.0000 1.0000 1.0000 1.4430 3242.0000
2 0.0000 1.0000 1.0000 1.7606 3242.0000
3 0.0000 1.0000 1.0000 1.5901 3242.0000
4 0.0000 1.0000 1.0000 2.3992 3242.0000
Variable Lower CLM Upper CLM Lower CLV Upper CLV
1 0.3084 0.3406 0.2090 0.2303
2 0.2292 0.2588 0.1758 0.1938
3 0.2679 0.2990 0.1936 0.2134
4 0.1358 0.1603 0.1203 0.1326
Canonical Scores for Group Two
1 2 3 4
1 1.309 -0.577 0.189 0.000
2 -0.040 1.231 -1.257 0.000
3 -0.730 0.509 1.317 0.000
4 -1.406 -1.740 -0.864 0.000
5 1.309 -0.577 0.189 0.000
6 -0.040 1.231 -1.257 0.000
7 -0.730 0.509 1.317 0.000
8 -1.406 -1.740 -0.864 0.000
9 1.309 -0.577 0.189 0.000
10 -0.040 1.231 -1.257 0.000
11 -0.730 0.509 1.317 0.000
12 -1.406 -1.740 -0.864 0.000
13 1.309 -0.577 0.189 0.000
14 -0.040 1.231 -1.257 0.000
15 -0.730 0.509 1.317 0.000
16 -1.406 -1.740 -0.864 0.000
*** WARNING ERROR 1 from CANCR. The standardized cross covariance matrix
*** is not of full rank or is very ill-conditioned. Small
*** canonical correlations may not be accurate.