principalComponents¶
Computes principal components.
Synopsis¶
principalComponents (covariances)
Required Arguments¶
- float
covariances[[]](Input) - Array of length
nVariablesbynVariablescontaining the covariance or correlation matrix.
Return Value¶
An array of length nVariables containing the eigenvalues of the matrix
covariances ordered from largest to smallest.
Optional Arguments¶
covarianceMatrix(Input)Treat the input vector
covariancesas a covariance matrix.Default =
covarianceMatrix.
or
correlationMatrix(Input)Treat the input vector
covariancesas a correlation matrix.Default =
covarianceMatrix.cumPercent(Output)- An array of length
nVariablescontaining the cumulative percent of the total variances explained by each principal component. eigenvectors(Output)- An array of length
nVariablesbynVariablescontaining the eigenvectors ofcovariances, stored columnwise. Each vector is normalized to have Euclidean length equal to the value one. Also, the sign of each vector is set so that the largest component in magnitude (the first of the largest if there are ties) is made positive. correlations(Output)- An array of length
nVariablesbynVariablescontaining the correlations of the principal components (the columns) with the observed/standardized variables (the rows). IfcovarianceMatrixis specified, then the correlations are with the observed variables. Otherwise, the correlations are with the standardized (to a variance of 1.0) variables. In the principal component model for factor analysis, matrixcorrelationsis the matrix of unrotated factor loadings. stdDev, intnDegreesFreedom, floatstdDev(Input/Output)- Argument
nDegreesFreedomcontains the number of degrees of freedom incovariances. ArgumentstdDevis an array of lengthnVariablescontaining the estimated asymptotic standard errors of the eigenvalues.
Description¶
Function principalComponents finds the principal components of a set of
variables from a sample covariance or correlation matrix. The characteristic
roots, characteristic vectors, standard errors for the characteristic roots,
and the correlations of the principal component scores with the original
variables are computed. Principal components obtained from correlation
matrices are the same as principal components obtained from standardized (to
unit variance) variables.
The principal component scores are the elements of the vector \(y= \Gamma^Tx\), where Γ is the matrix whose columns are the characteristic vectors (eigenvectors) of the sample covariance (or correlation) matrix and x is the vector of observed (or standardized) random variables. The variances of the principal component scores are the characteristic roots (eigenvalues) of the covariance (correlation) matrix.
Asymptotic variances for the characteristic roots were first obtained by Girschick (1939) and are given more recently by Kendall et al. (1983, p. 331). These variances are computed either for covariance matrices or for correlation matrices.
The correlations of the principal components with the observed (or
standardized) variables are given in the matrix correlations. When the
principal components are obtained from a correlation matrix,
correlations is the same as the matrix of unrotated factor loadings
obtained for the principal components model for factor analysis.
Examples¶
Example 1¶
In this example, eigenvalues of the covariance matrix are computed.
from numpy import *
from pyimsl.stat.principalComponents import principalComponents
from pyimsl.stat.writeMatrix import writeMatrix
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
# Perform analysis
values = principalComponents(covariances)
# Print results
writeMatrix("Eigenvalues", values)
Output¶
Eigenvalues
1 2 3 4 5 6
4.677 1.264 0.844 0.555 0.447 0.429
7 8 9
0.310 0.277 0.196
Example 2¶
In this example, principal components are computed for a nine-variable correlation matrix.
from numpy import *
from pyimsl.stat.principalComponents import principalComponents
from pyimsl.stat.writeMatrix import writeMatrix
eigenvectors = []
std_dev = []
cum_percent = []
a = []
std_dev = {'nDegreesFreedom': 100}
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
# Perform analysis
values = principalComponents(covariances,
correlationMatrix=True,
eigenvectors=eigenvectors,
stdDev=std_dev,
cumPercent=cum_percent,
correlations=a)
# Print results
writeMatrix('Eigenvalues', values, writeFormat="%6.3f")
writeMatrix('Eigenvectors', eigenvectors)
writeMatrix('STD', std_dev['stdDev'], writeFormat="%6.4f")
writeMatrix('PCT', cum_percent, writeFormat="%6.3f")
writeMatrix('A', a)
Output¶
Eigenvalues
1 2 3 4 5 6 7 8 9
4.677 1.264 0.844 0.555 0.447 0.429 0.310 0.277 0.196
Eigenvectors
1 2 3 4 5
1 0.3462 -0.2354 0.1386 -0.3317 -0.1088
2 0.3526 -0.1108 -0.2795 -0.2161 0.7664
3 0.2754 -0.2697 -0.5585 0.6939 -0.1531
4 0.3664 0.4031 0.0406 0.1196 0.0017
5 0.3144 0.5022 -0.0733 -0.0207 -0.2804
6 0.3455 0.4553 0.1825 0.1114 0.1202
7 0.3487 -0.2714 -0.0725 -0.3545 -0.5242
8 0.2407 -0.3159 0.7383 0.4329 0.0861
9 0.3847 -0.2533 -0.0078 -0.1468 0.0459
6 7 8 9
1 0.7974 0.1735 -0.1240 -0.0488
2 -0.2002 0.1386 -0.3032 -0.0079
3 0.1511 0.0099 -0.0406 -0.0997
4 0.1152 -0.4022 -0.1178 0.7060
5 -0.1796 0.7295 0.0075 0.0046
6 0.0696 -0.3742 0.0925 -0.6780
7 -0.4355 -0.2854 -0.3408 -0.1089
8 -0.1969 0.1862 -0.1623 0.0505
9 -0.1498 -0.0251 0.8521 0.1225
STD
1 2 3 4 5 6 7 8 9
0.6498 0.1771 0.0986 0.0879 0.0882 0.0890 0.0944 0.0994 0.1113
PCT
1 2 3 4 5 6 7 8 9
0.520 0.660 0.754 0.816 0.865 0.913 0.947 0.978 1.000
A
1 2 3 4 5
1 0.7487 -0.2646 0.1274 -0.2471 -0.0728
2 0.7625 -0.1245 -0.2568 -0.1610 0.5124
3 0.5956 -0.3032 -0.5133 0.5170 -0.1024
4 0.7923 0.4532 0.0373 0.0891 0.0012
5 0.6799 0.5646 -0.0674 -0.0154 -0.1875
6 0.7472 0.5119 0.1677 0.0830 0.0804
7 0.7542 -0.3051 -0.0666 -0.2641 -0.3505
8 0.5206 -0.3552 0.6784 0.3225 0.0576
9 0.8319 -0.2848 -0.0071 -0.1094 0.0307
6 7 8 9
1 0.5224 0.0966 -0.0652 -0.0216
2 -0.1312 0.0772 -0.1596 -0.0035
3 0.0990 0.0055 -0.0214 -0.0442
4 0.0755 -0.2240 -0.0620 0.3127
5 -0.1177 0.4063 0.0039 0.0021
6 0.0456 -0.2084 0.0487 -0.3003
7 -0.2853 -0.1589 -0.1794 -0.0482
8 -0.1290 0.1037 -0.0854 0.0224
9 -0.0981 -0.0140 0.4485 0.0543
Warning Errors¶
IMSLS_100_DF |
Because the number of degrees of
freedom in “covariances” and
“nDegreesFreedom” is less than
or equal to 0, 100 degrees of freedom
will be used. |
IMSLS_COV_NOT_NONNEG_DEF |
“eigenvalues[#]” = #. One or
more eigenvalues much less than zero
are computed. The matrix
“covariances” is not
nonnegative definite. In order to
continue computations of
“eigenvalues” and
“correlations,” these
eigenvalues are treated as 0. |
IMSLS_FAILED_TO_CONVERGE |
The iteration for the eigenvalue failed to converge in 100 iterations before deflating. |