principalComponents¶
Computes principal components.
Synopsis¶
principalComponents (covariances)
Required Arguments¶
- float
covariances[[]]
(Input) - Array of length
nVariables
bynVariables
containing the covariance or correlation matrix.
Return Value¶
An array of length nVariables
containing the eigenvalues of the matrix
covariances
ordered from largest to smallest.
Optional Arguments¶
covarianceMatrix
(Input)Treat the input vector
covariances
as a covariance matrix.Default =
covarianceMatrix
.
or
correlationMatrix
(Input)Treat the input vector
covariances
as a correlation matrix.Default =
covarianceMatrix
.cumPercent
(Output)- An array of length
nVariables
containing the cumulative percent of the total variances explained by each principal component. eigenvectors
(Output)- An array of length
nVariables
bynVariables
containing the eigenvectors ofcovariances
, stored columnwise. Each vector is normalized to have Euclidean length equal to the value one. Also, the sign of each vector is set so that the largest component in magnitude (the first of the largest if there are ties) is made positive. correlations
(Output)- An array of length
nVariables
bynVariables
containing the correlations of the principal components (the columns) with the observed/standardized variables (the rows). IfcovarianceMatrix
is specified, then the correlations are with the observed variables. Otherwise, the correlations are with the standardized (to a variance of 1.0) variables. In the principal component model for factor analysis, matrixcorrelations
is the matrix of unrotated factor loadings. stdDev
, intnDegreesFreedom
, floatstdDev
(Input/Output)- Argument
nDegreesFreedom
contains the number of degrees of freedom incovariances
. ArgumentstdDev
is an array of lengthnVariables
containing the estimated asymptotic standard errors of the eigenvalues.
Description¶
Function principalComponents
finds the principal components of a set of
variables from a sample covariance or correlation matrix. The characteristic
roots, characteristic vectors, standard errors for the characteristic roots,
and the correlations of the principal component scores with the original
variables are computed. Principal components obtained from correlation
matrices are the same as principal components obtained from standardized (to
unit variance) variables.
The principal component scores are the elements of the vector \(y= \Gamma^Tx\), where Γ is the matrix whose columns are the characteristic vectors (eigenvectors) of the sample covariance (or correlation) matrix and x is the vector of observed (or standardized) random variables. The variances of the principal component scores are the characteristic roots (eigenvalues) of the covariance (correlation) matrix.
Asymptotic variances for the characteristic roots were first obtained by Girschick (1939) and are given more recently by Kendall et al. (1983, p. 331). These variances are computed either for covariance matrices or for correlation matrices.
The correlations of the principal components with the observed (or
standardized) variables are given in the matrix correlations
. When the
principal components are obtained from a correlation matrix,
correlations
is the same as the matrix of unrotated factor loadings
obtained for the principal components model for factor analysis.
Examples¶
Example 1¶
In this example, eigenvalues of the covariance matrix are computed.
from numpy import *
from pyimsl.stat.principalComponents import principalComponents
from pyimsl.stat.writeMatrix import writeMatrix
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
# Perform analysis
values = principalComponents(covariances)
# Print results
writeMatrix("Eigenvalues", values)
Output¶
Eigenvalues
1 2 3 4 5 6
4.677 1.264 0.844 0.555 0.447 0.429
7 8 9
0.310 0.277 0.196
Example 2¶
In this example, principal components are computed for a nine-variable correlation matrix.
from numpy import *
from pyimsl.stat.principalComponents import principalComponents
from pyimsl.stat.writeMatrix import writeMatrix
eigenvectors = []
std_dev = []
cum_percent = []
a = []
std_dev = {'nDegreesFreedom': 100}
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
# Perform analysis
values = principalComponents(covariances,
correlationMatrix=True,
eigenvectors=eigenvectors,
stdDev=std_dev,
cumPercent=cum_percent,
correlations=a)
# Print results
writeMatrix('Eigenvalues', values, writeFormat="%6.3f")
writeMatrix('Eigenvectors', eigenvectors)
writeMatrix('STD', std_dev['stdDev'], writeFormat="%6.4f")
writeMatrix('PCT', cum_percent, writeFormat="%6.3f")
writeMatrix('A', a)
Output¶
Eigenvalues
1 2 3 4 5 6 7 8 9
4.677 1.264 0.844 0.555 0.447 0.429 0.310 0.277 0.196
Eigenvectors
1 2 3 4 5
1 0.3462 -0.2354 0.1386 -0.3317 -0.1088
2 0.3526 -0.1108 -0.2795 -0.2161 0.7664
3 0.2754 -0.2697 -0.5585 0.6939 -0.1531
4 0.3664 0.4031 0.0406 0.1196 0.0017
5 0.3144 0.5022 -0.0733 -0.0207 -0.2804
6 0.3455 0.4553 0.1825 0.1114 0.1202
7 0.3487 -0.2714 -0.0725 -0.3545 -0.5242
8 0.2407 -0.3159 0.7383 0.4329 0.0861
9 0.3847 -0.2533 -0.0078 -0.1468 0.0459
6 7 8 9
1 0.7974 0.1735 -0.1240 -0.0488
2 -0.2002 0.1386 -0.3032 -0.0079
3 0.1511 0.0099 -0.0406 -0.0997
4 0.1152 -0.4022 -0.1178 0.7060
5 -0.1796 0.7295 0.0075 0.0046
6 0.0696 -0.3742 0.0925 -0.6780
7 -0.4355 -0.2854 -0.3408 -0.1089
8 -0.1969 0.1862 -0.1623 0.0505
9 -0.1498 -0.0251 0.8521 0.1225
STD
1 2 3 4 5 6 7 8 9
0.6498 0.1771 0.0986 0.0879 0.0882 0.0890 0.0944 0.0994 0.1113
PCT
1 2 3 4 5 6 7 8 9
0.520 0.660 0.754 0.816 0.865 0.913 0.947 0.978 1.000
A
1 2 3 4 5
1 0.7487 -0.2646 0.1274 -0.2471 -0.0728
2 0.7625 -0.1245 -0.2568 -0.1610 0.5124
3 0.5956 -0.3032 -0.5133 0.5170 -0.1024
4 0.7923 0.4532 0.0373 0.0891 0.0012
5 0.6799 0.5646 -0.0674 -0.0154 -0.1875
6 0.7472 0.5119 0.1677 0.0830 0.0804
7 0.7542 -0.3051 -0.0666 -0.2641 -0.3505
8 0.5206 -0.3552 0.6784 0.3225 0.0576
9 0.8319 -0.2848 -0.0071 -0.1094 0.0307
6 7 8 9
1 0.5224 0.0966 -0.0652 -0.0216
2 -0.1312 0.0772 -0.1596 -0.0035
3 0.0990 0.0055 -0.0214 -0.0442
4 0.0755 -0.2240 -0.0620 0.3127
5 -0.1177 0.4063 0.0039 0.0021
6 0.0456 -0.2084 0.0487 -0.3003
7 -0.2853 -0.1589 -0.1794 -0.0482
8 -0.1290 0.1037 -0.0854 0.0224
9 -0.0981 -0.0140 0.4485 0.0543
Warning Errors¶
IMSLS_100_DF |
Because the number of degrees of
freedom in “covariances ” and
“nDegreesFreedom ” is less than
or equal to 0, 100 degrees of freedom
will be used. |
IMSLS_COV_NOT_NONNEG_DEF |
“eigenvalues [#]” = #. One or
more eigenvalues much less than zero
are computed. The matrix
“covariances ” is not
nonnegative definite. In order to
continue computations of
“eigenvalues ” and
“correlations ,” these
eigenvalues are treated as 0. |
IMSLS_FAILED_TO_CONVERGE |
The iteration for the eigenvalue failed to converge in 100 iterations before deflating. |