factorAnalysis¶
Extracts initial factor-loading estimates in factor analysis with rotation options.
Synopsis¶
factorAnalysis (covariances, nFactors)
Required Arguments¶
- float
covariances[[]]
(Input) - Array of length
nVariables
×nVariables
containing the variance-covariance or correlation matrix. - int
nFactors
(Input) - Number of factors in the model.
Return Value¶
An array of length nVariables
×nFactors
containing the matrix of
factor loadings.
Optional Arguments¶
maximumLikelihood
- Maximum likelihood (common factor model) method used to obtain the
estimates. Argument
maximumLikelihood
is the number of degrees of freedom in covariances.
Default: unweightedLeastSquares
or
principalComponent
- Principal component (principal component model) method used to obtain the estimates.
Default: unweightedLeastSquares
or
principalFactor
- Principal factor (common factor model) method used to obtain the estimates.
Default: unweightedLeastSquares
or
unweightedLeastSquares
- Unweighted least-squares (common factor model) method used to obtain the estimates. This option is the default.
Default: This option is the default.
or
generalizedLeastSquares
- Generalized least-squares (common factor model) method used to obtain the estimates.
Default: unweightedLeastSquares
or
image
- Image-factor analysis (common factor model) method used to obtain the estimates.
Default: unweightedLeastSquares
or
alpha
- Alpha-factor analysis (common factor model) method used to obtain the
estimates. Argument
alpha
is the number of degrees of freedom in covariances.
Default: unweightedLeastSquares
uniqueVariancesInput
, float[]
(Input)Array of length
nVariables
containing the initial estimates of the unique variances.Default: Initial estimates are taken as the constant 1 −
nFactors
/2 ×nVariables
divided by the diagonal elements of the inverse ofcovariances
.uniqueVariancesOutput
, float[]
(Output)- User-allocated array of length
nVariables
containing the estimated unique variances. maxIterations
, int (Input)Maximum number of iterations in the iterative procedure.
Default:
maxIterations
= 60maxStepsLineSearch
, int (Input)Maximum number of step halvings allowed during any one iteration.
Default:
maxStepsLineSearch
= 10convergenceEps
, float (Input)Convergence criterion used to terminate the iterations. For the unweighted least squares, generalized least squares or maximum likelihood methods, convergence is assumed when the relative change in the criterion is less than
convergenceEps
. For alpha-factor analysis, convergence is assumed when the maximum change (relative to the variance) of a uniqueness is less thanconvergenceEps
.Default:
convergenceEps
= 0.0001switchExactHessian
, float (Input)Convergence criterion used to switch to exact second derivatives. When the largest relative change in the unique standard deviation vector is less than
switchExactHessian
, exact second derivative vectors are used. ArgumentswitchExactHessian
is not used with the principal component, principal factor, image-factor analysis, or alpha-factor analysis methods.Default:
switchExactHessian
= 0.1eigenvalues
(Output)- An array of length
nVariables
containing the eigenvalues of the matrix from which the factors were extracted. chiSquaredTest
, intdf
, floatchiSquared
, floatpValue
(Output)- Number of degrees of freedom in chi-squared is
df
;chiSquared
is the chi-squared test statistic for testing thatnFactors
common factors are adequate for the data;pValue
is the probability of a greater chi-squared statistic. tuckerReliabilityCoefficient
(Output)- Tucker reliability coefficient.
nIterations
(Output)- Number of iterations.
functionMin
(Output)- Value of the function minimum.
lastStep
(Output)- An array of length
nVariables
containing the updates of the unique variance estimates when convergence was reached (or the iterations terminated).
orthomaxRotation
, floatw
, intnorm
, floatb
, floatt
(Input/Output)- Nonnegative constant
w
defines the rotation. Ifnorm
=1, row normalization is performed. Otherwise, row normalization is not performed.b
contains an array of lengthnVariables
bynFactors
containing the rotated factor loading matrix.t
contains an array of lengthnFactors
bynFactors
containing the rotation transformation matrix.w
= 0.0 results in quartimax rotations,w
= 1.0 results in varimax rotations, andw
=nFactors
/2.0 results in equamax rotations. Other nonnegative values ofw
may also be used, but the best values forw
are in the range (0.0, 5 ×nFactors
).
orthogonalProcrustesRotation
, floattarget
, floatb
, floatt
(Input/Output)- If specified, the
nVariables
bynFactors
target matrixtarget
will be used to compute an orthogonal Procrustes rotation of the factor-loading matrix.b
contains an array of lengthnVariables
×nFactors
containing the rotated factor loading matrix.t
contains an array of lengthnFactors
×nFactors
containing the rotation transformation matrix.
directObliminRotation
, floatw
, intnorm
, floatb
, floatt
, floatfactorCorrelations
(Input/Output)- Computes a direct oblimin rotation. Nonpositive constant
w
defines the rotation. Ifnorm
=1, row normalization is performed. Otherwise, row normalization is not performed.b
contains an array of lengthnVariables
×nFactors
containing the rotated factor loading matrix.t
contains an array of lengthnFactors
×nFactors
containing the rotation transformation matrix.factorCorrelations
contains an array of lengthnFactors
×nFactors
containing the factor correlations. The parameterw
determines the type of direct oblimin rotation to be performed. In generalw
must be negative.w
= 0.0 results in direct quartimin rotations. Asw
approaches negative infinity, the orthogonality among factors will increase.
obliquePromaxRotation
, floatw
, floatpower
, intnorm
, floattarget
, floatb
, floatt
, floatfactorCorrelations
(Input/Output)Computes an oblique promax rotation of the factor loading matrix using a power vector. Nonnegative constant
w
defines the rotation.power
, a vector of lengthnFactors
, contains the power vector. Ifnorm
=1, row (Kaiser) normalization is performed. Otherwise, row normalization is not performed.b
contains an array of lengthnVariables
×nFactors
containing the rotated factor loading matrix.t
contains an array of lengthnFactors
×nFactors
containing the rotation transformation matrix.factorCorrelations
contains an array of lengthnFactors
×nFactors
containing the factor correlations.target
contains an array of lengthnVariables
×nFactors
containing the target matrix for rotation, derived from the orthomax rotation.w
is used in the orthomax rotation, see the optional argumentorthomaxRotation
for common values ofw
.All
power[j]
should be greater than 1.0, typically 4.0. Generally, the larger the values ofpower [j]
, the more oblique the solution will be.
obliquePivotalPromaxRotation
, floatw
, floatpivot
, intnorm
, floattarget
, floatb
, floatt
, floatfactorCorrelations
(Input/Output)- Computes an oblique pivotal promax rotation of the factor loading matrix
using pivot constants. Nonnegative constant
w
defines the rotation.pivot
, a vector of lengthnFactors
, contains the pivot constants.pivot[j]
should be in the interval (0.0, 1.0). Ifnorm
= 1, row (Kaiser) normalization is performed. Otherwise, row normalization is not performed.b
contains an array of lengthnVariables
×nFactors
containing the rotated factor loading matrix.t
contains an array of lengthnFactors
×nFactors
containing the rotation transformation matrix.factorCorrelations
contains an array of lengthnFactors
×nFactors
containing the factor correlations.target
contains an array of lengthnVariables
×nFactors
containing the target matrix for rotation, derived from the orthomax rotation.w
is used in the orthomax rotation, see the optional argumentorthomaxRotation
for common values ofw
.
obliqueProcrustesRotation
, floattarget
, floatb
, floatt
, floatfactorCorrelations
(Input/Output)- Computes an oblique procrustes rotation of the factor loading matrix
using a target matrix.
target
is a hypothesized rotated factor loading matrix based upon prior knowledge with loadings chosen to enhance interpretability. A simple structure solution will have most of the weightstarget[i][j]
either zero or large in magnitude.b
contains an array of lengthnVariables
×nFactors
containing the rotated factor loading matrix.t
contains an array of lengthnFactors
×nFactors
containing the rotation transformation matrix.factorCorrelations
contains an array of lengthnFactors
×nFactors
containing the factor correlations.
factorStructure
, floats
, floatfvar
(Output)- Computes the factor structure and the variance explained by each factor.
s
contains an array of lengthnVariables
×nFactors
containing the factor structure matrix.fvar
contains an array of lengthnFactors
containing the variance accounted for by each of thenFactors
rotated factors. A factor rotation matrix is used to compute the factor structure and the variance. One and only one rotation option argument can be specified.
Description¶
Function factorAnalysis
computes factor loadings in exploratory factor
analysis models. Models available in factorAnalysis
are the principal
component model for factor analysis and the common factor model with
additions to the common factor model in alpha-factor analysis and image
analysis. Methods of estimation include principal components, principal
factor, image analysis, unweighted least squares, generalized least squares,
and maximum likelihood.
In the factor analysis model used for factor extraction, the basic model is
given as \(\Sigma=\Lambda\Lambda ^T+\Psi\), where Σ is the p × p
population covariance matrix, Λ is the p × k matrix of factor loadings
relating the factors f to the observed variables x, and Ψ is the p ×
p matrix of covariances of the unique errors e. Here, p =
nVariables
and k = nFactors
. The relationship between the factors,
the unique errors, and the observed variables is given as \(x=\Lambda
f+e\), where in addition, the expected values of e, f, and x are assumed
to be 0. (The sample means can be subtracted from x if the expected value
of x is not 0.) It also is assumed that each factor has unit variance, the
factors are independent of each other, and that the factors and the unique
errors are mutually independent. In the common factor model, the elements of
unique errors e also are assumed to be independent of one another so that
the matrix Ψ is diagonal. This is not the case in the principal component
model in which the errors may be correlated.
Further differences between the various methods concern the criterion that is optimized and the amount of computer effort required to obtain estimates. Generally speaking, the least-squares and maximum likelihood methods, which use iterative algorithms, require the most computer time with the principal factor, principal component and the image methods requiring much less time since the algorithms in these methods are not iterative. The algorithm in alpha-factor analysis is also iterative, but the estimates in this method generally require somewhat less computer effort than the least-squares and maximum likelihood estimates. In all methods, one eigensystem analysis is required on each iteration.
Principal Component and Principal Factor Methods¶
Both the principal component and principal factor methods compute the factor-loading estimates as
where Γ and the diagonal matrix Δ are the eigenvectors and eigenvalues of a
matrix. In the principal component model, the eigensystem analysis is
performed on the sample covariance (correlation) matrix S, while in the
principal factor model, the matrix (S + Ψ) is used. If the unique error
variances Ψ are not known in the principal factor mode, then
factorAnalysis
obtains estimates for them.
The basic idea in the principal component method is to find factors that maximize the variance in the original data that is explained by the factors. Because this method allows the unique errors to be correlated, some factor analysts insist that the principal component method is not a factor analytic method. Usually, however, the estimates obtained by the principal component model and factor analysis model will be quite similar.
It should be noted that both the principal component and principal factor
methods give different results when the correlation matrix is used in place
of the covariance matrix. Indeed, any rescaling of the sample covariance
matrix can lead to different estimates with either of these methods. A
further difficulty with the principal factor method is the problem of
estimating the unique error variances. Theoretically, these must be known in
advance and be passed to factorAnalysis
using optional argument
uniqueVariancesInput
. In practice, the estimates of these parameters are
produced by factorAnalysis
when uniqueVariancesInput
is not
specified. In either case, the resulting adjusted covariance (correlation)
matrix
may not yield the nFactors
positive eigenvalues required for
nFactors
factors to be obtained. If this occurs, the user must either
lower the number of factors to be estimated or give new unique error
variance values.
Least-squares and Maximum Likelihood Methods¶
Unlike the previous two methods, the algorithm used to compute estimates in
this section is iterative (see Jöreskog 1977). As with the principal factor
model, the user may either initialize the unique error variances or allow
factorAnalysis
to compute initial estimates. Unlike the principal factor
method, factorAnalysis
optimizes the criterion function with respect to
both Ψ and Γ. (In the principal factor method, Ψ is assumed to be known.
Given Ψ, estimates for Λ may be obtained.)
The major difference between the methods discussed in this section is in the criterion function that is optimized. Let S denote the sample covariance (correlation) matrix, and let Σ denote the covariance matrix that is to be estimated by the factor model. In the unweighted least-squares method, also called the iterated principal factor method or the minres method (see Harman 1976, p. 177), the function minimized is the sum-of-squared differences between S and Σ. This is written as \(\Phi_{u1}=0.5\) (trace \((S-\Sigma)^2\)).
Generalized least-squares and maximum likelihood estimates are asymptotically equivalent methods. Maximum likelihood estimates maximize the (normal theory) likelihood {\(\Phi_{m1}\) = trace \((\Sigma^{-1}S)-\log(|\Sigma^{-1}S|)\)}, while generalized least squares optimizes the function \(\Phi_{gs}\) = trace \((\Sigma S^{-1}-I)^2\).
In all three methods, a two-stage optimization procedure is used. This proceeds by first solving the likelihood equations for Λ in terms of Ψ and substituting the solution into the likelihood. This gives a criterion \(\phi(\Psi,\Lambda(\Psi))\), which is optimized with respect to Ψ. In the second stage, the estimates \(\hat{\mathit{\Lambda}}\) are obtained from the estimates for Ψ.
The generalized least-squares and maximum likelihood methods allow for the
computation of a statistic (chiSquaredTest
) for testing that
nFactors
common factors are adequate to fit the model. This is a
chi-squared test that all remaining parameters associated with additional
factors are 0. If the probability of a larger chi-squared is so small that
the null hypothesis is rejected, then additional factors are needed
(although these factors may not be of any practical importance). Failure to
reject does not legitimize the model. The statistic chiSquaredTest
is a
likelihood ratio statistic in maximum likelihood estimation. As such, it
asymptotically follows a chi-squared distribution with degrees of freedom
given by df
.
The Tucker and Lewis reliability coefficient, ρ, is returned by
tuckerReliabilityCoefficient
when the maximum likelihood or generalized
least-squares methods are used. This coefficient is an estimate of the ratio
of explained variation to the total variation in the data. It is computed as
follows:
where |S| is the determinant of covariances, p = nVariables
,
k = nVariables
, ɸ is the optimized criterion, and d =
dfCovariances
.
Image Analysis Method¶
The term image analysis is used here to denote the noniterative image method of Kaiser (1963). It is not the image analysis discussed by Harman (1976, p. 226). The image method (as well as the alpha-factor analysis method) begins with the notion that only a finite number from an infinite number of possible variables have been measured. The image factor pattern is calculated under the assumption that the ratio of the number of factors to the number of observed variables is near 0, so that a very good estimate for the unique error variances (for standardized variables) is given as 1 minus the squared multiple correlation of the variable under consideration with all variables in the covariance matrix.
First, the matrix \(D^2=(\mathrm{diag} (S^{-1}))^{-1}\) is computed where the operator “diag” results in a matrix consisting of the diagonal elements of its argument and S is the sample covariance (correlation) matrix. Then, the eigenvalues Λ and eigenvectors Γ of the matrix \(D^{-1} SD^{-1}\) are computed. Finally, the unrotated image-factor pattern is computed as \(D \Gamma[(\Lambda-I)^2 \Lambda^{-1} ]^{1/2}\).
Alpha-factor Analysis Method¶
The alpha-factor analysis method of Kaiser and Caffrey (1965) finds factor-loading estimates to maximize the correlation between the factors and the complete universe of variables of interest. The basic idea in this method is that only a finite number of variables out of a much larger set of possible variables is observed. The population factors are linearly related to this larger set, while the observed factors are linearly related to the observed variables. Let f denote the factors obtainable from a finite set of observed random variables, and let ξ denote the factors obtainable from the universe of observable variables. Then, the alpha method attempts to find factor-loading estimates so as to maximize the correlation between f and ξ. In order to obtain these estimates, the iterative algorithm of Kaiser and Caffrey (1965) is used.
Rotation Methods¶
The orthomaxRotation
optional argument performs an orthogonal rotation
according to an orthomax criterion. In this analytic method of rotation, the
criterion function
is minimized by finding an orthogonal rotation matrix T such that
\((\lambda_{ij})=\Lambda=AT\) where A is the matrix of unrotated factor
loadings. Here, \(\gamma\geq 0\) is a user-specified constant (w
)
yielding a family of rotations, and p is the number of variables.
Kaiser (row) normalization can be performed on the factor loadings prior to
rotation by specifying the parameter norm =1
. In Kaiser normalization,
the rows of A
are first “normalized” by dividing each row by the square
root of the sum of its squared elements (Harman 1976). After the rotation is
complete, each row of b
is “denormalized” by multiplication by its
initial normalizing constant.
The method for optimizing Q proceeds by accumulating simple rotations
where a simple rotation is defined to be one in which Q is optimized for
two columns in Λ and for which the requirement that T be orthogonal is
satisfied. A
single iteration is defined to be such that each of the
nFactors
(nFactors
- 1)/2 possible simple rotations is performed
where nFactors
is the number of factors. When the relative change in Q
from one iteration to the next is less than convergenceEps
(the
user-specified convergence criterion), the algorithm stops.
convergenceEps
= 0.0001 is usually sufficient. Alternatively, the
algorithm stops when the user-specified maximum number of iterations,
maxIterations
, is reached. maxIterations
= 30 is usually sufficient.
The parameter in the rotation, \(\gamma\), is used to provide a family of rotations. When \(\gamma=0.0\), a direct quartimax rotation results. Other values of \(\gamma\) yield other rotations.
The orthogonalProcrustesRotation
optional argument performs orthogonal
Procrustes rotation according to a method proposed by Schöneman (1966). Let
k = nFactors
denote the number of factors, p = nVariables
denote
the number of variables, A denote the p × k matrix of unrotated factor
loadings, T denote the k × k orthogonal rotation matrix (orthogonality
requires that \(T^TT\) be a k × k identity matrix), and let X
denote the target matrix. The basic idea in orthogonal Procrustes rotation is
to find an orthogonal rotation matrix T such that \(B=AT\) and T
provides a least-squares fit between the target matrix X and the rotated
loading matrix B. Schöneman’s algorithm proceeds by finding the singular
value decomposition of the matrix \(A^TX=U\Sigma V^T\). The rotation
matrix is computed as \(T=UV^T\).
The directObliminRotation
optional argument performs direct oblimin
rotation. In this analytic method of rotation, the criterion function
is minimized by finding a rotation matrix T such that
\((\lambda_{ir})=\Lambda= AT\) and \((T^TT)^{-1}\) is a correlation
matrix. Here, \(\gamma\leq 0\) is a user-specified constant (w
)
yielding a family of rotations, and p is the number of variables. The
rotation is said to be direct because it minimizes Q with respect to the
factor loadings directly, ignoring the reference structure.
Kaiser normalization can be performed on the factor loadings prior to
rotation via the parameter norm
. In Kaiser normalization (see Harman
1976), the rows of the factor loading matrix are first “normalized” by
dividing each row by the square root of the sum of its squared elements.
After the rotation is complete, each row of b
is “denormalized” by
multiplication by its initial normalizing constant.
The method for optimizing Q is essentially the method first proposed by
Jennrich and Sampson (1966). It proceeds by accumulating simple rotations
where a simple rotation is defined to be one in which Q is optimized for a
given factor in the plane of a second factor, and for which the requirement
that \((T^T T)^{-1}\) be a correlation matrix is satisfied. An iteration
is defined to be such that each of the nFactors
[nFactors
- 1]
possible simple rotations is performed, where nFactors
is the number of
factors. When the relative change in Q from one iteration to the next is less
than convergenceEps
(the user-specified convergence criterion), the
algorithm stops. convergenceEps
= .0001 is usually sufficient.
Alternatively, the algorithm stops when the user-specified maximum number of
iterations, maxIterations
, is reached. maxIterations
= 30 is usually
sufficient.
The parameter in the rotation, \(\gamma\), is used to provide a family of rotations. Harman (1976) recommends that \(\gamma\) be strictly less than or equal to zero. When \(\gamma=0.0\), a direct quartimin rotation results. Other values of \(\gamma\) yield other rotations. Harman (1976) suggests that the direct quartimin rotations yield the most highly correlated factors while more orthogonal factors result as \(\gamma\) approaches -∞.
obliquePromaxRotation
, obliquePivotalPromaxRotation
,
obliqueProcrustesRotation
, optional arguments perform oblique rotations
using the Promax, pivotal Promax, or oblique Procrustes methods. In all of
these methods, a target matrix X is first either computed or specified by
the user. The differences in the methods relate to how the target matrix is
first obtained.
Given a p × k target matrix, X, and a p × k orthogonal matrix of unrotated factor loadings, A, compute the rotation matrix T as follows: First regress each column of A on X yielding a k × k matrix β. Then, let \(\gamma=\mathrm{diag}(\beta^T \beta)\) where diag denotes the diagonal matrix obtained from the diagonal of the square matrix. Standardize β to obtain \(T=\gamma^{-1/2} \beta\). The rotated loadings are computed as \(B= AT\) while the factor correlations can be computed as the inverse of the \(T^TT\) matrix.
In the Promax method, the unrotated factor loadings are first rotated
according to an orthomax criterion via optional argument
orthomaxRotation
. The target matrix X is taken as the elements of the
B raised to a power greater than one but retaining the same sign as the
original loadings. The column i of the rotated matrix B is raised to the
power power[
i]
. A power of four is commonly used.
Generally, the larger the power, the more oblique the solution.
In the pivotal Promax method, the unrotated matrix is first rotated to an
orthomax orthogonal solution as in the Promax case. Then, rather than
raising the i-th column in B to the power pivot[
i]
,
the elements \(x_{ij}\) of X are obtained from the elements
\(b_{ij}\) of B by raising the ij element of B to the power
pivot[
i]
/\(b_{ij}\). This has the effects of greatly
increasing in X those elements in B that are greater in magnitude than
the pivot elements pivot[
i]
, and of greatly decreasing
those elements that are less than pivot[
i]
.
In the oblique Procrustes method, the elements of X are specified by the
user as input to the routine via the target
argument. No orthogonal
rotation is performed in the oblique Procrustes method.
Factor Structure and Variance¶
The factorStructure
optional argument computes the factor structure
matrix (the matrix of correlations between the observed variables and the
hypothesized factors) and the variance explained by each of the factors (for
orthogonal rotations). For oblique rotations, factorStructure
computes a
measure of the importance of the factors, the sum of the squared elements in
each column.
Let Δ denote the diagonal matrix containing the elements of the variance of the original data along its diagonal. The estimated factor structure matrix S is computed as
while the elements of fvar
are computed as the diagonal elements of
If the factors were obtained from a correlation matrix (or the factor variances for standardized variables are desired), then the variances should all be 1.0.
Comments¶
- Function
factorAnalysis
makes no attempt to solve fornFactors
. In general, ifnFactors
is not known in advance, several different values ofnFactors
should be used and the most reasonable value kept in the final solution. - Iterative methods are generally thought to be superior from a theoretical point of view, but in practice, often lead to solutions that differ little from the noniterative methods. For this reason, it is usually suggested that a noniterative method be used in the initial stages of the factor analysis and that the iterative methods be used when issues such as the number of factors have been resolved.
- Initial estimates for the unique variances can be input. If the iterative methods fail for these values, new initial estimates should be tried. These can be obtained by use of another factoring method. (Use the final estimates from the new method as the initial estimates in the old method.)
Examples¶
Example 1¶
In this example, factor analysis is performed for a nine-variable matrix using the default method of unweighted least squares.
from numpy import *
from pyimsl.stat.factorAnalysis import factorAnalysis
from pyimsl.stat.writeMatrix import writeMatrix
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
# Perform analysis
a = factorAnalysis(covariances, 3)
# Print results
writeMatrix("Unrotated Loadings", a)
Output¶
Unrotated Loadings
1 2 3
1 0.7018 -0.2316 0.0796
2 0.7200 -0.1372 -0.2082
3 0.5351 -0.2144 -0.2271
4 0.7907 0.4050 0.0070
5 0.6532 0.4221 -0.1046
6 0.7539 0.4842 0.1607
7 0.7127 -0.2819 -0.0701
8 0.4835 -0.2627 0.4620
9 0.8192 -0.3137 -0.0199
Example 2¶
The following data were originally analyzed by Emmett (1949). There are 211 observations on 9 variables. Following Lawley and Maxwell (1971), three factors are obtained by the method of maximum likelihood.
from __future__ import print_function
from numpy import *
from pyimsl.stat.factorAnalysis import factorAnalysis
from pyimsl.stat.writeMatrix import writeMatrix
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
evals = []
reliability_coef = []
n_iterations = []
function_min = []
uniq = empty((9), dtype=double)
chsqtest = {}
# Perform analysis
a = factorAnalysis(covariances, 3,
maximumLikelihood=210,
switchExactHessian=0.01,
convergenceEps=0.000001,
maxIterations=30,
maxStepsLineSearch=10,
eigenvalues=evals,
uniqueVariancesOutput=uniq,
chiSquaredTest=chsqtest,
tuckerReliabilityCoefficient=reliability_coef,
nIterations=n_iterations,
functionMin=function_min)
# Print results
writeMatrix("Unrotated Loadings", a)
writeMatrix("Eigenvalues", evals, writeFormat="%6.3f")
writeMatrix("Unique Error Variances", uniq, writeFormat="%6.4f")
print("\n\nchi_squared_df = %d" % chsqtest['df'])
print("chi_squared = %f" % chsqtest['chiSquared'])
print("p_value = %f\n" % chsqtest['pValue'])
print("reliability_coef = %f" % reliability_coef[0])
print("function_min = %f" % function_min[0])
print("n_iterations = %d" % n_iterations[0])
Output¶
chi_squared_df = 12
chi_squared = 7.149363
p_value = 0.847587
reliability_coef = 1.000000
function_min = 0.035017
n_iterations = 5
Unrotated Loadings
1 2 3
1 0.6642 -0.3209 0.0735
2 0.6888 -0.2471 -0.1933
3 0.4926 -0.3022 -0.2224
4 0.8372 0.2924 -0.0354
5 0.7050 0.3148 -0.1528
6 0.8187 0.3767 0.1045
7 0.6615 -0.3960 -0.0777
8 0.4579 -0.2955 0.4913
9 0.7657 -0.4274 -0.0117
Eigenvalues
1 2 3 4 5 6 7 8 9
0.063 0.229 0.541 0.865 0.894 0.974 1.080 1.117 1.140
Unique Error Variances
1 2 3 4 5 6 7 8 9
0.4505 0.4271 0.6166 0.2123 0.3805 0.1769 0.3995 0.4615 0.2309
Example 3¶
This example is a continuation of example 1 and illustrates the use of the
factorStructure
optional argument when the structure and an index of
factor importance for obliquely rotated loadings are desired. A direct
oblimin rotation is used to compute the factors, derived from nine variables
and using \(\gamma=-1\). Note in this example that the elements of
fvar
are not variances since the rotation is oblique.
from numpy import *
from pyimsl.stat.factorAnalysis import factorAnalysis
from pyimsl.stat.writeMatrix import writeMatrix
n_variables = 9
n_factors = 3
w = -1.0
norm = 1
b = []
t = []
fcor = []
oblim = {'w': w, 'norm': norm}
fs = {}
covariances = [
[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
# Perform analysis
a = factorAnalysis(covariances, 3,
maximumLikelihood=210,
switchExactHessian=0.01,
convergenceEps=0.00001,
maxIterations=30,
maxStepsLineSearch=10,
directObliminRotation=oblim,
factorStructure=fs)
# Print results
writeMatrix('Unrotated Loadings', a)
writeMatrix('Rotated Loadings', oblim['b'])
writeMatrix('Transformation Matrix', oblim['t'])
writeMatrix('Factor Correlation Matrix', oblim['factorCorrelations'])
writeMatrix('Factor Structure', fs['s'])
writeMatrix('Factor Variance', fs['fvar'])
Output¶
Unrotated Loadings
1 2 3
1 0.6642 -0.3209 0.0735
2 0.6888 -0.2471 -0.1933
3 0.4926 -0.3022 -0.2224
4 0.8372 0.2924 -0.0354
5 0.7050 0.3148 -0.1528
6 0.8187 0.3767 0.1045
7 0.6615 -0.3960 -0.0777
8 0.4579 -0.2955 0.4913
9 0.7657 -0.4274 -0.0117
Rotated Loadings
1 2 3
1 0.1128 -0.5144 0.2917
2 0.1847 -0.6602 -0.0018
3 0.0128 -0.6354 -0.0585
4 0.7797 -0.1751 0.0598
5 0.7147 -0.1813 -0.0959
6 0.8520 0.0039 0.1820
7 0.0354 -0.6844 0.1510
8 0.0276 -0.0941 0.6824
9 0.0729 -0.7100 0.2493
Transformation Matrix
1 2 3
1 0.611 -0.462 0.203
2 0.923 0.813 -0.249
3 0.042 0.728 1.050
Factor Correlation Matrix
1 2 3
1 1.000 -0.427 0.217
2 -0.427 1.000 -0.411
3 0.217 -0.411 1.000
Factor Structure
1 2 3
1 0.3958 -0.6824 0.5275
2 0.4662 -0.7383 0.3094
3 0.2714 -0.6169 0.2052
4 0.8675 -0.5326 0.3011
5 0.7713 -0.4471 0.1339
6 0.8899 -0.4347 0.3656
7 0.3605 -0.7616 0.4398
8 0.2161 -0.3861 0.7271
9 0.4302 -0.8435 0.5568
Factor Variance
1 2 3
2.170 2.560 0.914
Warning Errors¶
IMSLS_VARIANCES_INPUT_IGNORED |
When using the
principalComponent option,
the unique variances are assumed
to be zero. Input for
uniqueVariancesInput is
ignored. |
IMSLS_TOO_MANY_ITERATIONS |
Too many iterations. Convergence is assumed. |
IMSLS_NO_DEG_FREEDOM |
There are no degrees of freedom for the significance testing. |
IMSLS_TOO_MANY_HALVINGS |
Too many step halvings. Convergence is assumed. |
IMSLS_NO_ROTATION |
nFactors = 1. No rotation
is possible. |
IMSLS_SVD_ERROR |
An error occurred in the singular value decomposition of tran(A)X. The rotation matrix, T, may not be correct. |
Fatal Errors¶
IMSLS_HESSIAN_NOT_POS_DEF |
The approximate Hessian is not semi-definite on iteration #. The computations cannot proceed. Try using different initial estimates. |
IMSLS_FACTOR_EVAL_NOT_POS |
“eigenvalues [#]” = #. An
eigenvalue corresponding to a
factor is negative or zero. Either
use different initial estimates for
“uniqueVariancesInput ” or
reduce the number of factors. |
IMSLS_COV_NOT_POS_DEF |
“covariances ” is not
positive semi-definite. The
computations cannot proceed. |
IMSLS_COV_IS_SINGULAR |
The matrix “covariances ” is
singular. The computations cannot
continue because variable # is
linearly related to the remaining
variables. |
IMSLS_COV_EVAL_ERROR |
An error occurred in calculating
the eigenvalues of the adjusted
(inverse) covariance matrix. Check
“covariances .” |
IMSLS_ALPHA_FACTOR_EVAL_NEG |
In alpha factor analysis on
iteration #, eigenvalue # is #.
As all eigenvalues corresponding to
the factors must be positive,
either the number of factors must
be reduced or new initial estimates
for “uniqueVariancesInput ”
must be given. |
IMSLS_RANK_LESS_THAN |
The rank of TRAN(A)target =
#. This must be greater than or
equal to nFactors = #. |