factorAnalysis

../../_images/OpenMp_27.png

Extracts initial factor-loading estimates in factor analysis with rotation options.

Synopsis

factorAnalysis (covariances, nFactors)

Required Arguments

float covariances[[]] (Input)
Array of length nVariables×nVariables containing the variance-covariance or correlation matrix.
int nFactors (Input)
Number of factors in the model.

Return Value

An array of length nVariables×nFactors containing the matrix of factor loadings.

Optional Arguments

maximumLikelihood
Maximum likelihood (common factor model) method used to obtain the estimates. Argument maximumLikelihood is the number of degrees of freedom in covariances.

Default: unweightedLeastSquares

or

principalComponent
Principal component (principal component model) method used to obtain the estimates.

Default: unweightedLeastSquares

or

principalFactor
Principal factor (common factor model) method used to obtain the estimates.

Default: unweightedLeastSquares

or

unweightedLeastSquares
Unweighted least-squares (common factor model) method used to obtain the estimates. This option is the default.

Default: This option is the default.

or

generalizedLeastSquares
Generalized least-squares (common factor model) method used to obtain the estimates.

Default: unweightedLeastSquares

or

image
Image-factor analysis (common factor model) method used to obtain the estimates.

Default: unweightedLeastSquares

or

alpha
Alpha-factor analysis (common factor model) method used to obtain the estimates. Argument alpha is the number of degrees of freedom in covariances.

Default: unweightedLeastSquares

uniqueVariancesInput, float[] (Input)

Array of length nVariables containing the initial estimates of the unique variances.

Default: Initial estimates are taken as the constant 1 − nFactors/2 × nVariables divided by the diagonal elements of the inverse of covariances.

uniqueVariancesOutput, float[] (Output)
User-allocated array of length nVariables containing the estimated unique variances.
maxIterations, int (Input)

Maximum number of iterations in the iterative procedure.

Default: maxIterations = 60

maxStepsLineSearch, int (Input)

Maximum number of step halvings allowed during any one iteration.

Default: maxStepsLineSearch = 10

convergenceEps, float (Input)

Convergence criterion used to terminate the iterations. For the unweighted least squares, generalized least squares or maximum likelihood methods, convergence is assumed when the relative change in the criterion is less than convergenceEps. For alpha-factor analysis, convergence is assumed when the maximum change (relative to the variance) of a uniqueness is less than convergenceEps.

Default: convergenceEps = 0.0001

switchExactHessian, float (Input)

Convergence criterion used to switch to exact second derivatives. When the largest relative change in the unique standard deviation vector is less than switchExactHessian, exact second derivative vectors are used. Argument switchExactHessian is not used with the principal component, principal factor, image-factor analysis, or alpha-factor analysis methods.

Default: switchExactHessian = 0.1

eigenvalues (Output)
An array of length nVariables containing the eigenvalues of the matrix from which the factors were extracted.
chiSquaredTest, int df, float chiSquared, float pValue (Output)
Number of degrees of freedom in chi-squared is df; chiSquared is the chi-squared test statistic for testing that nFactors common factors are adequate for the data; pValue is the probability of a greater chi-squared statistic.
tuckerReliabilityCoefficient (Output)
Tucker reliability coefficient.
nIterations (Output)
Number of iterations.
functionMin (Output)
Value of the function minimum.
lastStep (Output)
An array of length nVariables containing the updates of the unique variance estimates when convergence was reached (or the iterations terminated).
orthomaxRotation, float w, int norm, float b, float t (Input/Output)
Nonnegative constant w defines the rotation. If norm =1, row normalization is performed. Otherwise, row normalization is not performed. b contains an array of length nVariables by nFactors containing the rotated factor loading matrix. t contains an array of length nFactors by nFactors containing the rotation transformation matrix. w = 0.0 results in quartimax rotations, w = 1.0 results in varimax rotations, and w = nFactors/2.0 results in equamax rotations. Other nonnegative values of w may also be used, but the best values for w are in the range (0.0, 5 × nFactors).
orthogonalProcrustesRotation, float target, float b, float t (Input/Output)
If specified, the nVariables by nFactors target matrix target will be used to compute an orthogonal Procrustes rotation of the factor-loading matrix. b contains an array of length nVariables×nFactors containing the rotated factor loading matrix. t contains an array of length nFactors×nFactors containing the rotation transformation matrix.
directObliminRotation, float w, int norm, float b, float t, float factorCorrelations (Input/Output)
Computes a direct oblimin rotation. Nonpositive constant w defines the rotation. If norm =1, row normalization is performed. Otherwise, row normalization is not performed. b contains an array of length nVariables×nFactors containing the rotated factor loading matrix. t contains an array of length nFactors×nFactors containing the rotation transformation matrix. factorCorrelations contains an array of length nFactors×nFactors containing the factor correlations. The parameter w determines the type of direct oblimin rotation to be performed. In general w must be negative. w = 0.0 results in direct quartimin rotations. As w approaches negative infinity, the orthogonality among factors will increase.
obliquePromaxRotation, float w, float power, int norm, float target, float b, float t, float factorCorrelations (Input/Output)

Computes an oblique promax rotation of the factor loading matrix using a power vector. Nonnegative constant w defines the rotation. power, a vector of length nFactors, contains the power vector. If norm =1, row (Kaiser) normalization is performed. Otherwise, row normalization is not performed. b contains an array of length nVariables×nFactors containing the rotated factor loading matrix. t contains an array of length nFactors×nFactors containing the rotation transformation matrix. factorCorrelations contains an array of length nFactors×nFactors containing the factor correlations. target contains an array of length nVariables×nFactors containing the target matrix for rotation, derived from the orthomax rotation. w is used in the orthomax rotation, see the optional argument orthomaxRotation for common values of w.

All power[j] should be greater than 1.0, typically 4.0. Generally, the larger the values of power [j], the more oblique the solution will be.

obliquePivotalPromaxRotation, float w, float pivot, int norm, float target , float b, float t, float factorCorrelations (Input/Output)
Computes an oblique pivotal promax rotation of the factor loading matrix using pivot constants. Nonnegative constant w defines the rotation. pivot, a vector of length nFactors, contains the pivot constants. pivot[j] should be in the interval (0.0, 1.0). If norm = 1, row (Kaiser) normalization is performed. Otherwise, row normalization is not performed. b contains an array of length nVariables×nFactors containing the rotated factor loading matrix. t contains an array of length nFactors×nFactors containing the rotation transformation matrix. factorCorrelations contains an array of length nFactors×nFactors containing the factor correlations. target contains an array of length nVariables×nFactors containing the target matrix for rotation, derived from the orthomax rotation. w is used in the orthomax rotation, see the optional argument orthomaxRotation for common values of w.
obliqueProcrustesRotation, float target, float b, float t, float factorCorrelations (Input/Output)
Computes an oblique procrustes rotation of the factor loading matrix using a target matrix. target is a hypothesized rotated factor loading matrix based upon prior knowledge with loadings chosen to enhance interpretability. A simple structure solution will have most of the weights target[i][j] either zero or large in magnitude. b contains an array of length nVariables×nFactors containing the rotated factor loading matrix. t contains an array of length nFactors×nFactors containing the rotation transformation matrix. factorCorrelations contains an array of length nFactors×nFactors containing the factor correlations.
factorStructure, float s, float fvar (Output)
Computes the factor structure and the variance explained by each factor. s contains an array of length nVariables×nFactors containing the factor structure matrix. fvar contains an array of length nFactors containing the variance accounted for by each of the nFactors rotated factors. A factor rotation matrix is used to compute the factor structure and the variance. One and only one rotation option argument can be specified.

Description

Function factorAnalysis computes factor loadings in exploratory factor analysis models. Models available in factorAnalysis are the principal component model for factor analysis and the common factor model with additions to the common factor model in alpha-factor analysis and image analysis. Methods of estimation include principal components, principal factor, image analysis, unweighted least squares, generalized least squares, and maximum likelihood.

In the factor analysis model used for factor extraction, the basic model is given as \(\Sigma=\Lambda\Lambda ^T+\Psi\), where Σ is the p × p population covariance matrix, Λ is the p × k matrix of factor loadings relating the factors f to the observed variables x, and Ψ is the p × p matrix of covariances of the unique errors e. Here, p = nVariables and k = nFactors. The relationship between the factors, the unique errors, and the observed variables is given as \(x=\Lambda f+e\), where in addition, the expected values of e, f, and x are assumed to be 0. (The sample means can be subtracted from x if the expected value of x is not 0.) It also is assumed that each factor has unit variance, the factors are independent of each other, and that the factors and the unique errors are mutually independent. In the common factor model, the elements of unique errors e also are assumed to be independent of one another so that the matrix Ψ is diagonal. This is not the case in the principal component model in which the errors may be correlated.

Further differences between the various methods concern the criterion that is optimized and the amount of computer effort required to obtain estimates. Generally speaking, the least-squares and maximum likelihood methods, which use iterative algorithms, require the most computer time with the principal factor, principal component and the image methods requiring much less time since the algorithms in these methods are not iterative. The algorithm in alpha-factor analysis is also iterative, but the estimates in this method generally require somewhat less computer effort than the least-squares and maximum likelihood estimates. In all methods, one eigensystem analysis is required on each iteration.

Principal Component and Principal Factor Methods

Both the principal component and principal factor methods compute the factor-loading estimates as

\[\hat{\mathit{\Gamma}} \hat{\mathit{\Delta}}^{-1/2}\]

where Γ and the diagonal matrix Δ are the eigenvectors and eigenvalues of a matrix. In the principal component model, the eigensystem analysis is performed on the sample covariance (correlation) matrix S, while in the principal factor model, the matrix (S + Ψ) is used. If the unique error variances Ψ are not known in the principal factor mode, then factorAnalysis obtains estimates for them.

The basic idea in the principal component method is to find factors that maximize the variance in the original data that is explained by the factors. Because this method allows the unique errors to be correlated, some factor analysts insist that the principal component method is not a factor analytic method. Usually, however, the estimates obtained by the principal component model and factor analysis model will be quite similar.

It should be noted that both the principal component and principal factor methods give different results when the correlation matrix is used in place of the covariance matrix. Indeed, any rescaling of the sample covariance matrix can lead to different estimates with either of these methods. A further difficulty with the principal factor method is the problem of estimating the unique error variances. Theoretically, these must be known in advance and be passed to factorAnalysis using optional argument uniqueVariancesInput. In practice, the estimates of these parameters are produced by factorAnalysis when uniqueVariancesInput is not specified. In either case, the resulting adjusted covariance (correlation) matrix

\[S - \hat{\psi}\]

may not yield the nFactors positive eigenvalues required for nFactors factors to be obtained. If this occurs, the user must either lower the number of factors to be estimated or give new unique error variance values.

Least-squares and Maximum Likelihood Methods

Unlike the previous two methods, the algorithm used to compute estimates in this section is iterative (see Jöreskog 1977). As with the principal factor model, the user may either initialize the unique error variances or allow factorAnalysis to compute initial estimates. Unlike the principal factor method, factorAnalysis optimizes the criterion function with respect to both Ψ and Γ. (In the principal factor method, Ψ is assumed to be known. Given Ψ, estimates for Λ may be obtained.)

The major difference between the methods discussed in this section is in the criterion function that is optimized. Let S denote the sample covariance (correlation) matrix, and let Σ denote the covariance matrix that is to be estimated by the factor model. In the unweighted least-squares method, also called the iterated principal factor method or the minres method (see Harman 1976, p. 177), the function minimized is the sum-of-squared differences between S and Σ. This is written as \(\Phi_{u1}=0.5\) (trace \((S-\Sigma)^2\)).

Generalized least-squares and maximum likelihood estimates are asymptotically equivalent methods. Maximum likelihood estimates maximize the (normal theory) likelihood {\(\Phi_{m1}\) = trace \((\Sigma^{-1}S)-\log(|\Sigma^{-1}S|)\)}, while generalized least squares optimizes the function \(\Phi_{gs}\) = trace \((\Sigma S^{-1}-I)^2\).

In all three methods, a two-stage optimization procedure is used. This proceeds by first solving the likelihood equations for Λ in terms of Ψ and substituting the solution into the likelihood. This gives a criterion \(\phi(\Psi,\Lambda(\Psi))\), which is optimized with respect to Ψ. In the second stage, the estimates \(\hat{\mathit{\Lambda}}\) are obtained from the estimates for Ψ.

The generalized least-squares and maximum likelihood methods allow for the computation of a statistic (chiSquaredTest) for testing that nFactors common factors are adequate to fit the model. This is a chi-squared test that all remaining parameters associated with additional factors are 0. If the probability of a larger chi-squared is so small that the null hypothesis is rejected, then additional factors are needed (although these factors may not be of any practical importance). Failure to reject does not legitimize the model. The statistic chiSquaredTest is a likelihood ratio statistic in maximum likelihood estimation. As such, it asymptotically follows a chi-squared distribution with degrees of freedom given by df.

The Tucker and Lewis reliability coefficient, ρ, is returned by tuckerReliabilityCoefficient when the maximum likelihood or generalized least-squares methods are used. This coefficient is an estimate of the ratio of explained variation to the total variation in the data. It is computed as follows:

\[\rho = \frac{mM_0 - mM_k}{mM_0 - 1}\]
\[m = d - \frac{2p+5}{6} - \frac{2k}{6}\]
\[M_0 = \frac{-\ln(|S|)}{p(p-1)/2}\]
\[M_k = \frac{\phi}{\left(\left(p-k\right)^2-p-k\right)/2}\]

where |S| is the determinant of covariances, p = nVariables, k = nVariables, ɸ is the optimized criterion, and d = dfCovariances.

Image Analysis Method

The term image analysis is used here to denote the noniterative image method of Kaiser (1963). It is not the image analysis discussed by Harman (1976, p. 226). The image method (as well as the alpha-factor analysis method) begins with the notion that only a finite number from an infinite number of possible variables have been measured. The image factor pattern is calculated under the assumption that the ratio of the number of factors to the number of observed variables is near 0, so that a very good estimate for the unique error variances (for standardized variables) is given as 1 minus the squared multiple correlation of the variable under consideration with all variables in the covariance matrix.

First, the matrix \(D^2=(\mathrm{diag} (S^{-1}))^{-1}\) is computed where the operator “diag” results in a matrix consisting of the diagonal elements of its argument and S is the sample covariance (correlation) matrix. Then, the eigenvalues Λ and eigenvectors Γ of the matrix \(D^{-1} SD^{-1}\) are computed. Finally, the unrotated image-factor pattern is computed as \(D \Gamma[(\Lambda-I)^2 \Lambda^{-1} ]^{1/2}\).

Alpha-factor Analysis Method

The alpha-factor analysis method of Kaiser and Caffrey (1965) finds factor-loading estimates to maximize the correlation between the factors and the complete universe of variables of interest. The basic idea in this method is that only a finite number of variables out of a much larger set of possible variables is observed. The population factors are linearly related to this larger set, while the observed factors are linearly related to the observed variables. Let f denote the factors obtainable from a finite set of observed random variables, and let ξ denote the factors obtainable from the universe of observable variables. Then, the alpha method attempts to find factor-loading estimates so as to maximize the correlation between f and ξ. In order to obtain these estimates, the iterative algorithm of Kaiser and Caffrey (1965) is used.

Rotation Methods

The orthomaxRotation optional argument performs an orthogonal rotation according to an orthomax criterion. In this analytic method of rotation, the criterion function

\[Q = \sum_i \sum_r \lambda_{ir}^4 - \frac{\gamma}{p}\sum_r \left[\sum_i \lambda_{ir}^2\right]^2\]

is minimized by finding an orthogonal rotation matrix T such that \((\lambda_{ij})=\Lambda=AT\) where A is the matrix of unrotated factor loadings. Here, \(\gamma\geq 0\) is a user-specified constant (w) yielding a family of rotations, and p is the number of variables.

Kaiser (row) normalization can be performed on the factor loadings prior to rotation by specifying the parameter norm =1. In Kaiser normalization, the rows of A are first “normalized” by dividing each row by the square root of the sum of its squared elements (Harman 1976). After the rotation is complete, each row of b is “denormalized” by multiplication by its initial normalizing constant.

The method for optimizing Q proceeds by accumulating simple rotations where a simple rotation is defined to be one in which Q is optimized for two columns in Λ and for which the requirement that T be orthogonal is satisfied. A single iteration is defined to be such that each of the nFactors(nFactors - 1)/2 possible simple rotations is performed where nFactors is the number of factors. When the relative change in Q from one iteration to the next is less than convergenceEps (the user-specified convergence criterion), the algorithm stops. convergenceEps = 0.0001 is usually sufficient. Alternatively, the algorithm stops when the user-specified maximum number of iterations, maxIterations, is reached. maxIterations = 30 is usually sufficient.

The parameter in the rotation, \(\gamma\), is used to provide a family of rotations. When \(\gamma=0.0\), a direct quartimax rotation results. Other values of \(\gamma\) yield other rotations.

The orthogonalProcrustesRotation optional argument performs orthogonal Procrustes rotation according to a method proposed by Schöneman (1966). Let k = nFactors denote the number of factors, p = nVariables denote the number of variables, A denote the p × k matrix of unrotated factor loadings, T denote the k × k orthogonal rotation matrix (orthogonality requires that \(T^TT\) be a k × k identity matrix), and let X denote the target matrix. The basic idea in orthogonal Procrustes rotation is to find an orthogonal rotation matrix T such that \(B=AT\) and T provides a least-squares fit between the target matrix X and the rotated loading matrix B. Schöneman’s algorithm proceeds by finding the singular value decomposition of the matrix \(A^TX=U\Sigma V^T\). The rotation matrix is computed as \(T=UV^T\).

The directObliminRotation optional argument performs direct oblimin rotation. In this analytic method of rotation, the criterion function

\[Q = \sum_{r \neq s} \left[\sum_i \lambda_{ir}^2 \lambda_{is}^2 - \frac{\gamma}{p} \sum_i \lambda_{ir}^2 \sum_i \lambda_{is}^2\right]\]

is minimized by finding a rotation matrix T such that \((\lambda_{ir})=\Lambda= AT\) and \((T^TT)^{-1}\) is a correlation matrix. Here, \(\gamma\leq 0\) is a user-specified constant (w) yielding a family of rotations, and p is the number of variables. The rotation is said to be direct because it minimizes Q with respect to the factor loadings directly, ignoring the reference structure.

Kaiser normalization can be performed on the factor loadings prior to rotation via the parameter norm. In Kaiser normalization (see Harman 1976), the rows of the factor loading matrix are first “normalized” by dividing each row by the square root of the sum of its squared elements. After the rotation is complete, each row of b is “denormalized” by multiplication by its initial normalizing constant.

The method for optimizing Q is essentially the method first proposed by Jennrich and Sampson (1966). It proceeds by accumulating simple rotations where a simple rotation is defined to be one in which Q is optimized for a given factor in the plane of a second factor, and for which the requirement that \((T^T T)^{-1}\) be a correlation matrix is satisfied. An iteration is defined to be such that each of the nFactors[nFactors - 1] possible simple rotations is performed, where nFactors is the number of factors. When the relative change in Q from one iteration to the next is less than convergenceEps (the user-specified convergence criterion), the algorithm stops. convergenceEps = .0001 is usually sufficient. Alternatively, the algorithm stops when the user-specified maximum number of iterations, maxIterations, is reached. maxIterations = 30 is usually sufficient.

The parameter in the rotation, \(\gamma\), is used to provide a family of rotations. Harman (1976) recommends that \(\gamma\) be strictly less than or equal to zero. When \(\gamma=0.0\), a direct quartimin rotation results. Other values of \(\gamma\) yield other rotations. Harman (1976) suggests that the direct quartimin rotations yield the most highly correlated factors while more orthogonal factors result as \(\gamma\) approaches -∞.

obliquePromaxRotation, obliquePivotalPromaxRotation, obliqueProcrustesRotation, optional arguments perform oblique rotations using the Promax, pivotal Promax, or oblique Procrustes methods. In all of these methods, a target matrix X is first either computed or specified by the user. The differences in the methods relate to how the target matrix is first obtained.

Given a p × k target matrix, X, and a p × k orthogonal matrix of unrotated factor loadings, A, compute the rotation matrix T as follows: First regress each column of A on X yielding a k × k matrix β. Then, let \(\gamma=\mathrm{diag}(\beta^T \beta)\) where diag denotes the diagonal matrix obtained from the diagonal of the square matrix. Standardize β to obtain \(T=\gamma^{-1/2} \beta\). The rotated loadings are computed as \(B= AT\) while the factor correlations can be computed as the inverse of the \(T^TT\) matrix.

In the Promax method, the unrotated factor loadings are first rotated according to an orthomax criterion via optional argument orthomaxRotation. The target matrix X is taken as the elements of the B raised to a power greater than one but retaining the same sign as the original loadings. The column i of the rotated matrix B is raised to the power power[i]. A power of four is commonly used. Generally, the larger the power, the more oblique the solution.

In the pivotal Promax method, the unrotated matrix is first rotated to an orthomax orthogonal solution as in the Promax case. Then, rather than raising the i-th column in B to the power pivot[i], the elements \(x_{ij}\) of X are obtained from the elements \(b_{ij}\) of B by raising the ij element of B to the power pivot[i]/\(b_{ij}\). This has the effects of greatly increasing in X those elements in B that are greater in magnitude than the pivot elements pivot[i], and of greatly decreasing those elements that are less than pivot[i].

In the oblique Procrustes method, the elements of X are specified by the user as input to the routine via the target argument. No orthogonal rotation is performed in the oblique Procrustes method.

Factor Structure and Variance

The factorStructure optional argument computes the factor structure matrix (the matrix of correlations between the observed variables and the hypothesized factors) and the variance explained by each of the factors (for orthogonal rotations). For oblique rotations, factorStructure computes a measure of the importance of the factors, the sum of the squared elements in each column.

Let Δ denote the diagonal matrix containing the elements of the variance of the original data along its diagonal. The estimated factor structure matrix S is computed as

\[S = \mathit{\Delta}^{-\frac{1}{2}} A\left(T^{-1}\right)^T\]

while the elements of fvar are computed as the diagonal elements of

\[S^T \mathit{\Delta}^{\frac{1}{2}} AT\]

If the factors were obtained from a correlation matrix (or the factor variances for standardized variables are desired), then the variances should all be 1.0.

Comments

  1. Function factorAnalysis makes no attempt to solve for nFactors. In general, if nFactors is not known in advance, several different values of nFactors should be used and the most reasonable value kept in the final solution.
  2. Iterative methods are generally thought to be superior from a theoretical point of view, but in practice, often lead to solutions that differ little from the noniterative methods. For this reason, it is usually suggested that a noniterative method be used in the initial stages of the factor analysis and that the iterative methods be used when issues such as the number of factors have been resolved.
  3. Initial estimates for the unique variances can be input. If the iterative methods fail for these values, new initial estimates should be tried. These can be obtained by use of another factoring method. (Use the final estimates from the new method as the initial estimates in the old method.)

Examples

Example 1

In this example, factor analysis is performed for a nine-variable matrix using the default method of unweighted least squares.

from numpy import *
from pyimsl.stat.factorAnalysis import factorAnalysis
from pyimsl.stat.writeMatrix import writeMatrix

covariances = [
    [1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
    [0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
    [0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
    [0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
    [0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
    [0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
    [0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
    [0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
    [0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]

# Perform analysis
a = factorAnalysis(covariances, 3)

# Print results
writeMatrix("Unrotated Loadings", a)

Output

 
           Unrotated Loadings
             1            2            3
1       0.7018      -0.2316       0.0796
2       0.7200      -0.1372      -0.2082
3       0.5351      -0.2144      -0.2271
4       0.7907       0.4050       0.0070
5       0.6532       0.4221      -0.1046
6       0.7539       0.4842       0.1607
7       0.7127      -0.2819      -0.0701
8       0.4835      -0.2627       0.4620
9       0.8192      -0.3137      -0.0199

Example 2

The following data were originally analyzed by Emmett (1949). There are 211 observations on 9 variables. Following Lawley and Maxwell (1971), three factors are obtained by the method of maximum likelihood.

from __future__ import print_function
from numpy import *
from pyimsl.stat.factorAnalysis import factorAnalysis
from pyimsl.stat.writeMatrix import writeMatrix

covariances = [
    [1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
    [0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
    [0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
    [0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
    [0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
    [0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
    [0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
    [0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
    [0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
evals = []
reliability_coef = []
n_iterations = []
function_min = []
uniq = empty((9), dtype=double)
chsqtest = {}

# Perform analysis
a = factorAnalysis(covariances, 3,
                   maximumLikelihood=210,
                   switchExactHessian=0.01,
                   convergenceEps=0.000001,
                   maxIterations=30,
                   maxStepsLineSearch=10,
                   eigenvalues=evals,
                   uniqueVariancesOutput=uniq,
                   chiSquaredTest=chsqtest,
                   tuckerReliabilityCoefficient=reliability_coef,
                   nIterations=n_iterations,
                   functionMin=function_min)
# Print results
writeMatrix("Unrotated Loadings", a)
writeMatrix("Eigenvalues", evals, writeFormat="%6.3f")
writeMatrix("Unique Error Variances", uniq, writeFormat="%6.4f")
print("\n\nchi_squared_df =    %d" % chsqtest['df'])
print("chi_squared =       %f" % chsqtest['chiSquared'])
print("p_value =           %f\n" % chsqtest['pValue'])
print("reliability_coef = %f" % reliability_coef[0])
print("function_min =      %f" % function_min[0])
print("n_iterations =      %d" % n_iterations[0])

Output

chi_squared_df =    12
chi_squared =       7.149363
p_value =           0.847587

reliability_coef = 1.000000
function_min =      0.035017
n_iterations =      5
 
           Unrotated Loadings
             1            2            3
1       0.6642      -0.3209       0.0735
2       0.6888      -0.2471      -0.1933
3       0.4926      -0.3022      -0.2224
4       0.8372       0.2924      -0.0354
5       0.7050       0.3148      -0.1528
6       0.8187       0.3767       0.1045
7       0.6615      -0.3960      -0.0777
8       0.4579      -0.2955       0.4913
9       0.7657      -0.4274      -0.0117
 
                              Eigenvalues
     1       2       3       4       5       6       7       8       9
 0.063   0.229   0.541   0.865   0.894   0.974   1.080   1.117   1.140
 
                        Unique Error Variances
     1       2       3       4       5       6       7       8       9
0.4505  0.4271  0.6166  0.2123  0.3805  0.1769  0.3995  0.4615  0.2309

Example 3

This example is a continuation of example 1 and illustrates the use of the factorStructure optional argument when the structure and an index of factor importance for obliquely rotated loadings are desired. A direct oblimin rotation is used to compute the factors, derived from nine variables and using \(\gamma=-1\). Note in this example that the elements of fvar are not variances since the rotation is oblique.

from numpy import *
from pyimsl.stat.factorAnalysis import factorAnalysis
from pyimsl.stat.writeMatrix import writeMatrix

n_variables = 9
n_factors = 3
w = -1.0
norm = 1
b = []
t = []
fcor = []
oblim = {'w': w, 'norm': norm}
fs = {}

covariances = [
    [1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
    [0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
    [0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504],
    [0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
    [0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
    [0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
    [0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
    [0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
    [0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]

# Perform analysis
a = factorAnalysis(covariances, 3,
                   maximumLikelihood=210,
                   switchExactHessian=0.01,
                   convergenceEps=0.00001,
                   maxIterations=30,
                   maxStepsLineSearch=10,
                   directObliminRotation=oblim,
                   factorStructure=fs)

# Print results
writeMatrix('Unrotated Loadings', a)
writeMatrix('Rotated Loadings', oblim['b'])
writeMatrix('Transformation Matrix', oblim['t'])
writeMatrix('Factor Correlation Matrix', oblim['factorCorrelations'])
writeMatrix('Factor Structure', fs['s'])
writeMatrix('Factor Variance', fs['fvar'])

Output

 
           Unrotated Loadings
             1            2            3
1       0.6642      -0.3209       0.0735
2       0.6888      -0.2471      -0.1933
3       0.4926      -0.3022      -0.2224
4       0.8372       0.2924      -0.0354
5       0.7050       0.3148      -0.1528
6       0.8187       0.3767       0.1045
7       0.6615      -0.3960      -0.0777
8       0.4579      -0.2955       0.4913
9       0.7657      -0.4274      -0.0117
 
            Rotated Loadings
             1            2            3
1       0.1128      -0.5144       0.2917
2       0.1847      -0.6602      -0.0018
3       0.0128      -0.6354      -0.0585
4       0.7797      -0.1751       0.0598
5       0.7147      -0.1813      -0.0959
6       0.8520       0.0039       0.1820
7       0.0354      -0.6844       0.1510
8       0.0276      -0.0941       0.6824
9       0.0729      -0.7100       0.2493
 
          Transformation Matrix
             1            2            3
1        0.611       -0.462        0.203
2        0.923        0.813       -0.249
3        0.042        0.728        1.050
 
        Factor Correlation Matrix
             1            2            3
1        1.000       -0.427        0.217
2       -0.427        1.000       -0.411
3        0.217       -0.411        1.000
 
            Factor Structure
             1            2            3
1       0.3958      -0.6824       0.5275
2       0.4662      -0.7383       0.3094
3       0.2714      -0.6169       0.2052
4       0.8675      -0.5326       0.3011
5       0.7713      -0.4471       0.1339
6       0.8899      -0.4347       0.3656
7       0.3605      -0.7616       0.4398
8       0.2161      -0.3861       0.7271
9       0.4302      -0.8435       0.5568
 
           Factor Variance
          1            2            3
      2.170        2.560        0.914

Warning Errors

IMSLS_VARIANCES_INPUT_IGNORED When using the principalComponent option, the unique variances are assumed to be zero. Input for uniqueVariancesInput is ignored.
IMSLS_TOO_MANY_ITERATIONS Too many iterations. Convergence is assumed.
IMSLS_NO_DEG_FREEDOM There are no degrees of freedom for the significance testing.
IMSLS_TOO_MANY_HALVINGS Too many step halvings. Convergence is assumed.
IMSLS_NO_ROTATION nFactors = 1. No rotation is possible.
IMSLS_SVD_ERROR An error occurred in the singular value decomposition of tran(A)X. The rotation matrix, T, may not be correct.

Fatal Errors

IMSLS_HESSIAN_NOT_POS_DEF The approximate Hessian is not semi-definite on iteration #. The computations cannot proceed. Try using different initial estimates.
IMSLS_FACTOR_EVAL_NOT_POS eigenvalues[#]” = #. An eigenvalue corresponding to a factor is negative or zero. Either use different initial estimates for “uniqueVariancesInput” or reduce the number of factors.
IMSLS_COV_NOT_POS_DEF covariances” is not positive semi-definite. The computations cannot proceed.
IMSLS_COV_IS_SINGULAR The matrix “covariances” is singular. The computations cannot continue because variable # is linearly related to the remaining variables.
IMSLS_COV_EVAL_ERROR An error occurred in calculating the eigenvalues of the adjusted (inverse) covariance matrix. Check “covariances.”
IMSLS_ALPHA_FACTOR_EVAL_NEG In alpha factor analysis on iteration #, eigenvalue # is #. As all eigenvalues corresponding to the factors must be positive, either the number of factors must be reduced or new initial estimates for “uniqueVariancesInput” must be given.
IMSLS_RANK_LESS_THAN The rank of TRAN(A)target = #. This must be greater than or equal to nFactors = #.