factor_analysis

Chapter 9: Multivariate Analysis

factor_analysis

Extracts initial factor-loading estimates in factor analysis with rotation options.

Synopsis

#include <imsls.h>

float *imsls_f_factor_analysis (int n_variables, float covariances[], int n_factors, ..., 0)

The type double function is imsls_d_factor_analysis.

Required Arguments

int n_variables (Input)
Number of variables.

float covariances[] (Input)
Array of length n_variables*n_variables containing the variance-covariance or correlation matrix.

int n_factors (Input)
Number of factors in the model.

Return Value

An array of length n_variables*n_factors containing the matrix of factor loadings.

Synopsis with Optional Arguments

#include <imsls.h>

float *imsls_f_factor_analysis (int n_variables, float covariances[], int n_factors,
IMSLS_MAXIMUM_LIKELIHOOD, int df_covariances, or
IMSLS_PRINCIPAL_COMPONENT, or
IMSLS_PRINCIPAL_FACTOR, or
IMSLS_UNWEIGHTED_LEAST_SQUARES,or
IMSLS_GENERALIZED_LEAST_SQUARES, int df_covariances, or
IMSLS_IMAGE, or
IMSLS_ALPHA, int df_covariances,
IMSLS_UNIQUE_VARIANCES_INPUT, float unique_variances[],
IMSLS_UNIQUE_VARIANCES_OUTPUT, float unique_variances[],
IMSLS_MAX_ITERATIONS, int max_iterations,
IMSLS_MAX_STEPS_LINE_SEARCH, int max_steps_line_search,
IMSLS_CONVERGENCE_EPS, float convergence_eps,
IMSLS_SWITCH_EXACT_HESSIAN,float switch_epsilon,
IMSLS_EIGENVALUES, float **eigenvalues,
IMSLS_EIGENVALUES_USER, float eigenvalues[],
IMSLS_CHI_SQUARED_TEST, int *df, float *chi_squared, float *p_value,
IMSLS_TUCKER_RELIABILITY_COEFFICIENT, float *coefficient,
IMSLS_N_ITERATIONS, int *n_iterations,
IMSLS_FUNCTION_MIN, float *function_min,
IMSLS_LAST_STEP, float **last_step,
IMSLS_LAST_STEP_USER, float last_step[],
IMSLS_ORTHOMAX_ROTATION, float w, int norm, float **b, float **t,
IMSLS_ORTHOMAX_ROTATION_USER, float w, int norm, float b[], float t[],
IMSLS_ORTHOGONAL_PROCRUSTES_ROTATION, float target[], float **b,
float **t,
IMSLS_ORTHOGONAL_PROCRUSTES_ROTATION_USER, float target[], float b[],
float t[],
IMSLS_DIRECT_OBLIMIN_ROTATION, float w, int norm, float **b, float **t,
float **factor_correlations,
IMSLS_DIRECT_OBLIMIN_ROTATION_USER, float w, int norm, float b[],
float t[], float factor_correlations[],
IMSLS_OBLIQUE_PROMAX_ROTATION, float w, float power[], int norm, float
**target, float **b, float **t, float **factor_correlations,
IMSLS_OBLIQUE_PROMAX_ROTATION_USER, float w, float power[], nt norm,
float target[], float b[], float t[], float factor_correlations[],
IMSLS_OBLIQUE_PIVOTAL_PROMAX_ROTATION, float w, float pivot[],
int norm, float **target, float **b, float **t, float **factor_correlations,
IMSLS_OBLIQUE_PIVOTAL_PROMAX_ROTATION_USER, float w, float pivot[],
int norm, float target[], float b[], float t[], float factor_correlations[],
IMSLS_OBLIQUE_PROCRUSTES_ROTATION, float target[],float **b, float **t,
float **factor_correlations,
IMSLS_OBLIQUE_PROCRUSTES_ROTATION_USER, float target[], float b[],
float t[], float factor_correlations[],
IMSLS_FACTOR_STRUCTURE, float **s, float **fvar,
IMSLS_FACTOR_STRUCTURE_USER, float s[], float fvar[],
IMSLS_COV_COL_DIM, int cov_col_dim,
IMSLS_RETURN_USER, float factor_loadings[],
0)

Optional Arguments

IMSLS_MAXIMUM_LIKELIHOOD, int df_covariances (Input)
Maximum likelihood (common factor model) method used to obtain the estimates. Argument df_covariances is the number of degrees of freedom in covariances.
or

IMSLS_PRINCIPAL_COMPONENT
Principal component (principal component model) method used to obtain the estimates.
or

IMSLS_PRINCIPAL_FACTOR
Principal factor (common factor model) method used to obtain the estimates.
or

IMSLS_UNWEIGHTED_LEAST_SQUARES
Unweighted least-squares (common factor model) method used to obtain the estimates. This option is the default.
or

IMSLS_GENERALIZED_LEAST_SQUARES, int df_covariances (Input)
Generalized least-squares (common factor model) method used to obtain the estimates.
or

IMSLS_IMAGE
Image-factor analysis (common factor model) method used to obtain the estimates.
or

IMSLS_ALPHA, int df_covariances (Input)
Alpha-factor analysis (common factor model) method used to obtain the estimates. Argument df_covariances is the number of degrees of freedom in covariances.

IMSLS_UNIQUE_VARIANCES_INPUT, float unique_variances[] (Input)
Array of length n_variables containing the initial estimates of the unique variances.
Default: Initial estimates are taken as the constant 1 − n_factors/2 * n_variables divided by the diagonal elements of the inverse of covariances.

IMSLS_UNIQUE_VARIANCES_OUTPUT, float unique_variances[] (Output)
User-allocated array of length n_variables containing the estimated unique variances.

IMSLS_MAX_ITERATIONS, int max_iterations (Input)
Maximum number of iterations in the iterative procedure.
Default: max_iterations = 60

IMSLS_MAX_STEPS_LINE_SEARCH, int max_steps_line_search (Input)
Maximum number of step halvings allowed during any one iteration.
Default: max_steps_line_search = 10

IMSLS_CONVERGENCE_EPS, float convergence_eps (Input)
Convergence criterion used to terminate the iterations. For the unweighted least squares, generalized least squares or maximum likelihood methods, convergence is assumed when the relative change in the criterion is less than convergence_eps. For alpha-factor analysis, convergence is assumed when the maximum change (relative to the variance) of a uniqueness is less than convergence_eps.
Default: convergence_eps = 0.0001

IMSLS_SWITCH_EXACT_HESSIAN, float switch_epsilon (Input)
Convergence criterion used to switch to exact second derivatives. When the largest relative change in the unique standard deviation vector is less than switch_epsilon, exact second derivative vectors are used. Argument switch_epsilon is not used with the principal component, principal factor, image-factor analysis, or alpha-factor analysis methods.
Default: switch_epsilon = 0.1

IMSLS_EIGENVALUES, float **eigenvalues (Output)
The address of a pointer to an internally allocated array of length n_variables containing the eigenvalues of the matrix from which the factors were extracted.

IMSLS_EIGENVALUES_USER, float eigenvalues[] (Output)
Storage for array eigenvalues is provided by the user. See IMSLS_EIGENVALUES.

IMSLS_CHI_SQUARED_TEST, int *df, float *chi_squared, float *p_value (Output)
Number of degrees of freedom in chi-squared is df; chi_squared is the chi-squared test statistic for testing that n_factors common factors are adequate for the data; p_value is the probability of a greater chi-squared statistic.

IMSLS_TUCKER_RELIABILITY_COEFFICIENT, float *coefficient (Output)
Tucker reliability coefficient.

IMSLS_N_ITERATIONS, int *n_iterations (Output)
Number of iterations.

IMSLS_FUNCTION_MIN, float *function_min (Output)
Value of the function minimum.

IMSLS_LAST_STEP, float **last_step (Output)
Address of a pointer to an internally allocated array of length n_variables containing the updates of the unique variance estimates when convergence was reached (or the iterations terminated).

IMSLS_LAST_STEP_USER, float last_step[] (Output)
Storage for array last_step is provided by the user. See IMSLS_LAST_STEP.

IMSLS_ORTHOMAX_ROTATION, float w, int norm, float **b, float **t (Input/Output)
Nonnegative constant w defines the rotation. If norm =1, row normalization is performed. Otherwise, row normalization is not performed. b contains the address of a pointer to the internally allocated array of length n_variables by n_factors containing the rotated factor loading matrix. t contains the address of a pointer to the internally allocated array of length n_factors by n_factors containing the rotation transformation matrix. w = 0.0 results in quartimax rotations, w = 1.0 results in varimax rotations, and w = n_factors/2.0 results in equamax rotations. Other nonnegative values of w may also be used, but the best values for w are in the range (0.0, 5 * n_factors).

IMSLS_ORTHOMAX_ROTATION_USER, float w, int norm, float b[], float t[] (Input/Output)
Storage for b and t are provided by the user. See IMSLS_ORTHOMAX_ROTATION.

IMSLS_ORTHOGONAL_PROCRUSTES_ROTATION, float target[], float **b, float **t (Input/Output)
If specified, the n_variables by n_factors target matrix target will be used to compute an orthogonal Procrustes rotation of the factor-loading matrix. b contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the rotated factor loading matrix. t contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the rotation transformation matrix.

IMSLS_ORTHOGONAL_PROCRUTES_ROTATION_USER, float target[],
float b[], float t[] (Input/Output)
Storage for b and t are provided by the user. See IMSLS_ORTHOGONAL_PROCRUSTES_ROTATION.

IMSLS_DIRECT_OBLIMIN_ROTATION, float w , int norm, float **b,
float **t,float **factor_correlations (Input/Output)
Computes a direct oblimin rotation. Nonpositive constant w defines the rotation. If norm =1, row normalization is performed. Otherwise, row normalization is not performed. b contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the rotated factor loading matrix. t contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the rotation transformation matrix. factor_correlations contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the factor correlations. The parameter w determines the type of direct oblimin rotation to be performed. In general w must be negative. w = 0.0 results in direct quartimin rotations. As w approaches negative infinity, the orthogonality among factors will increase.

IMSLS_DIRECT_OBLIMIN_ROTATION_USER, float w, int norm, float b[],
float t[], float factor_correlations[] (Input/Output)
Storage for b, t and factor_correlations are provided by the user. See IMSLS_DIRECT_OBLIMIN_ROTATION.

IMSLS_OBLIQUE_PROMAX_ROTATION, float w, float power[], int norm, float **target, float **b, float **t, float **factor_correlations, (Input/Output)
Computes an oblique promax rotation of the factor loading matrix using a power vector. Nonnegative constant w defines the rotation. power, a vector of length n_factors containing the power vector. If norm =1, row (Kaiser) normalization is performed. Otherwise, row normalization is not performed. b contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the rotated factor loading matrix. t contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the rotation transformation matrix. factor_correlations contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the factor correlations. target contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the target matrix for rotation, derived from the orthomax rotation. w is used in the orthomax rotation, see the optional argument IMSLS_ORTHOMAX_ROTATION for common values of w.

All power[j] should be greater than 1.0, typically 4.0. Generally, the larger the values of power [j], the more oblique the solution will be.

IMSLS_OBLIQUE_PROMAX_ROTATION_USER, float w, float power[], int norm, float target[], float b[], float t[], float factor_correlations[], (Input/Output)
Storage for b, t, factor_correlations, and target are provided by the user. See IMSLS_OBLIQUE_PROMAX_ROTATION.

IMSLS_OBLIQUE_PIVOTAL_PROMAX_ROTATION, float w, float pivot[],
int norm, float **target , float **b, float **t, float **factor_correlations, (Input/Output)
Computes an oblique pivotal promax rotation of the factor loading matrix using pivot constants. Nonnegative constant w defines the rotation. pivot, a vector of length n_factors containing the pivot constants. pivot[j] should be in the interval (0.0, 1.0). If norm =1, row (Kaiser) normalization is performed. Otherwise, row normalization is not performed. b contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the rotated factor loading matrix. t contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the rotation transformation matrix. factor_correlations contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the factor correlations. target contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the target matrix for rotation, derived from the orthomax rotation. w is used in the orthomax rotation, see the optional argument IMSLS_ORTHOMAX_ROTATION for common values of w.

IMSLS_OBLIQUE_PIVOTAL_PROMAX_ROTATION_USER, float w, float pivot[], int norm, float target[], float b[], float t[], float factor_correlations[], (Input/Output)
Storage for b, t, factor_correlations, and target are provided by the user. See IMSLS_OBLIQUE_PIVOTAL_PROMAX_ROTATION.

IMSLS_OBLIQUE_PROCRUSTES_ROTATION, float **target, float **b, float **t, float **factor_correlations (Input/Output)
Computes an oblique procrustes rotation of the factor loading matrix using a target matrix. target is a hypothesized rotated factor loading matrix based upon prior knowledge with loadings chosen to the enhance interpretability. A simple structure solution will have most of the weights target[i][j] either zero or large in magnitude. b contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the rotated factor loading matrix. t contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the rotation transformation matrix. factor_correlations contains the address of a pointer to the internally allocated array of length n_factors*n_factors containing the factor correlations.

IMSLS_OBLIQUE_PROCRUSTES_ROTATION_USER, float target[],
float b[], float t[], float factor_correlations [] (Input/Output)
Storage for b, t, and factor_correlations are provided by the user. See IMSLS_PROCRUSTES_ROTATION.

IMSLS_FACTOR_STRUCTURE,float **s, float **fvar, (Output)
Computes the factor structure and the variance explained by each factor. s contains the address of a pointer to the internally allocated array of length n_variables*n_factors containing the factor structure matrix. fvar contains the address of a pointer to the internally allocated array of length n_factors containing the variance accounted for by each of the n_factors rotated factors. A factor rotation matrix is used to compute the factor structure and the variance. One and only one rotation option argument can be specified.

IMSLS_FACTOR_STRUCTURE_USER, float s[], float fvar[], (Output)
Storage for s, and fvar are provided by the user.
See IMSLS_FACTOR_STRUCTURE.

IMSLS_COV_COL_DIM, int cov_col_dim (Input)
Column dimension of the matrix covariances.
Default: cov_col_dim = n_variables

IMSLS_RETURN_USER, float factor_loadings[] (Output)
User-allocated array of length n_variables*n_factors containing the unrotated factor loadings.

Description

Function imsls_f_factor_analysis computes factor loadings in exploratory factor analysis models. Models available in imsls_f_factor_analysis are the principal component model for factor analysis and the common factor model with additions to the common factor model in alpha-factor analysis and image analysis. Methods of estimation include principal components, principal factor, image analysis, unweighted least squares, generalized least squares, and maximum likelihood.

In the factor analysis model used for factor extraction, the basic model is given as Σ = ΛΛT + Ψ, where Σ is the p ´ p population covariance matrix, Λ is the p ´ k matrix of factor loadings relating the factors f to the observed variables x, and Ψ is the p ´ p matrix of covariances of the unique errors e. Here, p = n_variables and k = n_factors. The relationship between the factors, the unique errors, and the observed variables is given as x = Λf + e, where in addition, the expected values of e, f, and x are assumed to be 0. (The sample means can be subtracted from x if the expected value of x is not 0.) It also is assumed that each factor has unit variance, the factors are independent of each other, and that the factors and the unique errors are mutually independent. In the common factor model, the elements of unique errors e also are assumed to be independent of one another so that the matrix Ψ is diagonal. This is not the case in the principal component model in which the errors may be correlated.

Further differences between the various methods concern the criterion that is optimized and the amount of computer effort required to obtain estimates. Generally speaking, the least-squares and maximum likelihood methods, which use iterative algorithms, require the most computer time with the principal factor, principal component and the image methods requiring much less time since the algorithms in these methods are not iterative. The algorithm in alpha-factor analysis is also iterative, but the estimates in this method generally require somewhat less computer effort than the least-squares and maximum likelihood estimates. In all methods, one eigensystem analysis is required on each iteration.

Principal Component and Principal Factor Methods

Both the principal component and principal factor methods compute the factor-loading estimates as

where Γ and the diagonal matrix Δ are the eigenvectors and eigenvalues of a matrix. In the principal component model, the eigensystem analysis is performed on the sample covariance (correlation) matrix S, while in the principal factor model, the matrix (S + Ψ) is used. If the unique error variances Ψ are not known in the principal factor mode, then imsls_f_factor_analysis obtains estimates for them.

The basic idea in the principal component method is to find factors that maximize the variance in the original data that is explained by the factors. Because this method allows the unique errors to be correlated, some factor analysts insist that the principal component method is not a factor analytic method. Usually, however, the estimates obtained by the principal component model and factor analysis model will be quite similar.

It should be noted that both the principal component and principal factor methods give different results when the correlation matrix is used in place of the covariance matrix. Indeed, any rescaling of the sample covariance matrix can lead to different estimates with either of these methods. A further difficulty with the principal factor method is the problem of estimating the unique error variances. Theoretically, these must be known in advance and be passed to imsls_f_factor_analysis using optional argument IMSLS_UNIQUE_VARIANCES_INPUT. In practice, the estimates of these parameters are produced by imsls_f_factor_analysis when IMSLS_UNIQUE_VARIANCES_INPUT is not specified. In either case, the resulting adjusted covariance (correlation) matrix

may not yield the n_factors positive eigenvalues required for n_factors factors to be obtained. If this occurs, the user must either lower the number of factors to be estimated or give new unique error variance values.

Least-squares and Maximum Likelihood Methods

Unlike the previous two methods, the algorithm used to compute estimates in this section is iterative (see Jöreskog 1977). As with the principal factor model, the user may either initialize the unique error variances or allow imsls_f_factor_analysis to compute initial estimates. Unlike the principal factor method, imsls_f_factor_analysis optimizes the criterion function with respect to both Ψ and Γ. (In the principal factor method, Ψ is assumed to be known. Given Ψ, estimates for Λ may be obtained.)

The major difference between the methods discussed in this section is in the criterion function that is optimized. Let S denote the sample covariance (correlation) matrix, and let Σ denote the covariance matrix that is to be estimated by the factor model. In the unweighted least-squares method, also called the iterated principal factor method or the minres method (see Harman 1976, p. 177), the function minimized is the sum-of-squared differences between S and Σ. This is written as Φu1= 0.5 (trace (S − Σ)2).

Generalized least-squares and maximum likelihood estimates are asymptotically equivalent methods. Maximum likelihood estimates maximize the (normal theory) likelihood {Φm1 = trace (Σ−1S) − log (|Σ−1S|)}, while generalized least squares optimizes the function Φgs = trace (ΣS−1 − I)2.

In all three methods, a two-stage optimization procedure is used. This proceeds by first solving the likelihood equations for Λ in terms of Ψ and substituting the solution into the likelihood. This gives a criterion ɸ (Ψ, Λ (Ψ)), which is optimized with respect to Ψ. In the second stage, the estimates are obtained from the estimates for Ψ.

The generalized least-squares and maximum likelihood methods allow for the computation of a statistic (IMSLS_CHI_SQUARED_TEST) for testing that n_factors common factors are adequate to fit the model. This is a chi-squared test that all remaining parameters associated with additional factors are 0. If the probability of a larger chi-squared is so small that the null hypothesis is rejected, then additional factors are needed (although these factors may not be of any practical importance). Failure to reject does not legitimize the model. The statistic IMSLS_CHI_SQUARED_TEST is a likelihood ratio statistic in maximum likelihood estimation. As such, it asymptotically follows a chi-squared distribution with degrees of freedom given by df.

The Tucker and Lewis reliability coefficient, ρ, is returned by IMSLS_TUCKER_RELIABILITY_COEFFICIENT when the maximum likelihood or generalized least-squares methods are used. This coefficient is an estimate of the ratio of explained variation to the total variation in the data. It is computed as follows:

where |S| is the determinant of covariances, p = n_variables, k = n_variables, ɸ is the optimized criterion, and d = df_covariances.

Image Analysis Method

The term image analysis is used here to denote the noniterative image method of Kaiser (1963). It is not the image analysis discussed by Harman (1976, p. 226). The image method (as well as the alpha-factor analysis method) begins with the notion that only a finite number from an infinite number of possible variables have been measured. The image factor pattern is calculated under the assumption that the ratio of the number of factors to the number of observed variables is near 0, so that a very good estimate for the unique error variances (for standardized variables) is given as 1 minus the squared multiple correlation of the variable under consideration with all variables in the covariance matrix.

First, the matrix D2 = (diag (S−1) )−1 is computed where the operator “diag” results in a matrix consisting of the diagonal elements of its argument and S is the sample covariance (correlation) matrix. Then, the eigenvalues Λ and eigenvectors Γ of the matrix D−1SD−1 are computed. Finally, the unrotated image-factor pattern is computed as DΓ [(Λ − I)2Λ−1]1∕2.

Alpha-factor Analysis Method

The alpha-factor analysis method of Kaiser and Caffrey (1965) finds factor-loading estimates to maximize the correlation between the factors and the complete universe of variables of interest. The basic idea in this method is that only a finite number of variables out of a much larger set of possible variables is observed. The population factors are linearly related to this larger set, while the observed factors are linearly related to the observed variables. Let f denote the factors obtainable from a finite set of observed random variables, and let ξ denote the factors obtainable from the universe of observable variables. Then, the alpha method attempts to find factor-loading estimates so as to maximize the correlation between f and ξ. In order to obtain these estimates, the iterative algorithm of Kaiser and Caffrey (1965) is used.

Rotation Methods

The IMSLS_ORTHOMAX_ROTATION optional argument performs an orthogonal rotation according to an orthomax criterion. In this analytic method of rotation, the criterion function

is minimized by finding an orthogonal rotation matrix T such that (lij) = L = AT where A is the matrix of unrotated factor loadings. Here, g ³ 0 is a user-specified constant (W) yielding a family of rotations, and p is the number of variables.

Kaiser (row) normalization can be performed on the factor loadings prior to rotation by specifying the parameter norm =1. In Kaiser normalization, the rows of A are first “normalized” by dividing each row by the square root of the sum of its squared elements (Harman 1976). After the rotation is complete, each row of b is “denormalized” by multiplication by its initial normalizing constant.

The method for optimizing Q proceeds by accumulating simple rotations where a simple rotation is defined to be one in which Q is optimized for two columns in L and for which the requirement that T be orthogonal is satisfied. A single iteration is defined to be such that each of the n_factors(n_factors - 1)/2 possible simple rotations is performed where n_factors is the number of factors. When the relative change in Q from one iteration to the next is less than EPS (the user-specified convergence criterion), the algorithm stops. eps = 0.0001 is usually sufficient. Alternatively, the algorithm stops when the user-specified maximum number of iterations, max_iterations, is reached. max_iterations = 30 is usually sufficient.

The parameter in the rotation, g, is used to provide a family of rotations. When g = 0.0, a direct quartimax rotation results. Other values of g yield other rotations.

The IMSLS_ORTHOGONAL_PROCRUSTES_ROTATION optional argument performs orthogonal Procrustes rotation according to a method proposed by Schöneman (1966). Let k = n_factors denote the number of factors, p = n_variables denote the number of variables, A denote the
p × k matrix of unrotated factor loadings, T denote the k × k orthogonal rotation matrix (orthogonality requires that TT T be a k × k identity matrix), and let X denote the target matrix. The basic idea in orthogonal Procrustes rotation is to find an orthogonal rotation matrix T such that
B = AT and T provides a least-squares fit between the target matrix X and the rotated loading matrix B. Schöneman's algorithm proceeds by finding the singular value decomposition of the matrix AT X = USVT. The rotation matrix is computed as T = UVT.

The IMSLS_DIRECT_OBLIMIN_ROTATION optional argument performs direct oblimin rotation. In this analytic method of rotation, the criterion function

is minimized by finding a rotation matrix T such that (lir) = L = AT and (TT T )−1 is a correlation matrix. Here, g £ 0 is a user-specified constant (w) yielding a family of rotations, and p is the number of variables. The rotation is said to be direct because it minimizes Q with respect to the factor loadings directly, ignoring the reference structure.

Kaiser normalization can be performed on the factor loadings prior to rotation via the parameter norm. In Kaiser normalization (see Harman 1976), the rows of the factor loading matrix are first “normalized” by dividing each row by the square root of the sum of its squared elements. After the rotation is complete, each row of b is “denormalized” by multiplication by its initial normalizing constant.

The method for optimizing Q is essentially the method first proposed by Jennrich and Sampson (1966). It proceeds by accumulating simple rotations where a simple rotation is defined to be one in which Q is optimized for a given factor in the plane of a second factor, and for which the requirement that (TTT)−1 be a correlation matrix is satisfied. An iteration is defined to be such that each of the n_factors[n_factors - 1] possible simple rotations is performed, where n_factors is the number of factors. When the relative change in Q from one iteration to the next is less than eps (the user-specified convergence criterion), the algorithm stops. eps = .0001 is usually sufficient. Alternatively, the algorithm stops when the user-specified maximum number of iterations, max_iterations, is reached. max_iterations = 30 is usually sufficient.

The parameter in the rotation, g, is used to provide a family of rotations. Harman (1976) recommends that g be strictly less than or equal to zero. When g = 0.0, a direct quartimin rotation results. Other values of g yield other rotations. Harman (1976) suggests that the direct quartimin rotations yield the most highly correlated factors while more orthogonal factors result as g approaches -¥.

IMSLS_OBLIQUE_PROMAX_ROTATION, IMSLS_OBLIQUE_PIVOTAL_PROMAX_ROTATION, IMSLS_OBLIQUE_PROCRUSTES_ROTATION, optional arguments performs oblique rotations using the Promax, pivotal Promax, or oblique Procrustes methods. In all of these methods, a target matrix X is first either computed or specified by the user. The differences in the methods relate to how the target matrix is first obtained.

Given a p × k target matrix, X, and a p × k orthogonal matrix of unrotated factor loadings, A, compute the rotation matrix T as follows: First regress each column of A on X yielding a k × k matrix b. Then, let g = diag(bT b) where diag denotes the diagonal matrix obtained from the diagonal of the square matrix. Standardize b to obtain
T = g−1∕2 b. The rotated loadings are computed as B = AT while the factor correlations can be computed as the inverse of the T TT matrix.

In the Promax method, the unrotated factor loadings are first rotated according to an orthomax criterion via optional argument IMSLS_ORTHOMAX_ROTATION . The target matrix X is taken as the elements of the B raised to a power greater than one but retaining the same sign as the original loadings. The column i of the rotated matrix B is raised to the power power[i]. A power of four is commonly used. Generally, the larger the power, the more oblique the solution.

In the pivotal Promax method, the unrotated matrix is first rotated to an orthomax orthogonal solution as in the Promax case. Then, rather than raising the i-th column in B to the power pivot[i], the elements xij of X are obtained from the elements bij of B by raising the ij element of B to the power pivot[i]/bij. This has the effects of greatly increasing in X those elements in B that are greater in magnitude than the pivot elements pivot[i], and of greatly decreasing those elements that are less than pivot[i].

In the oblique Procrustes method, the elements of X are specified by the user as input to the routine via the target argument. No orthogonal rotation is performed in the oblique Procrustes method.

Factor Structure and Variance

The IMSLS_FACTOR_STRUCTURE optional argument computes the factor structure matrix (the matrix of correlations between the observed variables and the hypothesized factors) and the variance explained by each of the factors (for orthogonal rotations). For oblique rotations, IMSLS_FACTOR_STRUCTURE computes a measure of the importance of the factors, the sum of the squared elements in each column.

Let D denote the diagonal matrix containing the elements of the variance of the original data along its diagonal. The estimated factor structure matrix S is computed as

while the elements of fvar are computed as the diagonal elements of

If the factors were obtained from a correlation matrix (or the factor variances for standardized variables are desired), then the variances should all be 1.0.

Comments

1. Function imsls_f_factor_analysis makes no attempt to solve for n_factors. In general, if n_factors is not known in advance, several different values of n_factors should be used and the most reasonable value kept in the final solution.

2. Iterative methods are generally thought to be superior from a theoretical point of view, but in practice, often lead to solutions that differ little from the noniterative methods. For this reason, it is usually suggested that a noniterative method be used in the initial stages of the factor analysis and that the iterative methods be used when issues such as the number of factors have been resolved.

3. Initial estimates for the unique variances can be input. If the iterative methods fail for these values, new initial estimates should be tried. These can be obtained by use of another factoring method. (Use the final estimates from the new method as the initial estimates in the old method.)

Examples

Example 1

In this example, factor analysis is performed for a nine-variable matrix using the default method of unweighted least squares.

#include <stdio.h>
#include <imsls.h>
#include <stdlib.h>

int main()
{
#define N_VARIABLES 9
#define N_FACTORS   3
    float *a;

    float covariances[N_VARIABLES][N_VARIABLES] = {
        1.0,   0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639,
        0.523, 1.0,   0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645,
        0.395, 0.479, 1.0,   0.355, 0.27, 0.254, 0.452, 0.219, 0.504,
        0.471, 0.506, 0.355, 1.0,   0.691, 0.791, 0.443, 0.285, 0.505,
        0.346, 0.418, 0.27, 0.691, 1.0,   0.679, 0.383, 0.149, 0.409,
        0.426, 0.462, 0.254, 0.791, 0.679, 1.0,   0.372, 0.314, 0.472,
        0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0,   0.385, 0.68,
        0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0,   0.47,
        0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0};

                        /* Perform analysis */
    a = imsls_f_factor_analysis (9, covariances, 3, 0);

                        /* Print results */
    imsls_f_write_matrix("Unrotated Loadings", N_VARIABLES, N_FACTORS,
        a, 0);

    imsls_free(a);
}

Output

         Unrotated Loadings
            1           2           3
1      0.7018     -0.2316      0.0796
2      0.7200     -0.1372     -0.2082
3      0.5351     -0.2144     -0.2271
4      0.7907      0.4050      0.0070
5      0.6532      0.4221     -0.1046
6      0.7539      0.4842      0.1607
7      0.7127     -0.2819     -0.0701
8      0.4835     -0.2627      0.4620
9      0.8192     -0.3137    -0.0199

Example 2

The following data were originally analyzed by Emmett (1949). There are 211 observations on 9 variables. Following Lawley and Maxwell (1971), three factors are obtained by the method of maximum likelihood.

#include <stdio.h>
#include <imsls.h>
#include <stdlib.h>

int main()
{
#define N_VARIABLES 9
#define N_FACTORS   3
    float *a;
    float *evals;
    float chi_squared, p_value, reliability_coef, function_min;
    int   chi_squared_df, n_iterations;
    float uniq[N_VARIABLES];

    float covariances[N_VARIABLES][N_VARIABLES] = {
        1.0,   0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639,
        0.523, 1.0,   0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645,
        0.395, 0.479, 1.0,   0.355, 0.27, 0.254, 0.452, 0.219, 0.504,
        0.471, 0.506, 0.355, 1.0,   0.691, 0.791, 0.443, 0.285, 0.505,
        0.346, 0.418, 0.27, 0.691, 1.0,   0.679, 0.383, 0.149, 0.409,
        0.426, 0.462, 0.254, 0.791, 0.679, 1.0,   0.372, 0.314, 0.472,
        0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0,   0.385, 0.68,
        0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0,   0.47,
        0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0};

                           /* Perform analysis */
    a = imsls_f_factor_analysis (9, covariances, 3,
        IMSLS_MAXIMUM_LIKELIHOOD,           210,
        IMSLS_SWITCH_EXACT_HESSIAN,         0.01,
        IMSLS_CONVERGENCE_EPS,              0.000001,
        IMSLS_MAX_ITERATIONS,               30,
        IMSLS_MAX_STEPS_LINE_SEARCH,        10,
        IMSLS_EIGENVALUES,                  &evals,
        IMSLS_UNIQUE_VARIANCES_OUTPUT,      uniq,
        IMSLS_CHI_SQUARED_TEST,
            &chi_squared_df,
            &chi_squared,
            &p_value,
        IMSLS_TUCKER_RELIABILITY_COEFFICIENT, &reliability_coef,
        IMSLS_N_ITERATIONS,                 &n_iterations,
        IMSLS_FUNCTION_MIN,                 &function_min,
        0);

                         /* Print results */
    imsls_f_write_matrix("Unrotated Loadings", N_VARIABLES, N_FACTORS,
        a, 0);
    imsls_f_write_matrix("Eigenvalues", 1, N_VARIABLES, evals, 0);
    imsls_f_write_matrix("Unique Error Variances", 1, N_VARIABLES,
        uniq, 0);
    printf("\n\nchi_squared_df =    %d\n", chi_squared_df);
   printf("chi_squared =       %f\n", chi_squared);
    printf("p_value =           %f\n\n", p_value);
    printf("reliability_coef = %f\n", reliability_coef);
    printf("function_min =      %f\n", function_min);
    printf("n_iterations =      %d\n", n_iterations);

    imsls_free(evals);
    imsls_free(a);
}

Output

         Unrotated Loadings
            1           2           3
1      0.6642     -0.3209      0.0735
2      0.6888     -0.2471     -0.1933
3      0.4926     -0.3022     -0.2224
4      0.8372      0.2924     -0.0354
5      0.7050      0.3148     -0.1528
6      0.8187      0.3767      0.1045
7      0.6615     -0.3960     -0.0777
8      0.4579     -0.2955      0.4913
9      0.7657     -0.4274     -0.0117

                              Eigenvalues
         1           2           3           4           5           6
     0.063       0.229       0.541       0.865       0.894       0.974

         7           8           9
     1.080       1.117       1.140

                        Unique Error Variances
         1           2           3           4           5           6
    0.4505      0.4271      0.6166      0.2123      0.3805      0.1769

         7           8           9
    0.3995      0.4615      0.2309

chi_squared_df =    12
chi_squared =       7.149356
p_value =           0.847588

reliability_coef = 1.000000
function_min =      0.035017
n_iterations =      5

Example 3

This example is a continuation of example 1 and illustrates the use of the IMSLS_FACTOR_STRUCTURE optional argument when the structure and an index of factor importance for obliquely rotated loadings are desired. A direct oblimin rotation is used to compute the factors, derived from nine variables and using g = -1. Note in this example that the elements of fvar are not variances since the rotation is oblique.

#include <stdio.h>

#include <imsls.h>

#include <stdlib.h>

int main()

{

#define N_VARIABLES 9

#define N_FACTORS 3

float *a;

float w= -1.0;

int norm=1;

float *b, *t, *fcor;

float *s, *fvar;

float covariances[9][9] = {

1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639,

0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645,

0.395, 0.479, 1.0, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504,

0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505,

0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409,

0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472,

0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68,

0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47,

0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0};

/* Perform analysis */

a = imsls_f_factor_analysis (9, (float *)covariances, 3,

IMSLS_MAXIMUM_LIKELIHOOD, 210,

IMSLS_SWITCH_EXACT_HESSIAN, 0.01,

IMSLS_CONVERGENCE_EPS, 0.00001,

IMSLS_MAX_ITERATIONS, 30,

IMSLS_MAX_STEPS_LINE_SEARCH, 10,

IMSLS_DIRECT_OBLIMIN_ROTATION, w, norm, &b, &t, &fcor,

IMSLS_FACTOR_STRUCTURE, &s, &fvar,

0);

/* Print results */

imsls_f_write_matrix("Unrotated Loadings", N_VARIABLES, N_FACTORS,

a, 0);

imsls_f_write_matrix("Rotated Loadings", N_VARIABLES, N_FACTORS,

b, 0);

imsls_f_write_matrix("Transformation Matrix", N_FACTORS, N_FACTORS,

t, 0);

imsls_f_write_matrix("Factor Correlation Matrix", N_FACTORS, N_FACTORS,

fcor, 0);

imsls_f_write_matrix("Factor Structure", N_VARIABLES,

N_FACTORS,s,0);

imsls_f_write_matrix("Factor Variance", 1, N_FACTORS, fvar, 0);
}

Output

Unrotated Loadings

1 2 3

1 0.6642 -0.3209 0.0735

2 0.6888 -0.2471 -0.1933

3 0.4926 -0.3022 -0.2224

4 0.8372 0.2924 -0.0354

5 0.7050 0.3148 -0.1528

6 0.8187 0.3767 0.1045

7 0.6615 -0.3960 -0.0777

8 0.4579 -0.2955 0.4913

9 0.7657 -0.4274 -0.0117

Rotated Loadings

1 2 3

1 0.1128 -0.5144 0.2917

2 0.1847 -0.6602 -0.0018

3 0.0128 -0.6354 -0.0585

4 0.7797 -0.1751 0.0598

5 0.7147 -0.1813 -0.0959

6 0.8520 0.0039 0.1820

7 0.0354 -0.6844 0.1510

8 0.0276 -0.0941 0.6824

9 0.0729 -0.7100 0.2493

Transformation Matrix

1 2 3

1 0.611 -0.462 0.203

2 0.923 0.813 -0.249

3 0.042 0.728 1.050

Factor Correlation Matrix

1 2 3

1 1.000 -0.427 0.217

2 -0.427 1.000 -0.411

3 0.217 -0.411 1.000

Factor Structure

1 2 3

1 0.3958 -0.6824 0.5275

2 0.4662 -0.7383 0.3094

3 0.2714 -0.6169 0.2052

4 0.8675 -0.5326 0.3011

5 0.7713 -0.4471 0.1339

6 0.8899 -0.4347 0.3656

7 0.3605 -0.7616 0.4398

8 0.2161 -0.3861 0.7271

9 0.4302 -0.8435 0.5568

Factor Variance

1 2 3

2.170 2.560 0.914

Warning Errors

IMSLS_VARIANCES_INPUT_IGNORED When using the IMSLS_PRINCIPAL_COMPONENT option, the unique variances are assumed to be zero. Input for IMSLS_UNIQUE_VARIANCES_INPUT is ignored.

IMSLS_TOO_MANY_ITERATIONS Too many iterations. Convergence is assumed.

IMSLS_NO_DEG_FREEDOM There are no degrees of freedom for the significance testing.

IMSLS_TOO_MANY_HALVINGS Too many step halvings. Convergence is assumed.

IMSLS_NO_ROTATION n_factors = 1. No rotation is possible.

IMSLS_SVD_ERROR An error occurred in the singular value decomposition of tran(A)*X. The rotation matrix, T, may not be correct.

Fatal Errors

IMSLS_HESSIAN_NOT_POS_DEF The approximate Hessian is not semi-definite on iteration #. The computations cannot proceed. Try using different initial estimates.

IMSLS_FACTOR_EVAL_NOT_POS “eigenvalues[#]” = #. An eigenvalue corresponding to a factor is negative or zero. Either use different initial estimates for “unique_variances” or reduce the number of factors.

IMSLS_COV_NOT_POS_DEF “covariances” is not positive semi-definite. The computations cannot proceed.

IMSLS_COV_IS_SINGULAR The matrix “covariances” is singular. The computations cannot continue because variable # is linearly related to the remaining variables.

IMSLS_COV_EVAL_ERROR An error occurred in calculating the eigenvalues of the adjusted (inverse) covariance matrix. Check “covariances.”

IMSLS_ALPHA_FACTOR_EVAL_NEG In alpha factor analysis on iteration #, eigenvalue # is #. As all eigenvalues corresponding to the factors must be positive, either the number of factors must be reduced or new initial estimates for “unique_variances” must be given.

IMSLS_RANK_LESS_THAN The rank of TRAN(A)*target = #. This must be greater than or equal to n_factors = #.

Visual Numerics, Inc.
Visual Numerics - Developers of IMSL and PV-WAVE
http://www.vni.com/
PHONE: 713.784.3131
FAX:713.781.9260