FactorAnalysis.FactorLoadingEstimationMethod Property

FactorAnalysisFactorLoadingEstimationMethod Property

The factor loading estimation method.

Namespace: Imsl.Stat
Assembly: ImslCS (in ImslCS.dll) Version: 6.5.2.0

Syntax

public FactorAnalysisModel FactorLoadingEstimationMethod { get; set; }

Public Property FactorLoadingEstimationMethod As FactorAnalysisModel
	Get
	Set

public:
property FactorAnalysisModel FactorLoadingEstimationMethod {
	FactorAnalysisModel get ();
	void set (FactorAnalysisModel value);
}

member FactorLoadingEstimationMethod : FactorAnalysisModel with get, set

Property Value

Type: FactorAnalysisModel
Indicates the method to be applied for obtaining the factor loadings. Use FactorAnalysis.Model field PrincipalComponent, PrincipalFactor, UnweightedLeastSquares, GeneralizedLeastSquares, MaximumLikelihood, ImageFactorAnalysis, or AlphaFactorAnalysis for FactorLoadingEstimationMethod. By default, the PrincipalComponent is used.

Remarks

For the principal component and principal factor methods, the factor loading estimates are computed as

$\hat{\Gamma}\hat{\Delta}^{-1/2}$

where $\Gamma$ and the diagonal matrix $\Delta$ are the eigenvalues and eigenvectors of a matrix. In the principal component model, the eigensystem analysis is performed on the sample covariance (correlation) matrix

while in the principal factor model the matrix $(S - \Psi)$ is used. If the unique error variances $\Psi$ are not known in the principal factor model, then they are estimated. This is achieved by setting the property VarianceEstimationMethod to 0. If the principal component model is used, the error variances in the Variances property are set to 0.0 automatically.

The basic idea in the principal component method is to find factors that maximize the variance in the original data that is explained by the factors. Because this method allows the unique errors to be correlated, some factor analysts insist that the principal component method is not a factor analytic method. Usually however, the estimates obtained via the principal component model and other models in factor analysis will be quite similar.

It should be noted that both the principal component and the principal factor methods give different results when the correlation matrix is used in place of the covariance matrix. Indeed, any rescaling of the sample covariance matrix can lead to different estimates with either of these methods. A further difficulty with the principal factor method is the problem of estimating the unique error variances. Theoretically, these must be known in advance and set using the the Variances property. In practice, the estimates of these parameters produced by setting the property VarianceEstimationMethod to 0 are often used. In either case, the resulting adjusted covariance (correlation) matrix

$(S - \hat{\Psi})$

may not yield the nfactors positive eigenvalues required for nfactors factors to be obtained. If this occurs, the user must either lower the number of factors to be estimated or give new unique error variance values.

For the least-squares and maximum likelihood methods an iterative algorithm is used to obtain the estimates (see joreskog 1977). As with the principal factor model, the user may either input the initial unique error variances or allow the algorithm to compute initial estimates. Unlike the principal factor method, the code then optimizes the criterion function with respect to both $\Psi$ and $\Gamma$ . (In the principal factor method, $\Psi$ is assumed to be known. Given $\Psi$ , estimates for $\Lambda$ may be obtained.)

The major differences between the estimation methods described in this member function are in the criterion function that is optimized. Let denote the sample covariance (correlation) matrix, and let $\Sigma$ denote the covariance matrix that is to be estimated by the factor model. In the unweighted least-squares method, also called the iterated principal factor method or the minres method (see Harman 1976, page 177), the function minimized is the sum of the squared differences between and $\Sigma$ . This is written as $\Phi_ul = .5 trace((S - \Sigma)^2)$ .

Generalized least-squares and maximum likelihood estimates are asymptotically equivalent methods. Maximum likelihood estimates maximize the (normal theory) likelihood $\{\Phi_ml = trace(\Sigma^{-1}S) - log(|\Sigma^{-1}S|)\}.$ while generalized least squares optimizes the function $\Phi_gs = trace(\Sigma S^{-1} - I)^2$ .

In all three methods, a two-stage optimization procedure is used. This proceeds by first solving the likelihood equations for $\Lambda$ in terms of $\Psi$ and substituting the solution into the likelihood. This gives a criterion $\Phi(\Psi, \Lambda(\Psi))$ , which is optimized with respect to $\Psi$ . In the second stage, the estimates

$\hat{\Lambda}$

are obtained from the estimates for $\Psi$ .

The generalized least-squares and the maximum likelihood methods allow for the computation of a statistic for testing that nfactors common factors are adequate to fit the model. This is a chi-squared test that all remaining parameters associated with additional factors are zero. If the probability of a larger chi-squared is small (see stat[4]) so that the null hypothesis is rejected, then additional factors are needed (although these factors may not be of any practical importance). Failure to reject does not legitimize the model. The statistic stat[2] is a likelihood ratio statistic in maximum likelihood estimates. As such, it asymptotically follows a chi-squared distribution with degrees of freedom given in stat[3].

The Tucker and Lewis (1973) reliability coefficient, $\rho$ , is returned in stat[1] when the maximum likelihood or generalized least-squares methods are used. This coefficient is an estimate of the ratio of explained to the total variation in the data. It is computed as follows:

$\rho = \frac{mM_o - mM_k}{mM_o - 1}$

$m = d - \frac{2p + 5}{6} - \frac{2k}{6}$

$M_o = \frac{-ln(|S|)}{p(p-1)/2}$

$M_k = \frac{\Phi}{((p-k)^2 - p - k)/2}$

where

is the determinant of cov, p is the number of variables, k is the number of factors, $\Phi$ is the optimized criterion, and d is the number of degrees of freedom.

The term "image analysis" is used here to denote the noniterative image method of Kaiser (1963). It is not the image factor analysis discussed by Harman (1976, page 226). The image method (as well as the alpha factor analysis method) begins with the notion that only a finite number from an infinite number of possible variables have been measured. The image factor pattern is calculated under the assumption that the ratio of the number of factors to the number of observed variables is near zero so that a very good estimate for the unique error variances (for standardized variables) is given as one minus the squared multiple correlation of the variable under consideration with all variables in the covariance matrix.

First, the matrix $D^2 = (diag(S^{-1}))^{-1}$ is computed where the operator "diag" results in a matrix consisting of the diagonal elements of its argument, and is the sample covariance (correlation) matrix. Then, the eigenvalues $\Lambda$ and eigenvectors $\Gamma$ of the matrix $D^{-1} S D^{-1}$ are computed. Finally, the unrotated image factor pattern matrix is computed as $A = D\Gamma[(\Lambda - I)^2 \Lambda^{-1}]^{1/2}$ .

The alpha factor analysis method of Kaiser and Caffrey (1965) finds factor-loading estimates to maximize the correlation between the factors and the complete universe of variables of interest. The basic idea in this method is as follows: only a finite number of variables out of a much larger set of possible variables is observed. The population factors are linearly related to this larger set while the observed factors are linearly related to the observed variables. Let f denote the factors obtainable from a finite set of observed random variables, and let $\xi$ denote the factors obtainable from the universe of observable variables. Then, the alpha method attempts to find factor-loading estimates so as to maximize the correlation between f and $\xi$ . In order to obtain these estimates, the iterative algorithm of Kaiser and Caffrey (1965) is used.

Reference

FactorAnalysis Class

Imsl.Stat Namespace