DiscriminantAnalysis Class |
Namespace: Imsl.Stat
The DiscriminantAnalysis type exposes the following members.
Name | Description | |
---|---|---|
DiscriminantAnalysis | Constructs a DiscriminantAnalysis.
|
Name | Description | |
---|---|---|
Classify(Double) |
Classify a set of observations using the linear or quadratic
discriminant functions generated during the training process.
| |
Classify(Double, Int32) |
Classify a set of observations using the linear or quadratic
discriminant functions generated during the training process.
| |
Classify(Double, Int32, Double) |
Classify a set of observations and associated frequencies and weights
using the linear or quadratic discriminant functions generated
during the training process.
| |
Classify(Double, Int32, Int32) |
Classify a set of observations and compare against known groups using
the linear or quadratic discriminant functions generated during the
training process.
| |
Classify(Double, Int32, Int32, Double) |
Classify a set of observations and associated frequencies and weights
using the linear or quadratic discriminant functions generated
during the training process.
| |
Classify(Double, Int32, Int32, Int32, Double) |
Classify a set of observations, associated frequencies and weights, and
compare against known groups using the linear or quadratic discriminant
functions generated during the training process.
| |
Downdate(Double, Int32) |
Removes a set of observations from the discriminant functions.
| |
Downdate(Double, Int32, Int32) |
Removes a set of observations from the discriminant functions.
| |
Downdate(Double, Int32, Int32, Double) |
Removes a set of observations and associated frequencies and weights
from the discriminant functions.
| |
Downdate(Double, Int32, Int32, Int32, Double) |
Removes a set of observations and associated frequencies and weights
from the discriminant functions.
| |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetClassMembership | Returns the group number to which the observation was classified.
| |
GetClassTable | Returns the classification table.
| |
GetCoefficients | Returns the linear discriminant function coefficients.
| |
GetCovariance |
Returns the array of covariances.
| |
GetGroupCounts |
Returns the group counts.
| |
GetHashCode | Serves as a hash function for a particular type. (Inherited from Object.) | |
GetMahalanobis | Returns the Mahalanobis distances between the group means.
| |
GetMeans |
Returns the variable means.
| |
GetPrior |
Returns the prior probabilities.
| |
GetProbability | Returns the posterior probabilities for each observation.
| |
GetStatistics | Returns statistics.
| |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
SetPrior |
Specifies user supplied prior probabilities.
| |
ToString | Returns a string that represents the current object. (Inherited from Object.) |
Name | Description | |
---|---|---|
ClassificationMethod |
The classification method.
| |
CovarianceComputation |
The type of covariance matrices to be computed.
| |
DiscriminationMethod | The discrimination method.
| |
NumberOfRowsMissing |
The number of rows of data encountered containing missing
values (Double.NaN).
| |
PriorType |
The type of prior probabilities to be calculated.
|
DiscriminantAnalysis allows linear or a quadratic discrimination and the use of either reclassification, split sample, or the leaving-out-one methods in order to evaluate the rule. One or more observations can be added to the rule during each invocation of the Update method.
DiscriminantAnalysis results in the measure of distance between the groups, (see GetMahalanobis method), a table summarizing the classification results, (see GetClassTable), a matrix containing the posterior probabilities of group membership for each classified observation, (see GetProbability), the within-sample means, (see GetMeans) and covariance matrices computed from their LU factorizations, (see GetCovariance). The linear discriminant function coefficients are also computed, (see GetCoefficients method).
All observations can be input during one call to the Update method; this has the advantage of simplicity. Alternatively, one or more rows of observations can be input during separate calls to Update. This does not require all observations be memory resident, a significant advantage with large data sets. Note, however, to classify the same data set requires a second pass of the data to the Classify method. During the first pass to the Update method the discriminant functions are computed while in the second pass to the Classify method the observations are classified. When known groups are available the method GetClassTable is useful in comparing how well the alogorithm classifies. Multiple calls to the Classify method are also allowed. The class table, GetClassTable, is an accumulation of all observations classified. The class membership and probabilities, returned in GetClassMembership and GetProbability, will contain the membership for each observation from the most recent invocation of the Classify method.
Pooled only and pooled with group covariance computation cannot be mixed. By default, both pooled and group covariance matrices will be computed. An InvalidOperationException will be thrown if an attempt is made to change the covariance computation after the first call to the Update method. See the CovarianceComputation method for more details on specifying the covariance computation.
The within-group means are updated for all valid observations in x. Observations with invalid group numbers are ignored, as are observations with missing values (Double.NaN). The LU factorization of the covariance matrices are updated by adding (or deleting) observations via Givens rotations. See the Downdate method to delete observations.
During the algorithm's training process, or each invocation of the Update method, each observation in x is added to the means and the factorizations of the covariance matrices. Statistics of interest are computed: the linear discriminant functions, the prior probabilities, the log of the determinant of each of the covariance matrices, and a test statistic for testing that all of the within-group covariance matrices are equal. The matrix of Mahalanobis distances, which consists of the distances between the groups, is computed via the pooled covariance matrix when linear discrimination is specified. The row covariance matrix is used when the discrimination is quadratic. Covariance matrices are defined as follows. Let denote the sum of the frequencies of the observations in group i, and let denote the number of observations in group i. Then, if denotes the within-group i covariance matrix,
where is the weight of the j-th observation in group i, is its frequency, is the j-th observation column vector (in group i), and denotes the mean vector of the observations in group i. The mean vectors are computed as where Given the means and the covariance matrices, the linear discriminant function for group i is computed as: where is the natural log of the prior probability for the i-th group, x is the observation to be classified, and denotes the pooled covariance matrix.Let S denote either the pooled covariance matrix or one of the within-group covariance matrices . (S will be the pooled covariance matrix in linear discrimination, and otherwise.) The Mahalanobis distance between group i and group j is computed as:
Finally, the asymptotic chi-squared test for the equality of covariance matrices is computed as follows (Morrison 1976, page 252):
where is the number of degrees of freedom in the i-th sample covariance matrix, is the number of groups, and where is the number of variables.The estimated posterior probability of each observation x belonging to group i is computed using the prior probabilities and the sample mean vectors and estimated covariance matrices under a multivariate normal assumption. Under quadratic discrimination, the within-group covariance matrices are used to compute the estimated posterior probabilities. The estimated posterior probability of an observation x belonging to group i is
whereFor the leaving-out-one method of classification, the sample mean vector and sample covariance matrices in the formula for
are adjusted so as to remove the observation x from their computation. For linear discrimination, the linear discriminant function coefficients are actually used to compute the same posterior probabilities.Using the posterior probabilities, each observation in x is classified into a group; the result is tabulated in the matrix returned by GetClassTable and saved in the vector returned by GetClassMembership. If a group variable is provided and the group number is out of range, the classification table is not altered at this stage. If the reclassification method is specified, then all observations with no missing values are classified. When the leaving-out-one method is used, observations with invalid group numbers, weights, frequencies or classification variables are not classified. Regardless of the frequency, a 1 is added (or subtracted) from the classification table for each row of x that is classified and contains a valid group number. When the leaving-out-one method is used, adjustment is made to the posterior probabilities to remove the effect of the observation in the classification rule. In this adjustment, each observation is presumed to have a weight of and a frequency of 1.0. See Lachenbruch (1975, page 36) for the required adjustment.