Class DiscriminantAnalysis
DiscriminantAnalysis allows linear or a quadratic
discrimination and the use of several classification rules, such as
reclassification, split sample, or leave-out-one methods. One or more
observations can be added to the rule during each invocation of the
update method.
DiscriminantAnalysis results in the measure of distance
between the groups,(see getMahalanobis method), a table
summarizing the classification results, (see getClassTable), a
matrix containing the posterior probabilities of group membership for each
classified observation, (see getProbability), the
within-sample means, (see getMeans) and covariance matrices
computed from their LU factorizations, (see getCovariance). The
linear discriminant function coefficients are also computed,
(see getCoefficients method).
All observations can be input during one call to the update
method; this has the advantage of simplicity. Alternatively, one or more
rows of observations can be input during separate calls to
update. This does not require all observations be memory
resident, a significant advantage with large data sets. Note, however, to
classify the same data set requires a second pass of the data to the
classify method. During the first pass to the
update method the discriminant functions are computed while in
the second pass to the classify method the observations are
classified. When known groups are available the method getClassTable is
useful in comparing how well the algorithm classifies. Multiple calls to
the classify method are also allowed. The class table,
getClassTable, is an accumulation of all observations
classified. The class membership and probabilities, returned in
getClassMembership and getProbabilities, will
contain the membership for each observation from the most recent
invocation of the classify method.
Pooled only and pooled with group covariance computation cannot be mixed. By default,
both pooled and group covariance matrices will be computed. An IllegalStateException
will be thrown if an attempt is made to change the covariance computation
after the first call to the update method. See the
setCovarianceComputation method for more details on specifying
the covariance computation.
The within-group means are updated for all valid observations in
x. Observations with invalid group numbers are ignored, as are
observations with missing values (Double.NaN). The LU
factorization of the covariance matrices are updated by adding (or deleting)
observations via Givens rotations. See the downdate method to
delete observations.
During the algorithm's training process, or each invocation of the
update method, each observation in x is added to
the means and the factorizations of the covariance matrices. Statistics of
interest are computed: the linear discriminant functions, the prior
probabilities, the log of the determinant of each of the covariance matrices,
and a test statistic for testing that all of the within-group covariance
matrices are equal. The matrix of Mahalanobis distances, which consists of the
distances between the groups, is computed via the pooled covariance matrix
when linear discrimination is specified. The row covariance matrix is used
when the discrimination is quadratic.
Covariance matrices are defined as follows. Let \(N_i\)
denote the sum of the frequencies of the observations in group i, and
let \(M_i\) denote the number of observations in group
i. Then, if \(S_i\) denotes the within-group
i covariance matrix,
$$S_i = \frac{1}{N_i - 1} \sum_{j=1}^{M_i} w_j f_j (x_j - \overline{x})(x_j - \overline{x})^T$$
where \(w_j\) is the weight of the j-th observation
in group i, \(f_j\) is its frequency,
\(x_j\) is the j-th observation column vector (in
group i), and \(\overline{x}\) denotes the mean
vector of the observations in group i. The mean vectors are computed as
$$\overline{x} = \frac{1}{W_i} \sum_{j=1}^{M_i} w_j f_j x_j$$
where
$$W_i = \sum_{j=1}^{M_i} w_j f_j$$
Given the means and the covariance matrices, the linear discriminant
function for group i is computed as:
$$z_i = \ln(p_i)-0.5\overline{x_i}^T S_{p}^{-1} \overline{x_i} + x^T S_{p}^{-1} \overline{x_i}$$
where \(\ln(p_i)\) is the natural log of the prior
probability for the i-th group, x is the observation to be
classified, and \(S_p\) denotes the pooled covariance
matrix.
Let S denote either the pooled covariance matrix or one of the within-group covariance matrices \(S_i\). (S will be the pooled covariance matrix in linear discrimination, and \(S_i\) otherwise.) The Mahalanobis distance between group i and group j is computed as: $$D_{ij}^{2} = (\overline{x_i} - \overline{x_j})^T S^{-1} (\overline{x_i} - \overline{x_j})$$
Finally, the asymptotic chi-squared test for the equality of covariance matrices is computed as follows (Morrison 1976, page 252): $$\gamma = C^{-1} \sum_{i=1}^{k} n_i \{ ln( \left| S_p \right| ) - ln( \left| S_i \right| ) \}$$ where \(n_i\) is the number of degrees of freedom in the i-th sample covariance matrix, \(k\) is the number of groups, and $$C^{-1} = \frac{1-2p^2 + 3p - 1}{6(p + 1)(k - 1)} \left(\sum_{i=1}^{k} \frac{1}{n_i} - \frac{1}{\sum_{j}n_j} \right)$$ where \(p\) is the number of variables.
The estimated posterior probability of each observation x belonging to group i is computed using the prior probabilities and the sample mean vectors and estimated covariance matrices under a multivariate normal assumption. Under quadratic discrimination, the within-group covariance matrices are used to compute the estimated posterior probabilities. The estimated posterior probability of an observation x belonging to group i is $$\hat{q_i}(x) = \frac{e^{-\frac{1}{2}D_{i}^{2}(x)}}{\sum_{j=1}^{k} e^{-\frac{1}{2}D_{j}^{2}(x)}}$$ where $$D_{i}^{2}(x) = \left\{ \begin{array}{ll} (x - \overline{x_i})^T S_{i}^{-1} (x - \overline{x_i}) + ln \left|S_i \right| - 2 ln(p_i) & \mbox{Linear or Quadratic, pooled, group} \\ (x - \overline{x_i})^T S_{p}^{-1} (x - \overline{x_i}) - 2 ln(p_i) & \mbox{Linear, Pooled} \end{array} \right. $$
For the leave-out-one method of classification, the sample mean vector and sample covariance matrices in the formula for $$D_{i}^{2}(x)$$ are adjusted so as to remove the observation x from their computation. For linear discrimination, the linear discriminant function coefficients are actually used to compute the same posterior probabilities.
Using the posterior probabilities, each observation in x is
classified into a group; the result is tabulated in the matrix returned by
getClassTable and saved in the vector returned by
getClassMembership. If a group variable is provided and the
group number is out of range, the classification table is not altered at
this stage. If the reclassification method is specified, then all
observations with no missing values are classified. When the leaving-out-one
method is used, observations with invalid group numbers, weights, frequencies
or classification variables are not classified. Regardless of the frequency,
a 1 is added (or subtracted) from the classification table for each row of
x that is classified and contains a valid group number.
When the leaving-out-one method is used, adjustment is made to the posterior
probabilities to remove the effect of the observation in the classification
rule. In this adjustment, each observation is presumed to have a weight of
\(w_j\) and a frequency of 1.0. See Lachenbruch (1975, page 36)
for the required adjustment.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classThe variance-covariance matrix is singular.static classThere are no observations in a group.static classThe sum of the weights have become negative. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intIndicates leave-out-one classification method.static final intIndicates a linear discrimination method.static final intIndicates pooled covariances computation.static final intIndicates pooled, group covariances computation.static final intIndicates prior equal probabilities.static final intIndicates prior proportional probabilities.static final intIndicates a quadratic discrimination method.static final intIndicates reclassification classification method. -
Constructor Summary
ConstructorsConstructorDescriptionDiscriminantAnalysis(int nVariables, int nGroups) Constructs aDiscriminantAnalysis. -
Method Summary
Modifier and TypeMethodDescriptionvoidclassify(double[][] x) Classify a set of observations using the linear or quadratic discriminant functions generated during the training process.voidclassify(double[][] x, int[] varIndex) Classify a set of observations using the linear or quadratic discriminant functions generated during the training process.voidclassify(double[][] x, int[] frequencies, double[] weights) Classify a set of observations and associated frequencies and weights using the linear or quadratic discriminant functions generated during the training process.voidclassify(double[][] x, int[] group, int[] varIndex) Classify a set of observations and compare against known groups using the linear or quadratic discriminant functions generated during the training process.voidclassify(double[][] x, int[] varIndex, int[] frequencies, double[] weights) Classify a set of observations and associated frequencies and weights using the linear or quadratic discriminant functions generated during the training process.voidclassify(double[][] x, int[] group, int[] varIndex, int[] frequencies, double[] weights) Classify a set of observations, associated frequencies and weights, and compare against known groups using the linear or quadratic discriminant functions generated during the training process.voiddowndate(double[][] x, int[] group) Removes a set of observations from the discriminant functions.voiddowndate(double[][] x, int[] group, int[] varIndex) Removes a set of observations from the discriminant functions.voiddowndate(double[][] x, int[] group, int[] frequencies, double[] weights) Removes a set of observations and associated frequencies and weights from the discriminant functions.voiddowndate(double[][] x, int[] group, int[] varIndex, int[] frequencies, double[] weights) Removes a set of observations and associated frequencies and weights from the discriminant functions.int[]Returns the group number to which the observation was classified.double[][]Returns the classification table.double[][]Returns the linear discriminant function coefficients.double[][][]Returns the array of covariances.int[]Returns the group counts.double[][]Returns the Mahalanobis distances between the group means.double[][]getMeans()Returns the variable means.intDeprecated.intReturns the number of rows of data encountered containing missing values (Double.NaN).double[]getPrior()Returns the prior probabilities.double[][]Returns the posterior probabilities for each observation.double[]Returns statistics.voidsetClassificationMethod(int method) Specifies the classification method to be either reclassification or leave-out-one.voidsetCovarianceComputation(int type) Specifies the covariance matrix computation to be either pooled or pooled, group.voidsetDiscriminationMethod(int method) Specifies the discrimination method used to be either linear or quadratic discrimination.voidsetPrior(double[] prior) Specifies user supplied prior probabilities.voidsetPrior(int prior) Specifies the prior probabilities to be calculated as either equal or proportional priors.voidupdate(double[][] x) Deprecated.Useupdate(double[][], int[])instead.voidupdate(double[][] x, double[] frequencies, double[] weights) Deprecated.Useupdate(double[][], int[], int[], double[])instead.voidupdate(double[][] x, int groupIndex) Deprecated.Useupdate(double[][], int[])instead.voidupdate(double[][] x, int[] group) Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.voidupdate(double[][] x, int[] varIndex, double[] frequencies, double[] weights) Deprecated.Useupdate(double[][], int[], int[], int[], double[])instead.voidupdate(double[][] x, int[] group, int[] varIndex) Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.voidupdate(double[][] x, int[] group, int[] frequencies, double[] weights) Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.voidupdate(double[][] x, int[] group, int[] varIndex, int[] frequencies, double[] weights) Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.voidupdate(double[][] x, int groupIndex, double[] frequencies, double[] weights) Deprecated.Useupdate(double[][], int[], int[], double[])instead.voidupdate(double[][] x, int groupIndex, int[] varIndex) Deprecated.Useupdate(double[][], int[], int[])instead.voidupdate(double[][] x, int groupIndex, int[] varIndex, double[] frequencies, double[] weights) Deprecated.Useupdate(double [][], int[], int[], int[], double[])instead.
-
Field Details
-
LINEAR
public static final int LINEARIndicates a linear discrimination method.- See Also:
-
QUADRATIC
public static final int QUADRATICIndicates a quadratic discrimination method.- See Also:
-
POOLED
public static final int POOLEDIndicates pooled covariances computation.- See Also:
-
POOLED_GROUP
public static final int POOLED_GROUPIndicates pooled, group covariances computation.- See Also:
-
RECLASSIFICATION
public static final int RECLASSIFICATIONIndicates reclassification classification method.- See Also:
-
LEAVE_OUT_ONE
public static final int LEAVE_OUT_ONEIndicates leave-out-one classification method.- See Also:
-
PRIOR_PROPORTIONAL
public static final int PRIOR_PROPORTIONALIndicates prior proportional probabilities.- See Also:
-
PRIOR_EQUAL
public static final int PRIOR_EQUALIndicates prior equal probabilities.- See Also:
-
-
Constructor Details
-
DiscriminantAnalysis
public DiscriminantAnalysis(int nVariables, int nGroups) Constructs aDiscriminantAnalysis.- Parameters:
nVariables- anintrepresenting the number of variables to be used in the discriminationnGroups- anintrepresenting the number of groups in the data
-
-
Method Details
-
update
Deprecated.Useupdate(double[][], int[])instead.Trains a set of observations by performing a linear or quadratic discriminant function analysis among several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The column containing the group numbers must be in columnnVariablesof the input matrix. Specifically, the firstnVariablescolumns correspond to the variables, and the last column contains the group numbers. The groups must be numbered 1,2, ...,nGroups. Any additional columns will be ignored.- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
public void update(double[][] x, int groupIndex) throws DiscriminantAnalysis.SumOfWeightsNegException Deprecated.Useupdate(double[][], int[])instead.Trains a set of observations by performing a linear or quadratic discriminant function analysis among several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The firstnVariablescolumns, excludinggroupIndexcolumn, correspond to the variables, ThegroupIndexcolumn contains the group numbers. Any additional columns will be ignored.groupIndex- anintcontaining the column index ofxin which the group numbers are stored. The groups must be numbered 1,2, ...,nGroups. Any observations with a group number outside of this range will be skipped.- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
public void update(double[][] x, int groupIndex, int[] varIndex) throws DiscriminantAnalysis.SumOfWeightsNegException Deprecated.Useupdate(double[][], int[], int[])instead.Trains a set of observations by performing a linear or quadratic discriminant function analysis among the several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The columns indicated invarIndexcorrespond to the variables, andgroupIndexcolumn contains the group numbers. Any additional columns will be ignored.groupIndex- anintcontaining the column index ofxin which the group numbers are stored. The groups must be numbered 1,2, ...,nGroups.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysis.- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
public void update(double[][] x, double[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Deprecated.Useupdate(double[][], int[], int[], double[])instead.Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among the several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The firstnVariablescolumns correspond to the variables, and the last column (columnnVariables) contains the group numbers. The groups must be numbered 1,2, ...,nGroups.frequencies- adoublearray containing the associated frequencies for each observation.weights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
update
public void update(double[][] x, int groupIndex, double[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Deprecated.Useupdate(double[][], int[], int[], double[])instead.Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among the several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The firstnVariablescolumns correspond to the variables, excluding thegroupIndexcolumn.groupIndex- anintcontaining the column index ofxin which the group numbers are stored. The groups must be numbered 1,2, ...,nGroups.frequencies- adoublearray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
update
public void update(double[][] x, int[] varIndex, double[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Deprecated.Useupdate(double[][], int[], int[], int[], double[])instead.Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among the several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The columns indicated invarIndexcorrespond to the variables, and the last column (columnnVariables) contains the group numbers. The groups must be numbered 1,2, ...,nGroups.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysis.frequencies- adoublearray containing the associated frequencies for each observation.weights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
update
public void update(double[][] x, int groupIndex, int[] varIndex, double[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Deprecated.Useupdate(double [][], int[], int[], int[], double[])instead.Trains a set of observations and associated frequencies and weights by performin a linear or quadratic discriminant function analysis among the several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariables+ 1 columns. The columns indicated invarIndexcorrespond to the variables, andgroupIndexcolumn contains the group numbers.groupIndex- anintcontaining the column index ofxin which the group numbers are stored. The groups must be numbered 1,2, ...,nGroups.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysisfrequencies- adoublearray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The firstnVariablescorrespond to the variables. Any additional columns will be ignored.group- anintarray containing the group numbers. The groups must be numbered
1,2, ...,nGroupsfor each observation.- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
public void update(double[][] x, int[] group, int[] varIndex) throws DiscriminantAnalysis.SumOfWeightsNegException Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables. Any additional columns will be ignored.group- anintarray containing the group numbers. The groups must be numbered
1,2, ...,nGroupsfor each observation.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysis- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
public void update(double[][] x, int[] group, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The firstnVariablescorrespond to the variables. Any additional columns will be ignored.group- anintarray containing the group numbers. The groups must be numbered
1,2, ...,nGroupsfor each observation.frequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
update
public void update(double[][] x, int[] group, int[] varIndex, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Trains a set of observations and associated frequencies and weights by performing a linear or quadratic discriminant function analysis among several known groups.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables.group- anintarray containing the group numbers. The groups must be numbered
1,2, ...,nGroupsfor each observation.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysisfrequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
downdate
public void downdate(double[][] x, int[] group) throws DiscriminantAnalysis.SumOfWeightsNegException Removes a set of observations from the discriminant functions.- Parameters:
x- adoublematrix containing the observations to be removed, with at leastnVariablescolumns. The firstnVariablescolumns correspond to the variables. Any additional columns will be ignored.group- anintarray containing the group numbers. The groups must be numbered 1,2, ...,nGroupsfor each observation.- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
downdate
public void downdate(double[][] x, int[] group, int[] varIndex) throws DiscriminantAnalysis.SumOfWeightsNegException Removes a set of observations from the discriminant functions.- Parameters:
x- adoublematrix containing the observations to be removed, with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables.group- anintarray containing the group numbers. The groups must be numbered 1,2, ...,nGroupsfor each observation.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysis- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
downdate
public void downdate(double[][] x, int[] group, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Removes a set of observations and associated frequencies and weights from the discriminant functions.- Parameters:
x- adoublematrix containing the observations to be removed, with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables.group- anintarray containing the group numbers. The groups must be numbered 1,2, ...,nGroupsfor each observation.frequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
downdate
public void downdate(double[][] x, int[] group, int[] varIndex, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException Removes a set of observations and associated frequencies and weights from the discriminant functions.- Parameters:
x- adoublematrix containing the observations to be removed, with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables.group- anintarray containing the group numbers. The groups must be numbered 1,2, ...,nGroupsfor each observation.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysisfrequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.
-
classify
public void classify(double[][] x) throws DiscriminantAnalysis.SumOfWeightsNegException, DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularException Classify a set of observations using the linear or quadratic discriminant functions generated during the training process.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The firstnVariablescolumns correspond to the variables. Reclassification does not require group numbers be present. Any additional columns will be ignored.- Throws:
IllegalStateException- is thrown if the leave-out-one classification method is chosen.DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
classify
public void classify(double[][] x, int[] varIndex) throws DiscriminantAnalysis.SumOfWeightsNegException, DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularException Classify a set of observations using the linear or quadratic discriminant functions generated during the training process.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables. Reclassification does not require group numbers be present. Additional columns will be ignored.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysis- Throws:
IllegalStateException- is thrown if the leave-out-one classification method is chosen.DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negative.DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
classify
public void classify(double[][] x, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException, DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularException Classify a set of observations and associated frequencies and weights using the linear or quadratic discriminant functions generated during the training process.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The firstnVariablescolumns correspond to the variables. Reclassification does not require group numbers be present. Any additional columns will be ignored.frequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
IllegalStateException- is thrown if the leave-out-one classification method is chosenDiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negativeDiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a groupDiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular
-
classify
public void classify(double[][] x, int[] varIndex, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException, DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularException Classify a set of observations and associated frequencies and weights using the linear or quadratic discriminant functions generated during the training process.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables. Reclassification does not require group numbers be present. Additional columns inxwill be ignored.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysisfrequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
IllegalStateException- is thrown if the leave-out-one classification method is chosenDiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negativeDiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a groupDiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular
-
classify
public void classify(double[][] x, int[] group, int[] varIndex) throws DiscriminantAnalysis.SumOfWeightsNegException, DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularException Classify a set of observations and compare against known groups using the linear or quadratic discriminant functions generated during the training process.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables. Any additional columns will be ignored.group- anintarray containing the group numbers. The groups must be numbered 1,2, ...,nGroupsfor each observation.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysis- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negativeDiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a groupDiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular
-
classify
public void classify(double[][] x, int[] group, int[] varIndex, int[] frequencies, double[] weights) throws DiscriminantAnalysis.SumOfWeightsNegException, DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularException Classify a set of observations, associated frequencies and weights, and compare against known groups using the linear or quadratic discriminant functions generated during the training process.- Parameters:
x- adoublematrix containing the observations with at leastnVariablescolumns. The columns indicated invarIndexcorrespond to the variables. Additional columns are ignored.group- anintarray containing the group numbers. The groups must be numbered 1,2, ...,nGroupsfor each observation.varIndex- anintarray containing the column indices inxthat correspond to the variables to be used in the analysisfrequencies- anintarray containing the associated frequencies for each observationweights- adoublearray containing the associated weights for each observation- Throws:
DiscriminantAnalysis.SumOfWeightsNegException- is thrown when the sum of the weights have become negativeDiscriminantAnalysis.EmptyGroupExceptionDiscriminantAnalysis.CovarianceSingularException
-
setDiscriminationMethod
public void setDiscriminationMethod(int method) Specifies the discrimination method used to be either linear or quadratic discrimination.- Parameters:
method- anintscalar indicating the method of discrimination. Use class memberLINEARorQUADRATIC. By default, theLINEARmethod is used.
-
setCovarianceComputation
public void setCovarianceComputation(int type) Specifies the covariance matrix computation to be either pooled or pooled, group.- Parameters:
type- anintscalar indicating the type of covariance matrices to be computed. Use class memberPOOLEDorPOOLED_GROUP. By default,POOLED_GROUPis used.
-
setClassificationMethod
public void setClassificationMethod(int method) Specifies the classification method to be either reclassification or leave-out-one.- Parameters:
method- anintindicating the method of classification. Use class memberRECLASSIFICATIONorLEAVE_OUT_ONE. By default, theRECLASSIFICATIONmethod is used.
-
setPrior
public void setPrior(int prior) Specifies the prior probabilities to be calculated as either equal or proportional priors.- Parameters:
prior- anintspecifying how to calculate prior probabilities as either equal or proportional prior probabilities. Use class memberPRIOR_EQUALto set equal prior probabilities, calculated as 1.0/nGroups. Use class memberPRIOR_PROPORTIONALto calculate the priors to be proportional to the sample size in each group. The sum of all prior probabilities is equal to 1.0. If the values calculated for the priors are less than 1.0e-20, they will be converted to theStrictMath.log(1.0e-20). Prior probabilities are used in calculating statistics, coefficients, Mahalanobis, and classification probabilities. By default,PRIOR_EQUALis used.
-
setPrior
public void setPrior(double[] prior) Specifies user supplied prior probabilities.- Parameters:
prior- adoublevector of lengthnGroupscontaining the prior probabilities for each group. The elements ofpriorshould sum to 1.0. If the values ofpriorare less than 1.0e-20, they will be converted to theStrictMath.log(1.0e-20). By default, the prior probablities are calculated to be equal, seesetPrior(int).
-
getPrior
public double[] getPrior()Returns the prior probabilities.- Returns:
- a
doublearray of lengthnGroupscontaining the prior probabilities for each group.
-
getGroupCounts
public int[] getGroupCounts()Returns the group counts.- Returns:
- an
intarray of lengthnGroupscontaining the number of observations in each group. If an update has not preceeded the invocation of this method, an array of all zeros will be returned.
-
getMeans
public double[][] getMeans() throws DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularExceptionReturns the variable means.- Returns:
- an
nGroupsbynVariablesdoublematrix containing the variable means. The i-th row contains the variable means for group i.If this method is invoked before classification, the unscaled means will be returned.
- Throws:
DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
getCovariance
public double[][][] getCovariance() throws DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularExceptionReturns the array of covariances.- Returns:
- a g by
nVariablesbynVariables doublearray containing the covariances. Where, g =nGroups+1 if pooled, group covariance computation is specified or g=1 if pooled covariance computation is specified. When pooled only covariance matrices are computed, the within-group covariance matrices are not computed. The pooled covariance matrix is always computed and is returned as the g-th covariance matrix.If this method is invoked before classification, the unscaled covariance matrix will be returned.
- Throws:
DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
getCoefficients
public double[][] getCoefficients() throws DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularExceptionReturns the linear discriminant function coefficients.- Returns:
- an
nGroupsbynVariablesdoublematrix containing the linear discriminant function coefficients. The first column of the matrix contains the constant term, and the remaining columns contain the variable coefficients. The i-th row of the returned matrix corresponds to group i. The coefficients are always computed as linear discriminant function coefficients even when quadratic discrimination is specified. - Throws:
DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
getClassTable
public double[][] getClassTable()Returns the classification table.- Returns:
- an
nGroupsbynGroupsdoublematrix containing the classification table. The accumulation of each observation that is classified and has a group number equal to 1, 2, ...,nGroupsis entered into the table. If a known group is provided, the rows of the table correspond to the known group membership. The columns refer to the group to which the observation was classified. If a known group is not provided, the table will only contain the accumulated classified groups in the column coresponding to the group to which the observation was classified. - Throws:
IllegalStateException- is thrown if no data has been classified.
-
getMahalanobis
public double[][] getMahalanobis() throws DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularExceptionReturns the Mahalanobis distances between the group means.- Returns:
- an
nGroupsbynGroups doublematrix containing the Mahalanobis distances between the group means. For linear discrimination, the Mahalanobis distance $$D_{ij}^2(x)$$ between group means i and j is computed using the within covariance matrix for group i in place of the pooled covariance matrix. - Throws:
DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
getStatistics
public double[] getStatistics() throws DiscriminantAnalysis.EmptyGroupException, DiscriminantAnalysis.CovarianceSingularExceptionReturns statistics.- Returns:
- a
doublearray containing output statistics.index Description 0 Sum of the degrees of freedom for the within-covariance matrices. 1 Chi-squared statistic. 2 The degrees of freedom in the chi-squared statistic. 3 Probability of a greater chi-squared, respectively, of a test of the homogeneity of the within-covariance matrices. (Not computed when the pooled only covariance matrix is computed). 4 thru (4+ nGroups)Log of the determinant of each group's covariance matrix (not computed when the pooled only covariance matrix is computed) and of the pooled covariance matrix. Last ( nGroups + 1) elementsSum of the weights within each group. Last element Sum of the weights in all groups. - Throws:
DiscriminantAnalysis.EmptyGroupException- is thrown when there are no observations in a group.DiscriminantAnalysis.CovarianceSingularException- is thrown when the variance-covariance matrix is singular.
-
getClassMembership
public int[] getClassMembership()Returns the group number to which the observation was classified.- Returns:
- an
intarray containing the group to which the observation was classified. If an observation has an invalid group number, frequency, or weight when the leaving-out-one method has been specified, then the observation is not classified and the corresponding elements of the array are set to zero. Note this will return the classmembership of the last set of observations classified. - Throws:
IllegalStateException- is thrown if no data has been classified.
-
getProbability
public double[][] getProbability()Returns the posterior probabilities for each observation.- Returns:
- an
x.lengthbynGroups doublematrix containing the posterior probabilities for each observation. Note this will return the probabilities of the last set of observations classified. - Throws:
IllegalStateException- is thrown if no data has been classified.
-
getNRowsMissing
public int getNRowsMissing()Deprecated.UsegetNumberOfRowsMissing()instead.Returns the number of rows of data encountered containing missing values (NaN).- Returns:
- an
intrepresenting the number of rows of data encountered containing missing values (NaN) for the classification, group, weight, and/or frequency variables. If a row of data contains a missing value (NaN) for any of these variables, that row is excluded from the computations.
-
getNumberOfRowsMissing
public int getNumberOfRowsMissing()Returns the number of rows of data encountered containing missing values (Double.NaN).- Returns:
- an
intrepresenting the number of rows of data encountered containing missing values (Double.NaN) for the classification, group, weight, and/or frequency variables. If a row of data contains a missing value (Double.NaN) for any of these variables, that row is excluded from the computations.
-
getNumberOfRowsMissing()instead.