public abstract class PredictiveModel extends Object implements Serializable, Cloneable
| Modifier and Type | Class and Description |
|---|---|
static class |
PredictiveModel.PredictiveModelException
An exception class intended to be the parent of all nested Exception
classes where the enclosing class extends
PredictiveModel. |
static class |
PredictiveModel.StateChangeException
Exception thrown when an input parameter has changed that might affect
the model estimates or predictions.
|
static class |
PredictiveModel.SumOfProbabilitiesNotOneException
Exception thrown when the sum of probabilities is not approximately one.
|
static class |
PredictiveModel.VariableType
An enumeration of data types/characteristics.
|
| Modifier | Constructor and Description |
|---|---|
protected |
PredictiveModel(double[][] xy,
int responseColumnIndex,
PredictiveModel.VariableType[] varType)
Constructs a
PredictiveModel object for a single response
variable and multiple predictor variables. |
| Modifier and Type | Method and Description |
|---|---|
void |
fitModel()
Fits the predictive model to the training data (estimates the model using
the training data and current configuration settings).
|
double[] |
getClassCounts()
Returns the counts of each class (level) of the categorical response
variable.
|
double[][] |
getCostMatrix()
Returns the cost matrix for a categorical response variable.
|
int |
getMaxNumberOfCategories()
Returns the maximum number of categorical variables allowed.
|
int |
getNumberOfClasses()
Returns the number of unique classes found in the categorical response
data.
|
int |
getNumberOfColumns()
Returns the number of columns in
xy. |
int |
getNumberOfMissing()
Returns the number of missing values of the response variable found in
the data
xy. |
int |
getNumberOfPredictors()
Returns the number of predictors.
|
int |
getNumberOfRows()
Returns the number of rows in
xy (observations). |
int[] |
getNumberOfUniquePredictorValues()
Returns an array containing the number of distinct values of each
categorical or ordinal predictor found in the input data.
|
int[] |
getPredictorIndexes()
Returns an array of indices into
xy where the predictor
variables reside. |
PredictiveModel.VariableType[] |
getPredictorTypes()
Returns an array of
VariableType objects that correspond to
the predictor data types in xy. |
int |
getPrintLevel()
Returns the current print level.
|
double[] |
getPriorProbabilities()
Returns an array containing the prior probabilities.
|
int |
getResponseColumnIndex()
Returns the column index in
xy containing the response
variable. |
double |
getResponseVariableAverage()
Returns the weighted average value of the response variable.
|
int |
getResponseVariableMostFrequentClass()
Returns the most frequent value of the response variable.
|
PredictiveModel.VariableType |
getResponseVariableType()
Returns the variable type of the response variable.
|
double |
getTotalWeight()
Returns the sum of the active case weights.
|
PredictiveModel.VariableType[] |
getVariableType()
Returns an array containing the variable types in
xy. |
double[] |
getWeights()
Returns an array containing the case weights.
|
double[][] |
getXY()
Returns a copy of the
xy data. |
boolean |
isMustFitModelFlag()
Returns the current value of the
mustFitModel flag. |
boolean |
isUserFixedNClasses()
Returns
true if the number of classes was fixed by the user. |
abstract double[] |
predict()
Predicts the response variable using the most recent fit.
|
abstract double[] |
predict(double[][] testData)
Predicts the response values using the most recent fit and the provided
test data.
|
abstract double[] |
predict(double[][] testData,
double[] testDataWeights)
Predicts the response values using the most recent fit, the provided test
data, and the test data case weights.
|
void |
setClassCounts(double[] classCounts)
Sets the counts of each class of the response variable.
|
protected abstract void |
setConfiguration(PredictiveModel pm)
Sets the configuration of
PredictiveModel to that of the
input model. |
void |
setCostMatrix(double[][] costMatrix)
Specifies the cost matrix for a categorical response variable.
|
void |
setMaxNumberOfCategories(int maxCategories)
Sets the maximum number of categories allowed within categorical
predictor variables.
|
void |
setNumberOfClasses(int nClasses)
Sets the number of distinct classes of the response variable.
|
void |
setPredictorIndex(int[] predIdx)
Sets the array of indices into
xy where the predictor
variables reside. |
void |
setPredictorTypes(PredictiveModel.VariableType[] predVarType)
Sets the
VariableType objects that correspond to the
predictor data types in xy. |
void |
setPrintLevel(int printLevel)
Sets a print level that determines the information printed for a
PredictiveModel. |
void |
setPriorProbabilities(double[] priors)
Set the prior probabilities for class membership.
|
void |
setWeights(double[] weights)
Specifies the case weights.
|
protected PredictiveModel(double[][] xy,
int responseColumnIndex,
PredictiveModel.VariableType[] varType)
PredictiveModel object for a single response
variable and multiple predictor variables.
This constructor should be called by all classes extending
PredictiveModel.
xy - a double matrix that is a number of observations
by the number of variables.responseColumnIndex - an int specifying the column
index of the response variable.varType - a PredictiveModel.VariableType
array of length equal to xy[0].length containing the type of
each variable.public void fitModel()
throws PredictiveModel.PredictiveModelException
Each PredictiveModel subclass must override and call this
method.
PredictiveModel.PredictiveModelException - an exception has occurred in the common
PredictiveModel methods. Implementing or overriding methods
from this class may require that exceptions be thrown. Exceptions thrown
from these methods will necessarily extend the
PredictiveModelException.public double[] getClassCounts()
If the response variable is not PredictiveModel.VariableType.CATEGORICAL nor PredictiveModel.VariableType.ORDERED_DISCRETE,
null is returned.
double array containing the summation of the case
weights for each occurrence of a particular class found in the
categorical response data.public double[][] getCostMatrix()
The cost matrix has elements C(i, j) = cost of misclassifying a
response in class j as in class i. The diagonal elements of
the cost matrix must be 0. In the case that nClasses has not
been determined (usually because fitModel() has
not been called), an array of length zero is returned.
double matrix of dimension
nClasses by nClasses containing the cost matrix
for a categorical response variable, where nClasses is the
number of classes the response variable may assume.public int getMaxNumberOfCategories()
int indicating the maximum number of categorical
variables allowed.public int getNumberOfClasses()
int indicating the number of unique classes found
in the categorical response data.public int getNumberOfColumns()
xy.int that indicates the number of columns.public int getNumberOfMissing()
xy.int indicating the number of missing values.public int getNumberOfPredictors()
int equal to the number of predictors.public int getNumberOfRows()
xy (observations).int equal to the number of rows in
xy (observations).public int[] getNumberOfUniquePredictorValues()
For predictors with PredictiveModel.VariableType.QUANTITATIVE_CONTINUOUS, the
value is set to 0 but is not meaningful.
int array containing the number of distinct
values.public int[] getPredictorIndexes()
xy where the predictor
variables reside.int array containing indices into xy
where the predictor variables reside.public PredictiveModel.VariableType[] getPredictorTypes()
VariableType objects that correspond to
the predictor data types in xy.VariableType array that corresponds to the
predictor data types in xy.public int getPrintLevel()
int indicating the current print level.
| printLevel | Action |
| 0 | No printing. |
| 1 | Prints final results only. |
| 2 | Prints intermediate and final results. |
printLevel = 0.public double[] getPriorProbabilities()
double array containing the prior probabilities.public int getResponseColumnIndex()
xy containing the response
variable.int specifying the column index for the response
variable.public double getResponseVariableAverage()
double equal to the weighted average value.public int getResponseVariableMostFrequentClass()
VariableType.CATEGORICAL or
VariableType.ORDERED_DISCRETE.int equal to the average value.public PredictiveModel.VariableType getResponseVariableType()
VariableType of the response variable.public double getTotalWeight()
double indicating the sum of the active case
weights.public PredictiveModel.VariableType[] getVariableType()
xy.VariableType array containing the variable types
in xy.public double[] getWeights()
double array containing the case weights.public double[][] getXY()
xy data.double matrix containing the training data.public boolean isMustFitModelFlag()
mustFitModel flag.
When true, the fitModel() method
should be called before doing any predictions or other analysis.
boolean value indicating the state of the flag.public boolean isUserFixedNClasses()
true if the number of classes was fixed by the user.boolean value indicating whether or not the number
of classes has been fixed by the user.public abstract double[] predict()
throws PredictiveModel.PredictiveModelException
Each PredictiveModel subclass must override this
method.
double array containing the predicted values.PredictiveModel.PredictiveModelException - an exception has occurred in the common
PredictiveModel methods. Implementing or
overriding methods from this class may require that exceptions be thrown.
Exceptions thrown from these methods will necessarily extend the
PredictiveModelException.public abstract double[] predict(double[][] testData)
throws PredictiveModel.PredictiveModelException
Each PredictiveModel subclass must override this
method.
testData - a double matrix containing data to be
predicted. testData must have the same number of columns and
in the same arrangement as xy (the observations).double array containing the predicted values.PredictiveModel.PredictiveModelException - an exception has occurred in the common
PredictiveModel methods. Implementing or
overriding methods from this class may require that exceptions be thrown.
Exceptions thrown from these methods will necessarily extend the
PredictiveModelException.public abstract double[] predict(double[][] testData,
double[] testDataWeights)
throws PredictiveModel.PredictiveModelException
Each PredictiveModel subclass must override this
method.
testData - double matrix containing data to be
predicted. testData must have the same number of columns and
in the same arrangement as xy (the observations).testDataWeights - a double array containing weights for
each row of testData.double array containing the predicted values.PredictiveModel.PredictiveModelException - an exception has occurred in the common
PredictiveModel methods. Implementing or
overriding methods from this class may require that exceptions be thrown.
Exceptions thrown from these methods will necessarily extend the
PredictiveModelException.public void setClassCounts(double[] classCounts)
Use this method to set the class counts, when one or more classes do not
occur in the training data due to sampling, but are otherwise valid, or
when the data is distributed and the global counts are available. Only
applies when the response variable is of type PredictiveModel.VariableType.CATEGORICAL or PredictiveModel.VariableType.ORDERED_DISCRETE.
classCounts - a double array containing the class
counts of the response variable.
The default is to use the class counts discovered in the input matrix,
xy, weighted by the values in weights.
protected abstract void setConfiguration(PredictiveModel pm) throws PredictiveModel.PredictiveModelException
PredictiveModel to that of the
input model.
The implementation should include model-specific input methods (subclass
methods) that must necessarily be set. An example would be
problem-specific constraints such as tree depth for a
DecisionTree in an ensemble
prediction algorithm (see BootstrapAggregation). Here DecisionTree.setMaxDepth(int) is called
within this overridden method to constrain the permuted trees to the same
maximum dimensions.
Each PredictiveModel subclass must override this
method.
pm - a PredictiveModel object.PredictiveModel.PredictiveModelException - an
exception class intended to be the parent of all nested Exception
classes where the enclosing class extends PredictiveModel.public void setCostMatrix(double[][] costMatrix)
costMatrix - a square double matrix of dimension
nClasses by nClasses containing elements
C(i, j), the cost of misclassifying a response in class j
as in class i. The diagonal elements of the cost matrix must be 0.
Both dimensions of costMatrix should agree with the number
of classes found in the data. Otherwise an exception will be thrown.
Default: costMatrix[i][j]=1.0 where
and costMatrix[i][i]=0.0.
public void setMaxNumberOfCategories(int maxCategories)
maxCategories - an int specifying the maximum number of
categories a predictor variable can have.
Default: maxCategories=10
public void setNumberOfClasses(int nClasses)
PredictiveModel.VariableType.CATEGORICAL or
PredictiveModel.VariableType.ORDERED_DISCRETE.nClasses - an int representing the number of distinct
classes or categories of the response variable.
An error is generated if more than nClasses categories are
discovered in the data.
Default: nClasses is 0.
public void setPredictorIndex(int[] predIdx)
xy where the predictor
variables reside.
This may be used to subset the full set of predictor variables
(getPredictorTypes()).
predIdx - an int array containing the column index for
each predictor variable.
Default: All columns other than the column containing the response
variable are indicated.public void setPredictorTypes(PredictiveModel.VariableType[] predVarType)
VariableType objects that correspond to the
predictor data types in xy.predVarType - a VariableType array of length equal to
the number of predictors specifying the data type of each predictor.public void setPrintLevel(int printLevel)
PredictiveModel.printLevel - An int specifying the level of printing to
perform.
| printLevel | Action |
| 0 | No printing. |
| 1 | Prints final results only. |
| 2 | Prints intermediate and final results. |
Default: printLevel = 0.
public void setPriorProbabilities(double[] priors)
throws PredictiveModel.SumOfProbabilitiesNotOneException
priors - a double array specifying the prior
probabilities of class membership for each class.
The prior probabilities must range between 0.0 and 1.0 inclusive, and sum
to 1.0. The length of priors should agree with the number of
classes found in the data. Otherwise an exception is thrown. Calling this
method overwrites any existing values.
Default: Determined from the data.
PredictiveModel.SumOfProbabilitiesNotOneException - prior probabilities must sum to 1.0.public void setWeights(double[] weights)
weights - a double array specifying case weights.
Default: weights[i] = 1.0 for all i.
Copyright © 1970-2015 Rogue Wave Software
Built June 18 2015.