RandomTrees (JMSL Numerical Library (jmsl) 2021.0.0 API)

java.lang.Object
- com.imsl.datamining.PredictiveModel
- - com.imsl.datamining.decisionTree.RandomTrees

All Implemented Interfaces:

Serializable, Cloneable
```
public class RandomTrees
extends PredictiveModel
implements Serializable, Cloneable
```
Generates predictions using a random forest of decision trees.
A random forest is an ensemble of decision trees. Like bootstrap aggregation, a tree is fit to each of M bootstrap samples from the training data. Each tree is then used to generate predictions. For a regression problem (continuous response variable), the M predictions are combined into a single predicted value by averaging. For classification (categorical response variable), majority vote is used. A random forest also randomizes the predictors. That is, in every tree, the splitting variable at every node is selected from a random subset of the predictors. Randomization of the predictors reduces correlation among individual trees. The random forest was invented by Leo Breiman in 2001 (Breiman, 2001). Random Forests^TM is the trademark term for this approach. Also see Hastie, Tibshirani, and Friedman, 2008, for further discussion.

See Also:

Example 1, Example 2, Example 3, Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class RandomTrees.ReflectiveOperationException
Class that wraps exceptions thrown by reflective operations in core reflection.
- Nested classes/interfaces inherited from class com.imsl.datamining.PredictiveModel
  PredictiveModel.CloneNotSupportedException, PredictiveModel.PredictiveModelException, PredictiveModel.StateChangeException, PredictiveModel.SumOfProbabilitiesNotOneException, PredictiveModel.VariableType

Nested Classes
Modifier and Type	Class and Description
`static class`	`RandomTrees.ReflectiveOperationException` Class that wraps exceptions thrown by reflective operations in core reflection.

Constructor Summary

Constructors
Constructor and Description
`RandomTrees(DecisionTree dt)` Constructs a `RandomTrees` random forest of the input decision tree.
`RandomTrees(double[][] xy, int responseColumnIndex, PredictiveModel.VariableType[] varType)` Constructs a `RandomTrees` random forest of `ALACART` decision trees.
`RandomTrees(RandomTrees rtModel)` Constructs a copy of the input `RandomTrees` predictive model.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`RandomTrees`	`clone()` Clones a `RandomTrees` predictive model.
`void`	`fitModel()` Fits the random forest to the training data.
`int`	`getNumberOfRandomFeatures()` Returns the number of random features used in the splitting rules.
`int`	`getNumberOfTrees()` Returns the number of trees.
`double`	`getOutOfBagPredictionError()` Returns the out-of-bag prediction error.
`double[]`	`getOutOfBagPredictions()` Returns the out-of-bag predicted values for the examples in the training data.
`double[]`	`getVariableImportance()` Returns the variable importance measure based on the out-of-bag prediction error.
`boolean`	`isCalculateVariableImportance()` Returns the current setting of the boolean to calculate variable importance.
`double[]`	`predict()` Returns the predicted values generated by the random forest on the training data.
`double[]`	`predict(double[][] testData)` Returns the predicted values on the input test data.
`double[]`	`predict(double[][] testData, double[] testDataWeights)` Returns the predicted values on the input test data and the test data weights.
`void`	`setCalculateVariableImportance(boolean calculate)` Sets the boolean to calculate variable importance.
`protected void`	`setConfiguration(PredictiveModel pm)` Sets the configuration of `RandomTrees` to that of the input model.
`void`	`setNumberOfRandomFeatures(int numberOfRandomFeatures)` Sets the number of random features used in the splitting rules.
`void`	`setNumberOfThreads(int numberOfThreads)` Sets the maximum number of `java.lang.Thread` instances that may be used for parallel processing.
`void`	`setNumberOfTrees(int numberOfTrees)` Sets the number of trees to generate in the random forest.

Methods inherited from class com.imsl.datamining.PredictiveModel
getClassCounts, getClassErrors, getClassLabels, getClassProbabilities, getCostMatrix, getMaxNumberOfCategories, getMaxNumberOfIterations, getNumberOfClasses, getNumberOfColumns, getNumberOfMissing, getNumberOfPredictors, getNumberOfRows, getNumberOfUniquePredictorValues, getPredictorIndexes, getPredictorTypes, getPrintLevel, getPriorProbabilities, getRandomObject, getResponseColumnIndex, getResponseVariableAverage, getResponseVariableMostFrequentClass, getResponseVariableType, getTotalWeight, getVariableType, getWeights, getXY, isConstantSeries, isMustFitModel, isUserFixedNClasses, setClassCounts, setClassLabels, setClassProbabilities, setCostMatrix, setMaxNumberOfCategories, setMaxNumberOfIterations, setMustFitModel, setNumberOfClasses, setPredictorIndex, setPredictorTypes, setPrintLevel, setPriorProbabilities, setRandomObject, setResponseColumnIndex, setTrainingData, setVariableType, setWeights

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - RandomTrees
```
public RandomTrees(double[][] xy,
                   int responseColumnIndex,
                   PredictiveModel.VariableType[] varType)
```
    Constructs a RandomTrees random forest of ALACART decision trees.
    
    Parameters:
    
    xy - a double matrix containing the training data
    
    responseColumnIndex - an int, the column index for the response variable
    
    varType - a PredictiveModel.VariableType array containing the type of each variable
  - RandomTrees
```
public RandomTrees(DecisionTree dt)
```
    Constructs a RandomTrees random forest of the input decision tree.
    
    Parameters:
    
    dt - a DecisionTree object
  - RandomTrees
```
public RandomTrees(RandomTrees rtModel)
```
    Constructs a copy of the input RandomTrees predictive model.
    
    Parameters:
    
    rtModel - a RandomTrees predictive model
- Method Detail
  - clone
```
public RandomTrees clone()
```
    Clones a RandomTrees predictive model.
    
    Specified by:
    
    clone in class PredictiveModel
    
    Returns:
    
    a clone of the RandomTrees predictive model
  - setNumberOfTrees
```
public void setNumberOfTrees(int numberOfTrees)
```
    Sets the number of trees to generate in the random forest.
    The number of trees is equivalent to the number of bootstrap samples.
    
    Parameters:
    
    numberOfTrees - an int, the number of trees to generate
    Default: numberOfTrees=50
  - setNumberOfRandomFeatures
```
public void setNumberOfRandomFeatures(int numberOfRandomFeatures)
```
    Sets the number of random features used in the splitting rules.
    
    Parameters:
    
    numberOfRandomFeatures - an int, the number of predictors in the random subset
    Default: numberOfRandomFeatures= $\sqrt{p}$ for classification problems, $\frac{p}{3}$ for regression problems, where $p$ is the number of predictors in the training data.
  - getNumberOfRandomFeatures
```
public int getNumberOfRandomFeatures()
```
    Returns the number of random features used in the splitting rules.
    
    Returns:
    
    an int, the number of random features
  - setCalculateVariableImportance
```
public void setCalculateVariableImportance(boolean calculate)
```
    Sets the boolean to calculate variable importance.
    When true, a permutation type variable importance measure is calculated during bootstrap aggregation.
    
    Parameters:
    
    calculate - a boolean indicating whether or not to calculate variable importance
    Default: calculate = false
  - isCalculateVariableImportance
```
public boolean isCalculateVariableImportance()
```
    Returns the current setting of the boolean to calculate variable importance.
    
    Returns:
    
    a boolean, the current setting of the flag
  - getNumberOfTrees
```
public int getNumberOfTrees()
```
    Returns the number of trees.
    
    Returns:
    
    an int, the number of trees
  - setNumberOfThreads
```
public void setNumberOfThreads(int numberOfThreads)
```
    Sets the maximum number of java.lang.Thread instances that may be used for parallel processing.
    
    Parameters:
    
    numberOfThreads - an int specifying the maximum number of java.lang.Thread instances that may be used for parallel processing.
    The actual number of threads used in parallel processing will be the lesser of numberOfThreads and numberOfTrees, the number of trees in the random forest. This assessment is made to optimize use of resources.
    
    Default: numberOfThreads = 1.
  - fitModel
```
public void fitModel()
              throws PredictiveModel.PredictiveModelException
```
    Fits the random forest to the training data.
    
    Overrides:
    
    fitModel in class PredictiveModel
    
    Throws:
    
    PredictiveModel.PredictiveModelException - is thrown when an exception occurs in the com.imsl.datamining.PredictiveModel. Superclass exceptions should be considered such as com.imsl.datamining.PredictiveModel.StateChangeException and com.imsl.datamining.PredictiveModel.SumOfProbabilitiesNotOneException.
  - setConfiguration
```
protected void setConfiguration(PredictiveModel pm)
                         throws PredictiveModel.PredictiveModelException
```
    Sets the configuration of RandomTrees to that of the input model.
    
    Specified by:
    
    setConfiguration in class PredictiveModel
    
    Parameters:
    
    pm - a RandomTrees object
    
    Throws:
    
    PredictiveModel.PredictiveModelException - is thrown when an exception occurs in the com.imsl.datamining.PredictiveModel. Superclass exceptions should be considered such as com.imsl.datamining.PredictiveModel.StateChangeException and com.imsl.datamining.PredictiveModel.SumOfProbabilitiesNotOneException.
  - predict
```
public double[] predict()
                 throws PredictiveModel.PredictiveModelException
```
    Returns the predicted values generated by the random forest on the training data.
    
    Specified by:
    
    predict in class PredictiveModel
    
    Returns:
    
    a double array containing the fitted values
    
    Throws:
    
    PredictiveModel.PredictiveModelException - is thrown when an exception occurs in the com.imsl.datamining.PredictiveModel. Superclass exceptions should be considered such as com.imsl.datamining.PredictiveModel.StateChangeException and com.imsl.datamining.PredictiveModel.SumOfProbabilitiesNotOneException.
  - predict
```
public double[] predict(double[][] testData)
                 throws PredictiveModel.PredictiveModelException
```
    Returns the predicted values on the input test data.
    
    Specified by:
    
    predict in class PredictiveModel
    
    Parameters:
    
    testData - a double matrix containing test data
    Note: testData must have the same number of columns as xy and the columns must be in the same arrangement as in xy.
    
    Returns:
    
    a double array containing the predicted values
    
    Throws:
    
    PredictiveModel.PredictiveModelException - is thrown when an exception occurs in the com.imsl.datamining.PredictiveModel. Superclass exceptions should be considered such as com.imsl.datamining.PredictiveModel.StateChangeException and com.imsl.datamining.PredictiveModel.SumOfProbabilitiesNotOneException.
  - predict
```
public double[] predict(double[][] testData,
                        double[] testDataWeights)
                 throws PredictiveModel.PredictiveModelException
```
    Returns the predicted values on the input test data and the test data weights.
    
    Specified by:
    
    predict in class PredictiveModel
    
    Parameters:
    
    testData - a double matrix containing test data
    
    testDataWeights - a double array containing weight values for each row of testData
    Note: testData must have the same number of columns as xy and the columns must be in the same arrangement as in xy.
    
    Returns:
    
    a double array containing the predicted values
    
    Throws:
    
    PredictiveModel.PredictiveModelException - is thrown when an exception occurs in the com.imsl.datamining.PredictiveModel. Superclass exceptions should be considered such as com.imsl.datamining.PredictiveModel.StateChangeException and com.imsl.datamining.PredictiveModel.SumOfProbabilitiesNotOneException.
  - getOutOfBagPredictions
```
public double[] getOutOfBagPredictions()
```
    Returns the out-of-bag predicted values for the examples in the training data.
    
    Returns:
    
    a double array containing the out-of-bag predictions
  - getOutOfBagPredictionError
```
public double getOutOfBagPredictionError()
```
    Returns the out-of-bag prediction error.
    
    Returns:
    
    a double, the out-of-bag prediction error
  - getVariableImportance
```
public double[] getVariableImportance()
```
    Returns the variable importance measure based on the out-of-bag prediction error.
    Variable importance for a predictor is obtained by randomly permuting the out-of-bag values of the predictor and calculating the difference in predictive accuracy, before and after the permutation. The measure is averaged over all the trees.
    
    Returns:
    
    a double array containing variable importance for each predictor

Class RandomTrees

Nested Class Summary

Nested classes/interfaces inherited from class com.imsl.datamining.PredictiveModel

Constructor Summary

Method Summary

Methods inherited from class com.imsl.datamining.PredictiveModel

Methods inherited from class java.lang.Object

Constructor Detail

RandomTrees

RandomTrees

RandomTrees

Method Detail

clone

setNumberOfTrees

setNumberOfRandomFeatures

getNumberOfRandomFeatures

setCalculateVariableImportance

isCalculateVariableImportance

getNumberOfTrees

setNumberOfThreads

fitModel

setConfiguration

predict

predict

predict

getOutOfBagPredictions

getOutOfBagPredictionError

getVariableImportance