This example applies the QUEST
method to a simulated data set with 50 cases and three predictors of mixed-type. Shown are the maximum-sized tree under default controls and the subtree resulting from pruning with a cost-complexity value of 0.0.
import com.imsl.datamining.decisionTree.*;
public class DecisionTreeEx4 {
public static void main(String[] args) throws Exception {
double[][] xy = {
{2, 25.928690, 0, 0},
{1, 51.632450, 1, 1},
{1, 25.784321, 0, 2},
{0, 39.379478, 0, 3},
{2, 24.650579, 0, 2},
{2, 45.200840, 0, 2},
{2, 52.679600, 1, 3},
{1, 44.283421, 1, 3},
{2, 40.635231, 1, 3},
{2, 51.760941, 0, 3},
{2, 26.303680, 0, 1},
{2, 20.702299, 1, 0},
{2, 38.742729, 1, 3},
{2, 19.473330, 0, 0},
{1, 26.422110, 0, 0},
{2, 37.059860, 1, 0},
{1, 51.670429, 1, 3},
{0, 42.401562, 0, 3},
{2, 33.900269, 1, 2},
{1, 35.432819, 0, 0},
{1, 44.303692, 0, 1},
{0, 46.723869, 0, 2},
{1, 46.992619, 0, 2},
{0, 36.059231, 0, 3},
{2, 36.831970, 1, 1},
{1, 61.662571, 1, 2},
{0, 25.677139, 0, 3},
{1, 39.085670, 1, 0},
{0, 48.843410, 1, 1},
{1, 39.343910, 0, 3},
{2, 24.735220, 0, 2},
{1, 50.552509, 1, 3},
{0, 31.342630, 1, 3},
{1, 27.157949, 1, 0},
{0, 31.726851, 0, 2},
{0, 25.004080, 0, 3},
{1, 26.354570, 1, 3},
{2, 38.123428, 0, 1},
{0, 49.940300, 0, 2},
{1, 42.457790, 1, 3},
{0, 38.809479, 1, 1},
{0, 43.227989, 1, 1},
{0, 41.876240, 0, 3},
{2, 48.078201, 0, 2},
{0, 43.236729, 1, 0},
{2, 39.412941, 0, 3},
{1, 23.933460, 0, 2},
{2, 42.841301, 1, 3},
{2, 30.406691, 0, 1},
{0, 37.773891, 0, 2}
};
DecisionTree.VariableType[] varType = {
DecisionTree.VariableType.CATEGORICAL,
DecisionTree.VariableType.QUANTITATIVE_CONTINUOUS,
DecisionTree.VariableType.CATEGORICAL,
DecisionTree.VariableType.CATEGORICAL
};
QUEST dt = new QUEST(xy, 3, varType);
dt.setPrintLevel(1);
dt.fitModel();
dt.pruneTree(0.0);
System.out.println("\nMaximal tree: \n");
dt.printDecisionTree(true);
System.out.println("\nPruned subtree (cost-complexity = 0): \n");
dt.printDecisionTree(false);
}
}
Growing the maximal tree using method QUEST:
Maximal tree:
Decision Tree:
Node 0: Cost = 0.620, N= 50, Level = 0, Child nodes: 1 2
P(Y=0)= 0.180
P(Y=1)= 0.180
P(Y=2)= 0.260
P(Y=3)= 0.380
Predicted Y: 3
Node 1: Cost = 0.220, N= 17, Level = 1
Rule: X1 <= 35.031
P(Y=0)= 0.294
P(Y=1)= 0.118
P(Y=2)= 0.353
P(Y=3)= 0.235
Predicted Y: 2
Node 2: Cost = 0.360, N= 33, Level = 1, Child nodes: 3 4
Rule: X1 <= 35.031
P(Y=0)= 0.121
P(Y=1)= 0.212
P(Y=2)= 0.212
P(Y=3)= 0.455
Predicted Y: 3
Node 3: Cost = 0.180, N= 19, Level = 2
Rule: X1 <= 43.265
P(Y=0)= 0.211
P(Y=1)= 0.211
P(Y=2)= 0.053
P(Y=3)= 0.526
Predicted Y: 3
Node 4: Cost = 0.160, N= 14, Level = 2
Rule: X1 <= 43.265
P(Y=0)= 0.000
P(Y=1)= 0.214
P(Y=2)= 0.429
P(Y=3)= 0.357
Predicted Y: 2
Pruned subtree (cost-complexity = 0):
Decision Tree:
Node 0: Cost = 0.620, N= 50, Level = 0, Child nodes: 1 2
P(Y=0)= 0.180
P(Y=1)= 0.180
P(Y=2)= 0.260
P(Y=3)= 0.380
Predicted Y: 3
Node 1: Cost = 0.220, N= 17, Level = 1
Rule: X1 <= 35.031
P(Y=0)= 0.294
P(Y=1)= 0.118
P(Y=2)= 0.353
P(Y=3)= 0.235
Predicted Y: 2
Node 2: Cost = 0.360, N= 33, Level = 1
Rule: X1 <= 35.031
P(Y=0)= 0.121
P(Y=1)= 0.212
P(Y=2)= 0.212
P(Y=3)= 0.455
Predicted Y: 3
Pruned at Node id 2.
Link to Java source.