DecisionTree Example 4

Example 4: DecisionTree

This example applies the QUEST method to a simulated data set with 50 cases and three predictors of mixed-type. Shown are the maximum-sized tree under default controls and the subtree resulting from pruning with a cost-complexity value of 0.0.

import com.imsl.datamining.decisionTree.*;


public class DecisionTreeEx4 {

    public static void main(String[] args) throws Exception {
        
        double[][] xy = {
            {2, 25.928690, 0, 0},
            {1, 51.632450, 1, 1},
            {1, 25.784321, 0, 2},
            {0, 39.379478, 0, 3},
            {2, 24.650579, 0, 2},
            {2, 45.200840, 0, 2},
            {2, 52.679600, 1, 3},
            {1, 44.283421, 1, 3},
            {2, 40.635231, 1, 3},
            {2, 51.760941, 0, 3},
            {2, 26.303680, 0, 1},
            {2, 20.702299, 1, 0},
            {2, 38.742729, 1, 3},
            {2, 19.473330, 0, 0},
            {1, 26.422110, 0, 0},
            {2, 37.059860, 1, 0},
            {1, 51.670429, 1, 3},
            {0, 42.401562, 0, 3},
            {2, 33.900269, 1, 2},
            {1, 35.432819, 0, 0},
            {1, 44.303692, 0, 1},
            {0, 46.723869, 0, 2},
            {1, 46.992619, 0, 2},
            {0, 36.059231, 0, 3},
            {2, 36.831970, 1, 1},
            {1, 61.662571, 1, 2},
            {0, 25.677139, 0, 3},
            {1, 39.085670, 1, 0},
            {0, 48.843410, 1, 1},
            {1, 39.343910, 0, 3},
            {2, 24.735220, 0, 2},
            {1, 50.552509, 1, 3},
            {0, 31.342630, 1, 3},
            {1, 27.157949, 1, 0},
            {0, 31.726851, 0, 2},
            {0, 25.004080, 0, 3},
            {1, 26.354570, 1, 3},
            {2, 38.123428, 0, 1},
            {0, 49.940300, 0, 2},
            {1, 42.457790, 1, 3},
            {0, 38.809479, 1, 1},
            {0, 43.227989, 1, 1},
            {0, 41.876240, 0, 3},
            {2, 48.078201, 0, 2},
            {0, 43.236729, 1, 0},
            {2, 39.412941, 0, 3},
            {1, 23.933460, 0, 2},
            {2, 42.841301, 1, 3},
            {2, 30.406691, 0, 1},
            {0, 37.773891, 0, 2}
        };

        DecisionTree.VariableType[] varType = {
            DecisionTree.VariableType.CATEGORICAL,
            DecisionTree.VariableType.QUANTITATIVE_CONTINUOUS,
            DecisionTree.VariableType.CATEGORICAL,
            DecisionTree.VariableType.CATEGORICAL
        };

        QUEST dt = new QUEST(xy, 3, varType);
        dt.setPrintLevel(1);
        dt.fitModel();
        dt.pruneTree(0.0);

        System.out.println("\nMaximal tree: \n");
        dt.printDecisionTree(true);

        System.out.println("\nPruned subtree (cost-complexity = 0): \n");
        dt.printDecisionTree(false);
    }
}

Output

Growing the maximal tree using method QUEST:


Maximal tree: 


Decision Tree:


Node 0: Cost = 0.620, N= 50, Level = 0, Child nodes:  1  2 
P(Y=0)= 0.180
P(Y=1)= 0.180
P(Y=2)= 0.260
P(Y=3)= 0.380
Predicted Y:   3 
   
Node 1: Cost = 0.220, N= 17, Level = 1
   Rule: X1    <= 35.031
    P(Y=0)= 0.294
    P(Y=1)= 0.118
    P(Y=2)= 0.353
    P(Y=3)= 0.235
    Predicted Y:   2 
   
Node 2: Cost = 0.360, N= 33, Level = 1, Child nodes:  3  4 
   Rule: X1    <= 35.031
    P(Y=0)= 0.121
    P(Y=1)= 0.212
    P(Y=2)= 0.212
    P(Y=3)= 0.455
    Predicted Y:   3 
      
Node 3: Cost = 0.180, N= 19, Level = 2
      Rule: X1       <= 43.265
        P(Y=0)= 0.211
        P(Y=1)= 0.211
        P(Y=2)= 0.053
        P(Y=3)= 0.526
        Predicted Y:   3 
      
Node 4: Cost = 0.160, N= 14, Level = 2
      Rule: X1       <= 43.265
        P(Y=0)= 0.000
        P(Y=1)= 0.214
        P(Y=2)= 0.429
        P(Y=3)= 0.357
        Predicted Y:   2 

Pruned subtree (cost-complexity = 0): 


Decision Tree:


Node 0: Cost = 0.620, N= 50, Level = 0, Child nodes:  1  2 
P(Y=0)= 0.180
P(Y=1)= 0.180
P(Y=2)= 0.260
P(Y=3)= 0.380
Predicted Y:   3 
   
Node 1: Cost = 0.220, N= 17, Level = 1
   Rule: X1    <= 35.031
    P(Y=0)= 0.294
    P(Y=1)= 0.118
    P(Y=2)= 0.353
    P(Y=3)= 0.235
    Predicted Y:   2 
   
Node 2: Cost = 0.360, N= 33, Level = 1
   Rule: X1    <= 35.031
    P(Y=0)= 0.121
    P(Y=1)= 0.212
    P(Y=2)= 0.212
    P(Y=3)= 0.455
    Predicted Y:   3 
Pruned at Node id 2.

Link to Java source.