This example uses the C45 method on simulated categorical data and demonstrates printing the tree structure with and without custom labels.
import com.imsl.datamining.decisionTree.*; public class DecisionTreeEx2 { public static void main(String[] args) throws Exception { double[][] xy = { {2, 0, 2}, {1, 0, 0}, {2, 1, 3}, {0, 1, 0}, {1, 2, 0}, {2, 2, 3}, {2, 2, 3}, {0, 1, 0}, {0, 0, 0}, {0, 1, 0}, {1, 2, 0}, {2, 0, 2}, {0, 2, 0}, {2, 0, 1}, {0, 0, 0}, {2, 0, 1}, {1, 0, 0}, {0, 2, 0}, {2, 0, 1}, {1, 2, 0}, {0, 2, 2}, {2, 1, 3}, {1, 1, 0}, {2, 2, 3}, {1, 2, 0}, {2, 2, 3}, {2, 0, 1}, {2, 1, 3}, {1, 2, 0}, {1, 1, 0} }; DecisionTree.VariableType[] varType = { DecisionTree.VariableType.CATEGORICAL, DecisionTree.VariableType.CATEGORICAL, DecisionTree.VariableType.CATEGORICAL }; String responseName = "Response"; String[] names = {"Var1", "Var2"}; String[] classNames = {"c1", "c2", "c3", "c4"}; String[] varLabels = {"L1", "L2", "L3", "A", "B", "C"}; C45 dt = new C45(xy, 2, varType); dt.setMinObsPerChildNode(5); dt.setMinObsPerNode(10); dt.setMaxNodes(50); dt.fitModel(); System.out.println("\nGenerated labels:"); dt.printDecisionTree(true); System.out.println("\nCustom labels:"); dt.printDecisionTree(responseName, names, classNames, varLabels, false); } }
Generated labels: Decision Tree: Node 0: Cost = 0.467, N= 30, Level = 0, Child nodes: 1 2 3 P(Y=0)= 0.533 P(Y=1)= 0.133 P(Y=2)= 0.100 P(Y=3)= 0.233 Predicted Y: 0 Node 1: Cost = 0.033, N= 8, Level = 1 Rule: X0 in: { 0 } P(Y=0)= 0.875 P(Y=1)= 0.000 P(Y=2)= 0.125 P(Y=3)= 0.000 Predicted Y: 0 Node 2: Cost = 0.000, N= 9, Level = 1 Rule: X0 in: { 1 } P(Y=0)= 1.000 P(Y=1)= 0.000 P(Y=2)= 0.000 P(Y=3)= 0.000 Predicted Y: 0 Node 3: Cost = 0.200, N= 13, Level = 1 Rule: X0 in: { 2 } P(Y=0)= 0.000 P(Y=1)= 0.308 P(Y=2)= 0.154 P(Y=3)= 0.538 Predicted Y: 3 Custom labels: Decision Tree: Node 0: Cost = 0.467, N= 30, Level = 0, Child nodes: 1 2 3 P(Y=0)= 0.533 P(Y=1)= 0.133 P(Y=2)= 0.100 P(Y=3)= 0.233 Predicted Response: c1 Node 1: Cost = 0.033, N= 8, Level = 1 Rule: Var1 in: { L1 } P(Y=0)= 0.875 P(Y=1)= 0.000 P(Y=2)= 0.125 P(Y=3)= 0.000 Predicted Response: c1 Node 2: Cost = 0.000, N= 9, Level = 1 Rule: Var1 in: { L2 } P(Y=0)= 1.000 P(Y=1)= 0.000 P(Y=2)= 0.000 P(Y=3)= 0.000 Predicted Response: c1 Node 3: Cost = 0.200, N= 13, Level = 1 Rule: Var1 in: { L3 } P(Y=0)= 0.000 P(Y=1)= 0.308 P(Y=2)= 0.154 P(Y=3)= 0.538 Predicted Response: c4Link to Java source.