decisionTreePrint¶
Prints a decision tree.
Synopsis¶
decisionTreePrint (tree)
Required Arguments¶
- structure tree (Input)
- An estimated decision tree.
Optional Arguments¶
respName
, char (Input)An array of length 1 containing a character string representing the name of the response variable.
Default:
respName[0]
= “Y”.varNames
, char[]
(Input)An array of length
tree.npreds
containing character strings representing the names of the predictors.Default:
varNames[0]=”X0”, varNames[1]=”X1”
, etc.classNames
, char[]
(Input)An array of length
tree.nclasses
containing character strings representing the names of the different classes in Y, assuming Y is of categorical type.Default:
classNames[0]=”0”, classNames[1]=”1”
, etc.categNames
, char[]
(Input)An array of length
tree.predNvalues[0] + tree.predNvalues[1] + ... + tree.predNvalues[tree.npreds-1]
containing character strings representing the names of the different category levels for each predictor of categorical type.Default:
categNames[0]=”0”, categNames[1]=”1”
, etc.printMax
, (Input)If present, the maximal tree is printed despite any pruning information.
Default: Accounts for pruning.
Description¶
Function decisionTreePrint
provides a convenient way to quickly see the
structure of the tree. More elaborate visualization methods or summaries can
be written for the decision tree structure described in
Structure Definitions for function
decisionTree
, and Figure 13.1 in the
Overview section].
Comments¶
1. The nodes are labeled as the tree was grown. In other words, the first child of the root node is labeled Node 1, the first child node of Node 1 is labeled Node 2, and so on, until the branch stops growing. The numbering continues with the most recent split one level up.
2. If the tree has fewer than five levels, each new level is indented. Otherwise, there is no indentation.
Example¶
This example operates on simulated categorical data.
from __future__ import print_function
from numpy import *
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.decisionTree import decisionTree
from pyimsl.stat.decisionTreePrint import decisionTreePrint
from pyimsl.stat.decisionTreeFree import decisionTreeFree
xy = [[2, 0, 2],
[1, 0, 0],
[2, 1, 3],
[0, 1, 0],
[1, 2, 0],
[2, 2, 3],
[2, 2, 3],
[0, 1, 0],
[0, 0, 0],
[0, 1, 0],
[1, 2, 0],
[2, 0, 2],
[0, 2, 0],
[2, 0, 1],
[0, 0, 0],
[2, 0, 1],
[1, 0, 0],
[0, 2, 0],
[2, 0, 1],
[1, 2, 0],
[0, 2, 2],
[2, 1, 3],
[1, 1, 0],
[2, 2, 3],
[1, 2, 0],
[2, 2, 3],
[2, 0, 1],
[2, 1, 3],
[1, 2, 0],
[1, 1, 0]]
responseColIdx = 2
method = 1
varType = [0, 0, 0]
control = [5, 10, 10, 50, 10]
names = ["Var1", "Var2"]
classNames = ["c1", "c2", "c3", "c4"]
responseName = ["Response"]
varLevels = ["L1", "L2", "L3", "A", "B", "C"]
tree = decisionTree(xy, responseColIdx, varType,
control=control)
print("Generated labels:")
decisionTreePrint(tree, printMax=True)
print("\nCustom labels:")
decisionTreePrint(tree, printMax=True,
varNames=names, classNames=classNames,
categNames=varLevels, respName=responseName)
decisionTreeFree(tree)
Output¶
Generated labels:
Custom labels:
Decision Tree:
Node 0: Cost = 0.467, N= 30, Level = 0, Child nodes: 1 2 3
P(Y=0)= 0.533
P(Y=1)= 0.133
P(Y=2)= 0.100
P(Y=3)= 0.233
Predicted Y: 0
Node 1: Cost = 0.033, N= 8, Level = 1
Rule: X0 in: { 0 }
P(Y=0)= 0.875
P(Y=1)= 0.000
P(Y=2)= 0.125
P(Y=3)= 0.000
Predicted Y: 0
Node 2: Cost = 0.000, N= 9, Level = 1
Rule: X0 in: { 1 }
P(Y=0)= 1.000
P(Y=1)= 0.000
P(Y=2)= 0.000
P(Y=3)= 0.000
Predicted Y: 0
Node 3: Cost = 0.200, N= 13, Level = 1
Rule: X0 in: { 2 }
P(Y=0)= 0.000
P(Y=1)= 0.308
P(Y=2)= 0.154
P(Y=3)= 0.538
Predicted Y: 3
Decision Tree:
Node 0: Cost = 0.467, N= 30, Level = 0, Child nodes: 1 2 3
P(Y=0)= 0.533
P(Y=1)= 0.133
P(Y=2)= 0.100
P(Y=3)= 0.233
Predicted Response: c1
Node 1: Cost = 0.033, N= 8, Level = 1
Rule: Var1 in: { L1 }
P(Y=0)= 0.875
P(Y=1)= 0.000
P(Y=2)= 0.125
P(Y=3)= 0.000
Predicted Response: c1
Node 2: Cost = 0.000, N= 9, Level = 1
Rule: Var1 in: { L2 }
P(Y=0)= 1.000
P(Y=1)= 0.000
P(Y=2)= 0.000
P(Y=3)= 0.000
Predicted Response: c1
Node 3: Cost = 0.200, N= 13, Level = 1
Rule: Var1 in: { L3 }
P(Y=0)= 0.000
P(Y=1)= 0.308
P(Y=2)= 0.154
P(Y=3)= 0.538
Predicted Response: c4