decision_tree_print

Prints a decision tree.

Synopsis

#include <imsls.h>

void imsls_f_decision_tree_print (Imsls_f_decision_tree *tree, ..., 0)

The type double function is imsls_d_decision_tree_print.

Required Arguments

imsls_f_decision_tree *tree (Input)
An estimated decision tree.

Synopsis with Optional Arguments

#include <imsls.h>

void imsls_f_decision_tree_print (Imsls_f_decision_tree *tree,
IMSLS_RESP_NAMEchar *response_name,
IMSLS_VAR_NAMESchar *names[],
IMSLS_CLASS_NAMESchar *class_names[],
IMSLS_CATEG_NAMESchar *categ_names[],
IMSLS_PRINT_MAX,
0)

Optional Arguments

IMSLS_RESP_NAME, char *response_name (Input)
An array of length 1 containing a pointer to a character string representing the name of the response variable.
Default: response_name[0] = “Y”.

IMSLS_VAR_NAMES, char *var_names[] (Input)
An array of length tree->npreds containing pointers to character strings representing the names of the predictors.
Default: var_names[0]=”X0”, var_names[1]=”X1”, etc.

IMSLS_CLASS_NAMES, char *class_names[] (Input)
An array of length tree->nclasses containing pointers to character strings representing the names of the different classes in Y, assuming Y is of categorical type.
Default: class_names[0]=”0”, class_names[1]=”1”, etc.

IMSLS_CATEG_NAMES, char *categ_names[] (Input)
An array of length tree‑>pred_nvalues[0] + tree‑>pred_nvalues[1] + … + tree‑>pred_nvalues[tree->npreds-1] containing pointers to character strings representing the names of the different category levels for each predictor of categorical type.
Default: categ_names[0]=”0”, categ_names[1]=”1”, etc.

IMSLS_PRINT_MAX, (Input)
If present, the maximal tree is printed despite any pruning information.
Default: Accounts for pruning.

Description

Function imsls_f_decision_tree_print provides a convenient way to quickly see the structure of the tree. More elaborate visualization methods or summaries can be written for the decision tree structure described in Structure Definitions for function decision_tree, and Figure 1,  , Play Golf? This tree has a size of 6, 4 terminal nodes, and a height or depth of 2. in the Overview section].

Comments

1. The nodes are labeled as the tree was grown. In other words, the first child of the root node is labeled Node 1, the first child node of Node 1 is labeled Node 2, and so on, until the branch stops growing. The numbering continues with the most recent split one level up.

2. If the tree has fewer than five levels, each new level is indented. Otherwise, there is no indentation.

Example

This example operates on simulated categorical data.

 

#include <imsls.h>

#include <stdio.h>

 

int main()

{

 

float xy[30*3] =

{

2, 0, 2,

1, 0, 0,

2, 1, 3,

0, 1, 0,

1, 2, 0,

2, 2, 3,

2, 2, 3,

0, 1, 0,

0, 0, 0,

0, 1, 0,

1, 2, 0,

2, 0, 2,

0, 2, 0,

2, 0, 1,

0, 0, 0,

2, 0, 1,

1, 0, 0,

0, 2, 0,

2, 0, 1,

1, 2, 0,

0, 2, 2,

2, 1, 3,

1, 1, 0,

2, 2, 3,

1, 2, 0,

2, 2, 3,

2, 0, 1,

2, 1, 3,

1, 2, 0,

1, 1, 0

};

 

int n = 30;

int ncols = 3;

int response_col_idx= 2;

int var_type[] = {0, 0, 0};

int control[] = {5, 10, 10, 50, 10};

 

const char* names[] = {"Var1", "Var2"};

const char* class_names[] = {"c1", "c2", "c3", "c4"};

const char* response_name = "Response";

const char* var_levels[] = {"L1", "L2", "L3", "A", "B", "C"};

Imsls_f_decision_tree *tree = NULL;

 

tree = imsls_f_decision_tree(n, ncols, xy, response_col_idx, var_type,

IMSLS_CONTROL, control,

0);

 

printf("\nGenerated labels:\n");

imsls_f_decision_tree_print(tree,

IMSLS_PRINT_MAX,

0);

printf("\nCustom labels:\n");

 

imsls_f_decision_tree_print(tree,

IMSLS_RESP_NAME, &response_name,

IMSLS_VAR_NAMES, names,

IMSLS_CATEG_NAMES, var_levels,

IMSLS_CLASS_NAMES, class_names,

IMSLS_PRINT_MAX,

0);

 

imsls_f_decision_tree_free(tree);

}

Output

 

Generated labels:

 

Decision Tree:

 

Node 0: Cost = 0.467, N= 30, Level = 0, Child nodes: 1 2 3

P(Y=0)= 0.533

P(Y=1)= 0.133

P(Y=2)= 0.100

P(Y=3)= 0.233

Predicted Y: 0

Node 1: Cost = 0.033, N= 8, Level = 1

Rule: X0 in: { 0 }

P(Y=0)= 0.875

P(Y=1)= 0.000

P(Y=2)= 0.125

P(Y=3)= 0.000

Predicted Y: 0

Node 2: Cost = 0.000, N= 9, Level = 1

Rule: X0 in: { 1 }

P(Y=0)= 1.000

P(Y=1)= 0.000

P(Y=2)= 0.000

P(Y=3)= 0.000

Predicted Y: 0

Node 3: Cost = 0.200, N= 13, Level = 1

Rule: X0 in: { 2 }

P(Y=0)= 0.000

P(Y=1)= 0.308

P(Y=2)= 0.154

P(Y=3)= 0.538

Predicted Y: 3

 

Custom labels:

 

Decision Tree:

 

Node 0: Cost = 0.467, N= 30, Level = 0, Child nodes: 1 2 3

P(Y=0)= 0.533

P(Y=1)= 0.133

P(Y=2)= 0.100

P(Y=3)= 0.233

Predicted Response c1

Node 1: Cost = 0.033, N= 8, Level = 1

Rule: Var1 in: { L1 }

P(Y=0)= 0.875

P(Y=1)= 0.000

P(Y=2)= 0.125

P(Y=3)= 0.000

Predicted Response c1

Node 2: Cost = 0.000, N= 9, Level = 1

Rule: Var1 in: { L2 }

P(Y=0)= 1.000

P(Y=1)= 0.000

P(Y=2)= 0.000

P(Y=3)= 0.000

Predicted Response c1

Node 3: Cost = 0.200, N= 13, Level = 1

Rule: Var1 in: { L3 }

P(Y=0)= 0.000

P(Y=1)= 0.308

P(Y=2)= 0.154

P(Y=3)= 0.538

Predicted Response c4