nbClassifierWrite

Writes a Naive Bayes Classifier to an ASCII file for later retrieval using nbClassifierRead.

Synopsis

nbClassifierWrite (nbClassifier, filename)

Required Arguments

Imsls_d_nb_classifier nbClassifier (Input)
A trained Naive Bayes Classifier.
char filename (Input)
The name of an ASCII file to be created. A full or relative path can be given. If this file exists, it is replaced with the Naive Bayes Classifier. If it does not exist, it is created. If the optional argument file is used, filename is ignored.

Optional Arguments

t_print, (Input)

Prints status of file opening, writing and closing.

Default: No printing.

file, FILE (Input/Output)
A FILE pointer to a file opened for writing. This file is written but not closed. If this option is provided, filename is ignored. This option allows users to write additional data and multiple classifiers to the same file (see Example 2). To ensure the file is opened and closed with the same run-time library used by the product, open and close this file using fopen and fclose.

Description

This function stores an Imsls_d_nb_classifier data structure containing a trained Naive Bayes Classifier into an ASCII file. If the optional argument file is provided, nbClassifierWrite writes the file and returns without closing the file. If this argument is not provided, nbClassifierWrite creates a file using the path and name provided in filename, writes the data structure to that file and then closes the file before returning.

Examples

Example 1

This example trains a classifier using Fisher’s Iris data. These data consist of 150 patterns. The input attributes consist of four continuous attributes and one classification attribute with three classes. The classifier is stored into four lines of an ASCII file named NB_Classifier_Ex1.txt.

from __future__ import print_function
from numpy import empty, double, int
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.naiveBayesTrainer import naiveBayesTrainer
from pyimsl.stat.nbClassifierWrite import nbClassifierWrite

filename = "NB_Classifier_Ex1.txt"
n_patterns = 150   # 150 training patterns
n_continuous = 4   # four continuous input attributes
n_classes = 3   # three classification categories
dashes = "------------------------------------------------------"

classification = empty([150], dtype=int)
continuous = empty([150, 4], dtype=double)
classLabel = ["Setosa     ", "Versicolour", "Virginica  "]

irisData = dataSets(3)

# setup the required input arrays from the data matrix
for i in range(0, n_patterns):
    classification[i] = int(irisData[i][0] - 1)
    for j in range(1, n_continuous + 1):
        continuous[i][j - 1] = irisData[i][j]

    nb_classifier = []

classErrors = naiveBayesTrainer(n_classes, classification,
                                continuous=continuous,
                                nbClassifier=nb_classifier)

print("     Iris Classification Error Rates")
print("----------------------------------------------")
print("   Setosa  Versicolour  Virginica   |   TOTAL")
print("    %d/%d      %d/%d         %d/%d     |   %d/%d\n"
      % (classErrors[0][0], classErrors[0][1],
         classErrors[1][0], classErrors[1][1],
         classErrors[2][0], classErrors[2][1],
         classErrors[3][0], classErrors[3][1]))
print("----------------------------------------------\n")

nbClassifierWrite(nb_classifier, filename)

Output

     Iris Classification Error Rates
----------------------------------------------
   Setosa  Versicolour  Virginica   |   TOTAL
    0/50      3/50         3/50     |   6/150

----------------------------------------------

Example 2

This example illustrates the use of the optional argument file to store multiple classifiers into one file. Two Naive Bayes classifiers are trained using Fisher’s Iris data. These data consist of 150 patterns. The input attributes consist of four continuous attributes and one classification attribute. The first classifier is trained using all four inputs and the second using only the first two. The networks are stored into 10 lines of an ASCII file named NB_Classifier_Ex2.txt.

from __future__ import print_function
from numpy import empty, double, int, zeros
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.fclose import fclose
from pyimsl.stat.fopen import fopen
from pyimsl.stat.free import free
from pyimsl.stat.naiveBayesTrainer import naiveBayesTrainer
from pyimsl.stat.nbClassifierFree import nbClassifierFree
from pyimsl.stat.nbClassifierWrite import nbClassifierWrite

filename = "NB_Classifier_Ex2.txt"

n_patterns = 150   # 150 training patterns
n_cont4 = 4     # four continuous input attributes
n_cont2 = 2     # two continuous input attributes
n_classes = 3     # three classification categories
n_classifiers = 2     # two classifiers in this example
classification = empty([150], dtype=int)
cont4 = zeros([150, 4], dtype=double)
cont2 = zeros([150, 2], dtype=double)
classLabel = ["Setosa     ", "Versicolour", "Virginica  "]

irisData = dataSets(3)

# setup the required input arrays from the data matrix
for i in range(0, n_patterns):
    classification[i] = int(irisData[i][0] - 1)
    for j in range(1, n_cont4 + 1):
        cont4[i][j - 1] = irisData[i][j]
        if (j < 3):
            cont2[i][j - 1] = irisData[i][j]

print("Opening file ", filename)
file = fopen(filename, "w")
nb_classifier = []
classErrors = naiveBayesTrainer(n_classes, classification,
                                continuous=cont4,
                                nbClassifier=nb_classifier)

print("     Iris Classification Error Rates")
print("----------------------------------------------")
print("   Setosa  Versicolour  Virginica   |   TOTAL")
print("    %d/%d      %d/%d         %d/%d     |   %d/%d\n"
      % (classErrors[0][0], classErrors[0][1],
         classErrors[1][0], classErrors[1][1],
         classErrors[2][0], classErrors[2][1],
         classErrors[3][0], classErrors[3][1]))
print("----------------------------------------------\n")

# write first classifier
nbClassifierWrite(nb_classifier, None,
                  file=file)

nbClassifierFree(nb_classifier)

classErrors = naiveBayesTrainer(n_classes, classification,
                                continuous=cont2,
                                nbClassifier=nb_classifier)
print("     Iris Classification Error Rates")
print("----------------------------------------------")
print("   Setosa  Versicolour  Virginica   |   TOTAL")
print("    %d/%d      %d/%d         %d/%d     |   %d/%d\n"
      % (classErrors[0][0], classErrors[0][1],
         classErrors[1][0], classErrors[1][1],
         classErrors[2][0], classErrors[2][1],
         classErrors[3][0], classErrors[3][1]))
print("----------------------------------------------\n")

nbClassifierWrite(nb_classifier, None,
                  file=file)

nbClassifierFree(nb_classifier)
print("Closing Classifier File")
fclose(file)

Output

Opening file  NB_Classifier_Ex2.txt
     Iris Classification Error Rates
----------------------------------------------
   Setosa  Versicolour  Virginica   |   TOTAL
    0/50      3/50         3/50     |   6/150

----------------------------------------------

     Iris Classification Error Rates
----------------------------------------------
   Setosa  Versicolour  Virginica   |   TOTAL
    1/50      13/50         19/50     |   33/150

----------------------------------------------

Closing Classifier File

Fatal Errors

IMSLS_FILE_OPEN_FAILURE Unable to open file for writing network.