nbClassifierRead

Retrieves a Naive Bayes Classifier previously filed using nbClassifierWrite.

Synopsis

nbClassifierRead (filename)

Required Arguments

char filename (Input)
The name of an ASCII file containing a Naive Bayes Classifier previously saved using nbClassifierWrite. A full or relative path can be given. If the optional argument file is used, filename is ignored.

Return Value

An Imsls_d_nb_classifier data structure containing a Naive Bayes Classifier previously stored using nbClassifierWrite.

Optional Arguments

t_print, (Input)

Prints status of file opening, reading and closing.

Default: No printing.

file, FILE (Input)
A FILE pointer to a file opened for reading. This file is read but not closed. If this option is provided, filename is ignored. This argument allows users to read additional user-defined data and multiple classifiers from the same file (see Example 2 below). To ensure the file is opened and closed with the same run-time library used by the product, open and close this file using fopen and fclose.

Description

Function nbClassifierRead reads a classifier from an ASCII file previously stored using nbClassifierWrite and returns a Naive Bayes Classifier in the form of an Imsls_d_nb_classifier data structure. If the optional argument file is provided, a classifier is read from the file and returned without closing the file. If this argument is not provided, nbClassifierRead opens the file using the path and name provided in filename, reads the classifier then closes the file and returns the data structure.

Examples

Example 1

This example reads a classifier previously trained using Fisher’s Iris data (see Example 2 of nbClassifierWrite). These data consist of 150 patterns, each with four continuous attributes and one dependent variable. The classifier is read from an ASCII file named NB_Classifier_Ex1.txt.

from __future__ import print_function
from numpy import empty, double, int, zeros
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.naiveBayesClassification import naiveBayesClassification
from pyimsl.stat.nbClassifierRead import nbClassifierRead

filename = "NB_Classifier_Ex1.txt"
n_patterns = 150   # 150 training patterns
n_continuous = 4   # four continuous input attributes
n_classes = 3   # three classification categories

classification = empty([150], dtype=int)
continuous = empty([150, 4], dtype=double)
classLabel = ["Setosa     ", "Versicolour", "Virginica  "]

irisData = dataSets(3)

# setup the required input arrays from the data matrix
for i in range(0, n_patterns):
    classification[i] = int(irisData[i][0] - 1)
    for j in range(1, n_continuous + 1):
        continuous[i][j - 1] = irisData[i][j]

nb_classifier = nbClassifierRead(filename, t_print=True)
predictedClass = naiveBayesClassification(
    nb_classifier, n_patterns, continuous=continuous)

classErrors = zeros([4, 2], dtype=int)
for i in range(0, n_patterns):
    if (classification[i] == 0):
        classErrors[0][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[0][0] += 1
    elif (classification[i] == 1):
        classErrors[1][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[1][0] += 1
    elif (classification[i] == 2):
        classErrors[2][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[2][0] += 1

classErrors[3][0] = classErrors[0][0] + classErrors[1][0] + classErrors[2][0]
classErrors[3][1] = classErrors[0][1] + classErrors[1][1] + classErrors[2][1]

print("     Iris Classification Error Rates")
print("----------------------------------------------")
print("   Setosa  Versicolour  Virginica   |   TOTAL")
print("    %d/%d      %d/%d         %d/%d     |   %d/%d\n"
      % (classErrors[0][0], classErrors[0][1],
         classErrors[1][0], classErrors[1][1],
         classErrors[2][0], classErrors[2][1],
         classErrors[3][0], classErrors[3][1]))
print("----------------------------------------------\n")

Output

Attempting to open NB_Classifier_Ex1.txt
for reading Naive Bayes data structure
File NB_Classifier_Ex1.txt Successfully Opened
File NB_Classifier_Ex1.txt closed
   Iris Classification Error Rates
----------------------------------------------
   Setosa  Versicolour  Virginica   |   TOTAL
   0/50      3/50         3/50     |   6/150

----------------------------------------------

Example 2

This example illustrates the use of the optional argument file to read multiple classifiers stored previously into a single file using nbClassifierWrite (see Example 2 of nbClassifierWrite). Two Naive Bayes classifiers were trained using Fisher’s Iris data. These data consist of 150 patterns. The input attributes consist of four continuous attributes and one classification attribute with three classes. The first classifier was trained using all four inputs and the second using only the first two. The classifiers are read from an ASCII file named NB_Classifier_Ex2.txt.

from __future__ import print_function
from numpy import empty, double, int, zeros
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.fclose import fclose
from pyimsl.stat.fopen import fopen
from pyimsl.stat.naiveBayesClassification import naiveBayesClassification
from pyimsl.stat.nbClassifierRead import nbClassifierRead

filename = "NB_Classifier_Ex2.txt"
n_patterns = 150   # 150 training patterns
n_cont4 = 4     # four continuous input attributes
n_cont2 = 2     # two continuous input attributes
n_classes = 3     # three classification categories
n_classifiers = 2     # two classifiers in this example
classification = empty([150], dtype=int)
classErrors = zeros([4, 2], dtype=int)
cont4 = zeros([150, 4], dtype=double)
cont2 = zeros([150, 2], dtype=double)
classLabel = ["Setosa     ", "Versicolour", "Virginica  "]
irisData = dataSets(3)

# setup the required input arrays from the data matrix
for i in range(0, n_patterns):
    classification[i] = int(irisData[i][0] - 1)
    for j in range(1, n_cont4 + 1):
        cont4[i][j - 1] = irisData[i][j]
        if (j < 3):
            cont2[i][j - 1] = irisData[i][j]

print("Opening file %s\n" % (filename))
file = fopen(filename, "r")

nb_classifier4 = nbClassifierRead(" ", file=file)
predictedClass = naiveBayesClassification(
    nb_classifier4, n_patterns, continuous=cont4)
classErrors = zeros([4, 2], dtype=int)
for i in range(0, n_patterns):
    if (classification[i] == 0):
        classErrors[0][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[0][0] += 1
    elif (classification[i] == 1):
        classErrors[1][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[1][0] += 1
    elif (classification[i] == 2):
        classErrors[2][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[2][0] += 1

classErrors[3][0] = classErrors[0][0] + classErrors[1][0] + classErrors[2][0]
classErrors[3][1] = classErrors[0][1] + classErrors[1][1] + classErrors[2][1]

print("     Iris Classification Error Rates")
print("----------------------------------------------")
print("   Setosa  Versicolour  Virginica   |   TOTAL")
print("    %d/%d      %d/%d         %d/%d     |   %d/%d"
      % (classErrors[0][0], classErrors[0][1],
         classErrors[1][0], classErrors[1][1],
         classErrors[2][0], classErrors[2][1],
         classErrors[3][0], classErrors[3][1]))
print("----------------------------------------------\n")

nb_classifier2 = nbClassifierRead(" ", file=file)
predictedClass = naiveBayesClassification(
    nb_classifier2, n_patterns, continuous=cont2)
classErrors = zeros([4, 2], dtype=int)
for i in range(0, n_patterns):
    if (classification[i] == 0):
        classErrors[0][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[0][0] += 1
    elif (classification[i] == 1):
        classErrors[1][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[1][0] += 1
    elif (classification[i] == 2):
        classErrors[2][1] += 1
        if (classification[i] != predictedClass[i]):
            classErrors[2][0] += 1

classErrors[3][0] = classErrors[0][0] + classErrors[1][0] + classErrors[2][0]
classErrors[3][1] = classErrors[0][1] + classErrors[1][1] + classErrors[2][1]

print("     Iris Classification Error Rates")
print("----------------------------------------------")
print("   Setosa  Versicolour  Virginica   |   TOTAL")
print("    %d/%d      %d/%d         %d/%d     |   %d/%d\n"
      % (classErrors[0][0], classErrors[0][1],
         classErrors[1][0], classErrors[1][1],
         classErrors[2][0], classErrors[2][1],
         classErrors[3][0], classErrors[3][1]))
print("----------------------------------------------\n")

print("Closing Classifier File.")
fclose(file)

Output

Opening file  NB_Classifier_Ex2.txt
   Iris Classification Error Rates
----------------------------------------------
   Setosa  Versicolour  Virginica   |   TOTAL
   0/50      3/50         3/50     |   6/150

----------------------------------------------

   Iris Classification Error Rates
----------------------------------------------
   Setosa  Versicolour  Virginica   |   TOTAL
   1/50      13/50         19/50     |   33/150

----------------------------------------------

Closing Classifier File

Fatal Errors

IMSLS_FILE_OPEN_FAILURE Unable to open file for reading neural network.