unsupervisedNominalFilter¶
Converts nominal data into a series of binary encoded columns for input to a neural network. Optionally, it can also reverse the binary encoding, accepting a series of binary encoded columns and returning a single column of nominal classes.
Synopsis¶
unsupervisedNominalFilter (nClasses, x)
Required Arguments¶
- int
nClasses(Input/Output) - The number of classes in
x[].nClassesis output forencodeand input fordecode. - int
x[](Input) - A one or two-dimensional array depending upon whether encoding or
decoding is requested. If encoding is requested,
xis an array of lengthnPatternscontaining the categories for a nominal variable numbered from 1 tonClasses. If decoding is requested, thenxis an array of sizenPatternsbynClasses. In this case, the columns contain only zeros and ones that are interpreted as binary encoded representations for a single nominal variable.
Return Value¶
An array, z[]. The values in z are either the encoded or decoded
values for x, depending upon whether encode or decode is
requested. If errors are encountered, None is returned.
Optional Arguments¶
encode, (Input)Specifies binary encoding. Classes must be numbered sequentially from 1 to
nClasses. Optional Argumentsencodeanddecodeare mutually exclusive.Default:
encode.
or
decode, (Input)Specifies that
xwill be decoded. The values in each column should be zeros and ones. The values in the i-th column ofxare associated with the i-th class of the nominal variable. Optional Argumentsencodeanddecodeare mutually exclusive.Default:
encode.
Description¶
The function unsupervisedNominalFilter is designed to either encode or
decode nominal variables using a simple binary mapping.
Binary Encoding: encode¶
In this case, x[] is an input array to which a binary filter is applied.
Binary encoding takes each category in x[], and creates a column in
z[], the output matrix, containing all zeros and ones. A value of zero
indicates that this category is not present and a value of one indicates
that it is present.
For example, if x[]={2,1,3,4,2,4} then nClasses=4, and
Notice that the number of columns in z is equal to the number of
distinct classes in x. The number of rows in z is equal to the
length of x.
Binary Decoding: decode¶
Binary decoding takes each column in x[], and returns the appropriate
class in z[].
For example, if x[] is the same as described above:
then z[] would be returned as z[]={2, 1, 3, 4, 2, 4}. Notice this
is the same as the original array because classes are numbered sequentially
from 1 to nClasses. This ensures that the i-th column of x[] is
associated with the i-th class in the output array.
Example¶
This example illustrates nominal binary encoding and decoding for \(x=\{3,3,1,2,2,1,2\}\).
from __future__ import print_function
from numpy import *
from pyimsl.stat.unsupervisedNominalFilter import unsupervisedNominalFilter
from pyimsl.stat.writeMatrix import writeMatrix
x = array([3, 3, 1, 2, 2, 1, 2])
# Binary filtering
nClasses = [] # Output for encode, input for decode
z = unsupervisedNominalFilter(nClasses, x, encode=True)
print("nClasses = ", nClasses[0])
writeMatrix("x", x, writeFormat="%5i", column=True)
writeMatrix("z", z, writeFormat="%5i")
# Binary Unfiltering.
nClasses = nClasses[0] # Output for encode, input for decode
x2 = unsupervisedNominalFilter(nClasses, z, decode=True)
writeMatrix("Unfiltering result", x2, writeFormat="%5i", column=True)
Output¶
nClasses = 3
x
1 3
2 3
3 1
4 2
5 2
6 1
7 2
z
1 2 3
1 0 0 1
2 0 0 1
3 1 0 0
4 0 1 0
5 0 1 0
6 1 0 0
7 0 1 0
Unfiltering result
1 3
2 3
3 1
4 2
5 2
6 1
7 2