unsupervisedNominalFilter¶
Converts nominal data into a series of binary encoded columns for input to a neural network. Optionally, it can also reverse the binary encoding, accepting a series of binary encoded columns and returning a single column of nominal classes.
Synopsis¶
unsupervisedNominalFilter (nClasses, x)
Required Arguments¶
- int
nClasses
(Input/Output) - The number of classes in
x[]
.nClasses
is output forencode
and input fordecode
. - int
x[]
(Input) - A one or two-dimensional array depending upon whether encoding or
decoding is requested. If encoding is requested,
x
is an array of lengthnPatterns
containing the categories for a nominal variable numbered from 1 tonClasses
. If decoding is requested, thenx
is an array of sizenPatterns
bynClasses
. In this case, the columns contain only zeros and ones that are interpreted as binary encoded representations for a single nominal variable.
Return Value¶
An array, z[]
. The values in z
are either the encoded or decoded
values for x
, depending upon whether encode
or decode
is
requested. If errors are encountered, None
is returned.
Optional Arguments¶
encode
, (Input)Specifies binary encoding. Classes must be numbered sequentially from 1 to
nClasses
. Optional Argumentsencode
anddecode
are mutually exclusive.Default:
encode
.
or
decode
, (Input)Specifies that
x
will be decoded. The values in each column should be zeros and ones. The values in the i-th column ofx
are associated with the i-th class of the nominal variable. Optional Argumentsencode
anddecode
are mutually exclusive.Default:
encode
.
Description¶
The function unsupervisedNominalFilter
is designed to either encode or
decode nominal variables using a simple binary mapping.
Binary Encoding: encode¶
In this case, x[]
is an input array to which a binary filter is applied.
Binary encoding takes each category in x[]
, and creates a column in
z[]
, the output matrix, containing all zeros and ones. A value of zero
indicates that this category is not present and a value of one indicates
that it is present.
For example, if x[]={2,1,3,4,2,4}
then nClasses=4
, and
Notice that the number of columns in z
is equal to the number of
distinct classes in x
. The number of rows in z
is equal to the
length of x
.
Binary Decoding: decode¶
Binary decoding takes each column in x[]
, and returns the appropriate
class in z[].
For example, if x[]
is the same as described above:
then z[]
would be returned as z[]=
{2, 1, 3, 4, 2, 4}. Notice this
is the same as the original array because classes are numbered sequentially
from 1 to nClasses
. This ensures that the i-th column of x[]
is
associated with the i-th class in the output array.
Example¶
This example illustrates nominal binary encoding and decoding for \(x=\{3,3,1,2,2,1,2\}\).
from __future__ import print_function
from numpy import *
from pyimsl.stat.unsupervisedNominalFilter import unsupervisedNominalFilter
from pyimsl.stat.writeMatrix import writeMatrix
x = array([3, 3, 1, 2, 2, 1, 2])
# Binary filtering
nClasses = [] # Output for encode, input for decode
z = unsupervisedNominalFilter(nClasses, x, encode=True)
print("nClasses = ", nClasses[0])
writeMatrix("x", x, writeFormat="%5i", column=True)
writeMatrix("z", z, writeFormat="%5i")
# Binary Unfiltering.
nClasses = nClasses[0] # Output for encode, input for decode
x2 = unsupervisedNominalFilter(nClasses, z, decode=True)
writeMatrix("Unfiltering result", x2, writeFormat="%5i", column=True)
Output¶
nClasses = 3
x
1 3
2 3
3 1
4 2
5 2
6 1
7 2
z
1 2 3
1 0 0 1
2 0 0 1
3 1 0 0
4 0 1 0
5 0 1 0
6 1 0 0
7 0 1 0
Unfiltering result
1 3
2 3
3 1
4 2
5 2
6 1
7 2