unsupervisedOrdinalFilter

Converts ordinal data into proportions. Optionally, it can also reverse encoding, accepting proportions and converting them into ordinal values.

Synopsis

unsupervisedOrdinalFilter (x, z)

Required Arguments

int x[] (Input/Output)
An array of length nPatterns containing the classes for the ordinal data. Classes must be numbered 1 to nClasses. This is an output argument if decode is specified, otherwise it is input.
float z[] (Input/Output)
An array of length nPatterns containing the encoded values for x represented as cumulative proportions associated with each ordinal class (values between 0.0 and 1.0 inclusive). This is an input argument if decode is specified, otherwise it is output.

Optional Arguments

encode, (Input)

Specifies z as an output array and x an input array that is filtered by converting each ordinal class value into a cumulative proportion (a value between 0.0 and 1.0 inclusive). Optional Arguments encode and decode are mutually exclusive.

Default: encode.

or

decode, (Input)

Specifies x as an output array and z an input array that contains transformed cumulative proportions. In this case, the transformed cumulative proportions are converted into ordinal class values using the coding class=1, 2, … etc. Optional Arguments encode and decode are mutually exclusive.

Default: encode.

noTransform, (Input)

Indicates that the cumulative proportions used to encode the ordinal variable are not transformed. Optional Arguments noTransform, squareRoot, and arcSin are mutually exclusive.

Default: noTransform.

or

squareRoot, (Input)

Indicates cumulative proportions are transformed using the square root transformation. Optional Arguments noTransform, squareRoot, and arcSin are mutually exclusive.

Default: noTransform.

or

arcSin, (Input)

Indicates cumulative proportions are transformed using the arcsin of the square root of the cumulative proportions. Optional Arguments noTransform, squareRoot, and arcSin are mutually exclusive.

Default: noTransform.

nClasses (Output)
The number of ordinal classes in x and the number of unique proportions in z.

Description

The function unsupervisedOrdinalFilter is designed to either encode or decode ordinal variables. Filtering consists of transforming the ordinal classes into proportions, with each proportion being equal to the proportion of the data at or below this class.

Ordinal Filtering: encode

In this case, x is an input array that is filtered by converting each ordinal class value into a cumulative proportion.

For example, if x[]={2,1,3,4,2,4,1,1,3,3} then nPatterns=10 and nClasses=4. This function then fills z with cumulative proportions represented as proportions displayed in the table below. Cumulative proportions are equal to the proportion of the data in this class or a lower class.

Ordinal Class Frequency Cumulative Proportion
1 3 30%
2 2 50%
3 3 80%
4 2 100%

If noTransform is specified, then the equivalent proportions in z are

\[\texttt{z[]}=\{0.50, 0.30, 0.80, 1.00, 0.50, 1.00, 0.30, 0.30, 0.80, 0.80\}.\]

If squareRoot is specified, then the square root of these values is returned, i.e.,

\[\mathtt{z}[i] = \sqrt{\frac{\mathtt{z}[i]}{100}}\]
\[\texttt{z[]}=\{0.71, 0.55 , 0.89, 1.0, 0.71, 1.0, 0.55, 0.55, 0.89, 0.89\};\]

If arcSin is specified, then the arcsin square root of these values is returned using the following calculation:

\[\mathtt{z}[i] = \arcsin\left(\sqrt{\frac{\mathtt{z}[i]}{100}}\right)\]

Ordinal UnFiltering: decode

Ordinal Unfiltering takes the transformed cumulative proportions in z and converts them into ordinal class values using the coding class=1, 2, … etc.

For example, if noTransform is specified and z[]={0.20, 1.00, 0.20, 0.40, 1.00, 1.00, 0.40, 0.10, 1.00, 1.00} then upon return, the output array would consist of the ordinal classes x[]={2, 4, 2, 3, 4, 4, 3, 1, 4, 4}.

If one of the transforms is specified, the same operation is performed since the transformations of the proportions are monotonically increasing. For example, if the original observations consisted of {2.8, 5.6, 5.6, 1.2, 4.5, 7.1}, then input x for encoding would be x[]={2, 4, 4, 1, 3, 5} and output nClasses=5. The output array x after decoding would consist of the ordinal classes x[]={2, 4, 4, 1, 3, 5}.

Example

A taste test was conducted yielding the following data:

Individual Rating
1 Poor
2 Good
3 Very Good
4 Very Poor
5 Very Good

The data in the table above would have the coded values shown below. This assumes that the rating scale is: very poor, poor, good, and very good.

\[\texttt{x}=\{2, 3, 4, 1, 4\}\]

The returned values are:

\[\texttt{z}=\{0.40, 0.60, 1.00, 0.20, 1.00\}.\]
from __future__ import print_function
from numpy import *
from pyimsl.stat.unsupervisedOrdinalFilter import unsupervisedOrdinalFilter
from pyimsl.stat.writeMatrix import writeMatrix

x = [2, 3, 4, 1, 4]
z = []
nClasses = []

# Filtering
unsupervisedOrdinalFilter(x, z, encode=True, nClasses=nClasses)
writeMatrix("x", x, writeFormat="%5i", column=True)
writeMatrix("z", z, column=True)

# Unfiltering
x2 = []
unsupervisedOrdinalFilter(x2, z, decode=True, nClasses=nClasses)
print("\nnClasses: ", nClasses[0])
writeMatrix("x-unfiltered", x2, writeFormat="%5i", column=True)

Output

nClasses:  4
 
    x
1      2
2      3
3      4
4      1
5      4
 
       z
1          0.4
2          0.6
3          1.0
4          0.2
5          1.0
 
x-unfiltered
  1      2
  2      3
  3      4
  4      1
  5      4