Class ScaleFilter
- All Implemented Interfaces:
Serializable
Bounded scaling is used to ensure that the values in the scaled array fall between a lower and upper bound. The scale limits have the following interpretation:
| Argument | Interpretation |
realMin |
The lowest value expected in x. |
realMax |
The largest value expected in x. |
targetMin |
The lower bound for the values in the scaled data. |
targetMax |
The upper bound for the values in the scaled data. |
The scale limits are set using the method setBounds.
The specific scaling used is controlled by the argument
scalingMethod used when constructing the filter object.
If scalingMethod is NO_SCALING, then no scaling is performed
on the data.
If the scalingMethod is BOUNDED_SCALING then the bounded method
of scaling and unscaling is applied to x. The scaling operation
is conducted using the scale limits set in method setBounds,
using the following calculation:
$$z = r(x - \text{realMin}) + \text{targetMin},$$
where
$$r = \frac{\text{targetMax} - \text{targetMin}}{\text{realMax} - \text{realMin}}.$$
If scalingMethod is UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEV,
or BOUNDED_Z_SCORE_SCALING_MEAN_STDEV, then \(a\) and \(b\) are the arithmetic average
and sample standard deviation of the training data.
If scalingMethod is UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MAD
or BOUNDED_Z_SCORE_SCALING_MEDIAN_MAD, then \(a\) and \(b\) are the median and
\(\tilde{s}\), where \(\tilde{s}\) is a
robust estimate of the population standard deviation:
$$\tilde{s} = \frac{\mbox{MAD}}{0.6745}$$
where MAD is the Mean Absolute Deviation
$$\mbox{MAD} = \text{median} \{ \mid x - \text{median}\{x\}\mid \}$$
The Mean Absolute Deviation is a robust measure of spread calculated by
finding the median of the absolute value of differences between each
non-missing value for the \(i\)th variable and the median of those values.
If the method decode is called
then an unscaling operation is conducted by inverting using:
$$x = \frac{(z - \text{targetMin})}{r} + \text{realMin}.$$
Unbounded z-score Scaling
If scalingMethod is
UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEV or
UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MAD, then a scaling operation
is conducted
using the z-score calculation:
$$z = \frac{(x - center)}{spread},$$
If scalingMethod is UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEV then
\(center\) is set equal to the arithmetic average
\(\bar{x}\) of x, and \(spread\) is set equal to the
sample standard deviation of x.
If scalingMethod is UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MAD then
\(center\) is set equal to the median \(\tilde{m}\)
of x, and \(spread\) is set equal to the
Mean Absolute Difference (MAD).
The method decode can be used to unfilter data using the
the inverse calculation for the above equation:
$$x = spread \cdot z + center.$$
Bounded z-score Scaling
This method is essentially the same as the z-score calculation described
above with additional scaling or unscaling using
the scale limits set in method setBounds. The scaling operation is conducted
using the well known z-score calculation:
$$z = \frac{r \cdot (x - center)}
{spread} - r \cdot realMin + targetMin.$$
If scalingMethod is UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEV then
\(center\) is set equal to the arithmetic average
\(\bar{x}\) of x, and \(spread\) is set equal to the
sample standard deviation of x.
If scalingMethod is UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MAD then
\(center\) is set equal to the median \(\tilde{m}\)
of x, and \(spread\) is set equal to the
Mean Absolute Difference (MAD).
decode can be used to unfilter data using the
the inverse calculation for the above equation:
$$x = \frac{spread \cdot
(z - \text{targetMin})}{r} + spread \cdot \text{realMin} + center$$
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intFlag to indicate bounded scaling.static final intFlag to indicate bounded z-score scaling using the mean and standard deviation.static final intFlag to indicate bounded z-score scaling using the median and mean absolute difference.static final intFlag to indicate no scaling.static final intFlag to indicate unbounded z-score scaling using the mean and standard deviation.static final intFlag to indicate unbounded z-score scaling using the median and mean absolute difference. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiondoubledecode(double z) Unscales a value.double[]decode(double[] z) Unscales an array of values.voiddecode(int columnIndex, double[][] z) Unscales a single column of a two dimensional array of values.doubleencode(double x) Scales a value.double[]encode(double[] x) Scales an array of values.voidencode(int columnIndex, double[][] x) Scales a single column of a two dimensional array of values.double[]Retrieves bounds used during bounded scaling.doubleRetrieves the measure of center to be used during z-score scaling.doubleRetrieves the measure of spread to be used during scaling.voidsetBounds(double realMin, double realMax, double targetMin, double targetMax) Sets bounds to be used during bounded scaling and unscaling.voidsetCenter(double center) Set the measure of center to be used during z-score scaling.voidsetSpread(double spread) Set the measure of spread to be used during z-score scaling.
-
Field Details
-
NO_SCALING
public static final int NO_SCALINGFlag to indicate no scaling.- See Also:
-
BOUNDED_SCALING
public static final int BOUNDED_SCALINGFlag to indicate bounded scaling.- See Also:
-
UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEV
public static final int UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEVFlag to indicate unbounded z-score scaling using the mean and standard deviation.- See Also:
-
UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MAD
public static final int UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MADFlag to indicate unbounded z-score scaling using the median and mean absolute difference.- See Also:
-
BOUNDED_Z_SCORE_SCALING_MEAN_STDEV
public static final int BOUNDED_Z_SCORE_SCALING_MEAN_STDEVFlag to indicate bounded z-score scaling using the mean and standard deviation.- See Also:
-
BOUNDED_Z_SCORE_SCALING_MEDIAN_MAD
public static final int BOUNDED_Z_SCORE_SCALING_MEDIAN_MADFlag to indicate bounded z-score scaling using the median and mean absolute difference.- See Also:
-
-
Constructor Details
-
ScaleFilter
public ScaleFilter(int scalingMethod) Constructor forScaleFilter.- Parameters:
scalingMethod- Anintspecifying the scaling method to be applied.scalingMethodis specified by:,NO_SCALING,BOUNDED_SCALING,UNBOUNDED_Z_SCORE_SCALING_MEAN_STDEV,UNBOUNDED_Z_SCORE_SCALING_MEDIAN_MAD, orBOUNDED_Z_SCORE_SCALING_MEAN_STDEV.BOUNDED_Z_SCORE_SCALING_MEDIAN_MAD
-
-
Method Details
-
setBounds
public void setBounds(double realMin, double realMax, double targetMin, double targetMax) Sets bounds to be used during bounded scaling and unscaling. This method is normally called prior to calls toencodeordecode. Otherwise the default bounds arerealMin= 0,realMax= 1,targetMin= 0, andtargetMax= 1. These bounds are ignored for unbounded scaling.- Parameters:
realMin- Adoublecontaining the lowest expected value in the data to be filtered.realMax- Adoublecontaining the largest expected value in the data to be filtered.targetMin- Adoublecontaining the lowest allowed value in the filtered data.targetMax- Adoublecontaining the largest allowed value in the filtered data.
-
getBounds
public double[] getBounds()Retrieves bounds used during bounded scaling.- Returns:
- A
doublearray of length 4 containing the valuesi result[i]0realMin. Lowest expected value in the data to be filtered.1realMax. Largest expected value in the data to be filtered.2targetMin. Lowest allowed value in the filtered data.3targetMax. Largest allowed value in the filtered data.
-
setCenter
public void setCenter(double center) Set the measure of center to be used during z-score scaling.- Parameters:
center- Adoublecontaining the measure of center to be used during scaling. If this method is not called then the measure of center is computed from the data.
-
getCenter
public double getCenter()Retrieves the measure of center to be used during z-score scaling.- Returns:
- A
doublecontaining the measure of center to be used during z-score scaling.
-
setSpread
public void setSpread(double spread) Set the measure of spread to be used during z-score scaling.- Parameters:
spread- Adoublecontaining the measure of spread to be used during z-score scaling. If this method is not called then the measure of spread is computed from the data.
-
getSpread
public double getSpread()Retrieves the measure of spread to be used during scaling.- Returns:
- a
doublecontaining the measure of spread to be used during scaling.
-
encode
public double encode(double x) Scales a value.- Parameters:
x- Adoublecontaining the value to be scaled.- Returns:
- A
doublecontaining the scaled value.
-
encode
public double[] encode(double[] x) Scales an array of values.- Parameters:
x- Adoublearray containing the data to be scaled.- Returns:
- A
doublearray containing the scaled data.
-
encode
public void encode(int columnIndex, double[][] x) Scales a single column of a two dimensional array of values.- Parameters:
columnIndex- Anintspecifying the index of the column ofxto scale. Indexing is zero-based.x- Adoublematrix containing the value to be scaled. ItscolumnIndex-th column is modified in place.
-
decode
public double decode(double z) Unscales a value.- Parameters:
z- Adoublecontaining the value to be unscaled.- Returns:
- A
doublecontaining the filtered data.
-
decode
public double[] decode(double[] z) Unscales an array of values.- Parameters:
z- Adoublearray of values to be unscaled.- Returns:
- A
doublearray containing the filtered data.
-
decode
public void decode(int columnIndex, double[][] z) Unscales a single column of a two dimensional array of values.- Parameters:
columnIndex- Anintspecifying the index of the column ofzto unscale. Indexing is zero-based.z- Adoublematrix containing the values to be unscaled. ItscolumnIndex-th column is modified in place.
-