public class ClusterKNN extends Object implements Serializable, Cloneable
Perform a kNearest Neighbor classification.
ClusterKNN
implements an algorithm to classify objects based
on a training set. Among the simpler algorithms for classification,
classifying a new object is essentially a majority vote of its closest
k
neighbors. k
must be a positive integer and is
typically small and odd. The method is straightforward in that the distance
from the new point to every point in the training set is computed and
sorted. The k
closest points are examined and the new object is
assigned to the class that is most common in that set. For the case
k = 1
the object is assigned to the class of its nearest
neighbor.
The default distance method is the Euclidean distance, but other options
are available by using the setDistanceMethod
method. The
supported methods are:
method 
Description 
L2_NORM 
The Euclidean distance method, norm, defined as the sum of the squares of the difference of each coordinate. (Default) 
L1_NORM 
The rectilinear norm or city block method, norm, defined as the sum of the absolute values of the difference of each coordinate. This is most useful for integer input data. 
INFINITY_NORM 
The Chebyshev distance method, norm, defined as the maximum of the absolute values of the difference of each coordinate. 
For cases where the data are poorly scaled, it may be necessary to normalize the input data first. For example, if in a 2D space the X values range from 0 to 1 and the Y values, from 0 to 1000, the distance calculations will be dominated by the Y coordinate unless they are normalized.
Modifier and Type  Field and Description 

static int 
INFINITY_NORM
Indicates the distance is computed using the norm method.

static int 
L1_NORM
Indicates the distance is computed using the norm method.

static int 
L2_NORM
Indicates the distance is computed using the norm, or Euclidean distance measurement.

Constructor and Description 

ClusterKNN(double[][] x,
int[] c)
Constructor for
ClusterKNN . 
Modifier and Type  Method and Description 

int[] 
classify(double[][] value,
int k)
Classify a set of observations using
k nearest neighbors. 
int 
classify(double[] value,
int k)
Classify an observation using
k nearest neighbors. 
void 
setDistanceMethod(int method)
Sets the distance calculation method to be used.

public static final int INFINITY_NORM
public static final int L1_NORM
public static final int L2_NORM
public ClusterKNN(double[][] x, int[] c)
ClusterKNN
.x
 A double
matrix containing the known
x.length
observations of x[0].length
variables.c
 An int
array containing the categories for the
x.length
observations. All integer values
are valid.public int[] classify(double[][] value, int k)
k
nearest neighbors.value
 A double
matrix of value.length
observations and x[0].length
variables to
classify.k
 An int
containing the number of nearest neigbors
to use. An odd value is recommended.int
array containing the cluster to which
each of the observations belong.public int classify(double[] value, int k)
k
nearest neighbors.value
 A double
array of x[0].length
variables containing the observations to classify.k
 An int
containing the number of nearest neigbors
to use. An odd value is recommended.int
containing the cluster to which the
observation belongs.public void setDistanceMethod(int method)
method
 An int
identifying the distance calculation
method to be used. By default, method
=
L2_NORM
.
method  Description 
L2_NORM 

L1_NORM 

INFINITY_NORM 
Copyright © 19702015 Rogue Wave Software
Built October 13 2015.