public class ClusterHierarchical extends Object implements Serializable, Cloneable
Class ClusterHierarchical
conducts a hierarchical cluster
analysis based upon a distance matrix, or, by appropriate use of the
transformation specified in the method setTransformType
, based
upon a similarity matrix. Only the upper triangular part of the input matrix is used.
Hierarchical clustering in ClusterHierarchical
proceeds as
follows:
Initially, each data point is considered to be a cluster, numbered 1 to n
,
where n
is the number of rows in the input matrix,
dist
.
setTransformType
. Set k = 1.dist
corresponding to
the new cluster is performed.The five methods differ primarily in how the distance matrix is updated
after two clusters have been joined. The argument method
in
setMethod
specifies how the distance of the cluster just
merged with each of the remaining clusters will be updated. Class
ClusterHierarchical
allows five methods for computing the distances.
To understand these measures, suppose in the following discussion that clusters A
and B have just been joined to form cluster Z, and interest is in
computing the distance of Z with another cluster called C.
method 
Description 
LINKAGE_SINGLE  Single linkage (minimum distance). The distance from Z to C is the minimum of the distances (A to C, B to C). 
LINKAGE_COMPLETE  Complete linkage (maximum distance). The distance from Z to C is the maximum of the distances (A to C, B to C). 
LINKAGE_AVG_WITHIN_CLUSTERS  Averagedistancewithinclusters method. The distance from Z to C is the average distance of all objects that would be within the cluster formed by merging clusters Z and C. This average may be computed according to formulas given by Anderberg (1973, page 139). 
LINKAGE_AVG_BETWEEN_CLUSTERS  Averagedistancebetweenclusters method. The distance from Z to C is the average distance of objects within cluster Z to objects within cluster C. This average may be computed according to methods given by Anderberg (1973, page 140). 
LINKAGE_WARDS  Ward's method: Clusters are formed so as to minimize the increase in the withincluster sums of squares. The distance between two clusters is the increase in these sums of squares if the two clusters were merged. A method for computing this distance from a squared Euclidean distance matrix is given by Anderberg (1973, pages 142145). 
In general, single linkage will yield long thin clusters while complete linkage will yield clusters that are more spherical. Average linkage and Ward's linkage tend to yield clusters that are similar to those obtained with complete linkage.
Class ClusterHierarchical
produces a unique
representation of the binary cluster tree via the following three
conventions; the fact that the tree is unique should aid in interpreting the
clusters. First, when two clusters are joined and each cluster contains two
or more data points, the cluster that was initially formed with the smallest
level becomes the left son. Second, when a cluster containing more than one
data point is joined with a cluster containing a single data point, the
cluster with the single data point becomes the right son. Finally, when two
clusters containing only one object are joined, the cluster with the
smallest cluster number becomes the right son.
n
, where n
is the number of rows in
dist
. The n
 1 clusters formed by merging
clusters are numbered n
+ 1 to n
+ (n
 1
).transform
= RECIPROCAL_ABS
,
in the setTransformType
method.ClusterHierarchical
since a dissimilarity matrix, not the original data, is used. Class
Dissimilarities
may be used to compute the matrix
dist
for either the variables or observations.Modifier and Type  Field and Description 

static int 
LINKAGE_AVG_BETWEEN_CLUSTERS
Indicates the average distance between (average distance between objects
in the two clusters) method.

static int 
LINKAGE_AVG_WITHIN_CLUSTERS
Indicates the average distance within (average distance between objects
within the merged cluster) method.

static int 
LINKAGE_COMPLETE
Indicates the complete linkage (maximum distance) method.

static int 
LINKAGE_SINGLE
Indicates the single linkage (minimum distance) method.

static int 
LINKAGE_WARDS
Indicates the Ward's method.

static int 
MULTIPLICATION
Indicates transformation by multiplication by 1.0.

static int 
NONE
Indicates no transformation.

static int 
RECIPROCAL_ABS
Indicates transformation by taking the reciprocal of the absolute value.

Constructor and Description 

ClusterHierarchical(double[][] dist)
Constructor for
ClusterHierarchical . 
Modifier and Type  Method and Description 

void 
compute()
Performs a hierarchical cluster analysis.

int[] 
getClusterLeftSons()
Returns the left sons of each merged cluster.

double[] 
getClusterLevel()
Returns the level at which the clusters are joined.

int[] 
getClusterMembership(int nClusters)
Returns the cluster membership of each observation.

int[] 
getClusterRightSons()
Returns the right sons of each merged cluster.

int 
getMethod()
Returns the clustering method used.

int[] 
getObsPerCluster(int nClusters)
Returns the number of observations in each cluster.

int 
getTransformType()
Returns the type of transformation.

void 
setMethod(int method)
Sets the clustering method to be used.

void 
setTransformType(int transform)
Sets the type of transformation.

public static final int LINKAGE_AVG_BETWEEN_CLUSTERS
public static final int LINKAGE_AVG_WITHIN_CLUSTERS
public static final int LINKAGE_COMPLETE
public static final int LINKAGE_SINGLE
public static final int LINKAGE_WARDS
public static final int MULTIPLICATION
public static final int NONE
public static final int RECIPROCAL_ABS
public ClusterHierarchical(double[][] dist)
ClusterHierarchical
.dist
 A double
symmetric matrix containing the
distance (or similarity) matrix. Only the upper
triangular part is used.public void compute()
public final int[] getClusterLeftSons()
int
array containing the left sons of each
merged cluster.public final double[] getClusterLevel()
double
array containing the level at which the
clusters are joined. Element [k1] contains the distance
(or similarity) level at which cluster n
+
k was formed.public final int[] getClusterMembership(int nClusters)
nClusters
 An int
which specifies the desired
number of clusters.int
array containing the cluster membership of
each observation.public final int[] getClusterRightSons()
int
array containing the right sons of each
merged cluster.public int getMethod()
public final int[] getObsPerCluster(int nClusters)
nClusters
 An int
which specifies the desired
number of clusters.int
array containing the number of
observations in each cluster.public int getTransformType()
public void setMethod(int method)
method
 An int
identifying the clustering method to
be used. By default, method
= LINKAGE_SINGLE
.
method  Description 
LINKAGE_SINGLE  Single linkage (minimum distance). 
LINKAGE_COMPLETE  Complete linkage (maximum distance). 
LINKAGE_AVG_WITHIN_CLUSTERS  Average distance within (average distance between objects within the merged cluster). 
LINKAGE_AVG_BETWEEN_CLUSTERS  Average distance between (average distance between objects in the two clusters). 
LINKAGE_WARDS  Ward's method
(minimize the withincluster sums of squares).
For Ward's method, the elements of
dist are assumed to be Euclidean
distances. 
public void setTransformType(int transform)
transform
 An int
identifying the type of
transformation applied to the measures in
dist
. By default, transform
= NONE
.
transform 
Description 
NONE  No
transformation is required. The elements of
dist are distances. 
MULTIPLICATION  Convert similarities to distances by multiplication by 1.0. 
RECIPROCAL_ABS  Convert similarities (usually correlations) to distances by taking the reciprocal of the absolute value. 
Copyright © 19702015 Rogue Wave Software
Built October 13 2015.