Computes cluster membership for a hierarchical cluster tree.
#include <imsls.h>
int *imsls_cluster_number (int npt, int *iclson, int *icrson, int k, …, 0)
int npt
(Input)
Number of data points to be clustered.
int *iclson (Input)
Vector of length npt − 1 containing the
left son cluster numbers.
Cluster npt + i is formed by merging
clusters iclson[i-1] and icrson[i-1].
int *icrson (Input)
Vector of length npt − 1 containing the
left son cluster numbers.
Cluster npt + i is formed by merging
clusters iclson[i-1] and icrson[i-1].
int k (Input)
Desired number of clusters.
Vector of length npt containing the cluster membership of each observation.
#include <imsls.h>
int
*imsls_cluster_number (int
npt, int
*iclson, int
*icrson, int
k,
IMSLS_OBS_PER_CLUSTER,
int **nclus,
IMSLS_OBS_PER_CLUSTER_USER,
int
nclus[],
IMSLS_RETURN_USER, int
iclus[],
0)
IMSLS_OBS_PER_CLUSTER,
int **nclus (Output)
Address of a pointer to
an internally allocated array of length k containing the
number of observations in each cluster.
IMSLS_OBS_PER_CLUSTER_USER,
int nclus[]
(Output)
Storage for array nclus is provided by
the user. See IMSLS_OBS_PER_CLUSTER.
IMSLS_RETURN_USER, float iclus[]
(Output)
User allocated array of length npt containing the
cluster membership of each observation.
Given a fixed number of clusters (K) and the cluster tree (vectors icrson and iclson) produced by the hierarchical clustering algorithm (see function imsls_f_cluster_hierarchical, function imsls_cluster_number determines the cluster membership of each observation. The function imsls_cluster_number first determines the root nodes for the K distinct subtrees forming the K clusters and then traverses each subtree to determine the cluster membership of each observation. The function imsls_cluster_number also returns the number of observations found in each cluster.
In the following example, cluster membership for K = 2 clusters is found for the displayed cluster tree. The output vector iclus contains the cluster numbers for each observation.
#include <imsls.h>
int main()
{
int k = 2, npt = 5, *iclus;
int iclson[] = {5, 6, 4, 7};
int icrson[] = {3, 1, 2, 8};
iclus = imsls_cluster_number(npt, iclson, icrson, k, 0);
imsls_i_write_matrix("iclus", 1, 5, iclus, 0);
}
iclus
1 2 3 4 5
1 2 1 2 1
This example illustrates the typical usage of imsls_cluster_number. The Fisher Iris data (see function imsls_f_data_sets, see Chapter 15, “Utilities”) is clustered. First the distance between the irises is computed using function imsls_f_dissimilarities. The resulting distance matrix is then clustered using function imsls_f_cluster_hierarchical. The cluster membership for 5 clusters is then obtained via function imsls_cluster_number using the output from imsls_f_cluster_hierarchical. The need for 5 clusters can be obtained either by theoretical means or by examining a cluster tree. The cluster membership for each of the iris observations is printed.
#include <imsls.h>
#include <stdio.h>
#define MAX(A,B) ((A)>(B)?(A): (B))
int main()
{
int ncol = 5, nrow = 150, nvar = 4, npt = 150, k = 5;
int i, j, *iclson, *icrson, *iclus, *nclus;
int ind[4] = {1, 2, 3, 4};
float *clevel, dist[150][150], *x, f_rand;
int *p_iclus = NULL, *p_nclus = NULL;
x = imsls_f_data_sets (3, 0);
imsls_f_dissimilarities(nrow, ncol, x,
IMSLS_INDEX, nvar, ind,
IMSLS_RETURN_USER, dist,
0);
imsls_random_seed_set (4);
for (i = 0; i < npt; i++)
{
for (j = i + 1; j < npt; j++)
{
imsls_f_random_uniform (1, IMSLS_RETURN_USER, &f_rand, 0);
dist[i][j] = MAX (0.0, dist[i][j] + .001 * f_rand);
dist[j][i] = dist[i][j];
}
dist[i][i] = 0.;
}
imsls_f_cluster_hierarchical (npt, (float*)dist,
IMSLS_CLUSTERS, &clevel, &iclson, &icrson,
0);
iclus = imsls_cluster_number (npt, iclson, icrson, k,
IMSLS_OBS_PER_CLUSTER, &nclus,
0);
imsls_i_write_matrix ("iclus", 25, 5, iclus, 0);
imsls_i_write_matrix ("nclus", 1, 5, nclus, 0); }
iclus
1 2 3 4 5
1 5 5 5 5 5
2 5 5 5 5 5
3 5 5 5 5 5
4 5 5 5 5 5
5 5 5 5 5 5
6 5 5 5 5 5
7 5 5 5 5 5
8 5 5 5 5 5
9 5 5 5 5 5
10 5 5 5 5 5
11 2 2 2 2 2
12 2 2 1 2 2
13 1 2 2 2 2
14 2 2 2 2 2
15 2 2 2 2 2
16 2 2 2 2 2
17 2 2 2 2 2
18 2 2 2 2 2
19 2 2 2 1 2
20 2 2 2 1 2
21 2 2 2 2 2
22 2 3 2 2 2
23 2 2 2 2 2
24 2 2 4 2 2
25 2 2 2 2 2
nclus
1 2 3 4 5
4 93 1 2 50