Class KohonenSOM
- All Implemented Interfaces:
Serializable,Cloneable
A self-organizing map (SOM), also known as a Kohonen map or Kohonen SOM, is a technique for gathering high-dimensional data into clusters that are constrained to lie in low dimensional space, usually two dimensions. A Kohonen map is a widely used technique for the purpose of feature extraction and visualization for very high dimensional data in situations where classifications are not known beforehand. The Kohonen SOM is equivalent to an artificial neural network having inputs linked to every node in the network. Self-organizing maps use a neighborhood function to preserve the topological properties of the input space.
In a Kohonen map, nodes are arranged in a rectangular or hexagonal grid or lattice. The input is connected to each node, and the output of the Kohonen map is the zero-based (i, j) index of the node that is closest to the input. A Kohonen map involves two steps: training and forecasting. Training builds the map using input examples (vectors), and forecasting classifies a new input.
During training, an input vector is fed to the network. The input's Euclidean distance from all the nodes is calculated. The node with the shortest distance is identified and is called the Best Matching Unit, or BMU. After identifying the BMU, the weights of the BMU and the nodes closest to it in the SOM lattice are updated towards the input vector. The magnitude of the update decreases with time and with distance (within the lattice) from the BMU. The weights of the nodes surrounding the BMU are updated according to:
$${W_{t + 1}} = {W_t} + \alpha \left( t \right) * h\left( {d,t} \right) * \left( {{D_t} - {W_t}} \right)$$where \({W_t}\) represents the node weights, \(\alpha \left( t \right)\) is the monotonically decreasing learning coefficient function, \(h\left( {d,t} \right)\) is the neighborhood function, d is the lattice distance between the node and the BMU, and \({D_t}\) is the input vector.
The monotonically decreasing learning coefficient function \(\alpha \left( t \right)\) is a scalar factor that defines the size of the update correction. The value of \(\alpha \left( t \right)\) decreases with the step index t.
The neighborhood function \(h\left( {d,t} \right)\) depends on the lattice distance d between the node and the BMU, and represents the strength of the coupling between the node and BMU. In the simplest form, the value of \(h\left( {d,t} \right)\) is 1 for all nodes closest to the BMU and 0 for others, but a Gaussian function is also commonly used. Regardless of the functional form, the neighborhood function shrinks with time (Hollmén, 15.2.1996). Early on, when the neighborhood is broad, the self-organizing takes place on the global scale. When the neighborhood has shrunk to just a couple of nodes, the weights converge to local estimates.
Note that in a rectangular grid, the BMU has four closest nodes for the Von Neumann neighborhood type, or eight closest nodes for the Moore neighborhood type. In a hexagonal grid, the BMU has six closest nodes.
During training, this process is repeated for a number of iterations on all input vectors.
During forecasting, the node with the shortest Euclidean distance is the winning node, and its (i, j) index is the output.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intIndicates a hexagonal grid.static final intIndicates a rectangular grid.static final intIndicates a Moore neighborhood type.static final intIndicates a Von Neumann neighborhood type. -
Constructor Summary
ConstructorsConstructorDescriptionKohonenSOM(int dim, int nrow, int ncol) Constructor for aKohonenSOMobject. -
Method Summary
Modifier and TypeMethodDescriptionint[]forecast(double[] input) Returns a forecast computed using theKohonenSOMobject.int[][]forecast(double[][] input) Returns forecasts computed using theKohonenSOMobject.intReturns the number of weights for each node.intReturns the grid type.intReturns the neighborhood type for the rectangular grid.intReturns the number of columns of the node grid.intReturns the number of rows of the node grid.double[][][]Returns the weights of the nodes.double[]getWeights(int i, int j) Returns the weights of the node at (i, j) in the node grid.booleanReturns whether the opposite edges are connected or not.voidsetGridType(int type) Sets the grid type.voidsetNeighborhoodType(int type) Sets the neighborhood type.voidSets the weights of the nodes using random numbers.voidsetWeights(double[][][] weights) Sets the weights of the nodes.voidsetWeights(int i, int j, double[] weights) Sets the weights of the node at (i, j) in the node grid.voidsetWeights(Random random) Sets the weights of the nodes using aRandomobject.voidSets a flag to indicate the map should wrap around or connect opposite edges.
-
Field Details
-
TYPE_VON_NEUMANN
public static final int TYPE_VON_NEUMANNIndicates a Von Neumann neighborhood type.- See Also:
-
TYPE_MOORE
public static final int TYPE_MOOREIndicates a Moore neighborhood type.- See Also:
-
GRID_RECTANGULAR
public static final int GRID_RECTANGULARIndicates a rectangular grid.- See Also:
-
GRID_HEXAGONAL
public static final int GRID_HEXAGONALIndicates a hexagonal grid.- See Also:
-
-
Constructor Details
-
KohonenSOM
public KohonenSOM(int dim, int nrow, int ncol) Constructor for aKohonenSOMobject.- Parameters:
dim- Anintscalar containing the number of weights for each node in the node grid.dimmust be greater than zero.nrow- Anintscalar containing the number of rows in the node grid.nrowmust be greater than zero.ncol- Anintscalar containing the number of columns in the node grid.ncolmust be greater than zero.
-
-
Method Details
-
getDimension
public int getDimension()Returns the number of weights for each node.- Returns:
- An
intscalar containing the number of weights for each node.
-
getNumberOfRows
public int getNumberOfRows()Returns the number of rows of the node grid.- Returns:
- An
intscalar containing the number of rows of the node grid.
-
getNumberOfColumns
public int getNumberOfColumns()Returns the number of columns of the node grid.- Returns:
- An
intscalar containing the number of columns of the node grid.
-
wrapAround
public void wrapAround()Sets a flag to indicate the map should wrap around or connect opposite edges. A hexagonal grid must have an even number of rows to wrap around. By default, opposite edges are not connected. -
isWrapAround
public boolean isWrapAround()Returns whether the opposite edges are connected or not.- Returns:
- A
booleanindicating whether or not the opposite edges are connected. It is true if the opposite edges are connected. Otherwise, it is false.
-
setWeights
public void setWeights()Sets the weights of the nodes using random numbers. The weights are in [0.0, 1.0]. -
setWeights
Sets the weights of the nodes using aRandomobject. The weights are generated using theRandom.nextDoublemethod.- Parameters:
random- ARandomobject used to generate random numbers for the nodes.
-
setWeights
public void setWeights(double[][][] weights) Sets the weights of the nodes.- Parameters:
weights- Annrowbyncolmatrix of double arrays containing the weights of the nodes.weights[i][j].lengthmust be equal todim.
-
setWeights
public void setWeights(int i, int j, double[] weights) Sets the weights of the node at (i, j) in the node grid.- Parameters:
i- Anintscalar containing the row index of the node in the node grid, where \(0 \leq \mbox{i} \leq \mbox{nrow} - 1\).j- Anintscalar containing the column index of the node in the node grid, where \(0 \leq \mbox{j} \leq \mbox{ncol} - 1\).weights- Adoublearray containing the weights.weights.lengthmust be equal todim.
-
getWeights
public double[] getWeights(int i, int j) Returns the weights of the node at (i, j) in the node grid.- Parameters:
i- Anintscalar containing the row index of the node in the node grid, where \(0 \leq \mbox{i} \leq \mbox{nrow} - 1\).j- Anintscalar containing the column index of the node in the node grid, where \(0 \leq \mbox{j} \leq \mbox{ncol} - 1\).- Returns:
- A
doublearray containing the weights of the node at (i, j) in the node grid.
-
getWeights
public double[][][] getWeights()Returns the weights of the nodes.- Returns:
- An
nrowbyncolmatrix of double arrays containing the weights of the nodes.
-
setNeighborhoodType
public void setNeighborhoodType(int type) Sets the neighborhood type.- Parameters:
type- Anintscalar containing the neighborhood type, Von Neumann (KohonenSOM.TYPE_VON_NEUMANN) or Moore (KohonenSOM.TYPE_MOORE). This method is ignored for a hexagonal grid.Default:
type=TYPE_VON_NEUMANN.typeDescription TYPE_VON_NEUMANNUse the Von Neumann ( type= 0) neighborhood type.TYPE_MOOREUse the Moore ( type= 1) neighborhood type.
-
setGridType
public void setGridType(int type) Sets the grid type.- Parameters:
type- Anintscalar containing the grid type, rectangular (KohonenSOM.GRID_RECTANGULAR) or hexagonal (KohonenSOM.GRID_HEXAGONAL).Default:
type=GRID_RECTANGULAR.typeDescription GRID_RECTANGULARUse a rectangular grid ( type= 0).GRID_HEXAGONALUse a hexagonal grid ( type= 1).
-
getGridType
public int getGridType()Returns the grid type.- Returns:
- An
intscalar containing the grid type. The return value is eitherKohonenSOM.GRID_RECTANGULARorKohonenSOM.GRID_HEXAGONAL
-
getNeighborhoodType
public int getNeighborhoodType()Returns the neighborhood type for the rectangular grid.- Returns:
- An
intscalar containing the neighborhood type. The return value is eitherKohonenSOM.TYPE_VON_NEUMANNorKohonenSOM.TYPE_MOORE
-
forecast
public int[] forecast(double[] input) Returns a forecast computed using theKohonenSOMobject.- Parameters:
input- Adoublearray containing the input data.input.lengthmust be equal todim.- Returns:
- An
intarray of length 2 containing the (i, j) index of the output node.
-
forecast
public int[][] forecast(double[][] input) Returns forecasts computed using theKohonenSOMobject.- Parameters:
input- Adoublematrix containinginput.lengthobservations of data.input[i].lengthmust be equal todim.- Returns:
- An
intmatrix containing the output indices of the nodes. The i-th row contains the (i, j) index of the output node forinput[i].
-