Draw the architecture of the Köhonen Network and explain the algorithm for training the weights of the Network.
Architecture of the Kohonen Network (Self-Organizing Map, SOM)
The Kohonen Network, also known as a Self-Organizing Map (SOM), is a type of competitive learning neural network used for unsupervised learning tasks such as clustering and dimensionality reduction. The key idea behind SOM is to map high-dimensional input data into a lower-dimensional (typically two-dimensional) grid of neurons while preserving the topological properties of the input data.
Architecture Diagram
Input Layer
(High-Dimensional Data)
/ / \ \
/ / \ \
/ / \ \
/ / \ \
O---O---O---O---O---O---O --> Output Layer (2D Grid of Neurons)
| | | | | | |
O---O---O---O---O---O---O
| | | | | | |
O---O---O---O---O---O---O
| | | | | | |
O---O---O---O---O---O---O
| | | | | | |
O---O---O---O---O---O---O
Components:
- Input Layer: Consists of input vectors from the high-dimensional space.
- Output Layer (2D Grid of Neurons): A grid where each neuron is connected to all input features through weight vectors. Neurons in this layer compete to be activated based on the input data.
Training Algorithm for the Kohonen Network
The training algorithm for the Kohonen Network (SOM) involves the following steps:
-
Initialization:
- Initialize the weight vectors of the neurons in the 2D grid with small random values or by sampling from the input data distribution.
-
Input Presentation:
- Present an input vector (\mathbf{x}) to the network.
-
Best Matching Unit (BMU) Identification:
- Calculate the distance between the input vector (\mathbf{x}) and all the weight vectors (\mathbf{w}_i) in the grid.
- Identify the neuron (BMU) with the smallest distance to the input vector. The Euclidean distance is commonly used: [ d_i = |\mathbf{x} - \mathbf{w}_i| ] [ \text{BMU} = \arg\min_i d_i ]
-
Weight Update:
- Update the weight vector of the BMU and its neighboring neurons to move closer to the input vector. The update rule is:
[
\mathbf{w}_i(t+1) = \mathbf{w}i(t) + \eta(t) \cdot h{i,\text{BMU}}(t) \cdot (\mathbf{x}(t) - \mathbf{w}_i(t))
]
where:
- (\eta(t)) is the learning rate, which decreases over time.
- (h_{i,\text{BMU}}(t)) is the neighborhood function, which decreases with the distance between neuron (i) and the BMU and also decreases over time.
- Update the weight vector of the BMU and its neighboring neurons to move closer to the input vector. The update rule is:
[
\mathbf{w}_i(t+1) = \mathbf{w}i(t) + \eta(t) \cdot h{i,\text{BMU}}(t) \cdot (\mathbf{x}(t) - \mathbf{w}_i(t))
]
where:
-
Neighborhood Function:
- The neighborhood function (h_{i,\text{BMU}}(t)) determines the influence of the BMU on its neighbors. A common choice is the Gaussian function:
[
h_{i,\text{BMU}}(t) = \exp\left(-\frac{|\mathbf{r}i - \mathbf{r}{\text{BMU}}|^2}{2\sigma^2(t)}\right)
]
where:
- (\mathbf{r}i) and (\mathbf{r}{\text{BMU}}) are the positions of neuron (i) and the BMU in the grid.
- (\sigma(t)) is the width of the neighborhood, which decreases over time.
- The neighborhood function (h_{i,\text{BMU}}(t)) determines the influence of the BMU on its neighbors. A common choice is the Gaussian function:
[
h_{i,\text{BMU}}(t) = \exp\left(-\frac{|\mathbf{r}i - \mathbf{r}{\text{BMU}}|^2}{2\sigma^2(t)}\right)
]
where:
-
Repeat:
- Repeat the process for each input vector and iterate through the entire dataset multiple times (epochs) until the weight vectors stabilize.
Detailed Steps with Example
-
Initialization:
- Assume we have a 2D grid of neurons (e.g., 3x3) and each neuron has a weight vector initialized randomly. For simplicity, assume the input vectors are 2-dimensional.
-
Input Presentation:
- Present an input vector, e.g., (\mathbf{x} = [0.5, 0.7]).
-
BMU Identification:
- Calculate the Euclidean distance between (\mathbf{x}) and each neuron's weight vector. Suppose the weight vectors are: [ \mathbf{w}_1 = [0.1, 0.2], \mathbf{w}_2 = [0.8, 0.9], \mathbf{w}_3 = [0.3, 0.4], \ldots ]
- Compute the distances: [ d_1 = \sqrt{(0.5 - 0.1)^2 + (0.7 - 0.2)^2}, \quad d_2 = \sqrt{(0.5 - 0.8)^2 + (0.7 - 0.9)^2}, \quad \ldots ]
- Identify the BMU with the smallest distance, say (\mathbf{w}_3).
-
Weight Update:
- Update the weight vector of the BMU and its neighbors. Suppose (\eta(t) = 0.1) and (h_{i,\text{BMU}}(t) = 0.5) for the immediate neighbors: [ \mathbf{w}_3(t+1) = \mathbf{w}_3(t) + 0.1 \cdot 0.5 \cdot ([0.5, 0.7] - \mathbf{w}_3(t)) ] [ \mathbf{w}_3(t+1) = [0.3, 0.4] + 0.05 \cdot ([0.5, 0.7] - [0.3, 0.4]) ] [ \mathbf{w}_3(t+1) = [0.3, 0.4] + [0.01, 0.015] = [0.31, 0.415] ]
-
Neighborhood Function:
- Update the neighboring neurons' weights similarly, but with a reduced influence based on their distance from the BMU.
-
Repeat:
- Repeat the process for each input vector and iterate through the dataset multiple times, gradually decreasing the learning rate (\eta(t)) and the neighborhood width (\sigma(t)).
Conclusion
The Kohonen Network (Self-Organizing Map) architecture consists of an input layer and a 2D grid of neurons, with each neuron connected to all input features through weight vectors. The training algorithm involves initializing the weights, presenting input vectors, identifying the Best Matching Unit (BMU), updating the weights of the BMU and its neighbors, and repeating the process over multiple iterations. This results in a topologically ordered map where similar input patterns activate neighboring neurons, making SOMs effective for clustering and visualization of high-dimensional data.