Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
2D1432
Exercise 4:
Self-Organizing Maps
1 Objectives
When you are nished you should understand:
2 Introduction
Self-Organising Maps (SOM) are networks which map points in the input space
to points in an output space while preserving the topology. Topology preserva-
tion means that points which are close in the input space should also be close
in the output space. Normally, the input space is of high dimension while the
output is one- or two-dimensional.
This kind of network is particularly useful in situations where complex high-
dimensional data needs to be presented in a form understandable for humans.
Visual information is essentially two-dimensional, and it is often desirable to
organise data in planar graphs or tables.
In this exercise you will implement the core algorithm of SOM and use it for
three dierent tasks. The rst is to order objects (animals) in a sequential order
according to their attributes. The second is to nd a circular tour which passes
ten prescribed points in the plane. The third is to make a two-dimensional map
over voting behaviour of members of the swedish parliament. In all three cases
the algorithm is supposed to nd a low-dimensional representation of higher-
dimensional data.
1
2.1 The SOM algorithm
4. Update the weights of all nodes in the neighborhood such that their
weights are moved closer to the input pattern.
We will now go through these steps in somewhat more detail.
Measuring similarity
Similarity is normally measured by calculating the euclidian distance between
the input pattern and the weight vector. Note that this means that the weights
are not really used as proper weights, i.e. they are not multiplied with the input.
If we have the input pattern x̄ and the i'th output node has a weight vector w̄i ,
then the distance for that output node is
q
di = (x̄ − w̄i )T · (x̄ − w̄i )
We only need to nd out which output node has the minimal distance to
the input pattern. The actual distance values are not interesting, therefore we
can skip the square root operation to save some computer time. The same node
will still be the winner.
Neighborhood
The neighborhood denes the set of output nodes close enough to the win-
ner to get the privilige of having its weights updated. It is important not to
confuse the distances in the previous step with the distances in the neighbor-
hood calculation. Earler we talked about distances in the input space, while
the neighborhood is dened in terms of the output space. The output nodes
are arranged in a grid, see gures 1 and 2. When a winning node has been
found, the neighborhood constitutes the surrounding nodes in this grid. In a
one-dimensional grid, the neighbors are simply the nodes where the index dif-
fers less than a prescribed amount. In the two-dimensional case, is is normally
sucient to use a so called Manhattan distance, i.e. to add the absolute values
of the index dierences in row and column directions.
One important consideration is how large the neighborhood should be. The
best strategy is normally to start o with a rather large neighborhood and grad-
ually making it smaller. The large neighborhood at the beginning is necessary
to get an overall organization while the small neighborhood at the end makes
the detailed positioning correct. Often some trial-and-error experimenting is
needed to nd a good strategy for handling the neighborhood sizes.
In one of the tasks in this exercise you will need a circular one-dimensional
neighborhood. This means that nodes in one end of the vector of output nodes
2
Figure 1: SOM network structure for mapping a 6-dimensional input (left) to
a 1-dimensional grid (right). All input nodes are connected to all output nodes
but in the gure, only connections to one particular output node are shown.
The algorithm picks the output node which has the shortest distance between
the input pattern and its weight vector. The weights are then updated for this
winning node and for its neighbours in the output grid.
are close to those in the other end. This can be achieved by modifying how
dierences between indices are calculated.
Weight modication
Only the nodes suciently close to the winner, i.e. the neighbours, will have
their weights updated. The update rule is very simple: the weight vector is
simply moved a bit closer to the input pattern:
where η is the step size. A reasonable step size in these tasks is η = 0.2.
Now it is time to start with the three tasks. You will have to write the
central parts of the algorithm yourself. Remember that Matlab is good at
handling vectors and matrices so try to avoid dealing with individual elements.
The main learning loop can be written in less than 20 lines.
3
Figure 2: SOM network structure for mapping a 6-dimensional input (left) to a
2-dimensional grid (right). Like in gure 1, only the connections to one of the
output nodes are shown.
Your task is to write the core algorithm in Matlab. Use a weight matrix of size
100 × 84 initialized with random numbers between zero and one (hint: use the
function rand). Use an outer loop to train the network for about 20 epochs,
and an inner loop which loops through the 32 animals, one at a time.
For each animal you will have to pick out the corresponding row from the
props matrix. This can be done using the Matlab statement p = props(a,:)
where a is the row index. Then nd the row of the weight matrix with the
shortest distance to this attribute vector (p). Note that you can not use a
4
scalar product since the attribute vectors are not normalized. Therefore you
have to take the dierence between the two vectors and calculate the length of
this dierence vector. (Hint: the Matlab function min returns two values where
the second one is the index to the minimal value).
Once you have the index to the winning node, it is time to update the
weights. Update the weights so that they come a bit closer to the input pat-
tern. A suitable step size is 0.2. Note that only weights to the winning node and
its neighbours should be updated. The neighbours are in this case the nodes
with an index close to that of the winning one. You should start with a large
neighbourhood and gradually make it smaller. Make the size of the neighbour-
hood depend on the epoch loop variable so that you start with a neighbourhood
of about 50 and end up close to one or zero.
Finally, you have to print out the result, i.e. the animals in a natural order.
Do this by looping through all animals once more, again calculating the index
of the winning output node. Save these indices in a 32 element vector pos.
By sorting this vector we will get the animals in the desired order. If we save
the permutation vector from the sorting we can print the names of the animals
using these statements:
4 Cyclic Tour
In the previous example, the SOM algorithm in eect positioned a one-dimensional
curve in the 84-dimensional input space so that it passed close to the places
where the training examples were located. We will now use the same technique
to layout a curve in a two-dimensional plane so that it passes a set of points.
In fact, we can interprete this as a variant of the TSP (travelling sales-person)
problem. The training points correspond to the cities and the curve corresponds
to the tour. With some luck, the SOM algorithm will be able to nd a fairly
short route which passes all cities.
The actual algorithm is very similar to what you implemented in the previons
task. In fact, you might be able to reuse much of the code. The main dierences
are:
• The input space has two dimensions instead of 84. The output grid should
have 10 nodes, corresponding to the ten cities used in this example.
5
• When presenting the result, it is better to plot the suggested tour graph-
ically than to sort the cities.
The location of the ten cities is dened in the le cities.m which denes
the 10 × 2 matrix city. Each row contains the coordinates of one city (value
between zero and one).
If you use a 10 × 2 matrix w to store the weights, then the following code
can be used to plot both the tour and the training points:
tour = [w;w(1,:)];
plot(tour(:,1),tour(:,2),'b-*',city(:,1),city(:,2),'+')
[x,y]= meshgrid([1:10],[1:10]);
xpos = reshape(x, 1, 100);
ypos = reshape(y, 1, 100);
6
Now, you can write, e.g. xpos(winner) to get the x-position in the grid of
the winner node.
a = ones(1,100)*350;
a(pos) = 1:349;
you will get a vector a where each element corresponds to one output node and
contains the index to an MP who belongs to that position. In fact, several MPs
may end up at the same output node and this code will simply keep the last
one. Nodes which never win will have the index number 350 (remember: there
are only 349 MPs).
Load the le mpparty.m which denes a vector mpparty with the party for
each MP. The parties are coded with numbers from 1 to 7. The following code
will add a 350'th element and then draw a colored map of where the parties
have ended up.
p=[mpparty;0];
image(p(reshape(a,10,10))+1);
The +1 is needed because the image function plots both index zero and one
as black. The colors codes for the dierent parties are dened in the mpparty.m
le.
By replacing mpparty with mpsex or mpdistrict, you can look at the distri-
bution of women versus men or the distribution of the dierent voting districts,
respectively.
Good luck!