If the learning rate is constant, i.e.
From (4.8) and (4.11) it is obvious that
the influence of past input signals decays exponentially fast with the
number of further input signals for which c is winner (see also figure
4.2). The most recent input signal, however, always
determines a fraction of the current value of
. This
has two consequences. First, such a system stays adaptive and is
therefore in principle able to follow also non-stationary signal
distribution
. Second (and for the same reason), there is no
convergence. Even after a large number of input signals the current
input signal can cause a considerable change of the reference vector
of the winner. A typical behavior of such a system in case of a
stationary signal distribution is the following: the reference vectors drift
from their initial positions to quasi-stationary positions where
they start to wander around a dynamic equilibrium. Better
quasi-stationary positions in terms of mean square error are
achieved with smaller learning rates. In this case, however, the
system also needs more adaptation steps to reach the
quasi-stationary positions.
Figure 4.2: Influence of an input signal on the vector of its winner s as a function of the number of following input signals for which s is winner (including
). Results for different constant adaptation rates are shown. The respective
section with the x-axis indicates how many signals are needed until the influence of
is below
. For example if the learning rate
is set to 0.5, about 10 additional signals (the section with the x-axis is near 11) are needed to let this happen.
If the distribution is non-stationary then the information about the
non-stationarity (how rapidly does the distribution change) can be
used to set an appropriate learning rate. For rapidly changing
distributions relatively large learning rates should be used and vice
versa. Figure 4.3 shows some stages of a simulation for a simple ring-shaped data distribution. Figure 4.4 displays the final results after 40000 adaptation steps for three other distribution. In both cases a constant learning rate
was used.
Figure 4.3: Hard competitive learning simulation sequence for a ring-shaped uniform probability distribution. A constant adaptation rate was used. a) Initial state. b-f) Intermediate states. g) Final state. h) Voronoi tessellation corresponding to the final state.
Figure 4.4: Hard competitive learning simulation results after 40000 input signals for three different probability distributions. A constant learning rate was used. a) This distribution is uniform within both shaded areas. The probability density, however, in the upper shaded area is 10 times as high as in the lower one. b) The distribution is uniform in the shaded area. c) In this distribution each of the 11 circles indicates the standard deviation of a Gaussian kernel which was used to generate the data. All Gaussian kernels have the same a priori probability.