## AbstractThere exists solid experimental evidence that synaptic weights are subject to spike-timing-dependent plasticity (STDP). However it is not clear how STDP contributes to the emergence and maintenance of powerful computations in network of neurons. We report a surprising theoretically founded connection between STDP in the context of an ubiquitous motif of cortical networks of neurons, winner-take-all (WTA) circuits [Douglas et al., Annu. Rev. Neurosci., 27:419-451, 2004], and theoretically optimal methods for unsupervised learning. More specifically, we prove that STDP approximates in such context stochastic online expectation maximization (EM) for the discovery of hidden causes for complex input patterns. More precisely, each application of STDP to a neuron on the competitive layer that fires can be understood as an approximation of the M-step in stochastic online EM. The E-step consists simply of the application of the WTA-circuit with the resulting slightly changed competitive balance to the next spike inputs. At the heart of this theoretical approach lies the observation that for certain forms of STDP the weights converge to values that can be interpreted as log of the conditional probability that the presynaptic neuron has fired just before, given that the postsynaptic neuron has fired. This principle provides a direct link between STDP and EM. On the basis of this principle one can achieve in computer simulations of STDP in networks of spiking neurons surprising network learning results. For example, we demonstrate that they can learn without supervision to discriminate handwritten digits after having seen a few thousand exampled from the MNIST database (transformed into high-dimensional spike patterns through standard population coding), and to detect and discriminate repeating spatio-temporal patterns of spikes within continuous high-dimensional spike input streams. Furthermore STDP induces in the weight vectors internal models for characteristic prototypes of input patterns, such as prototypes of handwritten digits, as predicted by our theoretical analysis. Our results also show that STDP is able to learn an optimal bayesian inference if both inputs and outputs are realized as probabilistic population codes [Ma et al., Nature 6:11:1432-1438, 2006] . Our theoretical framework predicts that unsupervised learning with STDP works best if weight increases are additive, but depend negative exponentially on the current value of the weight. Furthermore it predicts that the size of weight decreases is independent of the current weight size. Both predictions have been confirmed by experimental data (see Fig. 1 in [Montgomery et al, Neuron 29: 691-701, 2001] and Fig. 5c as well as the text on p. 1153 of [Sjöström et al., Neuron 32:1149-1164 2001]).
[Edit] |