R. Legenstein, S. A. Chase, A. B. Schwartz, and W. Maass
Recent experimental results have shown that the direction preference of neurons
in monkey motor cortex changes in order to compensate for purposeful
misreading of preferred directions for brain control of a robot arm. We show
that a simple neural network model in combination with a new rule for
reward-modulated Hebbian plasticity can explain this effect. This rule
requires substantial trial-to-trial variability of the neuronal output for
exploration. In contrast to previously proposed rules for reward-modulated
Hebbian plasticity, the new rule does not require that the plasticity
mechanism `knows' the noise explicitly. It is able to optimize the
performance of the model system within biologically realistic periods of time
and under high noise levels. When the neuronal noise is fitted to
experimental data, the model produces learning effects similar to those found
in monkey experiments. We quantified these effects and found a surprisingly
good match to those observed in experiments. This study shows that
reward-modulated learning can explain detailed experimental results about
neuronal tuning changes in a motor control task and suggests that
reward-modulated learning is an essential plasticity mechanism in the cortex
for the acquisition of goal-directed behavior. Self-tuning effects of the
type considered in this model are obviously important for successful use of
neuroprosthetic devices.