Reducing Communication for Distributed Learning in Neural Networks
A learning algorithm is presented for circuits consisting of a single layer of
perceptrons. We refer to such circuits as parallel perceptrons. In spite of
their simplicity, these circuits are universal approximators for arbitrary
boolean and continuous functions. In contrast to backprop for multi-layer
perceptrons, our new learning algorithm - the parallel delta rule (p-delta
rule) - only has to tune a single layer of weights, and it does not require
the computation and communication of analog values with high precision. This
distinguishes our new learning rule also from other learning rules for such
circuits such as MADALINE with far higher communication. Our algorithm also
provides an interesting new hypothesis for the organization of learning in
biological neural systems. A theoretical analysis shows that the p-delta rule
does in fact implement gradient descent - with regard to a suitable error
measure - although it does not require to compute derivatives. Furthermore it
is shown through experiments on common real-world benchmark datasets that its
performance is competitive with that of other learning approaches from neural
networks and machine learning.
Reference: P. Auer, H. Burgsteiner, and W. Maass.
Reducing communication for distributed learning in neural networks.
In J. R. Dorronsoro, editor, http://www.springer.de/comp/lncs/index.html
- Proc. of the International Conference on Artificial Neural Networks -
ICANN 2002, volume 2415 of Lecture Notes in Computer Science, pages
123-128. Springer, 2002.