Reducing Communication for Distributed Learning in Neural Networks
P. Auer, H. Burgsteiner, and W. Maass
 
Abstract:
A learning algorithm is presented for circuits consisting of a single layer of
  perceptrons. We refer to such circuits as parallel perceptrons. In spite of
  their simplicity, these circuits are universal approximators for arbitrary
  boolean and continuous functions. In contrast to backprop for multi-layer
  perceptrons, our new learning algorithm - the parallel delta rule (p-delta
  rule) - only has to tune a single layer of weights, and it does not require
  the computation and communication of analog values with high precision. This
  distinguishes our new learning rule also from other learning rules for such
  circuits such as MADALINE with far higher communication. Our algorithm also
  provides an interesting new hypothesis for the organization of learning in
  biological neural systems. A theoretical analysis shows that the p-delta rule
  does in fact implement gradient descent - with regard to a suitable error
  measure - although it does not require to compute derivatives. Furthermore it
  is shown through experiments on common real-world benchmark datasets that its
  performance is competitive with that of other learning approaches from neural
  networks and machine learning.
Reference: P. Auer, H. Burgsteiner, and W. Maass.
 Reducing communication for distributed learning in neural networks.
 In J. R. Dorronsoro, editor, http://www.springer.de/comp/lncs/index.html
  - Proc. of the International Conference on Artificial Neural Networks -
  ICANN 2002, volume 2415 of Lecture Notes in Computer Science, pages
  123-128. Springer, 2002.