Next: Boltzmann machines I [3 Up: NNA_Exercises_2012 Previous: Outer product approximation [3

# Advanced Training Methods for Neural Networks [3 P]

Compare different Quasi-Newton methods with standard backpropagation for a digit classification task. You are required to use MATLAB for this assignment.

a)

Download the digit data set HW11.zip 3. The file digits.mat contains training samples (learn.X, learn.C) and test samples (test.X, test.C). Each sample consists of 64 pixel values in the range .
(use d=reshape (learn.X(7,:),8,8)'; imagesc(d); colormap(1-gray); to visualize training sample 7)

Normalize the data using mapstd or prestd and trastd.

b)

Initialize the MATLAB random number generator with rand('state',MatrNmr); and randn('state',MatrNmr); (use only the MatrNmr of one team member).

c)

Use standard backpropagation ( traingdx) and a quasi-Newton method (e.g.  trainlm trainbfg, see  help nnet for a list of available training functions) to train a network of your choice (choose an appropriate network architecture and activation functions), so that it achieves a good generalization capability on the test data. If necessary adjust the training parameters net.trainParam of the network (e.g. see help traincgf for a list of training parameters for  traincgf). Avoid overfitting by means of a method of your choice (early stopping or weight decay). Use train(net,X,T,[],[],V) to hand over a validation set to the training algorithm and automatically activate early stopping.

Choose an appropriate training and validation set from learn.X and learn.C. (Hint: Because of the huge size of the training set do not use all training samples to train the network.)

Before network training set the weight and bias values to small but nonzero values.

Justify and explain your choice for the nework parameters and training functions.

d)

Compare the convergence speed of the different training methods for the network architecture obtained in c).

Present your results clearly, structured and legible. Document them in such a way that anybody can easily reproduce them.

Next: Boltzmann machines I [3 Up: NNA_Exercises_2012 Previous: Outer product approximation [3
Haeusler Stefan 2013-01-16