Compare different Quasi-Newton methods with standard backpropagation for a digit classification task. You are required to use MATLAB for this assignment.
Download the digit data set HW11.zip
The file digits.mat contains training samples (learn.X, learn.C) and test samples (test.X, test.C). Each sample consists of 64 pixel values in the range
(use d=reshape (learn.X(7,:),8,8)'; imagesc(d); colormap(1-gray); to visualize training sample 7)
Normalize the data using mapstd or prestd and trastd.
Initialize the MATLAB random number generator with rand('state',MatrNmr); and randn('state',MatrNmr); (use only the MatrNmr of one team member).
Use standard backpropagation ( traingdx) and a quasi-Newton method (e.g. trainlm, trainbfg, see help nnet for a list of available training functions) to train a network of your choice (choose an appropriate network architecture and activation functions), so that it achieves a good generalization capability on the test data. If necessary adjust the training parameters net.trainParam of the network (e.g. see help traincgf for a list of training parameters for traincgf). Avoid overfitting by means of a method of your choice (early stopping or weight decay). Use train(net,X,T,,,V) to hand over a validation set to the training algorithm and automatically activate early stopping.
Choose an appropriate training and validation set from learn.X and learn.C. (Hint: Because of the huge size of the training set do not use all training samples to train the network.)
Before network training set the weight and bias values to small but nonzero values.
Justify and explain your choice for the nework parameters and training functions.
Compare the convergence speed of the different training methods for the network architecture obtained in c).