Deep Learning - Exercise 03

Learning goal:
In this exercise you shall implement a simple neural network classifier for the 28x28 MNIST images: a Perceptron.

1. Implement a Perceptron

Write a Perceptron class that allows to freely define the number of inputs for a Perceptron and the number of output neurons used by the Perceptron. In our case we will need 10 output neurons for a 1-out-10 classification of the MNIST digit type.

The Perceptron class shall also store a weight from each input to each output neuron.

The class shall also be able to compute the output values of all output neurons given the current input vector / values and shall be able to return a classificaton result by determing the number of the neuron that has the largest output values.

2. Implement a classification rate test function

Then write a function that determines the classification rate of a given Perceptron (initialized e.g. with random weights) on the 10.000 MNIST test data images.

The function shall determine how much images can be correctly classified by the Perceptron.

Question: if you start with a Perceptron with randomly initialized weights. Which classification rate to you expect for a 1-out-10 classification task?

3. Implement the Perceptron learning rule

Using the 60.000 MNIST training images implement a learning procedure where you randomly select one of the training images, feed the input values into the Perceptron, compute the output values for each of the 10 output neurons and then adapt all the input weights for each of the 10 output neurons given a teacher vector. The teacher vector ("should-be-values" for all the 10 output neurons) can be produced using the ground truth label information which is also available for each of the 60.000 training images.

Note: remember the simple learning rule: ∆𝑤_𝑖= 𝛼(𝑡−𝑜𝑢𝑡)𝑥_𝑖
(have a look on the slides again)

4. Experiment with the Perceptron!

Observe what happens if you train the Perceptron on 100.000s of training images.

Can you see that the Perceptron improves regarding the classification rate?

What happens if you choose a large or a small learning rate?

Write the classification rate (on the 10.000 MNIST test images) during learning into a file after each N trained images and plot this classification rate graph using some tool.

5. Grayscale vs. filter responses as Perceptron input

First feed the Perceptron with the 28x28 grayscale values. Then try to feed the Perceptron with a larger input vector, namely a vector consisting of all output values of the filter response matrices (bank) from exercise 02.

Compare the Perceptron learning behavior using grayscale vs. filter response inputs!

Perceptron with filter bank inputs (filter bank size 8):
~90% correct classification rate (on MNIST test dataset)

vs.

Perceptron with gray scale images as input:
~85% correct classification rate (on MNIST test dataset)

However, today CNN reach classification rates of up to 99.77% (=0.23% error rate)!
See the table on the MNIST website.