Deep Learning - Exercise 04

Learning goal:
In this exercise you shall program two different implementations of a Multi Layer Perceptron (only the feedforward step, no backpropagation yet!) and compare the speed of the two implementations.

Multi layer perceptrons are feedforward networks.

This means that you set the outputs of the neurons in the input layer (layer #0) to the current (e.g. sensor data) input values. Then each neuron in layer #1 computes its netto input and output value. Afterwards, all neurons in layer #2 compute their netto input and output values based on the output values of the neurons in the previous layer and so on. So the output values of the neurons are updated layerwise. Neurons in the last layer of the MLP (the output neurons) are the last ones that compute their netto input and output values. The overall procedure is called a feedforward step.

1. Implement a MLP using the OOP approach

Implement a MLP that can be parametrized flexible to have N number of layers. Use a straightforward object oriented programming (OOP) approach in which you implement the MLP using a class mlp_oop, a class neuronlayer, a class neuron and a class weight.

Then create a MLP with 3 layers (784 input neurons, 10 hidden neurons, 10 output neurons) and measure the time your OOP MLP implementation needs to do 50.000 feedforward steps. In each step a MNIST image shall be used to initialize the output values of the input neurons. Then a layerwise actualization of the neurons shall be performed as described above.

2. Program a faster MLP implementation

Now forget about the OOP world! ;-) Implement only a single class mlp_fast in which you try to avoid any OOP overhead and come to a faster MLP implementation that can do the 50.000 feedforward steps much more faster.

Compare the speed of your mlp_oop class with the speed of your mlp_fast class while feeding 50.000 MNIST images each through a mlp_oop vs. mlp_fast.

Note: This exercise is an important step for the following exercises. You shall continue with your own fast MLP implementation in the next exercise and augment it by the Backpropagation learning algorithm. For this, a fast implementation is essentially.

3. Now experiment with your two MLP implementations in debug/release mode

First run the comparison in Debug mode:

Here is the output of my speed comparison for 50.000 feedforwards of 28x28 images compiled in Debug mode:

        
            ######################################################
            Feedforward Speed Test MLP implementation #1: mlp_oop
            ######################################################
            generated neuron layer with index #0 with 784 neurons.
            generated neuron layer with index #1 with 10 neurons.
            generated neuron layer with index #2 with 10 neurons.
            Number of images forwarded: 5000
            Number of images forwarded: 10000
            Number of images forwarded: 15000
            Number of images forwarded: 20000
            Number of images forwarded: 25000
            Number of images forwarded: 30000
            Number of images forwarded: 35000
            Number of images forwarded: 40000
            Number of images forwarded: 45000
            Number of images forwarded: 50000
            -->Time needed: 107863ms

            ######################################################
            Feedforward Speed Test MLP implementation #2: mlp_fast
            ######################################################
            Ok, in layer 0 you want 784 neurons.
            Ok, in layer 1 you want 10 neurons.
            Ok, in layer 2 you want 10 neurons.
            Number of images forwarded: 5000
            Number of images forwarded: 10000
            Number of images forwarded: 15000
            Number of images forwarded: 20000
            Number of images forwarded: 25000
            Number of images forwarded: 30000
            Number of images forwarded: 35000
            Number of images forwarded: 40000
            Number of images forwarded: 45000
            Number of images forwarded: 50000
            -->Time needed: 5559ms

            The implementation mlp_fast is 19.40 times faster than mlp_oop
        

So it seems we should avoid OOP overhead, not?

But probably you have heard that compiler optimizations are switched on in Release mode and can make an important difference between Debug and Release mode. So switch to Release mode, make sure that compiler optimization is set to "optimize for speed" and do the speed comparison again: Here is what I got:

  
            ######################################################
            Feedforward Speed Test MLP implementation #1: mlp_oop
            ######################################################
            generated neuron layer with index #0 with 784 neurons.
            generated neuron layer with index #1 with 10 neurons.
            generated neuron layer with index #2 with 10 neurons.
            Number of images forwarded: 5000
            Number of images forwarded: 10000
            Number of images forwarded: 15000
            Number of images forwarded: 20000
            Number of images forwarded: 25000
            Number of images forwarded: 30000
            Number of images forwarded: 35000
            Number of images forwarded: 40000
            Number of images forwarded: 45000
            Number of images forwarded: 50000
            -->Time needed: 1215ms

            ######################################################
            Feedforward Speed Test MLP implementation #2: mlp_fast
            ######################################################
            Ok, in layer 0 you want 784 neurons.
            Ok, in layer 1 you want 10 neurons.
            Ok, in layer 2 you want 10 neurons.
            Number of images forwarded: 5000
            Number of images forwarded: 10000
            Number of images forwarded: 15000
            Number of images forwarded: 20000
            Number of images forwarded: 25000
            Number of images forwarded: 30000
            Number of images forwarded: 35000
            Number of images forwarded: 40000
            Number of images forwarded: 45000
            Number of images forwarded: 50000
            -->Time needed: 752ms

            The implementation mlp_fast is 1.62 times faster than mlp_oop

            Good bye! Press a key to exit
        

Things have changed dramatically! Both implementations have gained a drastic increase in speed. However, mlp_oop has gained more than mlp_fast such that the difference is not that large any more!

Now that both implementions run dramatically faster, increase the number of feedforward steps from 50.000 to 1000000 and run the test again:

  
            ######################################################
            Feedforward Speed Test MLP implementation #1: mlp_oop
            ######################################################
            generated neuron layer with index #0 with 784 neurons.
            generated neuron layer with index #1 with 10 neurons.
            generated neuron layer with index #2 with 10 neurons.
            Number of images forwarded: 100000
            Number of images forwarded: 200000
            Number of images forwarded: 300000
            Number of images forwarded: 400000
            Number of images forwarded: 500000
            Number of images forwarded: 600000
            Number of images forwarded: 700000
            Number of images forwarded: 800000
            Number of images forwarded: 900000
            Number of images forwarded: 1000000
            -->Time needed: 22657ms

            ######################################################
            Feedforward Speed Test MLP implementation #2: mlp_fast
            ######################################################
            Ok, in layer 0 you want 784 neurons.
            Ok, in layer 1 you want 10 neurons.
            Ok, in layer 2 you want 10 neurons.
            Number of images forwarded: 100000
            Number of images forwarded: 200000
            Number of images forwarded: 300000
            Number of images forwarded: 400000
            Number of images forwarded: 500000
            Number of images forwarded: 600000
            Number of images forwarded: 700000
            Number of images forwarded: 800000
            Number of images forwarded: 900000
            Number of images forwarded: 1000000
            -->Time needed: 13763ms

            The implementation mlp_fast is 1.65 times faster than mlp_oop                        
        

4. Change number of layers and MLP layer sizes and conduct speed experiments again

Change the MLP size from 784-10-10 (3 layers) to 784-800-400-200-100-50 (6 layers) and do the speed test again for 10.000 feedforwards.
  
            ######################################################
            Feedforward Speed Test MLP implementation #1: mlp_oop
            ######################################################
            generated neuron layer with index #0 with 784 neurons.
            generated neuron layer with index #1 with 800 neurons.
            generated neuron layer with index #2 with 400 neurons.
            generated neuron layer with index #3 with 200 neurons.
            generated neuron layer with index #4 with 100 neurons.
            generated neuron layer with index #5 with 50 neurons.
            Number of images forwarded: 1000
            Number of images forwarded: 2000
            Number of images forwarded: 3000
            Number of images forwarded: 4000
            Number of images forwarded: 5000
            Number of images forwarded: 6000
            Number of images forwarded: 7000
            Number of images forwarded: 8000
            Number of images forwarded: 9000
            Number of images forwarded: 10000
            -->Time needed: 31462ms

            ######################################################
            Feedforward Speed Test MLP implementation #2: mlp_fast
            ######################################################
            Ok, in layer 0 you want 784 neurons.
            Ok, in layer 1 you want 800 neurons.
            Ok, in layer 2 you want 400 neurons.
            Ok, in layer 3 you want 200 neurons.
            Ok, in layer 4 you want 100 neurons.
            Ok, in layer 5 you want 50 neurons.
            Number of images forwarded: 1000
            Number of images forwarded: 2000
            Number of images forwarded: 3000
            Number of images forwarded: 4000
            Number of images forwarded: 5000
            Number of images forwarded: 6000
            Number of images forwarded: 7000
            Number of images forwarded: 8000
            Number of images forwarded: 9000
            Number of images forwarded: 10000
            -->Time needed: 23296ms

            The implementation mlp_fast is 1.35 times faster than mlp_oop
        
So we can see that the speed ratio between both implementations also strongly depends on the MLP architecture.