Digit Recognition

? - ??.??% probability

Controls:
Click and drag on the black canvas to the left in order to draw a number. When the left mouse button is released (or when "Guess" is clicked) the neural network's top 3 guesses for what you have drawn will be displayed above. Press "C" or click "Clear canvas" to clear the canvas (reset to black).

Note:
For the most accurate predictions, try to keep the digit size to roughly ~65% of the canvas size. This is advised as no pre-processing regarding the relative size of the digit is done (the digit is, however, centered before being fed to the neural network).

Description:
This program utilises a convolutional neural network in order to predict the digit that the user has drawn. This neural network was implemented using a neural network library created by me, the github repository of which can be accessed here.

This neural network is made up of 6 layers. The first is a convolutional layer with 16 5x5 filters and ReLU activation (which is applied before convolution), that takes as input a 28x28x1 matrix (nested arrays). The second is a max pooling layer with filter size 2x2. The next two layers are of the same type and configuration as the previous two, except this second convolutional layer has 32 filters. The fifth layer is a fully-connected layer with ReLU activation and 256 neurons that takes as input the flattened outputs of the previous (pooling) layer. Finally, the sixth (output) layer is a fully-connected layer with 10 neurons and softmax activation.

The model was trained and tested using the MNIST hand-drawn digit dataset. Error was calculated during training by using cross-entropy error, and this error was distributed through the network using back-propagation. Stochastic gradient descent was used to update the weights of the model, with an initial learning rate of 0.01 (or 0.001, can't quite remember) that was manually decreased gradually. I'm not sure how many epochs the model was trained on as I did not keep track, however the model ended with a training accuracy% of roughly 98.5%, and a test accuracy% of roughly 97%.

The library used to create this model unfortunately has a fairly severe bug where it is not possible for the first layer of weights to be tuned during training. I discovered this bug when I realised that I had made a typo in the for-loop used during back-propagation, which made it so that the first layer of weights was never reached (the condition was i > 0 instead of i >= 0, and it was iterating backwards). Fixing this typo causes the model's outputs to explode in magnitude (for softmax the outputs reach 0 and 1, which should not typically be possible), causing the gradient calculation functions to return NaN or Infinity, which then quickly spreads to the rest of the network leading to complete failure. I have deliberated over trying to fix this bug extensively, however my attempts have, so far, been fruitless. Because of this, I have moved on to Python and am using the PyTorch machine learning library for future projects. I hope to fix this bug eventually, or redesign the library altogether as it is really quite messy, but for the time being the library will remain relatively untouched.