Description

5/5 - (1 vote)

Artificial Neural Networks

This project can be done in pairs

Primary Goal:
In this project you will build an artificial neural network for categorizing images. You will write programs that take images like hand-written numbers and output what numbers the images represent.

Data-set:
MNIST is a data-set composed of handwritten numbers and their labels. It is a famous data-set that has been used for testing new machine-learning algorithm’s performance. Each MNIST image is a 28×28 grey-scale image. Data is provided as 28×28 matrices containing numbers ranging from 0 (white pixels) to 255 (black pixels). Labels for each image are also provided with integer values ranging from 0 to 9, corresponding to the actual value in the image. You can download the whole MNIST database of handwritten digits (Links to an external site.)Links to an external site. by using the package Keras in Python. We also provide you a smaller version of MNIST :images.npy , labels.npy . There are 6,500 images in our version of the database and 6,500 corresponding labels. Since the original MNIST is very large, you are recommended to use the smaller version for this assignment, especially for Task 4.4

Task 0: Install everything

Make sure that you have all of the software installed. Check the FAQ at the bottom of this page to learn how to set up Keras. Note, Ubuntu 16.04 LTS is the recommended OS for running Keras. Windows 10 is not recommended simply because the process is so different for you Windows 10 users. I should also recommend that you read the entire FAQ section prior to starting the project so that you guys don’t get hosed trying to follow tutorials that aren’t relevant.

Task 1: Data Preprocessing

Image data is provided as 28-by-28 matrices of integer pixel values. However, the input to the network will be a flat vector of length 28*28 = 784. You will have to flatten each matrix to be a vector.
The label for each image is provided as an integer in the range of 0 to 9. However the output of the network should be structured as a “one-hot vector” of length 10, like:

0 -> [1,0,0,0,0,0,0,0,0,0],

1 -> [0,1,0,0,0,0,0,0,0,0],

2 -> [0,0,1,0,0,0,0,0,0,0],…,

9 -> [0,0,0,0,0,0,0,0,0,1]

Task 2: Building an Artificial Neural Network to classify the preprocessed data

To implement an Artificial Neural Network, you should use a python package called Keras implemented in Python3. Here is a simple tutorial for beginners (Links to an external site.)Links to an external site.
You will build an ANN of three fully connected layers (input layer, hidden layer(s) and output layer). Then train and test your ANN on the MNIST dataset.
You will use stochastic gradient descent to train your ANN. The loss function should be standard categorical cross-entropy. The learning rate should be 0.001. Start by using 10 hidden layers, each with 50 nodes. The batch size should be 512, with 500 epochs.
Split the data into 20% test set, 20% validation set, and 60% training.
Plot the accuracy of the training set and validation set over epochs. What conclusions can you draw about your model based on the plot?
Report the accuracy and the error and the confusion matrix on the test set.
Add a 20% Dropout to the first layer of your model. Again, plot your training set and validation set over epochs. Compare your plot to the previous one, what conclusions can you draw about your model now?
Again, Report the accuracy, the error, and confusion matrix on the test set.

Task 3: Cross validation

You will implement a function of 3-fold cross validation. The function takes the model and dataset as parameters, and returns the accuracies on training and validation sets for each fold.

Task 4: Evaluate Hyper-parameter configuration

You will experiment with the hidden layers your ANN implemented in Task 1. Use the 3-fold cross validation to evaluate the ANNs with 1, 2, 10 hidden layers. Report the best one and explain why it outperforms others.
You will evaluate the model with batch size 32 and 512 by 3-fold cross validation. Report the best one and explain the advantage and disadvantage of small batch size.
Run three experiments of your own design and report on your findings.
1. For your own experiments tell me what you varied. I suggest you try varying batch size. Definitely try make a convolutional versus non-convolutional. You should also probably study drop out versus non drop out . Please report on validation accuracy but also time. If something takes 10% longer but gives a 50% boost to accuracy that is worth talking about in your report. For your convolutional version try some different kernels and report on what you find. Please make sure in your reporting you give me p-value so I can know if the differences were reliable.
2. You can decrease epoch size of other aspects (like number of images) to make something computational trackable.

Grading policy:

You will submit the source code and report paper on Canvas.

Coding:
1. Data preprocessing: 10 pts
2. ANN: 15 pts
3. Cross validation: 15 pts
Report:
- Appropriate use of Ttest. : 20 pts
- Accuracy, confusion matrix, and analysis for the test set (from Task 1): 20 pts
- Experiments of hidden layers (Task 4): 10 pts
- Experiments of batch size (Task 4): 10 pts

FAQ

Make sure you have installed:

For all OSs:

Your favorite text editor (Notepad++, Visual Studio Code, Visual Studio 2017, Pycharm, Sublime, Vim, Nano, Emacs, Gnome editor, gedit, etc.)

Python 3.5.X

Tensorflow

Keras

All dependencies required for those packages.

You may need sk-learn and scipy packages.

For Ubuntu:

Cuda for GPU support: especially if you plan on continuing use of Keras after the class

Terminator

Sublime

0. Why are there so many things to install?

Well, all these libraries have dependencies that must be met. You also need a good way to edit your code (depending on your operating system). I provided some more names if you are using Ubuntu for the first time. It is recommended to use a lighter weight OS like Ubuntu 1604 LTS to run these scripts because many of the dependencies are already met. Also, in the Keras documentation, they are assuming you use Ubuntu 1604 and know all of the sly tricks Linux guys know. Dual booting is a popular way to get access to Ubuntu 1604 if you plan on doing lots of Linux in the future.

This does come with caveats, however: you run the risk of wiping your Windows or MacOS partitions if you do not do this properly. If you are like me, you might have NVidia driver issues with dual booting. If you don’t want to risk dual booting your machine, you can always download VirtualBox for free and run Ubuntu 1604 as a virtual machine. This, however, can harm your machine if you do not properly cool it while running the virtual machine (VM).

If you want to use a VirtualBox VM, you can go to this link for all 3 major platforms for installation instructions. If you feel ballsy and want to dual boot, here is another great link that explains how to dual boot. I should say that Ubuntu is the recommended OS for this task because the Keras documentation is provided in Linux commands. Also, if you want GPU support, Windows tries to avoid the NVidia driver as much as possible by using the Intel integrated driver, but your mileage will vary.

0.5. Where do I get Ubuntu 1604 LTS?

The link has been provided here. Note that you will need to download the proper version depending on your hardware, whether that is 64-Bit or 32-Bit. Also, choose the Desktop installation, not the server installation, unless you are just that much of a baller. I choose this version to install on my Acer Nitro 5 (May 2018): 64-bit PC (AMD64) desktop image. I have written a guide on installing the correct driver for Ubuntu 1604 to support Cuda here. Tensorflow with GPU support will speed up your calculations, and these instructions will get your machine set up to accept Tensorflow with GPU support.

1. How do I install Python 3?

Great question!

If you are running Windows or MacOS you can go to this link here and then go to the downloads tab. Make sure to download the most up-to-date version of 3.5.X as this is the most well supported in terms of available libraries in Python 3.

If you are running Ubuntu 16.04 LTS or similar OS you can follow this tutorial linked here.

2. How do I install Keras?

It turns out that Keras is actually a “wrapper”, if you will, of other neural network libraries. You need to have another package installed before you can run Keras. We will be choosing Tensorflow because this is the most widely supported library that I know of, although Caffe and Theano are also popular. Tensorflow is supported by Google, so there are consistent bug fixes for the library. It is also usable for many other applications and MQPs here at WPI, so it is recommended you use it anyways. (Multi-GPU support works only with Tensorflow, so, for those of you that want to prep their code for the WPI Ace or Turing clusters, use Tensorflow.)

The official instructions for installing Tensorflow for all operating systems is provided here on this webpage. Make sure you install the regular version if you do not have CUDA installed.

Once you have finished installing Tensorflow, you can follow either of the following instructions to install Keras.

The easiest way to install Keras is to run this command.

“`

pip3 install keras

“`

For more links on the documentation for installing keras from various blogs, here is some more documentation.

You can follow this part of the Keras documentation for installing on Ubuntu.

I happened to find this link for macOS for installing Keras.

If you want to install Keras for Windows 10, you can go to this link here. This link isn’t the best because Windows is not the best for installing Keras. It says to use Anaconda3 for Python3 and package management.

That is good if you have never programmed before and must have everything bullet-proof. If you want Anaconda, great, but here we like to live dangerously. Anaconda also messes with the your Environment Variables in Windows more than I would recommend, so avoiding Windows at all costs will be more beneficial in the long run, especially if you plan to use Keras after CS4341. Industry uses Linux for Keras anyways.

3. I have been struggling for hours on trying to install this Tensorflow and Keras thing. Can I get help somewhere?

If you want help you should look at the troubleshooting links provided for Tensorflow at the bottom of the installation instructions. If that is not helping you, you can always go to Stackoverflow and see what other people have done.

If all else fails, and you have combed the documentation, or if you like dealing with us SAs and TAs, you can come to us and ask us for help. Classmates are also great resources.

The TA for A2018 has dual-booted his Alienware, so he would be a good resource during office hours to figure that out. Some of your fellow RBE friends have done this too…

Professor Heffernan’s Note to Self- Make sure you understand ttest and chi square test. Consider giving smaller size test case. Hard to install the stuff.

Make tutorial around numpy and install. Maybe the questions we ask harder.

CS 4341 Project 2 Artificial Neural Networks solved

Description

Artificial Neural Networks

Related products

CS 4341 Project 5 CSP solved

CS 4341 Project 4 Logic solved

CS 4341 Project 3 Decision Trees for Connect-4 solved