CPSC 585 – Artificial Neural Networks Project 5 solved

$35.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

5/5 - (1 vote)

Caution: the version of the EMNIST Letters dataset provided by Tensorflow Datasets is incorrect. If you downloaded emnist_letters.npz prior to Friday April 2, follow the link below to download a new version. The corrected version should have size 68,058,233 bytes and MD5 checksum 36ea18093d8de0d8f1d9d2601276a4f8. Please re-read the updated dataset description below, and note that experiment (3) has changed to use a dedicated validation set rather than specifying a validation split.

The character recognition problem in previous projects has used low-resolution bitmap images of letters from dot-matrix fonts. This project tackles a much more complex character recognition problem using higher-resolution grayscale images of hand-written letters. This problem will require a significantly more complex network, and will require more advanced approaches to training and tuning.

The project may be completed individually, or in a group of no more than three students.

Platforms

The platform requirements for this project are the same as for Project 1, but the project uses a larger dataset that will require significantly more training. If you do not have access to a physical machine with a GPU, you should consider using Google Colab with a GPU or TPU or another cloud service such as Kaggle Notebooks (GPU, TPU), or Gradient Community Notebooks.

Libraries and Code

This project should use the NumPy and Keras libraries. You may also wish to use other components of TensorFlow and Python libraries such as scikit-learn and pandas.

Code from A Whirlwind Tour of Python and from the library documentation may be reused. All other code and the results of experiments must be your own original work or the original work of other members of your team.

Warmup

As described in Section 1.8 of the textbook, the MNIST database of handwritten digits is a standard benchmark for computer vision. This dataset is included with Keras and is a good introduction to working with larger datasets.

This notebook by Francois Chollet creates a simple Multilayer Perceptron as described in Section 2.1 of Deep Learning with Python. (Recall that this book is available from the library.) Update the code to use tf.keras, then try running it to see if you can match the reported accuracy of 97.8%

Dataset

The Extended MNIST or EMNIST dataset expands on the original MNIST, adding handwritten letters as well as additional samples of handwritten digits. There are several “splits” of the data by various characteristics. We will be using the “EMNIST Letters” dataset, which contains data split into 27 classes, one unused (class 0) and one for each letter in the English alphabet.

Note: Some classes in this dataset can be challenging to recognize because each class contains images of both upper- and lower-case letters.

Download emnist_letter.npz. This file can be opened with the numpy.load() method. The data contains six arrays: ‘train_images’, ‘train_labels’, ‘validate_images’, ‘validate_labels’, ‘test_images’, and ‘test_labels’. The arrays from the original dataset have been altered to match Chollet’s MNIST notebook:

  • The images have been transposed and scaled to floating point.
  • The labels have been one-hot encoded.

Experiments

Run the following experiments in a Jupyter notebook, performing each action in a code cell and answering each question in a Markdown cell.

  1. Use plt.imshow() to verify that the image data has been loaded correctly and that the corresponding labels are correct.
  2. Begin by applying the network architecture from Chollet’s MNIST notebook to the EMNIST Letters data. What accuracy do you achieve? How does this compare with the accuracy for MNIST?.
  3. Keeping the same number of layers in the network (i.e. an MLP with a single hidden layer), modify the architecture to improve the accuracy. You will need to decide on an appropriate number of neurons in the hidden layer. Keep in mind that:
  • There are 27 classes rather than 10, so you will need a larger hidden layer than the MNIST network.
  • In addition to having more classes, EMNIST Letters mixes upper- and lowercase letters within each class, so even with enough neurons in the hidden layer, your accuracy is likely to be lower. See the details in the EMNIST paper for the kind of performance you might reasonably expect.
  • The Keras fit() method can take a validation_data parameter in order to evaluate metrics on the validation set.
  1. Once you have settled on the size of the hidden layer, use the techniques you learned in Chapters 3 and 4 of the textbook to obtain the highest accuracy you can on the validation set. These might include:

 

  • Preprocessing
  • Weight initialization
  • Choice of activation function
  • Optimizer
  • Batch Normalization
  • Regularization
  • Data augmentation
  • Dropout
  • Early Stopping

 

You may find the slides for Chapter 3 helpful, particularly the presentation “Neural Network Training [Initialization, Preprocessing, Mini-Batching, Tuning, and Other Black Art].”

  1. Add additional Dense hidden layers as appropriate to improve the accuracy. Note that you may need to adjust your hyperparameters or other aspects of the network architecture in response to these changes. How does the accuracy for your deep network compare with the accuracy you achieved in experiment (4)?.
  2. When finished tuning, evaluate your results on the test set. Compare the test performance of your original network from experiment (2) and the final networks for experiments (3) and (4).
  3. Use plt.imshow() to view some of the misclassified images and examine their labels. Describe what you think might have gone wrong.

Submission

A Markdown cell at the top of the notebook should include project summary information as described in the Syllabus for README files.

Since you may be actively editing and making changes to the code cells in your notebook, be certain that each of your code cells still runs correctly before submission. You may wish to do this by selecting Run All from the drop-down menu bar.

Submit your Jupyter .ipynb notebook file through Canvas before class on the due date.

If the assignment is completed by a team, only one submission is required. Be certain to identify the names of all students on your team at the top of the notebook. See the following sections of the Canvas documentation for instructions on group submission: