Single image super resolution (SISR) is a classical image restoration problem which aims to recover a
high-resolution (HR) image from the corresponding low-resolution (LR) image.
In this assignment, you’ll need to implement a super resolution convolutional neural network (SRCNN)
with PyTorch. We use “Learning a Deep Convolutional Network for Image Super-Resolution”  as the
basic reference. The basic network architecture and implementation details will be provided in the
following sections. In the end, you should submit the source code and the well-trained model after your
finish this assignment.
2 Implementation details
SRCNN uses pairs of LR and HR images to learn the mapping between them. For this purpose, image
databases containing LR and HR pairs are created and used as a training set. The learned mapping can be
used to predict HR details in a new image.
The SRCNN consists of the following operations:
1. Preprocessing: Upscales LR image to desired HR size (using bicubic interpolation).
2. Feature extraction: Extracts a set of feature maps from the upscaled LR image.
3. Non-linear mapping: Maps the feature maps representing LR to HR patches.
4. Reconstruction: Produces the HR image from HR patches.
Operations 2–4 above can be cast as a convolutional layer in a CNN that accepts the upscaled images as
input, and outputs the HR image. This CNN consists of three convolutional layers:
• Conv. Layer 1: Patch extraction
o 64 filters of size 3 x 9 x 9 (padding=4, stride=1)
o Activation function: ReLU
o Output: 64 feature maps
• Conv. Layer 2: Non-linear mapping
o 32 filters of size 64 x 1x 1 (padding=0, stride=1)
o Activation function: ReLU
o Output: 32 feature maps
• Conv. Layer 3: Reconstruction
o 3 filter of size 32 x 5 x 5 (padding=2, stride=1)
o Activation function: Identity
o Output: HR image
The overall structure of SRCNN is shown in Figure 1.
Figure 1. Network Architecture of SRCNN with upscaling factor=3
In this assignment, you will need to implement a SRCNN with upscaling factor 3 in PyTorch. Let 𝑌𝑌𝜃𝜃(𝑥𝑥)
denote this SRCNN model in the following sections.
2.2 Model Training
A typical training framework for a neural network is as follows:
• Define the neural network that has some learnable parameters (or weights)
• Iterate over a dataset of inputs
• Process input through the network
• Compute the loss between output and the ground truth (how far is the output from being correct)
• Propagate gradients back into the network’s parameters
• Update the weights of the network, typically using a simple update rule: weight = weight –
learning_rate * gradient
The SRCNN is a simple feed-forward neural network. It upscaled the input LR, feeds the upscaled image
through several layers one after the other, and then finally gives the output. The overall training procedure
of this network is the same as the above framework. To be specific, with PyTorch, the pseudocode of
training procedure for SRCNN can be described as follows:
Note that the actual code might differ from the pseudocode. Please check tutorial notes and PyTorch
document for related APIs. Besides, we use mean squared error (MSE) as the 𝒍𝒍𝒍𝒍𝒍𝒍𝒍𝒍_𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇:
𝐿𝐿(𝜃𝜃) = 1
𝑛𝑛�‖𝑌𝑌𝜃𝜃(𝐿𝐿𝑅𝑅𝑖𝑖) − 𝐻𝐻𝑅𝑅𝑖𝑖‖2
procedure TrainOneEpoch(𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑌𝑌𝜃𝜃, 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜,𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡)
for each (𝐿𝐿𝑅𝑅𝑖𝑖, 𝐻𝐻𝑅𝑅𝑖𝑖) pair in 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 do
zero the gradient buffers of 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜
compute 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑡𝑡𝑖𝑖 = 𝑌𝑌𝜃𝜃(𝐿𝐿𝑅𝑅𝑖𝑖)
compute the loss ℓ = 𝒍𝒍𝒍𝒍𝒍𝒍𝒍𝒍_𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇𝒇(𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑡𝑡𝑖𝑖, 𝐻𝐻𝑅𝑅𝑖𝑖)
back-propagate the gradients from ℓ to the parameters 𝜃𝜃 of model 𝑌𝑌𝜃𝜃
use 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 to update the parameters 𝜃𝜃
record the loss for training statistics [optional]
where 𝑛𝑛 is the number of training samples. This loss functions can be found in PyTorch APIs. Using MSE
as the loss function favors a high peak signal-to-noise ratio (PSNR). The PSNR is a widely used metric
for quantitatively evaluating image restoration quality and is at least partially related to the perceptual
quality. We will also use PSNR (the higher the better) to measure the performance of the trained model.
The PSNR related snippets are provided in the skeleton code.
In this assignment, we use 91-Image dataset as our training dataset and Set-5 dataset as the testing dataset.
The data related part is provided in the skeleton code.
Other hyperparameters related to training are listed below:
• Training epoch=100; one epoch means completing one loop over whole dataset
• Optimizer: Adam
• Learning rate=0.0001
• Training batch size=128; the number of inputs being feed into the network at once
Note that the above hyperparameters might not lead to reasonable performance. You are encouraged to
find other possible hyperparameters to achieve better performance.
2.3 Skeleton code usage
2.3.1 Project structure
The skeleton code consists of 6 files:
• train.py: a CLI program, which contains the procedure of model training [to be completed]
• model.py: SRCNN model [need to be completed]
• data.py: dataset related codes
• utils.py: helper functions
• super_resolve.py: a CLI program, which can super resolve images given a well-trained model
• info.py: submission info [need to be completed]
In this assignment, you are required to implement a SRCNN in PyTorch 1.2+. In order to make the
skeleton code functional, you need to complete these three files in the skeleton code: train.py,
The usage of train.py can be describe as follows:
# train the SRCNN model using GPU, set learning rate=0.0005, batch size=256,
# make the program train 100 epoches and save a checkpoint every 10 epoches
python train.py train –cuda –lr=0.0005 –batch-size=256 –num-epoch=100 –savefreq=10
# train the SRCNN model using CPU, set learning rate=0.001, batch size=128,
# make the program train 20 epoches and save a checkpoint every 2 epoches
python train.py train –lr=0.001 –batch-size=128 –num-epoch=20 –save-freq=2
# resume training with GPU from “checkpoint.x” with saved hyperparameters
python train.py resume checkpoint.x –cuda
# resume training from “checkpoint.x” and override some of saved hyperparameters
python train.py resume checkpoint.x –batch-size=16 –num-epoch=200
# inspect “checkpoint.x”
python train.py inspect checkpoint.x
Note that the checkpoint consists of the parameters of a trained model, the state of an optimizer, and the
arguments (or hyperparameters) used in current training procedure. Thus, you can use checkpoint to
The usage of super_resolve.py can be describe as follows:
You may use this program to perform qualitative comparison using the images inside the
image_examples.zip file. This file contains LR images, upscaled images with bicubic interpolation, and
ground truth (GT) HR images.
3 Grading Scheme
The assignment will be graded by the following marking scheme:
• Code [40 points]
o The network implementation: 20 points
o The codes for training: 20 points
• Model Training [60 points]
o The testing result of your well-trained model. The higher evaluation metrics are, the higher
your score will be.
4 Submission guidelines
You will need to submit train.py, info.py, model.py and checkpoint.pth to the Blackboard. The
saved checkpoint can have various filenames, you should select one and rename it to checkpoint.pth.
You need to archive all the mentioned files in .zip or .7z format, name this archive file with your name
and student ID (e.g. 1155xxxxxx_lastname_firstname.zip), and then submit this file to the
In case of multiple submissions, only the latest one will be considered.
4.1 Submission requirements
In your source code files, type your full name and student ID, just like:
# use the model stored in “checkpoint.x” to super resolve “lr.bmp”
python super_resolve.py –checkpoint checkpoint.x lr.bmp
Besides, complete the following parts in the info.py. The info object will be saved to the checkpoint, i.e.
4.2 Late submission penalty
If you submit your solution after the due date, 10 marks will be deducted per day, and the maximal
deduction is 30 marks even you delay more than 3 days. But there are hard deadlines as we need time to
grade and submit grade. The hard deadline for this assignment is 8 May 2020. After this day, we will not
grade your assignment.
 Dong, C., Loy, C. C., He, K., & Tang, X. (2015). Image super-resolution using deep convolutional
networks. IEEE transactions on pattern analysis and machine intelligence, 38(2), 295-307.
1. PyTorch tutorial: https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html
2. Tensorboard usage: https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html
3. PyTorch document: https://pytorch.org/docs/stable/index.html
# CSCI3290 Computational Imaging and Vision *
# — Declaration — *
# I declare that the assignment here submitted is original except for source
# material explicitly acknowledged. I also acknowledge that I am aware of
# University policy and regulations on honesty in academic work, and of the
# disciplinary guidelines and procedures applicable to breaches of such policy
# and regulations, as contained in the website
# http://www.cuhk.edu.hk/policy/academichonesty/ *
# Assignment 4
# Student ID:
# Email Addr:
info = Info(