Description
Problem 1 Feature Representations (10 pts)
In our study of model selection, we discussed several options of what to do when learning fails. Perhaps
the most opaque of these was to “change the feature representation of the data.” In our upcoming study
of kernels, we will see how to implicitly work with different feature representations. In this problem, we
will explicitly work with a new feature representation known as random Fourier features, described in the
seminal work here.
You will work with the MNIST dataset using your multiclass ridge regression implementation from
Homework 1. Transform each vector xi ∈ R
d via the equation
x¯i = cos (Gxi + b),
where G ∈ R
p×d
is a Gaussian random matrix whose elements are drawn iid from the normal distribution
with mean zero and variance σ
2 = 0.1, and the elements of b ∈ R
p are chosen iid from the uniform distribution
on [0, 2π].
To select the new feature dimension p, perform a training/validation split of 80%/20% on your training
data and use the validation set to perform model selection. Turn in the test error for your selected value of
p, as well as a plot showing your training and validation error as a function of p.
Notes:
1. You should use the same G to transform the entire dataset.
2. You may run into memory issues for p > 6000.
3. You may use λ = 10−4 as in Homework 1, but feel free to perform model selection on λ and/or σ
2 as
well.
Problem 2 Strong Convexity (10 pts)
Prove that f(w) = λ kwk
2
is 2λ-strongly convex.
Problem 3 SLT (10 pts)
(SLT rules apply.) UML, Ch. 12, Exercise 3. State how long you worked on the problem before looking at
the solution.
Problem 4 SLT (5 pts)
(SLT rules apply.) UML, Ch. 13, Exercise 1, first bullet only. State how long you worked on the problem
before looking at the solution.
1