Description
Problem 1 Multiclass Ridge Regression Classifier (5 pts each)
In this problem, you will derive and implement a multiclass linear classifier based on ridge regression. Suppose
you are given m training examples {(xi
, yi)}, from K classes, where xi ∈ R
d and yi ∈ {e1, . . . , eK}
1
is a
one-hot vector. One way to perform classification is to find the matrix W ∈ R
d×K that minimizes the least
squares cost function given below. You can then estimate the integer-valued class labels as
yˆi = arg max
k∈[K]
WT xi
.
(a) Derive a closed-form solution to the problem
Wc = arg min
W∈Rd×K
Xm
i=1
WT xi − yi
2
2
+ λ kWk
2
F
,
where λ > 0 is a regularization parameter.
(b) Implement your trained classifier with λ = 10−4 on the full MNIST dataset included in the homework
files. You may not use sklearn or other high-level libraries to perform the one-hot encoding, but you
may use numpy and scipy. Turn in your code, as well as the classification error on the training and
test sets.
Problem 2 DSS: Using scikit-learn (10 pts)
(DSS rules apply.) One important part of data science is fluency with popular libraries. One library that is
useful for a variety of machine learning tasks (especially those not related to deep learning) is scikit-learn,
a.k.a., sklearn. Use sklearn to perform multiclass ridge regression classification, taking special care to make
sure the regularization parameter λ and the use of an offset are the same. Turn in your code, an explanation
of what functions and options you used to perform the multiclass classification, as well as the classification
error on the training and test sets.
Problem 3 SLT (5 pts)
(SLT rules apply.) UML, Ch. 2, Exercise 2. State how long you worked on the problem before looking at
the solution.
Problem 4 SLT (5 pts each)
(SLT rules apply.) UML, Ch. 2, Exercise 3.1-3.2. State how long you worked on the problem before looking
at the solution.
1The vector ek is the kth standard basis vector, taking zero everywhere except in the kth element, where it takes the value
one.
1