Introduction to ML – Assignment 2 solved




Problem 1 Assume we collected a dataset D = {(x(i)
)}i∈1..7 of N = 7 points (i.e., observations) with inputs {x(i)
}i∈1..7 = (1, 2, 3, 4, 5, 6, 7) and outputs {t
}i∈1..7 = (6, 4, 2, 1, 3, 6, 10)
for a regression problem with both scalar inputs and outputs.
1. (1 point) Draw a scatter plot of the dataset on a spreadsheet software (e.g., Excel).
2. (6 points) Let us use a linear regression model gw,b(x) = wx + b to model this data.
Write down the analytical expression of the least squares loss (covered in Video 6) of
this model on dataset D. Your loss should take the form of
Aiw2 + Bib2 + Ciwb + Diw + Eib + Fi
where Ai, Bi, Ci, Di, Ei, and Fi are expressed only as a function of x(i) and t
(i) or constants. Do not fill-in any numerical values yet.
3. (4 points) Derive the analytical expressions of w and b by minimizing the mean squared
loss from the previous question. Your expressions for parameters w and b should only
depend on A = ”
i Ai, B = ”
i Bi, C = ”
i Ci, D = ”
i Di and E = ”
i Ei. Do not fill-in
any numerical values yet.
4. (2 points) Give approximate numerical values for w and b by plugging in numerical
values from the dataset D.
Introduction to ML – Fall 2022 Assignment 2 – Page 2 of 3 Oct 6
5. (0 points) Double-check your solution with the scatter plot from the question earlier:
e.g., you can use Excel to find numerical values of w and b. You do not need to hand
in anything here, this is just for you to verify you obtained the correct solution in the
previous questions.
Problem 2 The goal of this problem is to revisit Problem 1, but solving it with a different
technique known as the method of least squares. This will serve as a “warm-up” to Problem
3. In the rest of this problem, any reference to a dataset refers to the dataset described in
Problem 1.
1. (1 point) Verify that one can rewrite the linear regression model gw,b(x) = wx+b in the
simpler form of
gw(!x) = !xw!
if one assumes each input !x is a two-dimensional row vector such that a point in our
dataset is now !x(i) = (x(i)
, 1) where x(i) is the scalar input described in Problem 1. Write
the components of the new column vector w! as a function of w and b from Problem 1.
2. (4 points) Derive analytically ∇w! “Xw! −!t”2 where X is a N × 2 matrix such that each
row of X is a vector !x(i) described in the previous question, and !t = {t
3. (1 point) Conclude that the model’s weight value w! ∗ which minimizes the least squares
loss (covered in Video 6) must satisfy
2X⊤Xw! ∗ − 2X⊤!t = 0
4. (1 point) Assuming that X⊤X is invertible, derive analytically the value of w! ∗.
5. (0 points) Using numPy, implement the solution you found in the previous question and
verify that you obtain the same results for w and b than in Problem 1. You do not need
to hand in anything here, this is just a way for you to verify you obtained the correct
solution in the previous questions.
Problem 3 Let us now assume that D is a dataset with d features per input and N > 0
inputs. We have D = {(( !
j )j∈1..d,ti)}i∈1..N . In other words, each !x(i) is a column vector
with d components indexed by j such that x(i)
j is the jth component of !x(i). The output !t
remains a scalar (real value).
Let us assume for simplicity that we have a simplified linear regression model, as presented
in the Question 1 of Problem 2. We would like to train a regularized linear regression model,
where the mean squared loss is augmented with an ℓ2 regularization penalty 1
2 “w!”2
2 on the
weight parameter w!:
ε(w!, D) = 1
(gw! ( !x(i)) − t
2 +
where λ > 0 is a hyperparameter that controls how much importance is given to the penalty.
Introduction to ML – Fall 2022 Assignment 2 – Page 3 of 3 Oct 6
1. (3 points) Let A = ”
i∈1..N !x(i) !x(i)

. Give a simple analytical expression for the components of A.
2. (6 points) Let us write !b = ”
i∈1..N t
(i) !x(i), prove that the following holds:
∇ε(w!, D) = 1
Aw! −!b
+ λw!
3. (2 points) Write down the matrix equation that w! ∗ should satisfy, where:
w! ∗ = arg min
w! ε(w!, D)
Your equation should only involve A,!b, λ, N, and w! ∗.
4. (3 points) Prove that all eigenvalues of A are non-negative.
5. (3 points) Demonstrate that matrix A + λNId is invertible by proving that none of its
eigenvalues are zero. Here, Id is the identity matrix of dimension d.
6. (2 points) Using the invertibility of matrix A+λNId, solve the equation stated in question 3 and deduce an analytical solution for w! ∗. You’ve obtained a linear regression
model regularized with an ℓ2 penalty.

∗ ∗