## Description

Problem 1 Assume we collected a dataset D = {(x(i)

,t

(i)

)}i∈1..7 of N = 7 points (i.e., observations) with inputs {x(i)

}i∈1..7 = (1, 2, 3, 4, 5, 6, 7) and outputs {t

(i)

}i∈1..7 = (6, 4, 2, 1, 3, 6, 10)

for a regression problem with both scalar inputs and outputs.

1. (1 point) Draw a scatter plot of the dataset on a spreadsheet software (e.g., Excel).

2. (6 points) Let us use a linear regression model gw,b(x) = wx + b to model this data.

Write down the analytical expression of the least squares loss (covered in Video 6) of

this model on dataset D. Your loss should take the form of

1

2N

!

i∈1..N

Aiw2 + Bib2 + Ciwb + Diw + Eib + Fi

where Ai, Bi, Ci, Di, Ei, and Fi are expressed only as a function of x(i) and t

(i) or constants. Do not fill-in any numerical values yet.

3. (4 points) Derive the analytical expressions of w and b by minimizing the mean squared

loss from the previous question. Your expressions for parameters w and b should only

depend on A = ”

i Ai, B = ”

i Bi, C = ”

i Ci, D = ”

i Di and E = ”

i Ei. Do not fill-in

any numerical values yet.

4. (2 points) Give approximate numerical values for w and b by plugging in numerical

values from the dataset D.

Introduction to ML – Fall 2022 Assignment 2 – Page 2 of 3 Oct 6

5. (0 points) Double-check your solution with the scatter plot from the question earlier:

e.g., you can use Excel to find numerical values of w and b. You do not need to hand

in anything here, this is just for you to verify you obtained the correct solution in the

previous questions.

Problem 2 The goal of this problem is to revisit Problem 1, but solving it with a different

technique known as the method of least squares. This will serve as a “warm-up” to Problem

3. In the rest of this problem, any reference to a dataset refers to the dataset described in

Problem 1.

1. (1 point) Verify that one can rewrite the linear regression model gw,b(x) = wx+b in the

simpler form of

gw(!x) = !xw!

if one assumes each input !x is a two-dimensional row vector such that a point in our

dataset is now !x(i) = (x(i)

, 1) where x(i) is the scalar input described in Problem 1. Write

the components of the new column vector w! as a function of w and b from Problem 1.

2. (4 points) Derive analytically ∇w! “Xw! −!t”2 where X is a N × 2 matrix such that each

row of X is a vector !x(i) described in the previous question, and !t = {t

(i)

}i∈1..7.

3. (1 point) Conclude that the model’s weight value w! ∗ which minimizes the least squares

loss (covered in Video 6) must satisfy

2X⊤Xw! ∗ − 2X⊤!t = 0

4. (1 point) Assuming that X⊤X is invertible, derive analytically the value of w! ∗.

5. (0 points) Using numPy, implement the solution you found in the previous question and

verify that you obtain the same results for w and b than in Problem 1. You do not need

to hand in anything here, this is just a way for you to verify you obtained the correct

solution in the previous questions.

Problem 3 Let us now assume that D is a dataset with d features per input and N > 0

inputs. We have D = {(( !

x(i)

j )j∈1..d,ti)}i∈1..N . In other words, each !x(i) is a column vector

with d components indexed by j such that x(i)

j is the jth component of !x(i). The output !t

(i)

remains a scalar (real value).

Let us assume for simplicity that we have a simplified linear regression model, as presented

in the Question 1 of Problem 2. We would like to train a regularized linear regression model,

where the mean squared loss is augmented with an ℓ2 regularization penalty 1

2 “w!”2

2 on the

weight parameter w!:

ε(w!, D) = 1

2N

!

i∈1..N

(gw! ( !x(i)) − t

(i)

)

2 +

λ

2

“w!”2

2

where λ > 0 is a hyperparameter that controls how much importance is given to the penalty.

Introduction to ML – Fall 2022 Assignment 2 – Page 3 of 3 Oct 6

1. (3 points) Let A = ”

i∈1..N !x(i) !x(i)

⊤

. Give a simple analytical expression for the components of A.

2. (6 points) Let us write !b = ”

i∈1..N t

(i) !x(i), prove that the following holds:

∇ε(w!, D) = 1

N

#

Aw! −!b

$

+ λw!

3. (2 points) Write down the matrix equation that w! ∗ should satisfy, where:

w! ∗ = arg min

w! ε(w!, D)

Your equation should only involve A,!b, λ, N, and w! ∗.

4. (3 points) Prove that all eigenvalues of A are non-negative.

5. (3 points) Demonstrate that matrix A + λNId is invertible by proving that none of its

eigenvalues are zero. Here, Id is the identity matrix of dimension d.

6. (2 points) Using the invertibility of matrix A+λNId, solve the equation stated in question 3 and deduce an analytical solution for w! ∗. You’ve obtained a linear regression

model regularized with an ℓ2 penalty.

∗

∗ ∗