ECE421 Assignment 2 solved

$30.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

5/5 - (1 vote)

Problem 1 Assume we collected a dataset D = {(x(i)
,t
(i)
)}i∈1..7 of N = 7 points (i.e., observations) with inputs {x(i)
}i∈1..7 = (1, 2, 3, 4, 5, 6, 7) and outputs {t
(i)
}i∈1..7 = (6, 4, 2, 1, 3, 6, 10)
for a regression problem with both scalar inputs and outputs.

1. (1 point) Draw a scatter plot of the dataset on a spreadsheet software (e.g., Excel).

2. (6 points) Let us use a linear regression model gw,b(x) = wx + b to model this data.
Write down the analytical expression of the least squares loss (covered in Video 6) of
this model on dataset D. Your loss should take the form of
1
2N

i∈1..N
Aiw2 + Bib2 + Ciwb + Diw + Eib + Fi
where Ai, Bi, Ci, Di, Ei, and Fi are expressed only as a function of x(i) and t
(i) or constants. Do not fill-in any numerical values yet.

3. (4 points) Derive the analytical expressions of w and b by minimizing the mean squared
loss from the previous question. Your expressions for parameters w and b should only
depend on A = �
i Ai, B = �
i Bi, C = �
i Ci, D = �
i Di and E = �
i Ei. Do not fill-in
any numerical values yet.

4. (2 points) Give approximate numerical values for w and b by plugging in numerical
values from the dataset D.

5. (0 points) Double-check your solution with the scatter plot from the question earlier:
e.g., you can use Excel to find numerical values of w and b. You do not need to hand
in anything here, this is just for you to verify you obtained the correct solution in the
previous questions.

Problem 2 The goal of this problem is to revisit Problem 1, but solving it with a different
technique known as the method of least squares. This will serve as a “warm-up” to Problem
3. In the rest of this problem, any reference to a dataset refers to the dataset described in
Problem 1.

1. (1 point) Verify that one can rewrite the linear regression model gw,b(x) = wx+b in the
simpler form of
gw(�x) = �xw�
if one assumes each input �x is a two-dimensional row vector such that a point in our
dataset is now �x(i) = (x(i)
, 1) where x(i) is the scalar input described in Problem 1. Write
the components of the new column vector w� as a function of w and b from Problem 1.

2. (4 points) Derive analytically ∇w� �Xw� −�t�2 where X is a N × 2 matrix such that each
row of X is a vector �x(i) described in the previous question, and �t = {t
(i)
}i∈1..7.

3. (1 point) Conclude that the model’s weight value w� ∗ which minimizes the least squares
loss (covered in Video 6) must satisfy
2X⊤Xw� ∗ − 2X⊤�t = 0

4. (1 point) Assuming that X⊤X is invertible, derive analytically the value of w� ∗.

5. (0 points) Using numPy, implement the solution you found in the previous question and
verify that you obtain the same results for w and b than in Problem 1. You do not need
to hand in anything here, this is just a way for you to verify you obtained the correct
solution in the previous questions.

Problem 3 Let us now assume that D is a dataset with d features per input and N > 0
inputs. We have D = {(( �
x(i)
j )j∈1..d,ti)}i∈1..N . In other words, each �x(i) is a column vector
with d components indexed by j such that x(i)
j is the jth component of �x(i). The output �t
(i)
remains a scalar (real value).

Let us assume for simplicity that we have a simplified linear regression model, as presented
in the Question 1 of Problem 2. We would like to train a regularized linear regression model,
where the mean squared loss is augmented with an ℓ2 regularization penalty 1
2 �w��2
2 on the
weight parameter w�:
ε(w�, D) = 1
2N

i∈1..N
(gw� ( �x(i)) − t
(i)
)
2 +
λ
2
�w��2
2
where λ > 0 is a hyperparameter that controls how much importance is given to the penalty.

1. (3 points) Let A = �
i∈1..N �x(i) �x(i)

. Give a simple analytical expression for the components of A.

2. (6 points) Let us write �b = �
i∈1..N t
(i) �x(i), prove that the following holds:
∇ε(w�, D) = 1
N

Aw� −�b

+ λw�

3. (2 points) Write down the matrix equation that w� ∗ should satisfy, where:
w� ∗ = arg min
w� ε(w�, D)
Your equation should only involve A,�b, λ, N, and w� ∗.

4. (3 points) Prove that all eigenvalues of A are non-negative.

5. (3 points) Demonstrate that matrix A + λNId is invertible by proving that none of its
eigenvalues are zero. Here, Id is the identity matrix of dimension d.

6. (2 points) Using the invertibility of matrix A+λNId, solve the equation stated in question 3 and deduce an analytical solution for w� ∗. You’ve obtained a linear regression
model regularized with an ℓ2 penalty.

∗ ∗