## Description

Problem 1 [25%]

It is mentioned in Chapter 7 of ISL that a cubic regression spline with one knot at ξ can be obtained using a

basis of the form x, x

2

, x

3

, [x − ξ]

3

+, where [x − ξ]

3

+ = (x − ξ)

3

if x > ξ and equals 0 otherwise. We will now

show that a function of the form

f(x) = β0 + β1x + β2x

2 + β3x

3 + β4[x − ξ]

3

+

is indeed a cubic regression spline, regardless of the values of β0,β1,β2, β3,β4.

1. Find a cubic polynomial

f1(x) = a1 + b1x + c1x

2 + d1x

3

such that f(x) = f1(x) for all x ≤ ξ. Express a1,b1,c1,d1 in terms of β0,β1,β2,β3,β4.

2. Find a cubic polynomial

f2(x) = a2 + b2x + c2x

2 + d2x

3

such that f(x) = f2(x) for all x > ξ. Express a2,b2,c2,d2 in terms of β0,β1,β2,β3,β4. We have now

established that f(x) is a piecewise polynomial.

3. Show that f1(ξ) = f2(ξ). That is, f(x) is continuous at ξ.

Problem 2 [25%]

Use linear, cubic, and natural regression splines investigated Chapter 7 of ISL to the Auto data set. Is there

evidence for non-linear relationships in this data set? Create some informative plots to justify your answer.

Problem 3 [25%]

You will now derive the Bayesian connection to the lasso as discussed in Section 6.2.2. of ISL.

1. Suppose that yi = β0 +

Pp

j=1 xijβj + i where 1, . . . , n are independent and identically distributed

from a normal distribution N (0, 1). Write out the likelihood for the data as a function of values β.

2. Assume that the prior for β : β1, . . . , βp is that they are independent and identically distributed

according to a Laplace distribution with mean zero and variance c. Write out the posterior for β in this

setting using Bayes theorem.

3. Argue that the lasso estimate is the value of β with maximal probability under this posterior distribution.

Compute log of the probability in order to make this point. Hint: The denominator (= the probability

of data) can be ignored in computing the maximum probability.

4. Suppose that 1, . . . , n are independent and identically distributed according to the Laplace distribution.

What are the maximum likelihood/MAP estimates of βi under this assumption? Hint: See https:

//en.wikipedia.org/wiki/Least_absolute_deviations

1

Problem 4 [25%]

Based on a true story, according to: The Drunkard’s Walk: How Randomness Rules Our Lives, Leonard

Mlodinow

Suppose that you applied for a life insurance and underwent a physical exam. The bad news is that your

application was rejected because you tested positive for HIV. The test’s sensitivity is 99.7% and specificity

is 98.5% [https://en.wikipedia.org/wiki/Diagnosis_of_HIV/AIDS#Accuracy_of_HIV_testing]. However,

after studying the CDC website, you find that in your ethnic group (age, gender, race, . . . ) only one in 10,000

people is infected. What is the probability that you actually have HIV?

2