## Description

1. In a 2-class problem with 1 feature, you are given the following data points:

(a) Give the k-nearest neighbor estimate of with , for all x. Give both

algebraic (simplest) form, and a plot. On the plot, show the location (x value) and

height of (each) peak.

(b) Give the Parzen Windows estimate of with window function:

Give both algebraic (simplest) form, and a plot. On the plot, show the x value of all

significant points, and label the values of clearly.

(c) Estimate prior probabilities and from frequency of occurrence of the

data points.

(d) Give an expression for the decision rule for a Bayes minimum error classifier using

the density and probability estimates from (a)-(c). You may leave your answer in

terms of , without plugging in for these quantities.

(e) Using the estimates you have made above, solve for the decision boundaries and

regions of a Bayes minimum error classifier using the density and probability

estimates from (a)-(c). Give your answer in 2 forms:

(i) Algebraic expressions of the decision rule, in simplest form (using numbers and

variable x);

(ii) A plot showing the decision boundaries and regions.

Tip: you may find it easiest to develop the algebraic solution and plot and the same

time.

(f) Classify the points: using the classifier you developed in (e).

(g) Separately, use a discriminative 3-NN classifier to classify the points

. (Hint: if this takes you more than a few steps for each data point,

you are doing more work than necessary.)

Assignment continues on next page…

S1 : − 3, − 2, 0

S2 : −1, 2

p x S ( 1) k = 3

p x S ( 2 )

Δ(u) = 0.25, −2 ≤ u < 2
0, otherwise
⎧
⎨
⎪
⎩⎪
p x S ( 2 )
P S1 ( ) P S2 ( )
pˆ x S ( 1), pˆ x S ( 2 ), Pˆ S1 ( ), Pˆ S2 ( )
x = −0.5, 0.1, 0.5
x = −0.5, 0.1, 0.5
p. 2 of 3
2. [Comment: this problem is on parameter estimation, which is covered in Lecture 26 on
Monday, 4/27.]
In a 1D problem (1 feature), we will estimate parameters for one class. We model the
density as:
in which .
You are given a dataset , which are drawn i.i.d. from .
In this problem, you may use for convenience the notation:
.
(a) Solve for the maximum likelihood (ML) estimate , of , in terms of the given
data points. Express your result in simplest form.
For parts (b) and (c) below, assume there is a prior for , as follows:
in which .
(b) Solve for the maximum a posteriori (MAP) estimate , of , in terms of the given
data points. Express your result in simplest form.
(c) Write as a function of and given parameters. Find , in which
is the standard deviation of the prior on . What does this limit correspond to in
terms of our prior knowledge of ?
Hint: the standard deviation of for the given is: .
3. [Extra credit] Comment: this problem is not more difficult than the regular-credit problems
above; it is extra credit because the total length Problems 1 and 2 above is already sufficient
and reasonable for one homework assignment.
In a 2-class problem with D features, you are to use Fisher’s Linear Discriminant to find an
optimal 1D feature space. You are given that the scatter matrices for each class (calculated
from the data for each class) are diagonal:
p( x θ )
p( x θ ) = θ e−θx
, x ≥ 0
0, otherwise
⎧
⎨
⎪
⎩⎪
θ ≥ 0
Z : x1, x2 ,!, xN p( x θ )
m ! 1
N
xi
i=1
N
∑
ˆ
θ ML θ
θ
p(θ ) = ae−aθ , θ ≥ 0
0, otherwise
⎧
⎨
⎪
⎩⎪
a ≥ 0
ˆ
θ MAP θ
ˆ
θ MAP ˆ
θ ML limσ θ→∞
ˆ
θ MAP
σ θ θ
θ
θ p(θ ) σ θ = 1
a
p. 3 of 3
and you are given the sample means for each class:
.
(a) Find the Fisher’s Linear Discriminant . Express in simplest form.
(b) Let . Suppose , and , and:
Plot vectors , , and .
(c) Interpreting your answer of part (b), which makes more sense for a 1D feature space
direction: or ? Justify your answer.
S1 =
σ 1
2
σ 2
2
0
0 !
σ D
2
⎡
⎣
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎤
⎦
⎥
⎥
⎥
⎥
⎥
⎥
⎥
, S2 =
ρ1
2
ρ2
2
0
0 !
ρ D
2
⎡
⎣
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎤
⎦
⎥
⎥
⎥
⎥
⎥
⎥
⎥
m1 =
m1
(1)
m2
(1)
!
mD
(1)
⎛
⎝
⎜
⎜
⎜
⎜
⎜
⎜
⎞
⎠
⎟
⎟
⎟
⎟
⎟
⎟
, m2 =
m1
(2)
m2
(2)
!
mD
(2)
⎛
⎝
⎜
⎜
⎜
⎜
⎜
⎜
⎞
⎠
⎟
⎟
⎟
⎟
⎟
⎟
w
D = 2 σ 1
2 = 4σ 2
2 ρ1
2 = 4ρ2
2
m1 = 2
2
⎛
⎝
⎜
⎞
⎠
⎟, m2 = −2
1
⎛
⎝
⎜
⎞
⎠
⎟
m1, m2 (m1 − m2 ) w
(m1 − m2 ) w