# Stat W 4201 Assignment #8 solved

\$35.00

Category:

## Description

1. Chapter 12, problem 17
2. Chapter 12, problem 20
3. Chapter 20, problem 11
4. Chapter 20, problem 15
5. Consider a random vector (X, Y, Z), where Y and Z are marginally Bernoulli random variables such
that
P(Y = 1|X) = e
α+XT β
1 + e
α+XT β
,
P(Z = 1|X, Y ) = π(X)φ(Y ).
Show that
P(Y = 1|X, Z = 1) = e
α
∗+XT β
1 + e
α∗+XT β
,
and give the expression of α

.
6. Consider a properly normalized density function f(x) on the real line and its natural exponential
family,
f(x|θ) = f(x)e
θx−ϕ(θ)
.
(a) What conditions do you need for f(x|θ) to be a well defined density function?
(b) What is the choice of ϕ(θ) such that f(x|θ) is a properly normalized density function, that is,
Z
f(x|θ)dx = 1.
(c) Show that with the choice of ϕ in (b) satisfies
ϕ
0
(θ) = Z
xf(x|θ)dx,
ϕ
00(θ) = Z
(x − ϕ
0
(θ))2
f(x|θ)dx,
where ϕ
0 and ϕ
00 are the first and second derivatives of ϕ. If ϕ
0
(θ) is a monotone strictly increasing
function, the variance is determined by its mean for natural exponential families.
(d) Consider the following dataset
Y X
0 −2
0 −1
1 1
1 2
and a generalized linear model
logit−1
{P (Y = 1|X)} = β1X.
1
Report the “maximum likelihood estimate” of β1 from computer using glm(Y∼X-1,binomial) (or
any other softwares you use). Does the estimate make sense to you? Can we obtain a reasonable
estiamte of P (Y = 1|X = 0.5) from the data? Why? Plot the log-likelihood as a function of β1.
Does the plot explain this phenomenon? This is one of the few situations when the MLE does
not exist. It happens sometimes especially when the sample size is small. Provide a sufficient and
necessary condition that the MLE of logistic regression exists.
This problem provide a warning for your future work. When the model becomes more complicated,
some parameters may not be estimable/identifiable for different reasons. The data may not have
enough information to identify the best model, which is our current situation. It can also be
caused by the existence of redundant parameters, that is f (x|θ1) = f (x|θ2), for some θ1 6= θ2.
You should pay particular attention especially to the first situation. The estimates provided by
software may not be meaningful and yield misleading results.
2