STAT GU4206/GR5206 Lab 6 (Bayesian Estimation and MCMC) solved

$35.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

5/5 - (1 vote)

Goals
This lab has two goals. The first goal is to use the Accept-Reject algorithm to simulate from a mixture of
two normals. The second goal is to utilize Bayesian methods and the the famous Markov Chain Monte
Carlo algorithm to estimate the mixture parameter δ.
Background: (Mixture)
A mixture distribution is the probability distribution of a random variable that is derived from a collection
of other random variables (Wiki). In our case we consider a mixture of two normal distributions. Here we
assume that our random variable is governed by the probability density f(x), defined by
f(x) = f(x; µ1, σ1, µ2, σ2, δ)
= δf1(x; µ1, σ1) + (1 − δ)f2(x; µ2, σ2)
= δ
1
p
2πσ2
1
exp −
1

2
1
(x − µ1)
2 + (1 − δ)
1
p
2πσ2
2
exp −
1

2
2
(x − µ2)
2
,
where −∞ < x < ∞ and the parameter space is defined by −∞ < µ1, µ2 < ∞, σ1, σ2 > 0, and 0 ≤ δ ≤ 1.
The mixture parameter δ governs how much mass gets placed on the first distribution f(x; µ1, σ1) and the
complement of δ governs how much mass gets placed on the other distribution f2(x; µ2, σ2).
To further motivate this setting, consider simulating n = 10, 000 heights from the population of both males
and females. Assume that males are distributed normal with mean µ1 = 70[in] and standard deviation
σ1 = 3[in] and females are distributed normal with mean µ2 = 64[in] and standard deviation σ2 = 2.5[in].
Also assume that each distribution contributes equal mass, i.e., set the mixture parameter to δ = .5. The
distribution of males is governed by
f1(x; µ1, σ1) = 1
p
2πσ2
1
exp −
1

2
1
(x − µ1)
2
, −∞ < x < ∞,
and the distribution of females is governed by
f2(x; µ2, σ2) = 1
p
2πσ2
2
exp −
1

2
2
(x − µ2)
2
, −∞ < x < ∞.
1
Below shows the pdf of f1(x; µ1, σ1), f2(x; µ2, σ2) and the mixture f(x) all on the same plot.
x <- seq(45,90,by=.05)
n.x <- length(x)
f_1 <- dnorm(x,mean=70,sd=3)
f_2 <- dnorm(x,mean=64,sd=2.5)
f <- function(x) {
return(.5*dnorm(x,mean=70,sd=3) + .5*dnorm(x,mean=64,sd=2.5))
}
plot_df <- data.frame(x=c(x,x,x),
Density=c(f_1,f_2,f(x)),
Distribution=c(rep(“Male”,n.x),rep(“Female”,n.x),rep(“Mixture”,n.x))
)
library(ggplot2)
ggplot(data = plot_df) +
geom_line(mapping = aes(x = x, y = Density,color=Distribution))+
labs(title = “Mixture of Normals”)
0.00
0.05
0.10
0.15
50 60 70 80 90
x
Density
Distribution
Female
Male
Mixture
Mixture of Normals
Part I: Simulating a Mixture of Normals
The first goal is to simulate from the mixture distribution
δf1(x; µ1, σ1) + (1 − δ)f2(x; µ2, σ2),
where µ1 = 70, σ1 = 3, µ2 = 64, σ2 = 2.5, δ = .5. We use the accept-reject algorithm to accomplish this task.
2
First we must choose the “easy to simulate” distribution g(x). For this problem choose g(x) to be a Cauchy
distribution centered at 66 with scale parameter 7.
g <- function(x) {
s=7
l=66
return(1/(pi*s*(1+((x-l)/s)^2)))
}
Perform the following tasks
1) Identify a suitable value of alpha such that your envelope function e(x) satisfies
f(x) ≤ e(x) = g(x)/α, where 0 < α < 1.
Note that you must choose α so that e(x) is close to f(x). There is not one unique solution to this
problem. The below plot shows how α = .20 creates an envelope function that is too large. Validate
your choice of alpha with with a graphic similar to below.
# Choose alpha
alpha <- .20
# Define envelope e(x)
e <- function(x) {
return(g(x)/alpha)
}
# Plot
x.vec <- seq(30,100,by=.1)
ggplot() +
geom_line(mapping = aes(x = x.vec, y = f(x.vec)),col=”purple”)+
geom_line(mapping = aes(x = x.vec, y = e(x.vec)),col=”green”)
3
0.00
0.05
0.10
0.15
0.20
40 60 80 100
x.vec
f(x.vec)
# Is g(x)>f(x)?
all(e(x.vec)>f(x.vec))
## [1] TRUE
Solution
## Solution goes here ——-
2) Write a function named r.norm.mix() that simulates n.samps from the normal-mixture f(x). To
accomplish this task you will wrap a function around the accept-reject algorithm from the lecture notes.
Also include the acceptance rate, i.e., how many times did the algorithm accept a draw compared to the
total number of trials performed. Your function should return a list of two elements: (i) the simulated
vector mixture and (ii) the proportion of accepted cases. Run your function r.norm.mix() to simulate
10,000 cases and display the first 20 values. What’s the proportion of accepted cases? Compare this
number to your chosen α and comment on the result. The code below should help you get started.
Solution
r.norm.mix <- function(n.samps) {
#n <- 0 # counter for number samples accepted
#m <- 0 # counter for number of trials
#samps <- numeric(n.samps) # initialize the vector of output
#while (n < n.samps) {
# Fill in body ——-
#}
#return(list(x=samps,alpha.hat=n.samps/m))
}
4
#r.norm.mix(n.samps=10000)$alpha.hat
#alpha
3) Using ggplot or base R, construct a histogram of the simulated mixture distribution with the true
mixture pdf f(x) overlayed on the plot.
Solution
## Solution goes here ——-
Part II: Bayesian Statistics and MCMC
Suppose that the experimenter collected 100 cases from the true mixture-normal distribution f(x). To solve
problems (4) through (8) we analyze one realized sample from our function r.norm.mix(). In practice this
dataset would be collected and not simulated. Uncomment the below code to simulate our dataset x. If you
failed to solve Part I, then read in the csv file mixture_data.csv posted on Canvas.
Solution
# Simulate data
#set.seed(1983)
#x <- r.norm.mix(n.samps=100)$x
#head(x)
#hist(x,breaks=20,xlab=”X”,main=””)
# Or read data
x <- read.csv(“mixture_data.csv”)$x
head(x)
## [1] 71.66666 63.91096 67.06554 65.49516 70.34363 65.69982
hist(x,breaks=20,xlab=”X”,main=””)
X
Frequency
60 65 70 75
0
2
4
6
8 10
Further, suppose that we know the true heights and standard deviations of the two normal distributions but
the mixture parameter δ is unknown. In this case, we know µ1 = 70, σ1 = 3, µ2 = 64, σ2 = 2.5. The goal of
5
this exercise is to utilize maximum likelihood and MCMC Bayesian techniques to estimate mixture
parameter δ.
Maximum likelihood Estimator of Mixture Parameter
4) Set up the likelihood function L(δ|x1, . . . , x100) and define it as mix.like(). The function should have
two inputs including the parameter delta and data vector x. Evaluate the likelihood at the parameter
values delta=.2, delta=.4, and delta=.6. Note that all three evaluations will be very small numbers.
Which delta (δ = .2, .4, .6) is the most likely to have generated the dataset x?
Solution
## Solution goes here ——-
5) Compute the maximum likelihood estimator of mixture parameter δ. To accomplish this task, apply
your likelihood function mix.like() across the vector seq(.1,.99,by=.001). The solution to this
exercise is given below.
# delta <- seq(.1,.99,by=.001)
# MLE.values <- sapply(delta,mix.like,x=x)
# delta.MLE <- delta[which.max(MLE.values)]
# plot(delta,MLE.values,ylab=”Likelihood”,type=”l”)
# abline(v=delta.MLE,col=”blue”)
# text(x=delta.MLE+.08,y=mix.like(delta=.45,x=x),paste(delta.MLE),col=”blue”)
MCMC
6) Run the Metropolis-Hastings algorithm to estimate mixture parameter δ. In this exercise you will
assume a Beta(α = 10, β = 10) prior distribution on mixture parameter δ. Some notes follow:
• Run 20000 iterations. I.e., simulate 20000 draws of δ
(t)
• Proposal distribution Beta(α = 10, β = 10)
• Independence chain with Metropolis-Hastings ratio:
R(δ
(t)
, δ∗
) = L(δ

|x1, . . . , x100)
L(δ
(t)
|x1, . . . , x100)
Display the first 20 simulated cases of δ
(t)
.
Solution
## Solution goes here ——-
7) Construct a lineplot of the simulated Markov chain from exercise (6). The vertical axis is the simulated
chain δ
(t) and the horizontal axis is the number of iterations.
Solution
## Solution goes here ——-
8) Plot the empirical autocorrelation function of your simulated chain δ
(t)
. I.e., run the function acf(). A
quick decay of the chain’s autocorrelations indicate good mixing properties.
Solution
#acf(delta_t_vec,main=”ACF: Prior Beta(10,10)”)
6
9) Compute the empirical Bayes estimate ˆδB of the simulated posterior distribution π(δ|x1, . . . , xn). To
solve this problem, simply compute the sample mean of your simulated chain δ
(t) after discarding a
20% burn-in.
Solution
## Solution goes here ——-
10) Construct a histogram of the simulated posterior π(δ|x1, . . . , xn) after discarding a 20% burn-in.
Solution
## Solution goes here ——-
11) Run the Metropolis-Hastings algorithm to estimate the mixture parameter δ using a Beta(α = 15, β = 2)
prior distribution on mixture parameter δ. Repeat exercises 6 though 10 using the updated prior.
Solution
## Solution goes here ——-
lineplot:
Construct a lineplot of the simulated Markov chain from exercise (6). The vertical axis is the simulated chain
δ
(t) and the horizontal axis is the number of iterations.
Solution
## Solution goes here ——-
ACF:
Plot the empirical autocorrelation function of your simulated chain δ
(t)
. I.e., run the function acf(). A slow
decay of the chain’s autocorrelations indicate poor mixing properties.
Solution
## Solution goes here ——-
Bayes estimate:
Compute the empirical Bayes estimate ˆδB of the simulated posterior distribution π(δ|x1, . . . , xn). To solve
this problem, simply compute the sample mean of your simulated chain δ
(t) after discarding a 20% burn-in.
Your answer should be close to the MLE.
Solution
## Solution goes here ——-
Posterior: Construct a histogram of the simulated posterior π(δ|x1, . . . , xn) after discarding a 20% burn-in.
Solution
## Solution goes here ——-
7