STAT 578 – Advanced Bayesian Modeling Assignment 5 solved

$35.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

5/5 - (1 vote)

File usparkvisits.csv contains annual numbers of recreational visits to 30 different parks
managed by the US National Park Service for the years 2006 through 2018.1 You will build and
compare two different varying-coefficient hierarchical normal regression models for the log-scale
numbers, using JAGS and rjags.
(a) (i) [2 pts] On the same set of axes, plot segmented lines, one for each park, representing
the number of visits versus the year (2006 through 2018). Distinguish the lines for
different parks by using different colors or line types. (You should label the axes, but no
legend is needed.)
(ii) [2 pts] Repeat the previous part using the natural logarithm of the number of visits.
Let yij be the natural logarithm of the number of visits to park j in year i (i = 1, . . . , 13,
j = 1, . . . , 30). For each park, model the log-number as a simple linear regression on the
centered year number:
yij | β
(j)
, σ2
y
, X ∼ indep. N

β
(j)
1 + β
(j)
2
(xi − x¯), σ2
y

where
β
(j) =

β
(j)
1
β
(j)
2
!
j = 1, . . . , 30 xi = i i = 1, . . . , 13
Note that the coefficients are allowed to depend on the park, but the variance is not.
(b) Let βˆ
(j)
1
and βˆ
(j)
2
be the ordinary least squares estimates of β
(j)
1
and β
(j)
2
, estimated for
park j. You may use the lm function in R to compute these estimates. (For this part, the
coefficient pairs are estimated completely separately for each park.)
(i) [1 pt] Produce a scatterplot of the pairs
βˆ
(j)
1
, βˆ
(j)
2

, j = 1, . . . , 30.
(ii) [1 pt] Give the average (sample mean) of βˆ
(j)
1
and also of βˆ
(j)
2
.
(iii) [1 pt] Give the sample variance of βˆ
(j)
1
and also of βˆ
(j)
2
.
(iv) [1 pt] Give the sample correlation between βˆ
(j)
1
and βˆ
(j)
2
.
(c) Consider the bivariate prior
β
(j)
| µβ, Σβ ∼ iid N(µβ, Σβ)
µβ =

µβ1
µβ2
!
Σβ =

σ
2
β1
ρ σβ1 σβ2
ρ σβ1 σβ2 σ
2
β2
!
with hyperpriors
µβ ∼ N

0, 10002
I

Σ
−1
β ∼ Wishart2

Σ
−1
0
/2

1Data from https://irma.nps.gov/STAT
in the notation used in the lecture videos. For your analysis, use
Σ0 =

10 0
0 0.01!
based on preliminary analyses. Let the prior on σ
2
y be
σ
2
y ∼ Inv-gamma(0.0001, 0.0001)
(i) [2 pts] List an appropriate JAGS model. Make sure to create nodes for Σβ, ρ, and σ
2
y
.
Remember that the numbers of visits are to be analyzed on the log scale.
Now run your model using rjags. Make sure to use multiple chains with overdispersed
starting points, check convergence, and monitor µβ, Σβ, σ
2
y
, and ρ (after convergence)
long enough to obtain effective sample sizes of at least 4000 for each parameter.
(ii) [2 pts] Display the coda summary of the results for the monitored parameters.
(iii) [2 pts] Give an approximate 95% central posterior interval for the correlation
parameter ρ, and also produce a graph of its (estimated) posterior density.
(iv) [2 pts] Approximate the posterior probability that ρ < 0. Also, compute the Bayes
factor favoring ρ < 0 versus ρ > 0. (You may use the fact that ρ < 0 and ρ > 0 have
equal prior probability.) Describe the level of data evidence for ρ < 0.
(v) [1 pt] Your model implies that, over the 12 years from 2006 to 2018, the (population)
median number of visits should have changed by a factor of
e
12µβ2
Form an approximate 95% central posterior interval for this quantity.
(vi) [2 pts] Use the rjags function dic.samples to compute the effective number of
parameters (“penalty”) and Plummer’s DIC (“Penalized deviance”). Use at least
100,000 iterations.
(d) Now consider a different model with “univariate” hyperpriors for the model coefficients, which
do not allow for a coefficient correlation parameter:
β
(j)
1
| µβ1
, σβ1 ∼ iid N
µβ1
, σ2
β1

β
(j)
2
| µβ2
, σβ2 ∼ iid N
µβ2
, σ2
β2

with hyperpriors
µβ1
, µβ2 ∼ iid N
0, 10002

σβ1
, σβ2 ∼ iid U(0, 1000)
Let the prior on σ
2
y be the same as in the previous model.
(i) [4 pts] Draw a complete DAG for this new model.
(ii) [2 pts] List an appropriate JAGS model. Make sure that there are nodes for σ
2
β1
, σ
2
β2
,
and σ
2
y
.
Remember that the numbers of visits are to be analyzed on the log scale.
Now run your model using rjags. Make sure to use multiple chains with overdispersed
starting points, check convergence, and monitor µβ1
, µβ2
, σ
2
β1
, σ
2
β2
, σ
2
y
(after
convergence) long enough to obtain effective sample sizes of at least 4000 for each
parameter.
(iii) [2 pts] Display the coda summary of the results for the monitored parameters.
(iv) [2 pts] Recall the (population) median change factor for the number of visits,
e
12µβ2
as considered in the previous analysis. Form an approximate 95% central posterior
interval for this quantity, and compare it with the previous results.
(v) [2 pts] Use the rjags function dic.samples to compute the effective number of
parameters (“penalty”) and Plummer’s DIC (“Penalized deviance”). Use at least
100,000 iterations.
(vi) [1 pt] Compare the (Plummer’s) DIC values for this model and the previous one.
Which is preferred?
Total: 32 pts
3