Homework 4 STAT 547 solved




5/5 - (1 vote)

1. Consider the univariate nonparametric regression setting where we have a sample (Xi
, Yi),
i = 1, . . . , n, which satisfies Yi = µ(Xi) + i
, and the error variance var(i) ≡ σ
2 > 0 is
a constant. Assume the density of Xi
is positive, continuous, and supported on [0, 1].
The kernel K(·) is a symmetric continous density function supported on [−1, 1] with
R 1
−1 K2
(x)dx < ∞. The regression function µ is assumed to be twice differentiable with a bounded second derivative. Derive the asymptotic bias and variance for the Nadaraya–Watson estimator at a left boundary point x0 = ch, where c ∈ [0, 1), as h → 0 and nh → ∞. [For example, c = 0 implies x0 = 0, so only design points falling within [0, h] willl be utilized.] 2. Investigate the scallop abundance data. l i b r a r y ( SemiPar ) data ( s c a l l o p ) Perform bivariate smoothing with the total catch as the response using local polynomial. Vary the bandwidths and degrees and visualize the results. 3. Consider again the yeast gene expression data used in Homework 2. The gene expression profiles may be slightly noisy, so before performing an FPCA we may want to presmooth the individual curves. (a) Perform smoothing for each gene expression profile using appropriate smoothing parameters (bandwidth, roughness penalty, etc), and then apply FPCA on the presmoothed curved. Visualize and compare the functional data input, mean functions, and eigenfunctions obtained with and without presmoothing. (b) Estimate the derivative curve of each raw gene expression profile. To use local polynomials, one can use the deriv=1 argument of locfit. Then apply FPCA to analyze the estimated derivative curves. 1