STAT 6340 Mini Project 1 solved

$35.00

Category: You will receive a download link of the .ZIP file upon Payment

Description

5/5 - (1 vote)

1. (10 points) Consider the training and test data posted on eLearning in the files 1-tranining-data.csv
and 1-test-data.csv, respectively, for a classification problem with two classes.
(a) Fit KNN with K = 1, 6, . . . , 200.
(b) Plot training and test error rates against K. Explain what you observe. Is it consistent with
what you expect from the class?
(c) What is the optimal value of K? What are the training and test error rates associated with the
optimal K?
(d) Make a plot of the training data that also shows the decision boundary for the optimal K.
Comment on what you observe. Does the decision boundary seem sensible?
2. (10 points) Read about the problem of image classification at http://cs231n.github.io/classification/.
This website also discusses coding, which is to be ignored. Pay particular attention to the CIFAR-10
dataset. You can read a bit more about it at https://www.cs.toronto.edu/~kriz/cifar.html.
Upon doing so, you would understand that each image in this dataset is 32 pixels wide, 32 pixels
tall, and has 3 color channels (red, blue, and green; or RGB). Thus, each image is a 32 × 32 × 3 (3D)
array of numbers. Each number represents an intensity that lies between 0 and 255 (dark to light).
These image features, totaling 32 ∗ 32 ∗ 3 = 3072, will serve as predictors. The response is the class
of the object shown in the image. There are 10 classes. Thus, in essence, we have a classification
problem with C = 10 response classes, p = 3072 predictors; and the dataset consist of n = 50000
observations in the training set and 10000 observations in test set.
Next, closely follow the instructions at https://keras.rstudio.com to download and install the
keras package (that contains the data) together with its TensorFlow backend. Then, read the data
into R and pre-process as follows:
1
library(keras)
cifar <- dataset_cifar10() str(cifar) x.train <- cifar$train$x y.train <- cifar$train$y x.test <- cifar$test$x y.test <- cifar$test$y # reshape the images as vectors (column-wise) # (aka flatten or convert into wide format) # (for row-wise reshaping, see ?array_reshape) dim(x.train) <- c(nrow(x.train), 32*32*3) # 50000 x 3072 dim(x.test) <- c(nrow(x.test), 32*32*3) # 50000 x 3072 # rescale the x to lie between 0 and 1 x.train <- x.train/255 x.test <- x.test/255 # categorize the response y.train <- as.factor(y.train) y.test <- as.factor(y.test) # randomly sample 1/100 of test data to reduce computing time set.seed(2021) id.test <- sample(1:10000, 100) x.test <- x.test[id.test,] y.test <- y.test[id.test] Use the entire training set and the resampled test obtained above to answer the following questions. Note that we work with only 1% of the test set to reduce the computing time. (a) Fit KNN with K = 50, 100, 200, 300, 400 and examine the test error rates. (Feel free to explore additional values of K.) (b) For the best value of K (among the ones you have explored), examine the confusion matrix and comment on your findings. (c) Briefly explain (in no more than one small paragraph) why using KNN for image classification may not be a good idea. (You may additionally explore working with the entire test data and how to view the images in R. For the latter, see, e.g., https://rdrr.io/a/github/jlmelville/snedata/src/R/cifar.R. This is optional, however. Also, please share with class any other useful resources that you discover.) 3. (5 bonus points) Consider the following general model for the training data (Yi , xi), i = 1, . . . , n in a learning problem: Yi = f(xi) + i , where f is the true mean response function; and the random errors i have mean zero, variance σ 2 , and are mutually independent. We discussed this model in the class. Let ˆf be the estimator of f 2 obtained from the training data. Further, let (x0, Y0) be a test observation. In other words, x0 is a future value of x at which we want to predict Y and Y0 is the corresponding true value of Y . The test observation follows the same model as the training data, i.e., Y0 = f(x0) + 0, where 0 has the same distribution as the i for the training data but 0 is independent of the i . Let Yˆ 0 = ˆf(x0) be the predicted value of Y0. (a) Show that MSE{ ˆf(x0)} = (Bias{ ˆf(x0)}) 2 + var{ ˆf(x0)}. (b) Show that E(Yˆ 0 − Y0) 2 = (Bias{ ˆf(x0)}) 2 + var{ ˆf(x0)} + σ 2 . 3