Due date: 16 September 2018


Write R code to perform the tasks below and obtain answers of the specific questions. Save the final R code in a single file. Whenever possible, perform cross-validation without re-fitting the deletion models for each observation. It may be helpful to first write a separate function to compute the cross-validation sum of squares given a fitted model.

Once done, submit your answers by filling out this form, where you will also need to submit your R code.

  1. Download the file annual.csv and read it into R as a data frame called climate.

  2. Using the loess() function, fit non-parametric LOWESS models for Temp ~ CO2 using family = "gaussian" and span values 0.5, 0.75, and 1. Compute the leave-one-out cross-validation sum of squares for each of these three models and report them.

  3. Fit the simple linear regression model Temp ~ CO2. Compute and report the corresponding leave-one-out cross-validation sum of squares.

  4. Use lm() to fit a model for Temp as a degree-5 polynomial in CO2. Compute and report the corresponding leave-one-out cross-validation sum of squares.

  5. Use lm() to fit a basis spline model for Temp as a piecewise cubic spline function of CO2. The rank of the model should match the rank of the previous polynomial model. Compute and report the corresponding leave-one-out cross-validation sum of squares.

  6. Create a plot of Temp against CO2, and add fitted lines for the last two models.

  7. Which of the above six models would you prefer most? Justify.