Assignment 4 - 2014
|Due date(s):||27 February 2014, in class|
I strongly recommend you submit this assignment electronically (see instructions on the course website), so that you can practice using the system for the course project.
Question 1 
The Paper basis dataset contains data, sampled 30 seconds apart, of the basis weight (a measure of paper density) from an industrial source.
Show the autocorrelation plot for the data, and interpret the plot.
600-level students: also confirm that your interpretation is correct by sub-sampling the vector and repeating the autocorrelation test. Hint: use the
seq(start_from, end_at, step_size) command in R to subsample a vector.
Question 2 [6, for 600-level students only]
Use the autocorrelation function on this data set, show the plot, and carefully interpret what the results imply. Hint: you will notice there is a missing value in the data set, so use the
na.action=na.omit as the second input into the
Question 3 
This question uses two data sets. You may answer the question using either one of the data sets (your choice), however 600-level students are expected to use both data sets and compare the results side-by-side (i.e. don't repeat your analysis a second time below the first, do your analysis on both datasets simultaneously, making comparisons between the two data sets). Even 400-level students are encouraged to examine both data sets. Your answer may not exceed 4 pages.
The data is related to the time duration of students writing midterm tests, and it also records the grade the student achieved on the test. One column in the data is labelled as
Grade [percentage] and the other as
This question is of an exploratory nature. You may consider the ideas below, but also feel free to add to these:
- Explore the data: is there a relationship between the grade and time taken to write the test? (Consider learning about and using the
lowessfunction in R.) How would you describe the relationship?
- Should a regression model use
Gradeas the input variable?
- Build a suitable regression model using these two variables. What conclusions do you draw from the model?
- Investigate whether the assumptions for regression models hold true.
- What advice would you give to students based on these results?
- What result(s) do you learn from these data that is(are) useful for course instructors to know?