Software tutorial/Testing a linear model in R

From Statistics for Engineering
Jump to: navigation, search
← Extracting information from a linear model in R (previous step) Tutorial index Next step: Transformation of data in a linear model →


In this section we show how to build a model from some data, and then test it on the rest. Construct an x-vector input and a y-vector response both with 200 observations. Use 150 observation to build the model, then use the remaining 50 to test the model.

input <- rnorm(200, mean=50, sd=12)
response <- 0.7*input + 50 + rnorm(200, sd=10)

# Create index vectors that indicate observations for building and testing:
build.index = seq(1, 150)
test.index  = seq(151, 200)

# Build the model:
model <- lm(response ~ input, subset=build.index)
summary(model)

# Test model. Create data frame from the rest of the "input" x-variable.
x.new <- data.frame(input = input[test.index])
y.hat.new <- predict(model, newdata=x.new)

# Get the actual y-values from the testing data
y.actual = response[test.index]

# Plot the errors first, looking for structure.
errors <- (y.actual - y.hat.new)
plot(errors)

# Calculate RMSEP, and compare to model's standard error, and residuals.
RMSEP <- sqrt(mean(errors^2))
summary(residuals(model))