Software tutorial/Testing a linear model in R

From Statistics for Engineering
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
← Extracting information from a linear model in R (previous step) Tutorial index Next step: Transformation of data in a linear model →


<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/> In this section we show how to build a model from some data, and then test it on the rest. Construct an x-vector ``input`` and a y-vector ``response`` both with 200 observations. Use 150 observation to build the model, then use the remaining 50 to test the model.

.. code-block:: s

input <- rnorm(200, mean=50, sd=12) response <- 0.7*input + 50 + rnorm(200, sd=10)

# Create index vectors that indicate observations for building and testing: build.index = seq(1, 150) test.index = seq(151, 200)

# Build the model: model <- lm(response ~ input, subset=build.index) summary(model)

# Test model. Create data frame from the rest of the "input" x-variable. x.new <- data.frame(input = input[test.index]) y.hat.new <- predict(model, newdata=x.new)

# Get the actual y-values from the testing data y.actual = response[test.index]

# Plot the errors first, looking for structure. errors <- (y.actual - y.hat.new) plot(errors)

# Calculate RMSEP, and compare to model's standard error, and residuals. RMSEP <- sqrt(mean(errors^2)) summary(residuals(model))

.. remove observations from an existing model and rebuild it: lm(model, subset=build) to update the model </rst>