Software tutorial/Testing a linear model in R

From Statistics for Engineering
< Software tutorial
Revision as of 18:42, 14 February 2013 by Kevin Dunn (talk | contribs) (Created page with "{{Navigation|Book=Software tutorial|previous=Extracting information from a linear model in R|current=Tutorial index|next=Transformation of data in a linear model}} __NOTOC__ ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
← Extracting information from a linear model in R (previous step) Tutorial index Next step: Transformation of data in a linear model →


<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/> In this section we show how to build a model from some data, and then test it on the rest. Construct an x-vector ``input`` and a y-vector ``response`` both with 200 observations. Use 150 observation to build the model, then use the remaining 50 to test the model.

.. code-block:: s

input <- rnorm(200, mean=50, sd=12) response <- 0.7*input + 50 + rnorm(200, sd=10)

# Create index vectors that indicate observations for building and testing: build.index = seq(1, 150) test.index = seq(151, 200)

# Build the model: model <- lm(response ~ input, subset=build.index) summary(model)

# Test model. Create data frame from the rest of the "input" x-variable. x.new <- data.frame(input = input[test.index]) y.hat.new <- predict(model, newdata=x.new)

# Get the actual y-values from the testing data y.actual = response[test.index]

# Plot the errors first, looking for structure. errors <- (y.actual - y.hat.new) plot(errors)

# Calculate RMSEP, and compare to model's standard error, and residuals. RMSEP <- sqrt(mean(errors^2)) summary(residuals(model))

.. remove observations from an existing model and rebuild it: lm(model, subset=build) to update the model </rst>