Software tutorial/Linear models with multiple X-variables (MLR)

From Statistics for Engineering
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
← Investigating outliers, discrepancies and other influential points (previous step) Tutorial index Next step: Linear models with integer variables →


<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/> Including multiple variables in a linear model in R is straightforward. This case is called multiple linear regression, MLR.

Just extend the formula you normally provide to the ``lm(...)`` function with extra terms. For example:

  • Standard, univariate model, :math:`y = b_0 + b_1 x` is represented as: ``y ~ x``
  • To add extra explanatory variables, for example :math:`y = b_0 + b_1 x_1 + b_2 x_2`, is represented by: ``y ~ x1 + x2``

Using the stackloss example from earlier:

.. code-block:: s

attach(stackloss) colnames(stackloss) # [1] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"

model <- lm(stack.loss ~ Air.Flow + Acid.Conc. + Water.Temp) summary(model)

Call: lm(formula = stack.loss ~ Air.Flow + Acid.Conc. + Water.Temp)

Residuals: Min 1Q Median 3Q Max -7.2377 -1.7117 -0.4551 2.3614 5.6978

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -39.9197 11.8960 -3.356 0.00375 ** Air.Flow 0.7156 0.1349 5.307 5.8e-05 *** Acid.Conc. -0.1521 0.1563 -0.973 0.34405 Water.Temp 1.2953 0.3680 3.520 0.00263 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.243 on 17 degrees of freedom Multiple R-squared: 0.9136, Adjusted R-squared: 0.8983 F-statistic: 59.9 on 3 and 17 DF, p-value: 3.016e-09

We can interrogate this ``model`` object in the same way as we did for the single :math:`x`-variable case.

  • ``resid(model)``: get a list of residuals
  • ``fitted(model)``: predicted values of the model-building data
  • ``coef(model)``: the model coefficients
  • ``confint(model)``: provides the marginal confidence intervals (recall there are joint and marginal confidence intervals)
  • ``predict(model)``: can be used to get new predictions. For example, create a new data frame with 2 observations:

.. code-block:: s

x.new = data.frame(Air.Flow = c(56, 62), Water.Temp = c(18, 24), Acid.Conc. = c(82, 89)) x.new # Air.Flow Water.Temp Acid.Conc. # 1 56 18 82 # 2 62 24 89 y.new = predict(model, newdata=x.new) y.new # 1 2 # 10.99728 21.99798 </rst>

← Investigating outliers, discrepancies and other influential points (previous step) Tutorial index Next step: Linear models with integer variables →