Software tutorial/Building a least squares model in R

From Statistics for Engineering
Jump to: navigation, search
← Vectors and matrices (previous step) Tutorial index Next step: Extracting information from a linear model in R →


Note

A particularly useful tutorial for the theory of least squares are Chapters 5, 9 and 10 of the book Introductory Statistics with R by Dalgaard. You might be able access the PDF version from your company or university's subscription.

The lm(...) function is the primary tool to build a linear model in R. The input for this function must be a formula object (type help(formula) for further info). In the example below the formula is y ~ x. This says: "calculate for me the linear model that relates \(x\) to \(y\)"; or alternatively and equivalently: "build the linear model where \(y\) is regressed on \(x\)".

x <- c(1, 2, 3, 4, 5)
y <- c(2, 3, 4, 4, 5)
model <- lm(y~x)

The output from lm is a linear model object, also called an lm object. In R you can get a description of most objects when using the summary(...) command.

summary(model)
Call:
lm(formula = y ~ x)

Residuals:
        1         2         3         4         5
-2.00e-01  1.00e-01  4.00e-01 -3.00e-01  2.29e-16

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.5000     0.3317   4.523  0.02022 *
x             0.7000     0.1000   7.000  0.00599 **
---
Signif. codes:  0***0.001**0.01*0.05.0.1 ‘ ’ 1

Residual standard error: 0.3162 on 3 degrees of freedom
Multiple R-squared: 0.9423,     Adjusted R-squared: 0.9231
F-statistic:    49 on 1 and 3 DF,  p-value: 0.005986

This output gives you the intercept and slope for the equation \(y = b_0 + b_1 x\) and in this case it is \(y = 1.5 + 0.7x\). The residual standard error, \(S_E = 0.3162\) and \(R^2 = 0.9423\).

← Vectors and matrices (previous step) Tutorial index Next step: Extracting information from a linear model in R →