Software tutorial/Basic plots in R

From Statistics for Engineering
Jump to: navigation, search
← Basic data manipulation in R (previous step) Tutorial index Next step: Plots with multiple series, colour, and legends →


A simple sequence plot

We will continue on with the data set described in the previous section. Load the dataset and let's plot the column called Visits - there are 4 columns in the dataset, so we must be specific on which one to plot.

website <- read.csv('http://openmv.net/file/website-traffic.csv')
plot(website$Visits)

Which produces this figure. Note that the defaults in R are to leave a lot of white space around the figure. We will show later how to remove that.

Website-traffic-base.jpg

That plot shows only the points (markers), in the order of the data set. What if you want lines between the points? Type help(plot) to find out more about the plot command. For example, it tells you there that type="p" will just show the points (the default setting for plot):

plot(website$Visits, type="p")

If you use type="l" you get a line plot:

plot(website$Visits, type="l")

and type="b" will show both lines and points, leaving a space between the point and the line connections

plot(website$Visits, type="b")

and type="o" will connect (overplot) the lines and points.

Box plots

The basic boxplot syntax is:

boxplot(website$Visits)

To get boxplots for each day of the week, side-by-side we rely on the fact the column, DayOfWeek, is a categorical variable. R calls these factor variables, and you can confirm this: is.factor(website$DayOfWeek) returns TRUE. We can then tell the boxplot command to group the boxplots by a factor variable. Read the help text, help(boxplot), and help(factor) in order to understand more clearly how this plot was generated.

boxplot(website$Visits ~ website$DayOfWeek)

Website-traffic-boxplot-default.jpg
← Basic data manipulation in R (previous step) Tutorial index Next step: Plots with multiple series, colour, and legends →