Software tutorial/Reading data into R

From Statistics for Engineering
Jump to: navigation, search
← Getting started (previous step) Tutorial index Next step: Basic data manipulation in R →


Two videos on reading (and writing) data into R

The basics (CSV and tabular text files)

For advanced users (dput, zipped files, urls)


The most interesting data to analyze is always your own. How do we read your own data files into R? We will look at when the data file is on your computer, or when the data is available somewhere on the internet.

Note

For now we only consider comma separated values (CSV) files. R can read other files types, such as XML files, directly from databases, and other sources. All the details are here.

Data on your hard drive

Go to the datasets website and download any data set, for example the Website traffic datasets. Save the file, and remember the location. For example: C:/Courses/4C3/data/website-traffic.csv

Note

You must use the "/" character in R to separate directories (folders), not the "\" character, even in Windows.

We will use the read.csv command to read these comma-separated values (CSV) files. If you look inside the website-traffic.csv file you will how the data is stored: each column is separated by a comma, and each row is a new line.

> website <- read.csv('C:/Courses/4C3/data/website-traffic.csv')

Linux and Mac users will have something like:

> website <- read.csv('/home/yourname/Courses/4C3/data/website-traffic.csv')

You will get NO output to the screen if the data are successfully read in; you only will see something if an error occurred.

The <- operation means assign the result of the expression on the right to the variable name on the left. To see what the variable website looks like, just type website at the R command line:

> website

#     DayOfWeek      MonthDay Year Visits
# 1      Monday        June 1 2009     27
# 2     Tuesday        June 2 2009     31
# 3   Wednesday        June 3 2009     38
# 4    Thursday        June 4 2009     38
#  ...
# 211    Monday   December 28 2009     24
# 212   Tuesday   December 29 2009     18
# 213 Wednesday   December 30 2009     10
# 214  Thursday   December 31 2009      7

Reading data from the internet

You can read the data directly from the internet. Go to the datasets website again and right-click on the CSV link for the data set you want to download. Your web browser should have the right-click option Copy Link Location, or Copy Shortcut or something similar.

This will copy the address of the data set to your clipboard. Then in R, you type:

website <- read.csv('http://openmv.net/file/website-traffic.csv')

where the part between quotation marks is the web address you copied. Use the paste function to avoid typing errors.

Getting help

Before continuing further, if you ever need help with an R command type help("name of command"). For example:

help(read.csv)

This will pop up a new window and tell you what read.csv does and shows examples of how to use it.

← Getting started (previous step) Tutorial index Next step: Basic data manipulation in R →