Software tutorial/Reading data into R

From Statistics for Engineering
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
← Getting started (previous step) Tutorial index Next step: Basic data manipulation in R →


Two videos on reading (and writing) data into R

The basics (CSV and tabular text files)

For advanced users (dput, zipped files, urls)


<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/>

The most interesting data to analyze is always your own. How do we read your own data files into R? We will look at when the data file is on your computer, or when the data is available somewhere on the internet.

.. note:: For now we only consider comma separated values (CSV) files. R can read other files types, such as XML files, directly from databases, and other sources. All the `details are here <http://cran.r-project.org/doc/manuals/R-data.html>`_.

Data on your hard drive


Go to the `datasets website <http://openmv.net>`_ and download any data set, for example the **Website traffic** datasets. Save the file, and remember the location. For example: ``C:/Courses/4C3/data/website-traffic.csv``

.. note:: You must use the "``/``" character in R to separate directories (folders), not the "``\``" character, even in Windows.

We will use the ``read.csv`` command to read these comma-separated values (CSV) files. If you look inside the ``website-traffic.csv`` file you will how the data is stored: each column is separated by a comma, and each row is a new line.

.. code-block:: s

> website <- read.csv('C:/Courses/4C3/data/website-traffic.csv')

Linux and Mac users will have something like:

.. code-block:: s

> website <- read.csv('/home/yourname/Courses/4C3/data/website-traffic.csv')

You will get **NO** output to the screen if the data are successfully read in; you only will see something if an error occurred.

The ``<-`` operation means *assign the result of the expression on the right to the variable name on the left*. To see what the variable ``website`` looks like, just type ``website`` at the R command line:

.. code-block:: s

> website

# DayOfWeek MonthDay Year Visits # 1 Monday June 1 2009 27 # 2 Tuesday June 2 2009 31 # 3 Wednesday June 3 2009 38 # 4 Thursday June 4 2009 38 # ... # 211 Monday December 28 2009 24 # 212 Tuesday December 29 2009 18 # 213 Wednesday December 30 2009 10 # 214 Thursday December 31 2009 7

Reading data from the internet


You can read the data directly from the internet. Go to the `datasets website <http://openmv.net>`_ again and right-click on the CSV link for the data set you want to download. Your web browser should have the right-click option :menuselection:`Copy Link Location`, or :menuselection:`Copy Shortcut` or something similar.

This will copy the address of the data set to your clipboard. Then in R, you type:

.. code-block:: s

website <- read.csv('http://openmv.net/file/website-traffic.csv')

where the part between quotation marks is the web address you copied. Use the paste function to avoid typing errors.

Getting help


Before continuing further, if you ever need help with an R command type ``help("name of command")``. For example:

.. code-block:: s

help(read.csv)

This will pop up a new window and tell you what ``read.csv`` does and *shows examples* of how to use it. </rst>

← Getting started (previous step) Tutorial index Next step: Basic data manipulation in R →