Assignment 2 - 2012

Due date(s):	23 January 2012, noon
(PDF)	Assignment questions
(PDF)	Assignment solutions

Assignment objectives

=========

.. rubric:: Assignment objectives:

Use a table of normal distributions to calculate probabilities
Summarizing data my means and standard deviations, and their robust equivalent
Ability to downloaded data and analyze it

Question 1 [2]

==

Estimate the following:

. Without using tables or a computer: the cumulative area under the normal distribution between 15 and 35, with mean of 25 and standard deviation of 5.
. The same as part 1, but using a table of normal distributions from the course notes (or another statistics textbook).
. Between which lower and upper bounds will we find 60% probability of an event occurring, using the standardized (:math:`z`) normal distribution? Calculate your answer using a printed table, ensuring that the two bounds are symmetrical about zero.
. Convert these dimensionless :math:`z`-bounds to real-world bounds for a process with mean of 100 kg and a standard deviation of 25 kg.
. Verify your previous two answers using R, or other computer software.

Question 2 [3]

==

A chicken facility produces bags filled with breaded chicken strips. The advertised weight for each package is 750 grams. Each bag contains between 8 and 15 strips, given that each chicken strip is between 40 an 80 grams and from a uniform distribution. The company sets their target fill weight at 790 grams to avoid breaking regulations that require an accurate package labelling.

. If we take a large sample of bagged chicken strips and weigh each bag, from which distribution will we expect these weights to come from?
. Clearly explain why.
. If the standard deviation of this large sample of bag weights is 12 grams, out of 10,000 customers, how many will purchase bags below the advertised 750g weight?

Question 3 [3]

==

. Compute the mean, median, standard deviation and MAD for salt content of various potato chips `in this report <http://beta.images.theglobeandmail.com/archive/00245/Read_the_report_245543a.pdf>`_ (page 22) as described in the the article from the `Globe and Mail <http://www.theglobeandmail.com/life/health/salt-variation-between-brands-raises-call-for-cuts/article1299117/>`_ on 24 September 2009.

. Plot a boxplot of the data and report the interquartile range (IQR). Comment on the 3 measures of spread you have calculated: standard deviation, MAD, and interquartile range.

. Comment on the effectiveness of the visualization plots used in the PDF report.

Question 4 [4]

==

Data `characterizing 200 commuting trips of your instructor <http://openmv.net/info/travel-times>`_ was visualized in the previous assignment.

. Plot a histogram of the ``TotalTime`` variable (the total time for the commute) to confirm the variable is not normally distributed.

. How would you characterize the distribution of the ``TotalTime`` variable? Give reasons *why* the variable is not normally distributed.

. Confirm the variable is not normally distributed by using a suitable, visual statistical test.

. The 407 highway speeds are almost always much faster than the 403. Does the ``MaxSpeed`` variable (the maximum speed recorded during the entire trip, usually while travelling the 407) follow a normal distribution. Plot both a histogram and a q-q plot to check.

Question 5 [3]

==

In this question we investigate the stock prices for the Canadian National Railway Company (ticker ``CNR`` on the Toronto Stock Exchange).

Visit https://finance.yahoo.com/
Type in ``CNR.TO`` in the symbol (ticker) box
Click **Historical Prices** in the left column
Change the date range from 01 March 2011 to 01 January 2012
Click **Get Prices** to get the "Daily" prices of the stock
Scroll to the bottom of the page and click "Download to spreadsheet" to download a CSV file

Once you have loaded the CSV file into R, answer the following questions regarding the ``Adj.Close`` column (the price at which stock closes at end of the trading day, after adjusted it for stock splits and dividends paid)

. Are these closing prices from a normal distribution? Test your answer with a q-q plot.
. Estimate the distribution's location and spread, assuming the data are from a normal distribution. 600-level students must use the ``fitdistr`` function in R from the MASS package.
. Are these data points independent?
. What is the probability of observing a stock value above $ 77.00 ?

- Note**: the purpose of this exercise is more for you to become comfortable with web-based data retrieval, which is common in most companies.

.. raw:: latex

\vspace{0.5cm} \hrule $Unknown environment 'center'$

</rst>

Assignment 2 - 2012

Contents

=========

==

==

==

==

==

Navigation menu

Assignment 2 - 2012

=========

==

==

==

==

==

Navigation menu

Search