Assignment 2 - 2012

From Statistics for Engineering
Jump to navigation Jump to search
Due date(s): 23 January 2012, noon
Nuvola mimetypes pdf.png (PDF) Assignment questions
Nuvola mimetypes pdf.png (PDF) Assignment solutions

<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/>

Assignment objectives

=========

.. rubric:: Assignment objectives:

  • Use a table of normal distributions to calculate probabilities
  • Summarizing data my means and standard deviations, and their robust equivalent
  • Ability to downloaded data and analyze it

Question 1 [2]

==

Estimate the following:

  1. . Without using tables or a computer: the cumulative area under the normal distribution between 15 and 35, with mean of 25 and standard deviation of 5.
  2. . The same as part 1, but using a table of normal distributions from the course notes (or another statistics textbook).
  3. . Between which lower and upper bounds will we find 60% probability of an event occurring, using the standardized (:math:`z`) normal distribution? Calculate your answer using a printed table, ensuring that the two bounds are symmetrical about zero.
  4. . Convert these dimensionless :math:`z`-bounds to real-world bounds for a process with mean of 100 kg and a standard deviation of 25 kg.
  5. . Verify your previous two answers using R, or other computer software.

Question 2 [3]

==

A chicken facility produces bags filled with breaded chicken strips. The advertised weight for each package is 750 grams. Each bag contains between 8 and 15 strips, given that each chicken strip is between 40 an 80 grams and from a uniform distribution. The company sets their target fill weight at 790 grams to avoid breaking regulations that require an accurate package labelling.

  1. . If we take a large sample of bagged chicken strips and weigh each bag, from which distribution will we expect these weights to come from?
  2. . Clearly explain why.
  3. . If the standard deviation of this large sample of bag weights is 12 grams, out of 10,000 customers, how many will purchase bags below the advertised 750g weight?

Question 3 [3]

==
  1. . Compute the mean, median, standard deviation and MAD for salt content of various potato chips `in this report <http://beta.images.theglobeandmail.com/archive/00245/Read_the_report_245543a.pdf>`_ (page 22) as described in the the article from the `Globe and Mail <http://www.theglobeandmail.com/life/health/salt-variation-between-brands-raises-call-for-cuts/article1299117/>`_ on 24 September 2009.
  1. . Plot a boxplot of the data and report the interquartile range (IQR). Comment on the 3 measures of spread you have calculated: standard deviation, MAD, and interquartile range.
  1. . Comment on the effectiveness of the visualization plots used in the PDF report.

Question 4 [4]

==

Data `characterizing 200 commuting trips of your instructor <http://openmv.net/info/travel-times>`_ was visualized in the previous assignment.

  1. . Plot a histogram of the ``TotalTime`` variable (the total time for the commute) to confirm the variable is not normally distributed.
  1. . How would you characterize the distribution of the ``TotalTime`` variable? Give reasons *why* the variable is not normally distributed.
  1. . Confirm the variable is not normally distributed by using a suitable, visual statistical test.
  1. . The 407 highway speeds are almost always much faster than the 403. Does the ``MaxSpeed`` variable (the maximum speed recorded during the entire trip, usually while travelling the 407) follow a normal distribution. Plot both a histogram and a q-q plot to check.


Question 5 [3]

==

In this question we investigate the stock prices for the Canadian National Railway Company (ticker ``CNR`` on the Toronto Stock Exchange).

  • Visit https://finance.yahoo.com/
  • Type in ``CNR.TO`` in the symbol (ticker) box
  • Click **Historical Prices** in the left column
  • Change the date range from 01 March 2011 to 01 January 2012
  • Click **Get Prices** to get the "Daily" prices of the stock
  • Scroll to the bottom of the page and click "Download to spreadsheet" to download a CSV file

Once you have loaded the CSV file into R, answer the following questions regarding the ``Adj.Close`` column (the price at which stock closes at end of the trading day, after adjusted it for stock splits and dividends paid)

  1. . Are these closing prices from a normal distribution? Test your answer with a q-q plot.
  2. . Estimate the distribution's location and spread, assuming the data are from a normal distribution. 600-level students must use the ``fitdistr`` function in R from the MASS package.
  3. . Are these data points independent?
  4. . What is the probability of observing a stock value above $ 77.00 ?
    • Note**: the purpose of this exercise is more for you to become comfortable with web-based data retrieval, which is common in most companies.

.. raw:: latex

\vspace{0.5cm} \hrule \begin{center}END\end{center}

</rst>