Assignment 2 - 2011 - Solution

From Statistics for Engineering
Revision as of 13:23, 22 September 2018 by Kevin Dunn (talk | contribs)
Jump to navigation Jump to search

<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/>

.. rubric:: Assignment objectives

- A review of basic probability, histograms and sample statistics. - Collect data from multiple sources, consolidate it, and analyze it. - Deal with issues that are prevalent in real data sets. - Improve your skills with R (if you are using R for the course).

    • Notes**:

- I would normally expect you to spend between 3 and 5 hours outside of class on assignments. This assignment should take about that long. Answer with bullet points, not in full paragraphs. - **Numbers in bold** next to the question are the grading points. Read more about the `assignment grading system <http://stats4eng.connectmv.com/wiki/Assignment_grading_system>`_. - 600-level students must complete all the question; 400-level students may attempt the 600 level question for extra credit. Also 600-level students must read the paper by PJ Rousseeuw, "`Tutorial to Robust Statistics <http://dx.doi.org/10.1002/cem.1180050103>`_".

Question 1 [1]

=====

Recall from class that :math:`\mu = \mathcal{E}(x) = \frac{1}{N}\sum{x}` and :math:`\mathcal{V}\left\{x\right\} = \mathcal{E}\left\{ (x - \mu )^2\right\} = \sigma^2 = \frac{1}{N}\sum{(x-\mu)^2}`.

#. What is the expected value thrown of a fair 12-sided dice? #. What is the expected variance of a fair 12-sided dice? #. Simulate 10,000 throws in R, MATLAB, or Python from this dice and see if your answers match those above. Record the average value from the 10,000 throws. #. Repeat the simulation for the average value of the dice a total of 10 times. Calculate and report the mean and standard deviation of these 10 simulations and *comment* on the results.


Solution


The objective of this question is to recall basic probability rules.

  1. . Let :math:`X` represent a discrete random variable for the event of throwing a fair die. Let :math:`x_{i}` for :math:`i=1,\ldots,12` represent the numerical or realized values of the outcome of the random event given by :math:`X`. Now we can define the expected value of :math:`X` as,
   .. math::
       \mathcal{E}(X)=\sum_{i=1}^{12}x_{i}P(x_{i})
   where the probability of obtaining a value of :math:`1,\ldots,12` is :math:`P(x_{i})=1/N=1/12 \;\forall\; i=1,\ldots,12`. So, we have,
   .. math::
       \mathcal{E}(X)=\frac{1}{N}\sum_{i=1}^{12}x_{i}=\frac{1}{12}\left(1+2+\cdots+12\right)=\bf{6.5}
  1. . Continuing the notation from the above question we can derive the expected variance as,
   .. math::
       \mathcal{V}(X)&=\mathcal{E}\left\{[X-\mathcal{E}(X)]^{2}\right\}\\
       &=\mathcal{E}(X^{2})-[\mathcal{E}(X)]^{2}
     
   where :math:`\mathcal{E}(X^{2})=\sum_{i}x_{i}^{2}P(x_{i})`. So we can now calculate :math:`\mathcal{V}(X)` as, 
   .. math::
       \mathcal{V}(X)&=\sum_{i=1}^{12}x_{i}^{2}P(x_{i})-\left[\sum_{i=1}^{12}x_{i}P(x_{i})\right]^{2}\\
       &=\frac{1}{12}(1^{2}+2^{2}+\cdots+12^{12}) - [6.5]^{2}\approx \bf{11.9167}
  1. . Simulating 10,000 throws corresponds to 10,000 independent and mutually exclusive random events, each with an outcome in the set :math:`\mathcal{S}={1,2,\ldots,12}`. The sample mean and variance from my sample was:

.. math::

\overline{x} &= 6.4925\\ s^2 &= 11.77915

.. twocolumncode:: :code1: ../che4c3/Assignments/Assignment-2/code/q1c.R :language1: s :header1: R code :code2: ../che4c3/Assignments/Assignment-2/code/q1c.m :language2: matlab :header2: MATLAB code

  1. . Repeating the above simulation 10 times (i.e., 10 independent experiments) produces 10 different estimates of :math:`\mu` and :math:`\sigma^2`. Note, everyone's answer should be slightly different, and different each time you run the simulation.

.. twocolumncode:: :code1: ../che4c3/Assignments/Assignment-2/code/q1d.R :language1: s :header1: R code :code2: ../che4c3/Assignments/Assignment-2/code/q1d.m :language2: matlab :header2: MATLAB code

Note that each :math:`\overline{x} \sim \mathcal{N}\left(\mu, \sigma^2/n \right)`, where :math:`n = 10000`. We know what :math:`\sigma^2` is in this case: it is our theoretical value of **11.92**, calculated earlier, and for :math:`n=10000` samples, our :math:`\overline{x} \sim \mathcal{N}\left(6.5, 0.00119167\right)`.

Calculating the average of those 10 means, let's call that :math:`\overline{\overline{x}}`, shows values around 6.5, the theoretical mean.

Calculate the variance of those 10 means shows numbers that are around 0.00119167, as expected. </rst>