Software tutorial/Dealing with distributions

From Statistics for Engineering
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
← Calculating statistics from a data sample (previous step) Tutorial index Next step: Extending R with packages →

Values from various distribution functions are easily calculated in R.

Direct probability from a distribution


To calculate the probability value directly from *any* distribution in R you use a function created by combining ``d`` with the name of the distribution, that is what is meant by ``dDIST`` in the illustration here:

Show-dDIST.jpg

<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/> For the *normal* distribution: ``dnorm(x=...)``

For example, ``dnorm(1)`` returns 0.2419707, the point of inflection on the normal distribution curve.

For the :math:`t` distribution: ``dt(x=..., df=...)`` where ``df`` are the degrees of freedom in the :math:`t`-distribution

For the :math:`F`-distribution: ``df(x=..., df1=..., df2=...)`` given the ``df1`` (numerator) and ``df2`` (denominator) degrees of freedom.

For the chi-squared distribution: ``dchisq(x=..., df=...)`` given the ``df`` degrees of freedom.


Values from the cumulative and inverse cumulative distribution


Similar to the above, we call the function by combining ``p`` - to get the cumulative percentage area under the distribution, and ``q`` - to get the quantile. </rst>

Show-pDIST-and-qDIST.jpg

<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/>

  • For the *normal* distribution: ``pnorm(...)`` and ``qnorm(...)``
  • For the :math:`t`-distribution: ``pt(...)`` and ``qt(...)``
  • For the :math:`F`-distribution: ``pf(...)`` and ``qf(...)``
  • For the chi-squared distribution: ``pchisq(...)`` and ``qchisq(...)``

Obtaining random numbers from a particular distribution


To obtain a single random number from the normal distribution with mean of 0 and standard deviation of 1.0:

.. code-block:: s

rnorm(1) [1] -0.3451397

For example, to obtain 10 random, normally distributed values:

.. code-block:: s

rnorm(10) [1] 0.4604076 -0.9670948 -0.2624246 -0.2223866 0.2492692 [6] 0.7160273 -0.2734768 2.4437870 0.4269511 -0.4831478

where the ``r`` prefix indicates we want random numbers.

Notice that R has used a default value of ``mean=0`` and *standard deviation* ``sd=1``. If you'd like your random numbers centred about a different mean, with a different level of spread, then:

.. code-block:: s

rnorm(n=10, mean=30, sd=4) [1] 31.62686 37.83101 28.07470 20.95000 30.47500 [6] 28.21797 35.81518 28.61481 30.59083 32.94051

Please pay attention to the fact that this function accepts the *standard deviation* and not the variance. In the previous example, the usual notation in statistics is to say :math:`x \sim \mathcal{N}(30, 16)` that is, we specify the variance, but the random number generator requires you specify the standard deviation.

  • For the :math:`t` distribution: ``rt(...)``
  • For the :math:`F`-distribution: ``rf(...)``
  • For the chi-squared distribution: ``rchisq(...)``

</rst>

← Calculating statistics from a data sample (previous step) Tutorial index Next step: Extending R with packages →