Written midterm 2011
Date: | 16 February 2011 |
(PDF) | Midterm questions |
<rst> <rst-options: 'toc' = False/> <rst-options: 'reset-figures' = False/>
.. rubric:: Special instructions
- The midterm will be on 17 February 2011, starting at 19:15 to 21:15 - You may bring in **any printed materials to the final**; any textbooks, any papers, *etc*. - You may use any calculator during the exam. - You may not use a cellphone as a calculator. Nor may you use any other communication device during the exam. - **You may answer the questions in any order** in the examination booklet. - Please do not repeat the question in your answer; answer with bullet point text where appropriate (i.e. do not feel you need to use long paragraphs in your answers). - Please ensure you provide explanations with your calculations. - If you are not sure about the meaning of a problem, please write out your interpretation and follow through with the calculation. - **400-level students**: please answer all the questions, except those marked as 600-level questions. You will get extra credit for answering the 600-level questions though. - **Total marks**: 75 marks for 400-level; 85 marks for 600-level students.
.. raw:: latex
\pagestyle{plain} \vspace*{0.5cm}
\hrule \vspace*{0.2cm}
- 1**:math:`\qquad\textbf{[5 = 3 + 2]}`
Sulphur dioxide is a byproduct from ore smelting, coal-fired power stations, and other sources.
These 11 samples of sulphur dioxide, SO\ :sub:`2`, measured in parts per billion [ppb], were taken from our plant. Environmental regulations require us to report the 90% confidence interval for the mean SO\ :sub:`2` value.
.. math::
180, \,\, 340, \,\,220, \,\,410, \,\,101, \,\,89, \,\,210, \,\,99, \,\,128, \,\,113, \,\,111
#. What is the confidence interval that must be reported, given that the sample average of these 11 points is 181.9 ppb and the sample standard deviation is 106.8 ppb?
#. Why might Environment Canada require you to report the confidence interval instead of the mean?
- 2**:math:`\qquad\textbf{[6 = 2 + 2 + 2]}`
Questions related to process monitoring:
#. In which situation would you use a CUSUM chart? Why, in your given situation, would this chart be more advantageous than say a Shewhart chart?
#. How would you select the value of :math:`\lambda` used in the EWMA chart?
#. Explain what is meant by common cause variation in process monitoring.
- 3**:math:`\qquad\textbf{[7]}`
The most recent estimate of the process capability ratio for a key quality variable was 1.30, and the average quality value was 64.0. Your process operates closer to the lower specification limit of 56.0. The upper specification limit is 93.0.
What are the two parameters of the system you could adjust, and by how much, to achieve a capability ratio of 1.67, required by safety regulations. Assume you can adjust these parameters independently.
- 4**:math:`\qquad\textbf{[25 = 8 + 12 + 3 + 2]}\qquad`
A concrete slump test is used to test for the fluidity, or workability, of concrete. It's a crude, but quick test often used to measure the effect of polymer additives that are mixed with the concrete to improve workability.
The concrete mixture is prepared with a polymer additive. The mixture is placed in a mold and filled to the top. The mold is inverted and removed. The height of the mold minus the height of the remaining concrete pile is called the "slump".
.. figure:: ../figuress/types_of_concrete_slump.jpg :alt: http://en.wikipedia.org/wiki/File:Types_of_concrete_slump.jpg :width: 650px :align: center
*Illustration from* `Wikipedia <http://en.wikipedia.org/wiki/File:Types_of_concrete_slump.jpg>`_
Your company provides the polymer additive, and you are developing an improved polymer formulation, call it B, that hopefully provides the same slump values as your existing polymer, call it A. Formulation B costs less money than A, but you don't want to upset, or loose, customers by varying the slump value too much.
#. You have a single day to run your tests (experiments). Preparation, mixing times, measurement and clean up take 1 hour, only allowing you to run 10 experiments. Describe all precautions, and why you take these precautions, when planning and executing your experiment. Be very specific in your answer (use bullet points).
#. The following slump values were recorded over the course of the day:
========== ================ Additive Slump value [cm] ========== ================ A 5.2 A 3.3 B 5.8 A 4.6 B 6.3 A 5.8 A 4.1 B 6.0 B 5.5 B 4.5 ========== ================
What is your conclusion on the performance of the new polymer formulation (system B)? Your conclusion must either be "send the polymer engineers back to the lab" or "let's start making formulation B for our customers". Explain your choice clearly.
To help you, :math:`\overline{x}_A = 4.6` and :math:`s_A = 0.97`. For system B: :math:`\overline{x}_B = 5.62` and :math:`s_B = 0.69`.
*Note*: In your answer you must be clear on which assumptions you are using and, where necessary, why you need to make those assumptions.
#. Describe the circumstances under which you would rather use a paired test for differences between polymer A and B.
#. What are the advantage(s) of the paired test over the unpaired test?
This question is continued for 600-level students at the end of the exam.
- 5**:math:`\qquad\textbf{[23 = 2 + 2 + 2 + 2 + 2+ 8 + 5]}`
A simple linear model relating reactor temperature to polymer viscosity is desirable, because measuring viscosity online, in real time is far too costly, and inaccurate. Temperature, on the other hand, is quick and inexpensive. This is the concept of *soft sensors*, also known as *inferential sensors*.
Data were collected from a rented online viscosity unit and a least squares model build:
.. math::
\hat{v} = 1977 - 3.75 T
where the viscosity, :math:`v`, is measured in Pa.s (Pascal seconds) and the temperature is in Kelvin. A reasonably linear trend was observed over the 86 data points collected. Temperature values were taken over the range of normal operation: 430 to 480 K and the raw temperature data had a sample standard deviation of 8.2 K.
The output from a certain commercial software package was:
.. code-block:: text
Analysis of Variance --------------------------------------------------------- Sum of Mean Source DF Squares Square Model 2 9532.7 4766.35 Error 84 9963.7 118.6 Total 86 19496.4 Root MSE XXXXX R-Square XXXXX
#. Which is the causal direction: does a change in viscosity cause a change in temperature, or does a change in temperature cause a change in viscosity?
#. Calculate the ``Root MSE``, what we have called standard error, :math:`S_E` in this course.
#. What is the :math:`R^2` value that would have been reported in the above output?
#. What is the interpretation of the slope coefficient, -3.75, and what are its units?
#. What is the viscosity prediction at 430K? And at 480K?
#. In the future you plan to use this model to adjust temperature, in order to meet a certain viscosity target. To do that you must be sure the change in temperature will lead to the desired change in viscosity.
What is the 95% confidence interval for the slope coefficient, *and interpret* this confidence interval in the context of how you plan to use this model.
#. The standard error features prominently in all derivations related to least squares. Provide an interpretation of it and be specific in any assumption(s) you require to make this interpretation.
- 6**:math:`\qquad\textbf{[9 = 1 + 1 + 1 + 2 + 2 + 2]}`
Traffic cameras have their proponents (it improves road safety) and opponents (it's just a money grab). The plot below shows the number of cameras per 1000km of roadway and number of traffic deaths per 1000km of roadway. It is from the `The Economist <http://www.economist.com/node/21015161>`_ website.
.. figure:: ../figures/univariate/traffic-cameras-road-deaths.jpg :alt: http://www.economist.com/node/21015161 :width: 550px :align: center
#. What type of plot is this?
#. If you had to describe this relationship to a colleague, what would you say?
#. Identify and describe anything interesting in this plot that would lead you to search for more information.
#. What is the causal direction (line of reasoning) that the plot's author is wanting you to follow?
#. Which region of the plot would a linear regression model do an adequate job of describing? Feel free to answer with an illustration of your own.
#. An alternative model is possible to describe this relationship. Describe that model, perhaps providing an illustration of it, and be specific on how would use that model on new data points.
You will find all the user comments and criticism for this article quite informative (http://www.economist.com/node/21015161).
Questions for 600-level students only
==========================
- 7**:math:`\qquad\textbf{[6 = 3 + 3]}` for 600-level students (400 level students may attempt this question for extra credit)
This question is a continuation of question 3. Please refer back to that question for context.
#. Clearly explain which assumptions are used for paired tests, and why they are likely to be true in this case?
#. The slump tests were actually performed in a paired manner, where pairing was performed based on the cement supplier. Five different cement suppliers were used:
========== ======================= ======================= Supplier Slump value [cm] from A Slump value [cm] from B ========== ======================= ======================= 1 5.2 5.8 2 3.3 4.5 3 4.6 6.0 4 5.8 5.5 5 4.1 6.2 ========== ======================= =======================
Use these data, and provide, if necessary, an updated recommendation to your manager.
- 8**:math:`\qquad\textbf{[4 = 2 + 2]}` for 600-level students (400 level students may attempt this question for extra credit)
#. Describe what is meant by the *breakdown point* of a statistic, such as the standard deviation, or its robust counterpart, the median absolute deviation.
#. What is an advantage of using robust methods over their "classical" counterparts?
.. raw:: latex
\vspace{0.3cm} \hrule \begin{center} END \end{center} </rst>