3.12. Exercises¶

Question 1

Is it fair to say that a monitoring chart is like an online version of a confidence interval? Explain your answer.

Question 2

Use the batch yields data and construct a monitoring chart using the 300 yield values. Use a subgroup of size 5. Report your target value, lower control limit and upper control limit, showing the calculations you made. I recommend that you write your code so that you can reuse it for other questions.

Solution Click to show answer

Question 3

The boards data on the website are from a line which cuts spruce, pine and fir (SPF) to produce general quality lumber that you could purchase at Rona, Home Depot, etc. The price that a saw mill receives for its lumber is strongly dependent on how accurate the cut is made. Use the data for the 2 by 6 boards (each row is one board) and develop a monitoring system using these steps.

Plot all the data.

Now assume that boards 1 to 500 are the phase 1 data; identify any boards in this subset that appear to be unusual (where the board thickness is not consistent with most of the other operation)

Remove those unusual boards from the phase 1 data. Calculate the Shewhart monitoring limits and show the phase 1 data with these limits. Note: choose a subgroup size of 7 boards.

Test the Shewhart chart on boards 501 to 2000, the phase 2 data. Show the plot and calculate the type I error rate ( $α$ ) from the phase 2 data; assuming, of course, that all the phase 2 data are from in-control operation.

Calculate the ARL and look at the chart to see if the number looks about right. Use the time information in the raw data and your ARL value to calculate how many minutes between a false alarm. Will the operators be happy with this?

Describe how you might calculate the consumer’s risk ( $β$ ).

How would you monitor if the saws are slowly going out of alignment?

Question 4

Your process with Cpk of 2.0 experiences a drift of $1.5 σ$ away from the current process operating point towards the closest specification limit. What is the new Cpk value; how many defects per million items did you have before the drift? And after the drift?

Solution Click to show answer

Question 5

Which type of monitoring chart would be appropriate to detect unusual spikes (outliers) in your production process?

Solution Click to show answer

Question 6

A tank uses small air bubbles to keep solid particles in suspension. If too much air is blown into the tank, then excessive foaming and loss of valuable solid product occurs; if too little air is blown into the tank the particles sink and drop out of suspension.

Which monitoring chart would you use to ensure the airflow is always near target?
Use the aeration rate dataset from the website and plot the raw data (total litres of air added in a 1 minute period). Are you able to detect any problems?
Construct the chart you described in part 1, and show it’s performance on all the data. Make any necessary assumptions to construct the chart.
At what point in time are you able to detect the problem, using this chart?
Construct a Shewhart chart, choosing appropriate data for phase 1, and calculate the Shewhart limits. Then use the entire dataset as if it were phase 2 data.
- Show this phase 2 Shewhart chart.
- Compare the Shewhart chart’s performance to the chart in part 3 of this question.

Solution Click to show answer

Solution based on work by Ryan and Stuart (2011 class)

A CUSUM chart would be a suitable chart to monitor that the airflow is near target. While a Shewhart chart is also intended to monitor the location of a variable, it has a much larger run length for detecting small shifts. An EWMA chart with small $λ$ (long memory) would approximate a CUSUM chart, and so would also be suitable
The aeration rate dataset is depicted below:

It is very difficult to assess problems from the raw data plot. There might be a slight upward shift around 300 and 500 minutes.
Assumptions for the CUSUM chart:
- We will plot the CUSUM chart on raw data, though you could use subgroups if you wanted to.
- The target value can be the mean (24.17) of all the data, or more robustly, use the median (24.1), especially if we expect problems with the raw data (true of almost every real data set).
The CUSUM chart, using the median as target value showed a problem starting to occur around $t = 300$ . So we recalculated the median, using only data from 0 to $t = 200$ , to avoid biasing the target value. Using this median instead, 23.95, we get the following CUSUM chart:

The revised CUSUM chart suggests that the error occurs around 275 min, as evidenced by the steep positive slope thereafter. It should be noted that the CUSUM chart begins to bear a positive slope around 200 min, but this initial increase in the cumulative error would likely not be diagnosable (i.e. using a V-mask).

# Code by Ryan and Stuart (2011 class)

CUSUM <- function(x, target){
    N <- length(x)
    S <- numeric(N)
    S[1] = x[1] - target
    for (t in 2:N){        
        S[t] = S[t-1] + (x[t] - target)
    }
return(S)
}

# Import data and remove missing values (NA)
aeration.data <- read.csv('http://openmv.net/file/aeration-rate.csv')
aeration <- na.omit(aeration.data$Aeration)

# Plot raw data
bitmap('aeration-rate-raw-data.png', type="png256", 
        width=10, height=4, res=300, pointsize=14)
plot(aeration, type="l", xlab="Time (min)", ylab="Aeration rate (L/min)")
grid()
dev.off()

# Plot CUSUM Chart
target <- median(aeration[1:200])
bitmap('aeration-CUSUM.png', type="png256", 
        width=10, height=4, res=300, pointsize=14)
plot(CUSUM(aeration, target), type="l", xlab="Time (min)", 
     ylab="CUSUM cumulative deviations")
grid()
dev.off()

# Plot the Shewhart chart: see code from the other question to 
# calculate the control limits
LCL <- 22.1
UCL <- 25.8
N <- 5
subgroups <- matrix(aeration, N, length(aeration)/N)
x.mean <- numeric(length(aeration)/N)
x.sd <- numeric(length(aeration)/N)

# Calculate mean and sd of subgroups (see R-tutorial)
x.mean <- apply(subgroups, 2, mean)
x.sd <- apply(subgroups, 2, sd)
ylim <- range(x.mean) + c(-5, +5)
xdb <- target  # use the same CUSUM target !

bitmap('aeration-Shewhart-chart.png',
       type="png256", width=10, height=4, res=300, pointsize=14)
par(mar=c(4.2, 4.2, 0.5, 0.5)) 
par(cex.lab=1.3, cex.main=1.5, cex.sub=1.5, cex.axis=1.5)
plot(seq(1,length(x.mean)*N, N), x.mean, type="b", pch=".", cex=5, main="", 
     ylab="Phase II subgroups", xlab="Time order", ylim=ylim)
abline(h=UCL, col="red")
abline(h=LCL, col="red")
abline(h=xdb, col="green")
lines(c(275, 275), ylim, col="blue")
text(280, 29, "CUSUM detected problem at t=275",adj = c(0,0))
dev.off()

Using the iterative Shewhart code from the previous question, we used
- Phase I was taken far enough away from the suspected error: 0 - 200 min
- Subgroup size of $n = 5$
- $\bar{\bar{x}} = 23.9$
- $\bar{S} = 1.28$
- $a_{n} = 0.940$
- LCL = $23.9 - 3 \cdot \frac{1.28}{0.940 \sqrt{5}} = 22.1$
- UCL = $23.9 + 3 \cdot \frac{1.28}{0.940 \sqrt{5}} = 25.8$

The Shewhart chart applied to the entire dataset is shown below. In contrast to the CUSUM chart, the Shewhart chart is unable to detect the problem in the aeration rate. Unlike the CUSUM chart, which has infinite memory, the Shewhart chart has no memory and cannot adequately assess the location of the monitored variable in relation to its specified target. Instead, the Shewhart chart merely monitors aeration rate with respect to the control limits for the process. Since the aeration rate does not exceed the control limits for the process (i.e. process remains in control), the Shewhart chart does not detect any abnormalities.

If you used the Western Electric rules, in addition to the Shewhart chart limits, you would have picked up a consecutive sequence of 8 points on one side of the target around $t = 350$ .

Question 7

Do you think a Shewhart chart would be suitable for monitoring the closing price of a stock on the stock market? Please explain your answer if you agree, or describe an alternative if you disagree.

Solution Click to show answer

Question 8

Describe how a monitoring chart could be used to prevent over-control of a batch-to-batch process. (A batch-to-batch process is one where a batch of materials is processed, followed by another batch, and so on).

Solution Click to show answer

Question 9

You need to construct a Shewhart chart. You go to your company’s database and extract data from 10 periods of time lasting 6 hours each. Each time period is taken approximately 1 month apart so that you get a representative data set that covers roughly 1 year of process operation. You choose these time periods so that you are confident each one was from in control operation. Putting these 10 periods of data together, you get one long vector that now represents your phase 1 data.

There are 8900 samples of data in this phase 1 data vector.

You form subgroups: there are 4 samples per subgroup and 2225 subgroups.

You calculate the mean within each subgroup (i.e. 2225 means). The mean of those 2225 means is 714.

The standard deviation within each subgroup is calculated; the mean of those 2225 standard deviations is 98.

Give an unbiased estimate of the process standard deviation?
Calculate lower and upper control limits for operation at $\pm 3$ of these standard deviations from target. These are called the action limits.
Operators like warning limits on their charts, so they don’t have to wait until an action limit alarm occurs. Discussions with the operators indicate that lines at 590 and 820 might be good warning limits. What percentage of in control operation will lie inside the proposed warning limit region?

Short answer: Click to show answer

Question 10

If an exponentially weighted moving average (EWMA) chart can be made to approximate either a CUSUM or a Shewhart chart by adjusting the value of $λ$ , what is an advantage of the EWMA chart over the other two? Describe a specific situation where you can benefit from this.

Question 11

The most recent estimate of the process capability ratio for a key quality variable was 1.30, and the average quality value was 64.0. Your process operates closer to the lower specification limit of 56.0. The upper specification limit is 93.0.

What are the two parameters of the system you could adjust, and by how much, to achieve a capability ratio of 1.67, required by recent safety regulations. Assume you can adjust these parameters independently.

Question 12

A bagging system fills bags with a target weight of 37.4 grams and the lower specification limit is 35.0 grams. Assume the bagging system fills the bags with a standard deviation of 0.8 grams:

What is the current Cpk of the process?
To what target weight would you have to set the bagging system to obtain Cpk=1.3?
How can you adjust the Cpk to 1.3 without adjusting the target weight (i.e. keep the target weight at 37.4 grams)?

Short answer: Click to show answer

Question 13

Plastic sheets are manufactured on your blown film line. The Cp value is 1.7. You sell the plastic sheets to your customers with specification of 2 mm $\pm$ 0.4 mm.

List three important assumptions you must make to interpret the Cp value.

What is the theoretical process standard deviation, $σ$ ?

What would be the Shewhart chart limits for this system using subgroups of size $n = 4$ ?

Illustrate your answer from part 2 and 3 of this question on a diagram of the normal distribution.

Question 14

The following charts show the weight of feed entering your reactor. The variation in product quality leaving the reactor was unacceptably high during this period of time.

../figures/monitoring/monitoring-chart-cycling.R

What can your group of process engineers learn about the problem, using the time-series plot (100 consecutive measurements, taken 1 minute apart).
Why is this variability not seen in the Shewhart chart?
Using concepts described elsewhere in this book, why might this sort of input to the reactor have an effect on the quality of the product leaving the reactor?

Question 15

You will come across these terms in the workplace. Investigate one of these topics, using the Wikipedia link below to kick-start your research. Write a paragraph that (a) describes what your topic is and (b) how it can be used when you start working in a company after you graduate, or how you can use it now if you are currently working.

Lean manufacturing

Six sigma and the DMAIC cycle. See the list of companies that use six sigma tools.

Kaizen (a component of The Toyota Way)

Genchi Genbutsu (also a component of The Toyota Way)

In early 2010 Toyota experienced some of its worst press coverage on this very topic. Here is an article in case you missed it.

Question 16

The Kappa number is a widely used measurement in the pulp and paper industry. It can be measured on-line, and indicates the severity of chemical treatment that must be applied to a wood pulp to obtain a given level of whiteness (i.e. the pulp’s bleachability). Data on the website contain the Kappa values from a pulp mill. Use the first 2000 data points to construct a Shewhart monitoring chart for the Kappa number. You may use any subgroup size you like. Then use the remaining data as your phase 2 (testing) data. Does the chart perform as expected?

Short answer: Click to show answer

Question 17

In this section we showed how one can monitor any variable in a process. Modern instrumentation though capture a wider variety of data. It is common to measure point values, e.g. temperature, pressure, concentration and other hard-to-measure values. But it is increasingly common to measure spectral data. These spectral data are a vector of numbers instead of a single number.

Below is an example from a pharmaceutical process: a complete spectrum can be acquired many times per minute, and it gives a complete chemical fingerprint or signature of the system. There are 460 spectra in figure below; they could have come, for example, from a process where they are measured 5 seconds apart. It is common to find fibre optic probes embedded into pipelines and reactors to monitor the progress of a reaction or mixing.

Write a few bullet points how you might monitor a process where a spectrum (a vector) is your data source, and not a “traditional” single point measurement, like a temperature value.

Question 18

The carbon dioxide measurement is available from a gas-fired furnace. These data are from phase 1 operation.

Calculate the Shewhart chart upper and lower control limits that you would use during phase 2 with a subgroup size of $n = 6$ .
Is this a useful monitoring chart? What is going in this data?
How can you fix the problem?

Solution Click to show answer

Solution based on work by Ryan and Stuart (2011 class)

First a plot of the raw data will be useful:

Assuming that the CO₂ data set is from phase 1 operation, the control limits were calculated as follows:
- Assume subgroups are independent
- $\bar{\bar{x}} = \frac{1}{K} \sum_{k = 1}^{K} {\bar{x}}_{k} = 53.5$
- $\bar{S} = \frac{1}{K} \sum_{k = 1}^{K} s_{k} = 1.10$
- $a_{n} = 0.952$
- LCL = $53.5 - 3 \cdot \frac{1.10}{0.952 \sqrt{6}} = 52.08$
- UCL = $53.5 + 3 \cdot \frac{1.10}{0.952 \sqrt{6}} = 54.92$
The Shewhart chart using a subgroup of size 6 is not a useful monitoring chart. There are too many false alarms, which will cause the operators to just ignore the chart. The problem is that the first assumption of independence is not correct and has a detrimental effect, as shown in a previous question.
One approach to fixing the problem is to subsample the data, i.e. only use every $k^{th}$ data point as the raw data, e.g. $k = 10$ , and then form subgroups from that sampled data.

Another is to use a larger subgroup size. Use the autocorrelation function, and the corresponding acf(...) function in R to verify the degree of relationship. Using this function we can see the raw data are unrelated after the 17th lag, so we could use subgroups of that size. However, even then we see the Shewhart chart showing frequent violation, though fewer than before.

Yet another alternative is to use an EWMA chart, which takes the autocorrelation into account. However, the EWMA chart limits are found from the assumption that the subgroup means (or raw data, if subgroup size is 1), are independent.

So we are finally left with the conclusion that perhaps there data really are not from in control operation, or, if they are, we must manually adjust the limits to be wider.

Question 19

The percentage yield from a batch reactor, and the purity of the feedstock are available as the Batch yield and purity data set. Assume these data are from phase 1 operation and calculate the Shewhart chart upper and lower control limits that you would use during phase 2. Use a subgroup size of $n = 3$ .

What is phase 1?
What is phase 2?
Show your calculations for the upper and lower control limits for the Shewhart chart on the yield value.
Show a plot of the Shewhart chart on these phase 1 data.

Solution Click to show answer

Solution based on work by Ryan McBride, Stuart Young, and Mudassir Rashid (2011 class)

Phase 1 is the period from which historical data is taken that is known to be “in control”. From this data, upper and lower control limits can be established for the monitored variable that contain a specified percent of all in control data.
Phase 2 is the period during which new, unseen data is collected by process monitoring in real-time. This data can be compared with the limits calculated from the “in control” data.
Assuming the dataset was derived from phase 1 operation, the batch yield data was grouped into subgroups of size 3. However, since the total number of data points (N=241) is not a multiple of three, the data set was truncated to the closest multiple of 3, i.e. $N_{n e w} = 240$ , by removing the last data point. Subsequently, the mean and standard deviation were calculated for each of the 80 subgroups. From this data, the lower and upper control limits were calculated as follows:

$\begin{aligned} \overset{―}{\overset{―}{x}} & = \frac{1}{80} \sum_{k = 1}^{80} {\overset{―}{x}}_{k} = 75.3 \\ \overset{―}{S} & = \frac{1}{80} \sum_{k = 1}^{80} s_{k} = 5.32 \\ LCL & = \overset{―}{\overset{―}{x}} - 3 \cdot \frac{\overset{―}{S}}{a_{n} \sqrt{n}} = 64.9 \\ UCL & = \overset{―}{\overset{―}{x}} + 3 \cdot \frac{\overset{―}{S}}{a_{n} \sqrt{n}} = 85.7 \\ using a_{n} & = 0.886 for a subgroup size of 3 \\ and \overset{―}{\overset{―}{x}} & = 75.3 \end{aligned}$

Noticing that the mean for subgroup 42, ${\overset{―}{x}}_{42} = 63.3$ , falls below this LCL, the control limits were recalculated excluding this subgroup from phase 1 data (see R-code). Following this adjustment, the new control limits were calculated to be:
- LCL = 65.0
- UCL = 85.8
Shewhart charts for both rounds of the yield data (before and after removing the outlier):

Question 20

You will hear about 6-sigma processes frequently in your career. What does it mean exactly that a process is “6-sigma capable”? Draw a diagram to help illustrate your answer.