# Assignment 3

From Latent Variable Methods in Engineering

Please provide **short** answers to the questions below.

- Data source: Silicon wafer thickness
- Nine thickness measurements from a silicon wafer.
- \(N=184\) and \(K=9\)

- Build a PCA model on the data on the first 100 rows.
- Plot the scores. What do you notice?
- Investigate the outliers with the contribution tool.
- Verify that the outliers exist in the raw data
- Exclude any unusual observations and refit the model
- Did you get all the outliers? Check the scores and SPE. Repeat to get all outliers removed.
- Plot a loadings plot for the first component. What is your interpretation of \(p_1\)?
- Given the \(R^2\) and \(Q^2\) values for the first component, what is your interpretation about the variability in this process? (Remember the goal of PCA is to explain variability)
- What is the interpretation of \(p_2\)? From a quality control perspective, if you could remove the variability due to \(p_2\), how much of the variability would you be removing from the process?
- Plot the corresponding time series plot for \(t_1\). What do you notice in the sequence of score values?
- Repeat the above question for the second component.
- Use all the data as testing data (184 observations, of which the first \(\approx 100\) were used to build the model).
- Do the outliers that you excluded earlier show up as outliers still? Do the contribution plots for these outliers give the same diagnosis that you got before?
- Are there any new outliers in points 101 to 184? If so, what are is their diagnosis?