Difference between revisions of "Assignment 3"

From Latent Variable Methods in Engineering
Jump to navigation Jump to search
m (Created page with "<rst> <rst-options: 'toc' = False/> * Data source: `Silicon wafer thickness <http://datasets.connectmv.com/info/silicon-wafer-thickness>`_ * Nine thickness measurements from a si...")
 
m
Line 17: Line 17:
11. Repeat the above question for the second component.
11. Repeat the above question for the second component.
12. Use all the data as testing data (184 observations, of which the first :math:`\approx 100` were used to build the model).
12. Use all the data as testing data (184 observations, of which the first :math:`\approx 100` were used to build the model).
13. Do the outliers that you excluded earlier show up as outliers still? Do the contribution plots for these outliers give the
13. Do the outliers that you excluded earlier show up as outliers still? Do the contribution plots for these outliers give the same diagnosis that you got before?  
same diagnosis that you got before?  
14. Are there any new outliers in points 101 to 184? If so, what are is their diagnosis?
14. Are there any new outliers in points 101 to 184? If so, what are is their diagnosis?
</rst>
</rst>

Revision as of 12:45, 3 October 2011

<rst> <rst-options: 'toc' = False/>

1. Build a PCA model on the data on the first 100 rows. 2. Plot the scores. What do you notice? 3. Investigate the outliers with the contribution tool. 4. Verify that the outliers exist in the raw data 5. Exclude any unusual observations and refit the model 6. Did you get all the outliers? Check the scores and SPE. Repeat to get all outliers removed. 7. Plot a loadings plot for the first component. What is your interpretation of :math:`p_1`? 8. Given the :math:`R^2` and :math:`Q^2` values for the first component, what is your interpretation about the variability in this process? (Remember the goal of PCA is to explain variability) 9. What is the interpretation of :math:`p_2`? From a quality control perspective, if you could remove the variability due to :math:`p_2`, how much of the variability would you be removing from the process? 10. Plot the corresponding time series plot for :math:`t_1`. What do you notice in the sequence of score values? 11. Repeat the above question for the second component. 12. Use all the data as testing data (184 observations, of which the first :math:`\approx 100` were used to build the model). 13. Do the outliers that you excluded earlier show up as outliers still? Do the contribution plots for these outliers give the same diagnosis that you got before? 14. Are there any new outliers in points 101 to 184? If so, what are is their diagnosis? </rst>