# Assignment 1

From Latent Variable Methods in Engineering

This quick assignment considers the food texture data (introduced in class 2). There are 5 variables in the data table:

`Oil`

: percentage oil in the pastry`Density`

: the product’s density (the higher the number, the more dense the product)`Crispy`

: a crispiness measurement, on a scale from 7 to 15, with 15 being more crispy.`Fracture`

: the angle, in degrees, through which the pasty can be slowly bent before it fractures.`Hardness`

: a sharp point is used to measure the amount of force required before breakage occurs.

Please provide answers to these questions:

- Calculate the mean centering vector (a \(1 \times 5\) vector).
- Calculate the scaling vector (a \(1 \times 5\) vector) and indicate whether you multiply or divide columns in \(\mathbf{X}\) by the corresponding entries in your vector.
- Draw a scatter plot for
`Oil`

vs`Density`

using all 50 data points from the raw data table. - Draw a scatter plot for
`Oil`

vs`Density`

after you have centered and scaled the data. Any observations when you compare it to the previous scatter plot? - Use the software to calculate PCA model and report the \(R^2\) value for the first and second component. What is the total \(R^2\) using 2 components?
- Report the cumulative \(R^2\) value for each of the 5 variables after adding (a) one component and (b) two components.
- Write down the values of the \(p_1\) loading vector.
- What are the characteristics of pastries with large negative \(t_1\) values?
- What is the second component in the model describing?
- Replicate the calculation for the \(t_1\) value for pastry B758. Show each of the 5 terms that make up this linear combination.

*Hand in your answers at the next class; we will go through the assignment interactively during the next class*.