Principal Component Analysis

Class date(s):

16, 23, 30 September 2011

Video material (part 1)

Download video: Link (plays in Google Chrome) [290 Mb]

Video material(part 2)

Download video: Link (plays in Google Chrome) [306 Mb]

Video material (part 3)

Download video: Link (plays in Google Chrome) [294 Mb]

Video material (part 4)

Download video: Link (plays in Google Chrome) [152 Mb]

Video material (part 5)

Download video: Link (plays in Google Chrome) [276 Mb]

Video material (part 6)

Download video: Link (plays in Google Chrome) [333 Mb]

Video material (part 7)

Download video: Link (plays in Google Chrome) [198 Mb]

Video material (part 8)

Download video: Link (plays in Google Chrome) [180 Mb]

Class 2 (16 September 2011)

Download these 3 CSV files and bring them on your computer:
- Peas dataset: http://openmv.net/info/peas
- Food texture dataset: http://openmv.net/info/food-texture
- Food consumption dataset: http://openmv.net/info/food-consumption

Reading for class 2
Linear algebra topics you should be familiar with before class 2:
- matrix multiplication
- that matrix multiplication of a vector by a matrix is a transformation from one coordinate system to another (we will review this in class)
- linear combinations (read the first section of that website: we will review this in class)
- the dot product of 2 vectors, and that they are related by the cosine of the angle between them (see the geometric interpretation section)

This illustration should help better explain what I trying to get across in class 2B

$p_{1}$ and $p_{2}$ are the unit vectors for components 1 and 2.
$x_{i}$ is a row of data from matrix $X$ .
${\hat{x}}_{i, 1} = t_{i, 1} p_{1}$ = the best prediction of $x_{i}$ using only the first component.
${\hat{x}}_{i, 2} = t_{i, 2} p_{2}$ = the improvement we add after the first component to better predict $x_{i}$ .
${\hat{x}}_{i} = {\hat{x}}_{i, 1} + {\hat{x}}_{i, 2}$ = is the total prediction of $x_{i}$ using 2 components and is the open blue point lying on the plane defined by $p_{1}$ and $p_{2}$ . Notice that this is just the vector summation of ${\hat{x}}_{i, 1}$ and ${\hat{x}}_{i, 2}$ .
$e_{i, 2}$ = is the prediction error vector because the prediction ${\hat{x}}_{i}$ is not exact: the data point $x_{i}$ lies above the plane defined by $p_{1}$ and $p_{2}$ . This $e_{i, 2}$ is the residual distance after using 2 components.
$x_{i} = {\hat{x}}_{i} + e_{i, 2}$ is also a vector summation and shows how $x_{i}$ is broken down into two parts: ${\hat{x}}_{i}$ is a vector on the plane, while $e_{i, 2}$ is the vector perpendicular to the plane.

Least squares:
- what is the objective function of least squares
- how to calculate the regression coefficient $b$ for $y = b x + e$ where $x$ and $y$ are centered vectors
- understand that the residuals in least squares are orthogonal to $x$
Some optimization theory:
- How an optimization problem is written with equality constraints
- The Lagrange multiplier principle for solving simple, equality constrained optimization problems.