6.7.8. Variability explained with each component

We can calculate \(R^2\) values, since PLS explains both the \(\mathbf{X}\)-space and the \(\mathbf{Y}\)-space. We use the \(\mathbf{E}_a\) matrix to calculate the cumulative variance explained for the \(\mathbf{X}\)-space.

\[R^2_{\mathbf{X}, a, \text{cum}} = 1 - \dfrac{\text{Var}(\mathbf{E}_a)}{\text{Var}(\mathbf{X}_{a=1})}\]

Before the first component is extracted we have \(R^2_{\mathbf{X}, a=0} = 0.0\), since \(\mathbf{E}_{a=0} = \mathbf{X}_{a=1}\). After the second component, the residuals, \(\mathbf{E}_{a=1}\), will have decreased, so \(R^2_{\mathbf{X}, a}\) would have increased.

We can construct similar \(R^2\) values for the \(\mathbf{Y}\)-space using the \(\mathbf{Y}_a\) and \(\mathbf{F}_a\) matrices. Furthermore, we construct in an analogous manner the \(R^2\) values for each column of \(\mathbf{X}_a\) and \(\mathbf{Y}_a\), exactly as we did for PCA.

These \(R^2\) values help us understand which components best explain different sources of variation. Bar plots of the \(R^2\) values for each column in \(\mathbf{X}\) and \(\mathbf{Y}\), after a certain number of \(A\) components are one of the best ways to visualize this information.