Difference between revisions of "Assignment 2 - 2010 - Solution/Question 2"

From Process Model Formulation and Solution: 3E4
Jump to: navigation, search
m
m
 
Line 9: Line 9:
 
#. Derive, by hand, the lower-triangular matrix :math:`L` (with ones on the diagonal), and upper triangular matrix :math:`U` so that :math:`A = LU`, using the matrix :math:`A` from question 1.  Afterwards, show that :math:`A = LU`.
 
#. Derive, by hand, the lower-triangular matrix :math:`L` (with ones on the diagonal), and upper triangular matrix :math:`U` so that :math:`A = LU`, using the matrix :math:`A` from question 1.  Afterwards, show that :math:`A = LU`.
  
#. Use MATLAB or Python's built-in functions to solve for :math:`L` and :math:`U`.  Show your code and show the matrices from the software.  Explain any disagreement.  See `the software tutorial on matrix operations <http://modelling3e4.connectmv.com/wiki/Software_tutorial/Matrix_operations>`_ for help.
+
#. Use MATLAB or Python's built-in functions to solve for :math:`L` and :math:`U`.  Show your code and show the matrices from the software.  Explain any disagreement.  See `the software tutorial on matrix operations <https://learnche.org/3E4/Software_tutorial/Matrix_operations>`_ for help.
  
 
#. Use **your** matrices for :math:`L` and :math:`U` from the first part of this question, and calculate the inverse of :math:`A`.  Compare your inverse with the inverse calculated by MATLAB or Python - does it agree with the software's values?  Also, does :math:`A A^{-1} = I` for your inverse?
 
#. Use **your** matrices for :math:`L` and :math:`U` from the first part of this question, and calculate the inverse of :math:`A`.  Compare your inverse with the inverse calculated by MATLAB or Python - does it agree with the software's values?  Also, does :math:`A A^{-1} = I` for your inverse?

Latest revision as of 03:26, 16 September 2018

← Question 1 (previous step) Back to all questions Next step: Question 3 →

Question 2 [2]

  1. Derive, by hand, the lower-triangular matrix \(L\) (with ones on the diagonal), and upper triangular matrix \(U\) so that \(A = LU\), using the matrix \(A\) from question 1. Afterwards, show that \(A = LU\).
  2. Use MATLAB or Python's built-in functions to solve for \(L\) and \(U\). Show your code and show the matrices from the software. Explain any disagreement. See the software tutorial on matrix operations for help.
  3. Use your matrices for \(L\) and \(U\) from the first part of this question, and calculate the inverse of \(A\). Compare your inverse with the inverse calculated by MATLAB or Python - does it agree with the software's values? Also, does \(A A^{-1} = I\) for your inverse?

Solution

  1. The algorithm for determining the \(LU\) decomposition of a matrix (without pivoting) is covered on slides 27 - 34 of the Section B: Linear Algebraic Equations Notes. We start with the matrix \(A\):

    \[\begin{split}A = \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & -2 & -2 & -2\\ 1 & 4 & -4 & 1\\ 1 & -5 & -5 & -3\\ \end{array} \right]\end{split}\]

    The basic procedure behind \(LU\) decomposition is to perform forward elimination on the matrix \(A\) to generate the upper triangular matrix \(U\) and, while doing so, also store the row elimination factors used to generate the lower triangular matrix \(L\).

    To eliminate the first column of \(A\) we add \(-\left(\frac{a_{21}}{a_{11}}\right) = -\left(\frac{1}{1}\right)\) times row 1 to row 2, \(-\left(\frac{a_{31}}{a_{11}}\right) = -\left(\frac{1}{1}\right)\) times row 1 to row 3, and \(-\left(\frac{a_{41}}{a_{11}}\right) = -\left(\frac{1}{1}\right)\) times row 1 to row 4. Also recall that when we collect these multiplication terms into our \(L\) we take the negative of the value (i.e. we multiply by negative 1).

    \[\begin{split}A^{(1)} = \left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 1 + (-1)(1) & -2 + (-1)(1) & -2 + (-1)(1) & -2 + (-1)(1)\\ 1 + (-1)(1) & 4 + (-1)(1) & -4 + (-1)(1) & 1 + (-1)(1)\\ 1 + (-1)(1) & -5 + (-1)(1) & -5 + (-1)(1) & -3 + (-1)(1)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 3 & -5 & 0\\ 0 & -6 & -6 & -4\\ \end{array} \right]\end{split}\]
    \[\begin{split}L^{(1)} = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ \frac{a_{21}}{a_{11}} & 1 & 0 & 0\\ \frac{a_{31}}{a_{11}} & 0 & 1 & 0\\ \frac{a_{41}}{a_{11}} & 0 & 0 & 1\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & 0 & 1 & 0\\ 1 & 0 & 0 & 1\\ \end{array} \right]\end{split}\]

    Next we move to column 2. To remove the elements below the diagonal we add \(-\left(\frac{a_{32}}{a_{22}}\right) = -\left(\frac{3}{-3}\right)\) times row 2 to row 3 and \(-\left(\frac{a_{42}}{a_{22}}\right) = -\left(\frac{-6}{-3}\right)\) times row 2 to row 4. We also add the negative of these factors to the second column of our \(L\) matrix.

    \[\begin{split}A^{(2)} = \left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 3 + (1)(-3) & -5 + (1)(-3) & 0 + (1)(-3)\\ 0 & -6 + (-2)(-3) & -6 + (-2)(-3) & -4 + (-2)(-3)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 0 & -8 & -3\\ 0 & 0 & 0 & 2\\ \end{array} \right]\end{split}\]
    \[\begin{split}L^{(2)} = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & \frac{a_{32}}{a_{22}} & 1 & 0\\ 1 & \frac{a_{42}}{a_{22}} & 0 & 1\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & -1 & 1 & 0\\ 1 & 2 & 0 & 1\\ \end{array} \right]\end{split}\]

    Since \(A^{(2)}\) is already in upper triangular form we no longer have any reason to continue the forward elimination process. Therefore:

    \[\begin{split}L = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & -1 & 1 & 0\\ 1 & 2 & 0 & 1\\ \end{array} \right] \,\,\,\, U = \left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 0 & -8 & -3\\ 0 & 0 & 0 & 2\\ \end{array} \right]\end{split}\]

    Next we prove that \(A = LU\). Since matrix multiplications takes up alot of space we will do this column by column.

    \[\begin{split}a_{:1} = \left[\begin{array}{r} l_{11}u_{11} + l_{12}u_{21} + l_{13}u_{31} + l_{14}u_{41}\\ l_{21}u_{11} + l_{22}u_{21} + l_{23}u_{31} + l_{24}u_{41}\\ l_{31}u_{11} + l_{32}u_{21} + l_{33}u_{31} + l_{34}u_{41}\\ l_{41}u_{11} + l_{42}u_{21} + l_{43}u_{31} + l_{44}u_{41}\\ \end{array}\right] = \left[\begin{array}{r} (1)(1) + (0)(0) + (0)(0) + (0)(0)\\ (1)(1) + (1)(0) + (0)(0) + (0)(0)\\ (1)(1) + (-1)(0) + (1)(0) + (0)(0)\\ (1)(1) + (2)(0) + (0)(0) + (1)(0)\\ \end{array}\right] = \left[\begin{array}{r} 1\\ 1\\ 1\\ 1\\ \end{array}\right]\end{split}\]
    \[\begin{split}a_{:2} = \left[\begin{array}{r} l_{11}u_{12} + l_{12}u_{22} + l_{13}u_{32} + l_{14}u_{42}\\ l_{21}u_{12} + l_{22}u_{22} + l_{23}u_{32} + l_{24}u_{42}\\ l_{31}u_{12} + l_{32}u_{22} + l_{33}u_{32} + l_{34}u_{42}\\ l_{41}u_{12} + l_{42}u_{22} + l_{43}u_{32} + l_{44}u_{42}\\ \end{array}\right] = \left[\begin{array}{r} (1)(1) + (0)(-3) + (0)(0) + (0)(0)\\ (1)(1) + (1)(-3) + (0)(0) + (0)(0)\\ (1)(1) + (-1)(-3) + (1)(0) + (0)(0)\\ (1)(1) + (2)(-3) + (0)(0) + (1)(0)\\ \end{array}\right] = \left[\begin{array}{r} 1\\ -2\\ 4\\ -5\\ \end{array}\right]\end{split}\]
    \[\begin{split}a_{:3} = \left[\begin{array}{r} l_{11}u_{13} + l_{12}u_{23} + l_{13}u_{33} + l_{14}u_{43}\\ l_{21}u_{13} + l_{22}u_{23} + l_{23}u_{33} + l_{24}u_{43}\\ l_{31}u_{13} + l_{32}u_{23} + l_{33}u_{33} + l_{34}u_{43}\\ l_{41}u_{13} + l_{42}u_{23} + l_{43}u_{33} + l_{44}u_{43}\\ \end{array}\right] = \left[\begin{array}{r} (1)(1) + (0)(-3) + (0)(-8) + (0)(0)\\ (1)(1) + (1)(-3) + (0)(-8) + (0)(0)\\ (1)(1) + (-1)(-3) + (1)(-8) + (0)(0)\\ (1)(1) + (2)(-3) + (0)(-8) + (1)(0)\\ \end{array}\right] = \left[\begin{array}{r} 1\\ -2\\ -4\\ -5\\ \end{array}\right]\end{split}\]
    \[\begin{split}a_{:4} = \left[\begin{array}{r} l_{11}u_{14} + l_{12}u_{24} + l_{13}u_{34} + l_{14}u_{44}\\ l_{21}u_{14} + l_{22}u_{24} + l_{23}u_{34} + l_{24}u_{44}\\ l_{31}u_{14} + l_{32}u_{24} + l_{33}u_{34} + l_{34}u_{44}\\ l_{41}u_{14} + l_{42}u_{24} + l_{43}u_{34} + l_{44}u_{44}\\ \end{array}\right] = \left[\begin{array}{r} (1)(1) + (0)(-3) + (0)(-3) + (0)(2)\\ (1)(1) + (1)(-3) + (0)(-3) + (0)(2)\\ (1)(1) + (-1)(-3) + (1)(-3) + (0)(2)\\ (1)(1) + (2)(-3) + (0)(-3) + (1)(2)\\ \end{array}\right] = \left[\begin{array}{r} 1\\ -2\\ 1\\ -3\\ \end{array}\right]\end{split}\]

    From the above it is easy to see that

    \[\begin{split}LU = \left[a_{:1},a_{:2},a_{:3},a_{:4}\right] = \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & -2 & -2 & -2\\ 1 & 4 & -4 & 1\\ 1 & -5 & -5 & -3\\ \end{array} \right] = A\end{split}\]

    Therefore we have proven that \(A = LU\).

    Extra information - Derivation of Doolittle Method

    The \(LU\) decomposition of a square matrix \(A\) is defined as a set of lower triangular (i.e. no elements above the diagonal) and upper triangular (i.e. no elements below the diagonal) matrices that, when multiplied together, produce \(A\). In the simplest case, it is easy to show that if \(A\) contains no zero quantities along it's diagonal during a simple forward elimination process (i.e. partial pivoting is not necessary) then matrices \(L\) and \(U\) exist for \(A\) and are directly calculable from \(A\). In the more general case, it is also easy to show that an \(LU\) decomposition exists for any square matrix \(A\) if partial pivoting is allowed. In this case the \(LU\) decomposition will have an associated permutation matrix \(P\), and really you will have solved the sysem \(PA=LU\), but the difference in how these decompositions are used in further calculations (for example to calculate the inverse of a matrix) is minimal (All one must do to use the permuted \(LU\) set to solve a linear system is multiply the system by \(P\) --> \(Ax = b\) --> \(PAx = Pb\) --> \(LUx = Pb\). The solution of the system will be exactly the same). Finally, if one specifies that either the \(L\) or \(U\) matrix must have all ones along its diagonal (in the Doolittle method, and in this problem..., we specify that \(L\) must have a unit diagonal), then the \(L\) and \(U\) set is unique to \(A\).

    So how do we derive the \(LU\) decomposition of a matrix \(A\). The answer lies in applying elementary row operations. There are three classic elementary row operations in linear algebra:

    • Row Switching:

      Performed by left multiplying a matrix \(A\) by an identity matrix that has had its rows \(i\) and \(j\) switched, where \(i\) and \(j\) are the rows desired to be switched in \(A\). For example, if \(A\) were a 4x4 matrix and you wanted to switch rows 2 and 3, you would left multiply \(A\) by the following "row switching" matrix \(E_{1}\) (i.e. \(E_{1}A\)):

      \[\begin{split}E_{1} = \left[\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right]\end{split}\]
    • Multiplication of a row by a constant:

      This one is pretty straightforward. When you multiply a row by some constant \(c\) then you multiply every element of that row by that constant on an element by element basis. This may be represented in elementary matrix form by left multiplying \(A\) by an elementary matrix \(E_{2}\) that contains only 1s along its diagonal (i.e. it is an identity matrix) except for the \(i\text{th}\) element, which contains a value of \(c\) (where \(i\) is the row we wish to multiply by \(c\)). So, for example, if we had an 4x4 matrix \(A\) and we wished to multiply row 4 by 5 then we would left multiply \(A\) by the following elementary matrix \(E_{2}\):

      \[\begin{split}E_{2} = \left[\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 5 \\ \end{array}\right]\end{split}\]
    • Addition of a multiple of one row to another:

      Many will recognize this as the essential operation in solving a linear system of equations. The addition of one row \(i\), multiplied by some constant \(c\), to another row \(j\) may be represented in elementary matrix form by an identity matrix with an off diagonal element equal to \(c\) in the \(i\text{th}\) column and the \(j\text{th}\) row. So, if for example we wanted to add 4 times row 3 to row 1, we would left multiply \(A\) by the following elementary matrix \(E_{3}\) (go ahead, check for yourself, everything works out):

      \[\begin{split}E_{3} = \left[\begin{array}{cccc} 1 & 0 & 4 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right]\end{split}\]

    The final pieces of information we require before we may derive the \(LU\) decomposition of \(A\) are given below (derivation not provided):

    • The inverse of \(E_{1}\) is equal to \(E_{1}^{T}\)
    • The inverse of \(E_{2}\) is equal to a matrix \(E_{2'}\) whose diagonal elements are equal to the inverse of those in \(E_{2}\) (i.e. 1 over the elements in \(E_{2}\))
    • The inverse of \(E_{3}\) is equal to \(E_{3}\) with all its off-diagonal elements multiplied by (-1)
    • A lower triangular matrix multplied by a lower triangular matrix is also lower triangular (the opposite is also true)
    • The inverse of a lower triangular matrix is also lower triangular (the opposite is also true)

    Now that we have the required background information we may derive the \(LU\) decomposition of \(A\). Note that this derivation deals with the simplified case that no intermediate partial pivoting is required during the derivation.

    So let us start by asking a simple question. If one wanted to derive an upper triangular matrix from \(A\), how would one do it? The obvious answer would be to perform forward elimination on the matrix until all elements below the diagonal no longer existed. But how would this look in terms elementary row operations? Essentially we would be continuously multiplying by elementary matrices of form \(E_{3}\). Furthermore, since we are trying to convert matrix \(A\) to an upper triangular matrix we would only be concerned with eliminating elements below the diagonal wouldn't we? Therefore all of the elementary matrices applied would be some multiple of row \(i\) being added to some row \(j\) where \(i > j\). Recall that the off-diagonal element of \(E_{3}\) occurs at location \(e_{ij}\). Therefore the forward elimination step would always consist of lower triangular matrices no? So in the case of the first column of A we would simply left multiply A by the following three elementary row operation matrices to eliminate the below diagonal elements wouldn't we? (go ahead, try it out by hand)

    \[\begin{split}A^{(1)} = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ -\frac{a^{(0)}_{41}}{a^{(0)}_{11}} & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ -\frac{a^{(0)}_{31}}{a^{(0)}_{11}} & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ -\frac{a^{(0)}_{21}}{a^{(0)}_{11}} & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & -2 & -2 & -2\\ 1 & 4 & -4 & 1\\ 1 & -5 & -5 & -3\\ \end{array} \right] = \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 3 & -5 & 0\\ 0 & -6 & -6 & -4\\ \end{array} \right]\end{split}\]

    But we can't exactly claim \(A = A^{(1)}\) can we. So what would we need to do to make the equivalence correct? We would need to multiply by the inverse of the elementary row operations.

    \[\begin{split}A = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ \frac{a^{(0)}_{21}}{a^{(0)}_{11}} & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ \frac{a^{(0)}_{31}}{a^{(0)}_{11}} & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ \frac{a^{(0)}_{41}}{a^{(0)}_{11}} & 0 & 0 & 1 \\ \end{array}\right]\left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ -\frac{a^{(0)}_{41}}{a^{(0)}_{11}} & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ -\frac{a^{(0)}_{31}}{a^{(0)}_{11}} & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ -\frac{a^{(0)}_{21}}{a^{(0)}_{11}} & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & -2 & -2 & -2\\ 1 & 4 & -4 & 1\\ 1 & -5 & -5 & -3\\ \end{array} \right]\end{split}\]

    At this point it should be stated that when multiplying elementary matrices of type \(E_{3}\) where no off-diagonal elements occur in the same position (always the case when performing forward elimination) the result of the multiplication is simply a matrix containing all of the off-diagonal elements in the exact same positions they occurred separately (a prove will not be given but the reader is encouraged to investigate the validity of this statement on their own). Therefore if we conduct the multiplication of the row operations and inverse operations respectively then it is easy to see the system simplifies to:

    \[\begin{split}A = \underbrace{\left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ \frac{a^{(0)}_{21}}{a^{(0)}_{11}} & 1 & 0 & 0 \\ \frac{a^{(0)}_{31}}{a^{(0)}_{11}} & 0 & 1 & 0 \\ \frac{a^{(0)}_{41}}{a^{(0)}_{11}} & 0 & 0 & 1 \\ \end{array}\right]}_{(L^{(1)})^{-1}} \underbrace{\left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ -\frac{a^{(0)}_{21}}{a^{(0)}_{11}} & 1 & 0 & 0 \\ -\frac{a^{(0)}_{31}}{a^{(0)}_{11}} & 0 & 1 & 0 \\ -\frac{a^{(0)}_{41}}{a^{(0)}_{11}} & 0 & 0 & 1 \\ \end{array}\right]}_{L^{(1)}} \underbrace{\left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & -2 & -2 & -2\\ 1 & 4 & -4 & 1\\ 1 & -5 & -5 & -3\\ \end{array} \right]}_{A} = (L^{(1)})^{-1}A^{(1)}\end{split}\]

    Now what if we wanted to eliminate the second column of \(A\) (also \(A^{(1)}\))? Well naturally we would follow the exact same process (i.e. we would multiply \(A^{(1)}\) by the elementary row operation matrices required to eliminate all elements below the second diagonal, remembering to also multiply by the inverse of the operations to maintain the equivalence to \(A\)).

    \[\begin{split}\begin{array}{rcl} A & = & (L^{(1)})^{-1}(L^{(2)})^{-1}L^{(2)}A^{(1)}\\ A & = & (L^{(1)})^{-1}(L^{(2)})^{-1}A^{(2)} \end{array}\end{split}\]

    Are you starting to see the pattern now? If we were to repeat this process \(n-1\) times (where \(n\) is the dimension of \(A\)) then the forward elimination would be complete and therefore \(A^{(n-1)}\) would have to be upper triangular (as a quick aside, we repeat the process \(n-1\) times and not \(n\) times because we do not have to eliminate the first row).

    \[\begin{split}\begin{array}{rcl} A & = & (L^{(1)})^{-1}(L^{(2)})^{-1} \cdots (L^{(n-2)})^{-1}(L^{(n-1)})^{-1} L^{(n-1)}L^{(n-2)} \ldots L^{(2)}L^{(1)}A\\ A & = & (L^{(1)})^{-1}(L^{(2)})^{-1} \cdots (L^{(n-2)})^{-1}(L^{(n-1)})^{-1} A^{(n-1)}\\ A & = & (L^{(1)})^{-1}(L^{(2)})^{-1} \cdots (L^{(n-2)})^{-1}(L^{(n-1)})^{-1} U\\ \end{array}\end{split}\]

    Now recall the the inverse of a lower triangular matrix is also a lower triangular matrix and the multiplication of two lower triangular matrices is also lower triangular. Therefore all of the terms to the left of \(U\) collect to form one lower triangular matrix \(L\).

    \[A = LU\]

    But this is exactly what we wanted when we started the LU decomposition process. Furthermore, knowing the simplified multiplication rule for \(E_{3}\) type matrices presented earlier, it is easy to see that \(L\) equals:

    \[\begin{split}L = \left[ \begin{array}{rrrrrrr} 1 & 0 & 0 & \cdots & 0 & 0 & 0\\ \frac{a^{(0)}_{21}}{a^{(0)}_{11}} & 1 & 0 & \cdots & 0 & 0 & 0\\ \frac{a^{(0)}_{31}}{a^{(0)}_{11}} & \frac{a^{(1)}_{32}}{a^{(1)}_{22}} & 1 & \cdots & 0 & 0 & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots\\ \frac{a^{(0)}_{n-2,1}}{a^{(0)}_{11}} & \frac{a^{(1)}_{n-2,2}}{a^{(1)}_{22}} & \frac{a^{(2)}_{n-2,3}}{a^{(2)}_{33}} & \cdots & 1 & 0 & 0\\ \frac{a^{(0)}_{n-1,1}}{a^{(0)}_{11}} & \frac{a^{(1)}_{n-1,2}}{a^{(1)}_{22}} & \frac{a^{(2)}_{n-1,3}}{a^{(2)}_{33}} & \cdots & \frac{a^{(n-2)}_{n-1,n-2}}{a^{(n-2)}_{n-2,n-2}} & 1 & 0\\ \frac{a^{(0)}_{n,1}}{a^{(0)}_{11}} & \frac{a^{(1)}_{n,2}}{a^{(1)}_{22}} & \frac{a^{(2)}_{n,3}}{a^{(2)}_{33}} & \cdots & \frac{a^{(n-2)}_{n,n-2}}{a^{(n-2)}_{n-2,n-2}} & \frac{a^{(n-1)}_{n,n-1}}{a^{(n-1)}_{n-1,n-1}} & 1\\ \end{array}\right]\end{split}\]

    This is exactly the general form of L presented in the notes. This is where that form comes from.

  2. To perform LU decomposition in MATLAB we use the lu function. The two most common methods of using lu are:

    • [L,U] = lu(A) : this will return the \(L\) and \(U\) matrices with the \(L\) matrix multiplied by a permutation matrix \(P\) (if partial pivoting occurred during the solution procedure).
    • `[L,U,P] = lu(A) : this will return the \(L\) and \(U\) matrices with the implicit understanding that the system solved was \(PA\).

    To perform LU decomposition in Python we use the lu_factor function.

    MATLAB code Python code
    A = [1, 1, 1, 1;
         1,-2,-2,-2;
         1, 4,-4, 1;
         1,-5,-5,-3];
     
    [L,U] = lu(A)
    [L,U,P] = lu(A)
    
    % Running the above script results in the following output
    
    L =
        1.0000         0         0         0
        1.0000    0.5000         0    1.0000
        1.0000   -0.5000    1.0000         0
        1.0000    1.0000         0         0
    
    U =
         1     1     1     1
         0    -6    -6    -4
         0     0    -8    -2
         0     0     0    -1
    
    
    L =
        1.0000         0         0         0
        1.0000    1.0000         0         0
        1.0000   -0.5000    1.0000         0
        1.0000    0.5000         0    1.0000
    
    
    U =
         1     1     1     1
         0    -6    -6    -4
         0     0    -8    -2
         0     0     0    -1
    
    
    P =
         1     0     0     0
         0     0     0     1
         0     0     1     0
         0     1     0     0
    
    import numpy as np
    from numpy.linalg import *
    from scipy.linalg import *
    
    A = np.array([[1,1,1,1],
                  [1,-2,-2,-2],
                  [1,4,-4,1],
                  [1,-5,-5,-3]])
    
    LU, P = lu_factor(A)
    
    print('LU\n %s\n P\n %s' % (LU, P))
    """
    Running the above code results in the following output
    
    LU
    [[ 1.   1.   1.   1. ]
     [ 1.  -6.  -6.  -4. ]
     [ 1.  -0.5 -8.  -2. ]
     [ 1.   0.5  0.  -1. ]]
    P
    [0 3 2 3]
    
    Remember that python outputs L and U in a compressed format
    
    The separated L and U matrices would be
    
    L
    [[ 1.   0.   0.   0. ]
     [ 1.   1.   0.   0. ]
     [ 1.  -0.5  1.   0. ]
     [ 1.   0.5  0.   1. ]]
    
    U
    [[ 1.   1.   1.   1. ]
     [ 0.  -6.  -6.  -4. ]
     [ 0.   0.  -8.  -2. ]
     [ 0.   0.   0.  -1. ]]
    """
    

    We note that the results are different from those we calculated above. This is because MATLAB and Python perform \(LU\) decomposition with partial pivoting. Thus the system solved was actually \(PA = LU\).

  3. To calculate the inverse of a matrix using its \(LU\) decomposition we realize that the system \(AX = B\), where \(X\) and \(B\) are matrices, may be solved by splitting \(X\) and \(B\) into their individual column pairs (i.e. column 1 of \(X\) goes with column 1 of \(B\), column 2 with column 2, etc.) and solving each system of linear equations separately. In this case we know a matrix multiplied by its inverse yields the identity matrix. Therefore:

    \[\begin{split}B = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right]\end{split}\]

    Finally, we recall that the \(LU\) decomposition of a matrix may be used to simplify the solution process by splitting the system into two separate systems that are easily solved.

    • \(Ly = b\)
    • \(Ux = y\)

    Invert column 1

    We start with the subsystem \(Ly = b\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & -1 & 1 & 0\\ 1 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{11}\\ y_{21}\\ y_{31}\\ y_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ 0\\ 0\\ 0\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the first column under the diagonal (pivot) element by adding (-1) times row 1 to row 2, (-1) times row 1 to row 3, and (-1) times row 1 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 + (-1)(1) & 1 + (-1)(0) & 0 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & -1 + (-1)(0) & 1 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & 2 + (-1)(0) & 0 + (-1)(0) & 1 + (-1)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{11}\\ y_{21}\\ y_{31}\\ y_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ 0 + (-1)(1)\\ 0 + (-1)(1)\\ 0 + (-1)(1)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 & 1 & 0\\ 0 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{11}\\ y_{21}\\ y_{31}\\ y_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ -1\\ -1\\ -1\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the second column under the diagonal (pivot) element by adding (1) times row 2 to row 3, and (-2) times row 2 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 + (1)(1) & 1 + (1)(0) & 0 + (1)(0)\\ 0 & 2 + (-2)(1) & 0 + (-2)(0) & 1 + (-2)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{11}\\ y_{21}\\ y_{31}\\ y_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ -1\\ -1 + (1)(-1)\\ -1 + (-2)(-1)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{11}\\ y_{21}\\ y_{31}\\ y_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ -1\\ -2\\ 1\\ \end{array} \right]\end{split}\]

    Now that we have solved for \(y\) we move on to solve the second subsystem \(Ux = y\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 0 & -8 & -3\\ 0 & 0 & 0 & 2\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ -1\\ -2\\ 1\\ \end{array} \right]\end{split}\]

    We start by dividing row 4 by its diagonal element and then adding 3 times row 4 to row 3, 3 times row 4 to row 2, and (-1) times row 4 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1 + (-1)\left(\frac{2}{(2)}\right)\\ 0 & -3 & -3 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & -8 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & 0 & \frac{2}{(2)}\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1 + (-1)\left(\frac{1}{(2)}\right)\\ -1 + (3)\left(\frac{1}{(2)}\right)\\ -2 + (3)\left(\frac{1}{(2)}\right)\\ \frac{1}{(2)}\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 1 & 0\\ 0 & -3 & -3 & 0\\ 0 & 0 & -8 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1/2\\ 1/2\\ -1/2\\ 1/2\\ \end{array} \right]\end{split}\]

    Next we divide row 3 by its diagonal element and then add 3 times row 3 to row 2, and (-1) times row 3 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 + (-1)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & -3 & -3 + (3)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & 0 & \frac{-8}{(-8)} & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 1/2 + (-1)\left(\frac{-1/2}{(-8)}\right)\\ 1/2 + (3)\left(\frac{-1/2}{(-8)}\right)\\ \frac{-1/2}{(-8)}\\ 1/2\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 7/16\\ 11/16\\ 1/16\\ 1/2\\ \end{array} \right]\end{split}\]

    Finally we divide row 2 by its diagonal element and then add (-1) times row 2 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 + (-1)\left(\frac{-3}{(-3)}\right) & 0 & 0\\ 0 & \frac{-3}{(-3)} & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 7/16 + (-1)\frac{11/16}{(-3)}\\ \frac{11/16}{(-3)}\\ 1/16\\ 1/2\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 32/48\\ -11/48\\ 3/48\\ 24/48\\ \end{array} \right]\end{split}\]

    Therefore the first column of the inverse of \(A\) is.

    \[\begin{split}\left[\begin{array}{r} x_{11}\\ x_{21}\\ x_{31}\\ x_{41}\\ \end{array} \right] = \left[\begin{array}{r} 32/48\\ -11/48\\ 3/48\\ 24/48\\ \end{array} \right]\end{split}\]

    Invert column 2

    We start with the subsystem \(Ly = b\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & -1 & 1 & 0\\ 1 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{12}\\ y_{22}\\ y_{32}\\ y_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1\\ 0\\ 0\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the first column under the diagonal (pivot) element by adding (-1) times row 1 to row 2, (-1) times row 1 to row 3, and (-1) times row 1 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 + (-1)(1) & 1 + (-1)(0) & 0 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & -1 + (-1)(0) & 1 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & 2 + (-1)(0) & 0 + (-1)(0) & 1 + (-1)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{12}\\ y_{22}\\ y_{32}\\ y_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1 + (-1)(0)\\ 0 + (-1)(0)\\ 0 + (-1)(0)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 & 1 & 0\\ 0 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{12}\\ y_{22}\\ y_{32}\\ y_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1\\ 0\\ 0\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the second column under the diagonal (pivot) element by adding (1) times row 2 to row 3, and (-2) times row 2 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 + (1)(1) & 1 + (1)(0) & 0 + (1)(0)\\ 0 & 2 + (-2)(1) & 0 + (-2)(0) & 1 + (-2)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{12}\\ y_{22}\\ y_{32}\\ y_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1\\ 0 + (1)(1)\\ 0 + (-2)(1)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{12}\\ y_{22}\\ y_{32}\\ y_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1\\ 1\\ -2\\ \end{array} \right]\end{split}\]

    Now that we have solved for \(y\) we move on to solve the second subsystem \(Ux = y\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 0 & -8 & -3\\ 0 & 0 & 0 & 2\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1\\ 1\\ -2\\ \end{array} \right]\end{split}\]

    We start by dividing row 4 by its diagonal element and then adding 3 times row 4 to row 3, 3 times row 4 to row 2, and (-1) times row 4 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1 + (-1)\left(\frac{2}{(2)}\right)\\ 0 & -3 & -3 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & -8 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & 0 & \frac{2}{(2)}\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 0 + (-1)\left(\frac{-2}{(2)}\right)\\ 1 + (3)\left(\frac{-2}{(2)}\right)\\ 1 + (3)\left(\frac{-2}{(2)}\right)\\ \frac{-2}{(2)}\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 1 & 0\\ 0 & -3 & -3 & 0\\ 0 & 0 & -8 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 1\\ -2\\ -2\\ -1\\ \end{array} \right]\end{split}\]

    Next we divide row 3 by its diagonal element and then add 3 times row 3 to row 2, and (-1) times row 3 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 + (-1)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & -3 & -3 + (3)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & 0 & \frac{-8}{(-8)} & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 1 + (-1)\left(\frac{-2}{(-8)}\right)\\ -2 + (3)\left(\frac{-2}{(-8)}\right)\\ \frac{-2}{(-8)}\\ -1\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 3/4\\ -5/4\\ 1/4\\ -1\\ \end{array} \right]\end{split}\]

    Finally we divide row 2 by its diagonal element and then add (-1) times row 2 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 + (-1)\left(\frac{-3}{(-3)}\right) & 0 & 0\\ 0 & \frac{-3}{(-3)} & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 3/4 + (-1)\frac{-5/4}{(-3)}\\ \frac{-5/4}{(-3)}\\ 1/4\\ -1\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 1/3\\ 5/12\\ 1/4\\ -1\\ \end{array} \right]\end{split}\]

    Therefore the first column of the inverse of \(A\) is.

    \[\begin{split}\left[\begin{array}{r} x_{12}\\ x_{22}\\ x_{32}\\ x_{42}\\ \end{array} \right] = \left[\begin{array}{r} 16/48\\ 20/48\\ 12/48\\ -48\\ \end{array} \right]\end{split}\]

    Invert column 3

    We start with the subsystem \(Ly = b\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & -1 & 1 & 0\\ 1 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{13}\\ y_{23}\\ y_{33}\\ y_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 1\\ 0\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the first column under the diagonal (pivot) element by adding (-1) times row 1 to row 2, (-1) times row 1 to row 3, and (-1) times row 1 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 + (-1)(1) & 1 + (-1)(0) & 0 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & -1 + (-1)(0) & 1 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & 2 + (-1)(0) & 0 + (-1)(0) & 1 + (-1)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{13}\\ y_{23}\\ y_{33}\\ y_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0 + (-1)(0)\\ 1 + (-1)(0)\\ 0 + (-1)(0)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 & 1 & 0\\ 0 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{13}\\ y_{23}\\ y_{33}\\ y_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 1\\ 0\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the second column under the diagonal (pivot) element by adding (1) times row 2 to row 3, and (-2) times row 2 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 + (1)(1) & 1 + (1)(0) & 0 + (1)(0)\\ 0 & 2 + (-2)(1) & 0 + (-2)(0) & 1 + (-2)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{13}\\ y_{23}\\ y_{33}\\ y_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 1 + (1)(0)\\ 0 + (-2)(0)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{13}\\ y_{23}\\ y_{33}\\ y_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 1\\ 0\\ \end{array} \right]\end{split}\]

    Now that we have solved for \(y\) we move on to solve the second subsystem \(Ux = y\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 0 & -8 & -3\\ 0 & 0 & 0 & 2\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 1\\ 0\\ \end{array} \right]\end{split}\]

    We start by dividing row 4 by its diagonal element and then adding 3 times row 4 to row 3, 3 times row 4 to row 2, and (-1) times row 4 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1 + (-1)\left(\frac{2}{(2)}\right)\\ 0 & -3 & -3 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & -8 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & 0 & \frac{2}{(2)}\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0 + (-1)\left(\frac{0}{(2)}\right)\\ 0 + (3)\left(\frac{0}{(2)}\right)\\ 1 + (3)\left(\frac{0}{(2)}\right)\\ \frac{0}{(2)}\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 1 & 0\\ 0 & -3 & -3 & 0\\ 0 & 0 & -8 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 1\\ 0\\ \end{array} \right]\end{split}\]

    Next we divide row 3 by its diagonal element and then add 3 times row 3 to row 2, and (-1) times row 3 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 + (-1)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & -3 & -3 + (3)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & 0 & \frac{-8}{(-8)} & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0 + (-1)\left(\frac{1}{(-8)}\right)\\ 0 + (3)\left(\frac{1}{(-8)}\right)\\ \frac{1}{(-8)}\\ 0\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 1/8\\ -3/8\\ -1/8\\ 0\\ \end{array} \right]\end{split}\]

    Finally we divide row 2 by its diagonal element and then add (-1) times row 2 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 + (-1)\left(\frac{-3}{(-3)}\right) & 0 & 0\\ 0 & \frac{-3}{(-3)} & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 1/8 + (-1)\frac{-3/8}{(-3)}\\ \frac{-3/8}{(-3)}\\ -1/8\\ 0\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 1/8\\ -1/8\\ 0\\ \end{array} \right]\end{split}\]

    Therefore the first column of the inverse of \(A\) is.

    \[\begin{split}\left[\begin{array}{r} x_{13}\\ x_{23}\\ x_{33}\\ x_{43}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 6/48\\ -6/48\\ 0\\ \end{array} \right]\end{split}\]

    Invert column 4

    We start with the subsystem \(Ly = b\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & -1 & 1 & 0\\ 1 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{14}\\ y_{24}\\ y_{34}\\ y_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 0\\ 1\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the first column under the diagonal (pivot) element by adding (-1) times row 1 to row 2, (-1) times row 1 to row 3, and (-1) times row 1 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 1 + (-1)(1) & 1 + (-1)(0) & 0 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & -1 + (-1)(0) & 1 + (-1)(0) & 0 + (-1)(0)\\ 1 + (-1)(1) & 2 + (-1)(0) & 0 + (-1)(0) & 1 + (-1)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{14}\\ y_{24}\\ y_{34}\\ y_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0 + (-1)(0)\\ 0 + (-1)(0)\\ 1 + (-1)(0)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 & 1 & 0\\ 0 & 2 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{14}\\ y_{24}\\ y_{34}\\ y_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 0\\ 1\\ \end{array} \right]\end{split}\]

    We eliminate the elements in the second column under the diagonal (pivot) element by adding (1) times row 2 to row 3, and (-2) times row 2 to row 4.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & -1 + (1)(1) & 1 + (1)(0) & 0 + (1)(0)\\ 0 & 2 + (-2)(1) & 0 + (-2)(0) & 1 + (-2)(0)\\ \end{array} \right] \left[\begin{array}{r} y_{14}\\ y_{24}\\ y_{34}\\ y_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 0 + (1)(0)\\ 1 + (-2)(0)\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} y_{14}\\ y_{24}\\ y_{34}\\ y_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 0\\ 1\\ \end{array} \right]\end{split}\]

    Now that we have solved for \(y\) we move on to solve the second subsystem \(Ux = y\).

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1\\ 0 & -3 & -3 & -3\\ 0 & 0 & -8 & -3\\ 0 & 0 & 0 & 2\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ 0\\ 0\\ 1\\ \end{array} \right]\end{split}\]

    We start by dividing row 4 by its diagonal element and then adding 3 times row 4 to row 3, 3 times row 4 to row 2, and (-1) times row 4 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 & 1 + (-1)\left(\frac{2}{(2)}\right)\\ 0 & -3 & -3 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & -8 & -3 + (3)\left(\frac{2}{(2)}\right)\\ 0 & 0 & 0 & \frac{2}{(2)}\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0 + (-1)\left(\frac{1}{(2)}\right)\\ 0 + (3)\left(\frac{1}{(2)}\right)\\ 0 + (3)\left(\frac{1}{(2)}\right)\\ \frac{1}{(2)}\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 1 & 0\\ 0 & -3 & -3 & 0\\ 0 & 0 & -8 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} -1/2\\ 3/2\\ 3/2\\ 1/2\\ \end{array} \right]\end{split}\]

    Next we divide row 3 by its diagonal element and then add 3 times row 3 to row 2, and (-1) times row 3 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 & 1 + (-1)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & -3 & -3 + (3)\left(\frac{-8}{(-8)}\right) & 0\\ 0 & 0 & \frac{-8}{(-8)} & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} -1/2 + (-1)\left(\frac{3/2}{(-8)}\right)\\ 3/2 + (3)\left(\frac{3/2}{(-8)}\right)\\ \frac{3/2}{(-8)}\\ 1/2\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} -5/16\\ 15/16\\ -3/16\\ 1/2\\ \end{array} \right]\end{split}\]

    Finally we divide row 2 by its diagonal element and then add (-1) times row 2 to row 1.

    \[\begin{split}\left[\begin{array}{rrrr} 1 & 1 + (-1)\left(\frac{-3}{(-3)}\right) & 0 & 0\\ 0 & \frac{-3}{(-3)} & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} -5/16 + (-1)\frac{15/16}{(-3)}\\ \frac{15/16}{(-3)}\\ -3/16\\ 1/2\\ \end{array} \right] \;\; \Longrightarrow \;\; \left[\begin{array}{rrrr} 1 & 1 & 0 & 0\\ 0 & -3 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] \left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ -15/48\\ -3/16\\ 1/2\\ \end{array} \right]\end{split}\]

    Therefore the first column of the inverse of \(A\) is.

    \[\begin{split}\left[\begin{array}{r} x_{14}\\ x_{24}\\ x_{34}\\ x_{44}\\ \end{array} \right] = \left[\begin{array}{r} 0\\ -15/48\\ -9/48\\ 24/48\\ \end{array} \right]\end{split}\]

    Therefore the inverse of \(A\) is

    \[\begin{split}A^{-1} = [x_{:1},x_{:2},x_{:3},x_{:4}] = \left[\begin{array}{rrrr} 32/48 & 16/48 & 0 & 0\\ -11/48 & 20/48 & 6/48 & -15/48\\ 3/48 & 12/48 & -6/48 & -9/48\\ 24/48 & -48/48 & 0 & 24/48\\ \end{array}\right] = \left[\begin{array}{rrrr} 0.6667 & 0.3333 & 0 & 0\\ -0.2292 & 0.4167 & 0.1250 & -0.3125\\ 0.0625 & 0.2500 & -0.1250 & -0.1875\\ 0.5000 & -1.0000 & 0 & 0.5000\\ \end{array}\right]\end{split}\]

    Check the inverse

    To check the inverse of \(A\) in MATLAB and Python we use the inv() function.

    MATLAB code Python code
    A = [1, 1, 1, 1;
         1,-2,-2,-2;
         1, 4,-4, 1;
         1,-5,-5,-3];
     
    A_inv = inv(A)
    
    % Running the above script results in the following output
    A_inv =
        0.6667    0.3333         0   -0.0000
       -0.2292    0.4167    0.1250   -0.3125
        0.0625    0.2500   -0.1250   -0.1875
        0.5000   -1.0000         0    0.5000
    
    import numpy as np
    from numpy.linalg import *
    from scipy.linalg import *
    
    A = np.array([[1,1,1,1],
                  [1,-2,-2,-2],
                  [1,4,-4,1],
                  [1,-5,-5,-3]])
    print ('A inverted = \n%s' % inv(A))
    
    """
    Running the above code results in the following output
    
    A inverted = 
    [[  6.66666667e-01   3.33333333e-01   0.00000000e+00  -2.77555756e-17]
     [ -2.29166667e-01   4.16666667e-01   1.25000000e-01  -3.12500000e-01]
     [  6.25000000e-02   2.50000000e-01  -1.25000000e-01  -1.87500000e-01]
     [  5.00000000e-01  -1.00000000e+00   0.00000000e+00   5.00000000e-01]]
    """
    

    Finally we show that \(AA^{-1} = I\). Since matrix multiplication takes up alot of space we will do this column by column.

    \[\begin{split}id_{:1} = \left[\begin{array}{r} a_{11}a^{-1}_{11} + a_{12}a^{-1}_{21} + a_{13}a^{-1}_{31} + a_{14}a^{-1}_{41}\\ a_{21}a^{-1}_{11} + a_{22}a^{-1}_{21} + a_{23}a^{-1}_{31} + a_{24}a^{-1}_{41}\\ a_{31}a^{-1}_{11} + a_{32}a^{-1}_{21} + a_{33}a^{-1}_{31} + a_{34}a^{-1}_{41}\\ a_{41}a^{-1}_{11} + a_{42}a^{-1}_{21} + a_{43}a^{-1}_{31} + a_{44}a^{-1}_{41}\\ \end{array}\right] = \left[\begin{array}{r} (1)(32/48) + (1)(-11/48) + (1)(3/48) + (1)(24/48)\\ (1)(32/48) + (-2)(-11/48) + (-2)(3/48) + (-2)(24/48)\\ (1)(32/48) + (4)(-11/48) + (-4)(3/48) + (1)(24/48)\\ (1)(32/48) + (-5)(-11/48) + (-5)(3/48) + (1)(24/48)\\ \end{array}\right] = \left[\begin{array}{r} 1\\ 0\\ 0\\ 0\\ \end{array}\right]\end{split}\]
    \[\begin{split}id_{:2} = \left[\begin{array}{r} a_{11}a^{-1}_{12} + a_{12}a^{-1}_{22} + a_{13}a^{-1}_{32} + a_{14}a^{-1}_{42}\\ a_{21}a^{-1}_{12} + a_{22}a^{-1}_{22} + a_{23}a^{-1}_{32} + a_{24}a^{-1}_{42}\\ a_{31}a^{-1}_{12} + a_{32}a^{-1}_{22} + a_{33}a^{-1}_{32} + a_{34}a^{-1}_{42}\\ a_{41}a^{-1}_{12} + a_{42}a^{-1}_{22} + a_{43}a^{-1}_{32} + a_{44}a^{-1}_{42}\\ \end{array}\right] = \left[\begin{array}{r} (1)(16/48) + (1)(20/48) + (1)(12/48) + (1)(-48/48)\\ (1)(16/48) + (-2)(20/48) + (-2)(12/48) + (-2)(-48/48)\\ (1)(16/48) + (4)(20/48) + (-4)(12/48) + (1)(-48/48)\\ (1)(16/48) + (-5)(20/48) + (-5)(12/48) + (1)(-48/48)\\ \end{array}\right] = \left[\begin{array}{r} 0\\ 1\\ 0\\ 0\\ \end{array}\right]\end{split}\]
    \[\begin{split}id_{:3} = \left[\begin{array}{r} a_{11}a^{-1}_{13} + a_{12}a^{-1}_{23} + a_{13}a^{-1}_{33} + a_{14}a^{-1}_{43}\\ a_{21}a^{-1}_{13} + a_{22}a^{-1}_{23} + a_{23}a^{-1}_{33} + a_{24}a^{-1}_{43}\\ a_{31}a^{-1}_{13} + a_{32}a^{-1}_{23} + a_{33}a^{-1}_{33} + a_{34}a^{-1}_{43}\\ a_{41}a^{-1}_{13} + a_{42}a^{-1}_{23} + a_{43}a^{-1}_{33} + a_{44}a^{-1}_{43}\\ \end{array}\right] = \left[\begin{array}{r} (1)(0) + (1)(6/48) + (1)(-6/48) + (1)(0)\\ (1)(0) + (-2)(6/48) + (-2)(-6/48) + (-2)(0)\\ (1)(0) + (4)(6/48) + (-4)(-6/48) + (1)(0)\\ (1)(0) + (-5)(6/48) + (-5)(-6/48) + (1)(0)\\ \end{array}\right] = \left[\begin{array}{r} 0\\ 0\\ 1\\ 0\\ \end{array}\right]\end{split}\]
    \[\begin{split}id_{:4} = \left[\begin{array}{r} a_{11}a^{-1}_{14} + a_{12}a^{-1}_{24} + a_{13}a^{-1}_{34} + a_{14}a^{-1}_{44}\\ a_{21}a^{-1}_{14} + a_{22}a^{-1}_{24} + a_{23}a^{-1}_{34} + a_{24}a^{-1}_{44}\\ a_{31}a^{-1}_{14} + a_{32}a^{-1}_{24} + a_{33}a^{-1}_{34} + a_{34}a^{-1}_{44}\\ a_{41}a^{-1}_{14} + a_{42}a^{-1}_{24} + a_{43}a^{-1}_{34} + a_{44}a^{-1}_{44}\\ \end{array}\right] = \left[\begin{array}{r} (1)(0) + (1)(-15/48) + (1)(-9/48) + (1)(24/48)\\ (1)(0) + (-2)(-15/48) + (-2)(-9/48) + (-2)(24/48)\\ (1)(0) + (4)(-15/48) + (-4)(-9/48) + (1)(24/48)\\ (1)(0) + (-5)(-15/48) + (-5)(-9/48) + (1)(24/48)\\ \end{array}\right] = \left[\begin{array}{r} 0\\ 0\\ 0\\ 1\\ \end{array}\right]\end{split}\]

    From the above it is easy to see that

    \[\begin{split}AA^{-1} = \left[id_{:1},id_{:2},id_{:3},id_{:4}\right] = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ \end{array} \right] = I\end{split}\]

    Therefore we have proven that \(AA^{-1} = I\).

← Question 1 (previous step) Back to all questions Next step: Question 3 →