26.2. Matrix Multiplication¶

26.2.1. The Standard Approach¶

How do we normally do matrix multiplication? Given $n \times n$ matrices $A$ and $B$, the standard approach is to compute element $c_{ij}$ in $C = A \times B$ as follows:

\[c_{ij} = \sum_{k=1}^n a_{ik}b_{kj}.\]

This requires $\Theta(n^3)$ multiplications and $\Theta(n^3)$ additions. Though, this is not quite as bad as it first looks, since input size is not $n$, but instead it is $n^2$.

But is this really the best that we can do? Consider that the naive lower bound for any matrix multiplication algorithm is $\Omega(n^2)$, because we create $n^2$ outputs.

For the moement, let’s consider multiplying two $2 \times 2$ matrices. The basic algorithm requires exactly 8 multiplications and 4 additions. But consider this alternative. Compute:

\[\begin{split}\begin{eqnarray*} m_1 &=& (a_{12} - a_{22})(b_{21} + b_{22})\\ m_2 &=& (a_{11} + a_{22})(b_{11} + b_{22})\\ m_3 &=& (a_{11} - a_{21})(b_{11} + b_{12})\\ m_4 &=& (a_{11} + a_{12})b_{22}\\ m_5 &=& a_{11}(b_{12} - b_{22})\\ m_6 &=& a_{22}(b_{21} - b_{11})\\ m_7 &=& (a_{21} + a_{22})b_{11} \end{eqnarray*}\end{split}\]

Then:

\[\begin{split}\begin{eqnarray*} c_{11} &=& m_1 + m_2 - m_4 + m_6\\ c_{12} &=& m_4 + m_5\\ c_{21} &=& m_6 + m_7\\ c_{22} &=& m_2 - m_3 + m_5 - m_7 \end{eqnarray*}\end{split}\]

To verify that this seems to be correct, let’s check one of these:

\[\begin{split}\begin{eqnarray*} c_{11} &=& m_1 + m_2 - m_4 + m_6\\ &=& (a_{12} - a_{22})(b_{21} + b_{22}) + (a_{11} + a_{22})(b_{11} + b_{22})\\ &&\qquad- (a_{11} + a_{12})b_{22} + a_{22}(b_{21} - b_{11})\\ &=& a_{12}b_{21} + a_{12}b_{22} - a_{22}b_{21} - a_{22}b_{22} + a_{11}b_{22} + a_{22}b_{11}\\ &&\qquad + a_{22}b_{22} - a_{11}b_{22} - a_{12}b_{22} + a_{22}b_{21} - a_{22}b_{11}\\ &=& a_{11}b_{11} + a_{12}b_{21} \end{eqnarray*}\end{split}\]

This requires 7 multiplications and 18 additions/subtractions. Now, if we are actually multiplying $2 \times 2$ arrays, that does not seem like any sort of a win, because when multiplying integers (or even floating point numbers), there is not much difference in the cost of a multiplication and an addition. But consider that this works just as well when multiplying two matrices of size $2n \times 2n$. In that case, the multiplications are of half-size ($n \times n$) matrices. And in this situation, a matrix multiplication is far more expensive ($\Theta(n^3)$) than a matrix addition ($\Theta(n^2)$).

26.2.2. Strassen’s Algorithm¶

The arrangement shown above with the reduced number of multiplications is know as Strassen’s algorithm. Strassen’s algorithm trades more additions/subtractions for fewer multiplications in the $2n \times 2n$ case. It uses divide an conquer in an attempt to reduce the total work for multiplying large matrices.

As we have already seen, in the straightforward implementation, the $2 times 2$ case will do this work:

\[\begin{split}\begin{eqnarray*} c_{11} = a_{11}b_{11} + a_{12}b_{21}\\ c_{12} = a_{11}b_{12} + a_{12}b_{22}\\ c_{21} = a_{21}b_{11} + a_{22}b_{21}\\ c_{22} = a_{21}b_{12} + a_{22}b_{22} \end{eqnarray*}\end{split}\]

This requires 8 (matix) multiplications and 4 (matrix) additions to multiply two $2n \times 2n$ matrix.

Consider this divide and conquer step: Assume $n$ is a power of 2. Express $C = A \times B$ in terms of $\frac{n}{2} \times \frac{n}{2}$ matrices.

\[\begin{split}\left[ \begin{array}{ll} C_{11} & C_{12}\\ C_{21} & C_{22} \end{array} \right] = \left[ \begin{array}{ll} A_{11} & A_{12}\\ A_{21} & A_{22} \end{array} \right] \left[ \begin{array}{ll} B_{11} & B_{12}\\ B_{21} & B_{22} \end{array} \right]\end{split}\]

By Strassen’s algorithm, this can be computed with 7 multiplications and 18 additions/subtractions of $n/2 \times n/2$ matrices.

This gives us the following recurrence relation, along with this solution from applying the Master Theorem:

\[\begin{split}\begin{eqnarray*} T(n) &=& 7T(n/2) + 18(n/2)^2\\ T(n) &=& \Theta(n^{\log_2 7}) = \Theta(n^{2.81}). \end{eqnarray*}\end{split}\]

This is clearly grows slower than $\Theta(n^3)$. However, there is a much larger constant due to the increased number of additions. Strassen’s algorithm would only be practical to use if the matrices where extremely large.

There are other arrangements that can further reduce the exponent. The current “fastest” algorithm (in an asymptotic sense) is $\Theta(n^{2.376})$. But it is even more impractical than Strassen’s algorithm due to the overhead of a large number of additions.

It is a theoretical open question as to whether it is possible to do matrix multiplication be done in $O(n^2)$ time.

OpenDSA Complete Catalog

Chapter 26 Number Problems

26.2. Matrix Multiplication¶

26.2.1. The Standard Approach¶

26.2.2. Strassen’s Algorithm¶