An earlier post claimed that there practical advantages to partitioning a matrix, thinking of the matrix as a matrix of matrices. This post will give an example.
Let M be a square matrix and suppose we need to multiply M by its transpose MT. We can compute this product faster than multiplying two arbitrary matrices of the same size by exploiting the fact that MMT will be a symmetric matrix.
We start by partitioning M into four blocks
Then
and
Now for the first clever part: We don’t have to compute both of the off-diagonal blocks because each is the transpose of the other. So we can reduce our calculation by 25% by not calculating one of the blocks, say the lower left block.
And now for the second clever part: apply the same procedure recursively. The diagonal blocks in MMT involve a matrix times its transpose. That is, we can partition A and use the same idea to compute AAT and do the same for BBT, CCT, and DDT. The off diagonal blocks require general matrix multiplications.
The net result is that we can compute MMT in about 2/3 the time it would take to multiply two arbitrary matrices of the same size.
Recently a group of researchers found a way to take this idea even further, partitioning a matrix into a 4 by 4 matrix of 16 blocks and doing some clever tricks. The RXTX algorithm can compute MMT in about 36/41 the time required to multiply arbitrary matrices, a savings of about 5%. A 5% improvement may be significant if it appears in the inner loop of a heavy computation. According to the authors, “The algorithm was discovered by combining Machine Learning-based search methods with Combinatorial Optimization.”
Related posts
- Embeddings, Projections, and Inverses
- The million dollar matrix multiply
- Vector spaces over finite fields
The post Multiplying a matrix by its transpose first appeared on John D. Cook.