Rant: Matrix Layouts

Matrices are, by definition, simple mathematical structures consisting of a couple of numbers organized into rows and columns. To the graphics programmer, usually the 4×4 matrix type (4 rows and 4 columns) is of most interest due to it’s ability to perform arbitrary linear transforms on a given (homogenous) 3D vector. Let’s define some notation first. Let M be a 4×4 matrix as follows:

\mathbf{M} = \begin{pmatrix} m_{11} & m_{12} & m_{13} & m_{14} \\ m_{21} & m_{22} & m_{23} & m_{24} \\ m_{31} &m_{32} & m_{33} & m_{34} \\ m_{41} & m_{42} & m_{43} & m_{44} \end{pmatrix} = \begin{pmatrix}m_{11} & m_{21} & m_{31} & m_{41} \\ m_{12} & m_{22} & m_{32} & m_{42} \\ m_{13} &m_{23} & m_{33} & m_{43} \\ m_{14} & m_{24} & m_{34} & m_{44}\end{pmatrix}^T

where the T denotes the transpose, which swaps rows and columns of a given matrix.
Multiplication with a homogenous vector \mathbf{v} = \begin{pmatrix}v_x & v_y & v_z & v_w\end{pmatrix}^T from the right can then be defined as follows

\mathbf{M} \mathbf{v} = \begin{pmatrix}m_{11} v_x + m_{12} v_y + m_{13} v_z + m_{14} v_w \\ m_{21} v_x + m_{22} v_y + m_{23} v_z + m_{24} v_w \\ m_{31} v_x + m_{32} v_y + m_{33} v_z + m_{34} v_w \\ m_{41} v_x + m_{42} v_y + m_{43} v_z + m_{44} v_w \end{pmatrix} = \mathbf{m_{|1}} v_x + \mathbf{m_{|2}} v_y + \mathbf{m_{|3}} v_z + \mathbf{m_{|4}} v_w

where \mathbf{m_{|i}} denotes the i-th matrix column. The multiplication of a vector with a matrix from the right corresponds thus to the sum of the matrix columns, weighted by the vector’s components. Conversely, multiplying a matrix with a vector from the left can be seen as a weighted sum of the matrix rows:

\mathbf{v}^T \mathbf{M} = \begin{pmatrix}m_{11} v_x + m_{21} v_y + m_{31} v_z + m_{41} v_w \\ m_{12} v_x + m_{22} v_y + m_{32} v_z + m_{42} v_w\\ m_{13} v_x + m_{23} v_y + m_{33} v_z + m_{43} v_w\\ m_{14} v_x + m_{24} v_y + m_{34} v_z + m_{44} v_w \end{pmatrix} = \mathbf{m_{\overline{1}}} v_x + \mathbf{m_{\overline{2}}} v_y + \mathbf{m_{\overline{3}}} v_z + \mathbf{m_{\overline{4}}} v_w

where \mathbf{m_{\overline{i}}} denotes the i-th matrix row. Since the transpose operator swaps a matrix’s rows and columns, we can conclude that multiplying a vector by a matrix from the right is equivalent to multiplying the same vector with the matrix’s transpose from the left. This works in accordance with the transposition rules

(\mathbf{M_1} \mathbf{M_2})^T = \mathbf{M_2}^T \mathbf{M_1}^T\quad\text{and}\quad(\mathbf{M}^T)^T = \mathbf{M}

Given these definitions, lets now look at the matrix type we encounter most in graphics programming: Combined rotation and translation transforms. Matrices of this form consist of a 3×3 rotational part \mathbf{R} and a 3×1 translational part \mathbf{t}:

\mathbf{M} = \begin{pmatrix} \mathbf{R} & \mathbf{t} \\ \mathbf{0} & 1 \end{pmatrix} = \begin{pmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \\ 0 & 0 & 0 & 1\end{pmatrix}

Note that this definition already implicitly defines the multiplication order – it only works when multiplying from the right. Let \mathbf{v} = (v_x, v_y, v_z, 1)^T:

\mathbf{v'} = \mathbf{M} \mathbf{v} = \mathbf{R} \mathbf{v} + \mathbf{t}

If we wanted to multiply \mathbf{v} from the left we’d have to use the transposition rules:

\mathbf{v'} = (\mathbf{M} \mathbf{v})^T = \mathbf{v}^T \mathbf{M}^T

which again illustrates the point I made before: multiplying a matrix by a vector from the right is equivalent to multiplying it’s transpose from the left. Why am I insisting on this fact so much? Guess what: different companies have, as usual not been able to agree on a common standard. As a result, DirectX assumes multiplication from the left and OpenGL and Assimp assume multipliation from the right. This means that we can’t just take a matrix from Assimp to DirectX, we first have to transpose it.

Now, why not make things even more complicated: Lets think about how we actually store our matrices in code! Since computer memory is adressed in a linear fashion we have to map the values of a matrix to a one-dimensional string of numbers. We can do so in two ways: iterate through the matrix column by colum (column-major) or row by row (row-major).

[m_{11} m_{12} m_{13} m_{14} m_{21} \cdots]\xleftarrow{\text{row major}}\begin{pmatrix} m_{11} & m_{12} & m_{13} & m_{14} \\ m_{21} & m_{22} & m_{23} & m_{24} \\ m_{31} &m_{32} & m_{33} & m_{34} \\ m_{41} & m_{42} & m_{43} & m_{44} \end{pmatrix}\xrightarrow{\text{column major}}[m_{11} m_{21} m_{31} m_{41} m_{12} \cdots]

And guess what: Once again the difference between the two corresponds to a transposition, i.e. if you read a row-major matrix as column-major you’ll actually have that matrix’s transpose. So let the fun begin, lets convert matrices from one software package to another: the first mistake you can make is to mix up the memory layouts when copying the matrix data. This will result in all your matrices being transposed. The second thing that can go wrong is the multiplication order: multiply from left instead of from the right and vice versa. Yet another transposition of the matrix. And the cruel thing is, if you are ‘in luck’ and you make both mistakes at once you might actually never notice because double transposition cancels out as shown before. Only once you need to access individual matrix values you’ll realize that something is off and you might scratch your head for a while looking for the reason. So it really pays off to figure out matrix memory layout and multiplication order before starting to mix matrices of two different software packages.

Leave a Reply

Your email address will not be published. Required fields are marked *