Jump to content

Rotation matrix: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
KSmrq (talk | contribs)
major rewrite
Line 1: Line 1:
In [[matrix theory]], a '''rotation matrix''' is a [[real number|real]] [[square matrix]] whose [[transpose]] is its [[invertible matrix|inverse]] and whose [[determinant]] is +1.
A '''rotation matrix''' is a [[matrix (mathematics)|matrix]] which when [[matrix_multiplication|multiplied]] by a [[vector (spatial)|vector]] has the effect of changing the direction of the vector but not its magnitude. Rotation matrices do not include [[Inversion (geometry)|inversions]], which can change a right-handed coordinate system into a left-handed coordinate system and vice versa. The set of all rotation matrices along with inversions forms the set of [[orthogonal matrix|orthogonal matrices]].
:<math>\begin{align}
Q^{T}Q &{}= I = Q Q^{T} \\
\det Q &{}= +1
\end{align}</math>
In other words, it is a [[real number|real]] [[special orthogonal matrix]]. The name refers to the fact that an ''n''×''n'' rotation matrix corresponds to a geometric [[rotation (geometry)|rotation]] about a fixed origin in an ''n-''dimensional [[Euclidean space]], or equivalently, to a rotation of an ''n-''dimensional real [[vector space]] equipped with a Euclidean [[inner product]]. For example, the 3×3 rotation matrix
:<math> Q = \begin{bmatrix} 0.6 & -0.8 & 0 \\ 0.8 & 0.6 & 0 \\ 0 & 0 & 1 \end{bmatrix} </math>
corresponds to a rotation of approximately 53° around the ''z'' axis in three-dimensional space.


==Properties==
== Examples ==
{{col-begin}}
{{col-1-of-2}}
<ul>
<li>The 2×2 rotation matrix
:<math> Q = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} </math>
corresponds to a 90° planar rotation.</li>


<li>The transpose of the 2×2 matrix
Let <math>\mathcal{M}</math> be a general rotation matrix of any dimension: <math>\mathcal{M}\in\mathbb{R}^{n \times n}</math>
:<math> M = \begin{bmatrix} 0.936 & 0.352 \\ 0.352 & -0.936 \end{bmatrix} </math>
is its inverse, but since its determinant is −1 this is not a rotation matrix; it is a reflection across the line 11''y'' = 2''x''.


<li>The 3×3 rotation matrix
* The [[dot product]] of two vectors remains unchanged when both are operated upon by a rotation matrix:
:<math> Q = \begin{bmatrix} 1 & 0 & 0 \\ 0 & \frac{\sqrt{3}}{2} & \frac12 \\ 0 & -\frac12 & \frac{\sqrt{3}}{2} \end{bmatrix} </math>
corresponds to a −30° rotation around the ''x'' axis in three-dimensional space.</li>


<li>The 3×3 rotation matrix
::<math>\mathbf{a}\cdot\mathbf{b} = \mathcal{M}\mathbf{a}\cdot\mathcal{M}\mathbf{b}</math>
:<math> Q = \begin{bmatrix} 0.36 & 0.48 & -0.8 \\ -0.8 & 0.60 & 0 \\ 0.48 & 0.64 & 0.60 \end{bmatrix} </math>
corresponds to a rotation of approximately 74° around the axis (−<sup>1</sup>&frasl;<sub>3</sub>,<sup>2</sup>&frasl;<sub>3</sub>,<sup>2</sup>&frasl;<sub>3</sub>) in three-dimensional space.</li>


<li>The 3×3 [[permutation matrix]]
* It follows that the [[matrix inversion|inverse]] of a rotation matrix is its [[transpose]]:
:<math> P = \begin{bmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} </math>
is also a rotation matrix, as is the matrix of any [[even permutation]] (but never of any odd permutation).</li>
</ul>
{{col-2-of-2}}
<ul>
<li>The 3×3 matrix
:<math> M = \begin{bmatrix} 3 & -4 & 1 \\ 5 & 3 & -7 \\ -9 & 2 & 6 \end{bmatrix} </math>
has determinant +1, but its transpose is not its inverse, so it is not a rotation matrix.</li>


<li>The 4×3 matrix
::<math>\mathcal{M}\,\mathcal{M}^{-1}=\mathcal{M}\,\mathcal{M}^\top=\mathcal{I}</math> &nbsp;&nbsp; where <math>\mathcal{I}</math> is the [[Identity_matrix|identity matrix]].
:<math> M = \begin{bmatrix} 0.5 & -0.1 & 0.7 \\ 0.1 & 0.5 & -0.5 \\ -0.7 & 0.5 & 0.5 \\ -0.5 & -0.7 & -0.1 \end{bmatrix} </math>
is not square, and so cannot be a rotation matrix; yet ''Q''<sup>''T''</sup>''Q'' yields a 3×3 identity matrix (the columns are orthonormal).</li>


<li>The 4×4 rotation matrix
* A matrix is a rotation matrix if and only if it is [[Orthogonal matrix|orthogonal]] and its [[Determinant|determinant]] is unity. The determinant of an orthogonal matrix is ±1; if the determinant is −1, then the matrix also contains a [[reflection]] and is not a rotation matrix.
:<math> Q = \begin{bmatrix} -1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix} </math>
has no axis of rotation; it reverses the direction of every vector.


<li>The 5×5 rotation matrix
* A rotation matrix is [[Orthogonal matrix|orthogonal]] if its column vectors form an [[orthonormal basis]] of <math>\mathbb{R}^{n}</math>, that is, the scalar product between any two different column vectors is zero ([[orthogonality]]) and the magnitude of each column vector is unity ([[Normalized vector|normalization]]).
:<math> Q = \begin{bmatrix} 0 & -1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} </math>
rotates vectors in the plane of the first two coordinate axes 90°, rotates vectors in the plane of the next two axes 180°, and leaves the last coordinate axis unmoved.
</ul>
{{col-end}}


== Geometry ==
* Any rotation matrix can be represented as the exponential of a [[skew-symmetric matrix]] '''A''':
In [[Euclidean geometry]], a rotation is an example of an [[isometry]], a transformation that moves points without changing the distances between them. Rotations are distinguished from other isometries by two additional properties: they leave (at least) one point fixed, and they leave "handedness" unchanged. By contrast, a [[translation (geometry)|translation]] moves every point, a [[reflection (geometry)|reflection]] exchanges left- and right-handed ordering, and a [[glide reflection]] does both.


If we take the fixed point as the origin of a [[Cartesian coordinate system]], then every point can be given coordinates as a displacement from the origin. Thus we may work with the [[vector space]] of displacements instead of the points themselves. Now suppose (''p''<sub>1</sub>,…,''p''<sub>''n''</sub>) are the coordinates of the vector '''p''' from the origin, ''O'', to point ''P''. Choose an [[orthonormal basis]] for our coordinates; then the squared distance to ''P'', by [[Pythagorean theorem|Pythagoras]], is
::<math>\mathcal{M}=\exp (\mathbf{A})=\sum_{k=0}^\infty \frac{\mathbf{A}^k}{k!}</math>
:<math> d^2(O,P) = \| \bold{p} \|^2 = p_1^2 + \cdots + p_n^2 , \,\!</math>
which we can compute using the matrix multiplication
:<math> \| \bold{p} \|^2 = \begin{bmatrix}p_1 \cdots p_n\end{bmatrix} \begin{bmatrix}p_1 \\ \vdots \\ p_n \end{bmatrix} = \bold{p}^T \bold{p} . </math>


A geometric rotation transforms lines to lines, and preserves ratios of distances between points. From these properties we can show that a rotation is a [[linear transformation]] of the vectors, and thus can be written in [[matrix]] form, ''Q'''''p'''. The fact that a rotation preserves, not just ratios, but distances themselves, we can state as
:where the exponential is defined in terms of its [[Taylor series]] and <math>\mathbf{A}^k</math> is defined in terms of [[matrix multiplication]]. The '''A''' matrix is known as the ''generator'' of the rotation. The [[Lie algebra]] of rotation matrices is the algebra of its generators, which is just the algebra of skew-symmetric matrices. The generator can be found by finding the [[logarithm of a matrix|matrix logarithm]] of M.
:<math> \bold{p}^T \bold{p} = (Q \bold{p})^T (Q \bold{p}) , \,\!</math>
oder
:<math>\begin{align}
\bold{p}^T I \bold{p}&{}= (\bold{p}^T Q^T) (Q \bold{p}) \\
&{}= \bold{p}^T (Q^T Q) \bold{p} .
\end{align}</math>
Because this equation holds for all vectors, '''p''', we conclude that every rotation matrix, ''Q'', satisfies the ''orthogonality'' condition,
:<math> Q^T Q = I . \,\!</math>
Rotations preserve handedness because they cannot change the ordering of the axes, which implies the ''special matrix'' condition,
:<math> \det Q = +1 . \,\!</math>
Equally important, we can show that any matrix satisfying these two conditions acts as a rotation.


==Two dimensions==
== Multiplication ==
The inverse of a rotation matrix is its transpose, which is also a rotation matrix:
In two dimensions, a rotation can be defined by a single angle, <math>\theta</math>. Conventionally, positive angles represent counter-clockwise rotation. The matrix to rotate a [[column vector]] in [[cartesian coordinates]] about the origin by a counter-clockwise angle of <math>\theta</math> is:
:<math>\begin{align} (Q^T)^T (Q^T) &{}= Q Q^T = I\\ \det Q^T &{}= \det Q = +1 \end{align}</math>
:<math>
The product of two rotation matrices is a rotation matrix:
M(\theta) = \begin{bmatrix}
:<math>\begin{align}
\cos{\theta} & -\sin{\theta} \\
(Q_1 Q_2)^T (Q_1 Q_2) &{}= Q_2^T (Q_1^T Q_1) Q_2 = I \\
\sin{\theta} & \cos{\theta}
\det (Q_1 Q_2) &{}= (\det Q_1) (\det Q_2) = +1
\end{bmatrix}
\end{align}</math>
= \exp\left(\begin{bmatrix}
For ''n'' greater than 2, multiplication of ''n''×''n'' rotation matrices is not commutative.
0 & -\theta \\
:<math>\begin{align}
\theta & 0
Q_1 &{}= \begin{bmatrix}0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1\end{bmatrix} &
\end{bmatrix}\right).
Q_2 &{}= \begin{bmatrix}0 & 0 & 1 \\ 0 & 1 & 0 \\ -1 & 0 & 0\end{bmatrix} \\
</math>
Q_1 Q_2 &{}= \begin{bmatrix}0 & -1 & 0 \\ 0 & 0 & 1 \\ -1 & 0 & 0\end{bmatrix} &
Q_2 Q_1 &{}= \begin{bmatrix}0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0\end{bmatrix}
\end{align}</math>
Noting that any [[identity matrix]] is a rotation matrix, and that matrix multiplication is [[associative]], we may summarize all these properties by saying that the ''n''×''n'' rotation matrices form a [[group]], which for ''n''&nbsp;&gt;&nbsp;2 is non-[[abelian]]. Called a [[special orthogonal group]], and denoted by SO(''n''), SO(''n'','''R'''), SO<sub>''n''</sub>, or SO<sub>''n''</sub>('''R'''), the group of ''n''×''n'' rotation matrices is isomorphic to the group of rotations in an ''n-''dimensional space. This means that multiplication of rotation matrices corresponds to composition of rotations.


==Three dimensions==
== Ambiguities ==
The interpretation of a rotation matrix can be subject to many ambiguities.


[[Image:Alias and alibi rotations.png|thumb|350px|right|Alias and alibi rotations]]
In three dimensions, a rotation matrix has one real [[eigenvalue]], equal to unity. The rotation matrix specifies a rotation about the corresponding [[eigenvector]] ([[Euler's rotation theorem]]). If the angle of rotation is ''θ'' then the other two (complex) eigenvalues of the rotation matrix are ''e<sup>iθ</sup>'' and ''e<sup>–iθ</sup>''. It follows that the [[Trace_(linear_algebra)|trace]] of a 3D rotation matrix is equal to 1 + 2cos''θ'', which can be used to quickly calculate the rotation angle of any 3D rotation matrix.
; Positive or negative sense
: A positive rotation can mean [[clockwise]] or the opposite.
; Row or column vectors
: A square matrix can multiply a [[column vector]] or a [[row vector]].
; Alias or alibi transformation
: The change in a vector's coordinates can indicate a turn of the coordinate system (alias) or a turn of the vector (alibi).
; Right- or left-handed coordinates
: The matrix can be with respect to a [[Cartesian coordinate system#Orientation and handedness|right-handed]] or left-handed coordinate system.
; Row- or column-major storage
: Matrix elements may be stored in computer memory in either [[row-major order]] or column-major order, depending on the [[programming language]] and [[API]].
; World or body axes
: The coordinate axes can be fixed or rotate with a body.
; Cartesian or homogeneous representation
: [[Homogeneous coordinates]] carry an extra dimension compared to [[Cartesian coordinates]] to allow more flexibility.
; Vectors or forms
: The vector space has a [[dual space]] of [[linear form]]s, and the matrix can act on either vectors or forms.


In most cases the effect of the ambiguity is to transpose or invert the matrix.
The generators of 3D rotation matrices are 3D skew symmetric matrices. Since only three real numbers are needed to specify a 3D skew-symmetric matrix, it follows that only three real numbers are needed to specify a 3D rotation matrix.


== Decompositions ==
===Roll, Pitch and Yaw===
=== Independent planes ===
Consider the 3×3 rotation matrix
:<math> Q = \begin{bmatrix} 0.36 & 0.48 & -0.8 \\ -0.8 & 0.60 & 0 \\ 0.48 & 0.64 & 0.60 \end{bmatrix} . </math>
If ''Q'' acts in a certain direction, '''v''', purely as a scaling by a factor λ, then we have
:<math> Q \bold{v} = \lambda \bold{v}, \,\!</math>
so that
:<math> \bold{0} = (\lambda I - Q) \bold{v} . \,\!</math>
Thus λ is a root of the [[characteristic polynomial]] for ''Q'',
:<math>\begin{align}
0 &{}= \det (\lambda I - Q) \\
&{}= \lambda^3 - \tfrac{39}{25} \lambda^2 + \tfrac{39}{25} \lambda - 1 \\
&{}= (\lambda-1) (\lambda^2 - \tfrac{14}{25} \lambda + 1)
\end{align}</math>
Two features are noteworthy. First, one of the roots (or [[eigenvalue]]s) is 1, which tells us that some direction is unaffected by the matrix. For rotations in three dimensions, this is the ''axis'' of the rotation (a concept that has no meaning in any other dimension). Second, the other two roots are a pair of complex conjugates, whose product is 1 (the constant term of the quadratic), and whose sum is 2&nbsp;cos&nbsp;&theta; (the negated linear term). This factorization is of interest for 3×3 rotation matrices because the same thing occurs for all of them. (As special cases, for a null rotation the "complex conjugates" are both 1, and for a 180° rotation they are both −1.) Furthermore, a similar factorization holds for any ''n''×''n'' rotation matrix. If the dimension, ''n'', is odd, there will be a "dangling" eigenvalue of 1; and for any dimension the rest of the polynomial factors into quadratic terms like the one here (with the two special cases noted). We are guaranteed that the characteristic polynomial will have degree ''n'' and thus ''n'' eigenvalues. And since a rotation matrix commutes with its transpose, it is a [[normal matrix]], so can be diagonalized. We conclude that every rotation matrix, when expressed in a suitable coordinate system, partitions into independent rotations of two-dimensional subspaces, at most <sup>''n''</sup>&frasl;<sub>2</sub> of them.


The sum of the entries on the main diagonal of a matrix is called the [[trace (linear algebra)|trace]]; it does not change if we reorient the coordinate system, and always equals the sum of the eigenvalues. This has the convenient implication for 2×2 and 3×3 rotation matrices that the trace reveals the angle of rotation, θ, in the two-dimensional (sub-)space. For a 2×2 matrix the trace is 2 cos(θ), and for a 3×3 matrix it is 1+2 cos(θ). In the three-dimensional case, the subspace consists of all vectors perpendicular to the rotation axis (the invariant direction, with eigenvalue 1). Thus we can extract from any 3×3 rotation matrix a rotation axis and an angle, and these completely determine the rotation.
A simple way to generate a rotation matrix is to compose it as a sequence of three basic rotations. Rotations about the right-handed cartesian ''x''-, ''y''- and ''z''-axes are known as ''roll'', ''pitch'' and ''yaw'' rotations respectively. Since these rotations are expressed as a rotation about an axis, their generators are easily expressed.


=== Sequential angles ===
* Rotation around the ''x''-axis is defined as:
The constraints on a 2×2 rotation matrix imply that it must have the form
:<math>Q = \begin{bmatrix} a & -b \\ b & a \end{bmatrix}</math>
with ''a''<sup>2</sup>+''b''<sup>2</sup>&nbsp;= 1. Therefore we may set ''a''&nbsp;= cos&nbsp;&theta; and ''b''&nbsp;= sin&nbsp;&theta;, for some angle &theta;. To solve for &theta; it is not enough to look at ''a'' alone or ''b'' alone; we must consider both together to place the angle in the correct [[Cartesian coordinate system#Two-dimensional coordinate system|quadrant]], using a [[atan2|two-argument arctangent]] function.


Now consider the first column of a 3×3 rotation matrix,
:<math>
:<math>\begin{bmatrix}a\\b\\c\end{bmatrix} . </math>
\mathcal{R}_x(\theta_x)=
Although ''a''<sup>2</sup>+''b''<sup>2</sup> will probably not equal 1, but some value ''r''<sup>2</sup>&nbsp;&lt;&nbsp;1, we can use a slight variation of the previous computation to find a so-called [[Givens rotation]] that transforms the column to
\begin{bmatrix}
:<math>\begin{bmatrix}r\\0\\c\end{bmatrix} , </math>
1 & 0 & 0 \\
zeroing ''b''. This acts on the subspace spanned by the ''x'' and ''y'' axes. We can then repeat the process for the ''xz'' subspace to zero ''c''. Acting on the full matrix, these two rotations produce the schematic form
0 & \cos{\theta_x} & -\sin{\theta_x} \\
:<math>Q_{xz}Q_{xy}Q = \begin{bmatrix}1&0&0\\0&\ast&\ast\\0&\ast&\ast\end{bmatrix} . </math>
0 & \sin{\theta_x} & \cos{\theta_x}
Shifting attention to the second column, a Givens rotation of the ''yz'' subspace can now zero the ''z'' value. This brings the full matrix to the form
\end{bmatrix}
:<math>Q_{yz}Q_{xz}Q_{xy}Q = \begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix} , </math>
=\exp \left(
which is an identity matrix. Thus we have decomposed ''Q'' as
\begin{bmatrix}
:<math>Q = Q_{xy}^{-1}Q_{xz}^{-1}Q_{yz}^{-1} . </math>
0 & 0 & 0 \\
0 & 0 & \theta_x \\
0 & -\theta_x & 0
\end{bmatrix}\right)
</math> where <math>\theta_x</math> is the ''roll angle''.


An ''n''×''n'' rotation matrix will have (''n''−1)+(''n''−2)+⋯+2+1, or
* Rotation around the ''y''-axis is defined as:
:<math>\sum_{k=1}^{n-1} k = \frac{n(n-1)}{2} \,\!</math>
entries below the diagonal to zero. We can zero them by extending the same idea of stepping through the columns with a series of rotations in a fixed sequence of planes. We conclude that the set of ''n''×''n'' rotation matrices, each of which has ''n''<sup>2</sup> entries, can be parameterized by ''n''(''n''−1)/2 angles.


{| border="1" cellspacing="0" cellpadding="4" style="float:right; margin-left:1em"
:<math>
|-
\mathcal{R}_y(\theta_y)=
| ''xzx''<sub>w</sub> || ''xzy''<sub>w</sub> || ''xyx''<sub>w</sub> || ''xyz''<sub>w</sub>
|-
| ''yxy''<sub>w</sub> || ''yxz''<sub>w</sub> || ''yzy''<sub>w</sub> || ''yzx''<sub>w</sub>
|-
| ''zyz''<sub>w</sub> || ''zyx''<sub>w</sub> || ''zxz''<sub>w</sub> || ''zxy''<sub>w</sub>
|-
| ''xzx''<sub>b</sub> || ''yzx''<sub>b</sub> || ''xyx''<sub>b</sub> || ''zyx''<sub>b</sub>
|-
| ''yxy''<sub>b</sub> || ''zxy''<sub>b</sub> || ''yzy''<sub>b</sub> || ''xzy''<sub>b</sub>
|-
| ''zyz''<sub>b</sub> || ''xyz''<sub>b</sub> || ''zxz''<sub>b</sub> || ''yxz''<sub>b</sub>
|}
In three dimensions this restates in matrix form an observation made by [[Leonhard Euler|Euler]], so mathematicians call the ordered sequence of three angles [[Euler angles]]. However, the situation is somewhat more complicated than we have so far indicated. Despite the small dimension, we actually have considerable freedom in the sequence of axis pairs we use; and we also have some freedom in the choice of angles. Thus we find many different conventions employed when three-dimensional rotations are parameterized for physics, or medicine, or chemistry, or other disciplines. When we include the option of world axes or body axes, 24 different sequences are possible. And while some disciplines call any sequence Euler angles, others give different names (Euler, Cardano, Tait-Byan, roll-pitch-yaw) to different sequences.

One reason for the large number of options is that, as noted previously, rotations in three dimensions (and higher) do not commute. If we reverse a given sequence of rotations, we get a different outcome. This also implies that we cannot compose two rotations by adding their corresponding angles. Thus ''Euler angles are not [[vector space|vectors]]'', despite a similarity in appearance as a triple of numbers.

=== Nested dimensions ===
A 3×3 rotation matrix like
:<math>Q_{3 \times 3} = \begin{bmatrix}\cos \theta & -\sin \theta & {\color{CadetBlue}0} \\ \sin \theta & \cos \theta & {\color{CadetBlue}0} \\ {\color{CadetBlue}0} & {\color{CadetBlue}0} & {\color{CadetBlue}1}\end{bmatrix} </math>
suggests a 2×2 rotation matrix,
:<math>Q_{2 \times 2} = \begin{bmatrix}\cos \theta & -\sin \theta \\ \sin \theta & \cos \theta\end{bmatrix} , </math>
is embedded in the upper left corner:
:<math>Q_{3 \times 3} = \left[ \begin{matrix} Q_{2 \times 2} & \bold{0} \\ \bold{0}^T & 1 \end{matrix} \right] . </math>
This is no illusion; not just one, but many, copies of ''n''-dimensional rotations are found within (''n''+1)-dimensional rotations, as [[subgroup]]s. Each embedding leaves one direction fixed, which in the case of 3×3 matrices is the rotation axis. For example, we have
:<math>Q_{\bold{x}}(\theta) = \begin{bmatrix}1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\ 0 & \sin \theta & \cos \theta\end{bmatrix} , </math>
:<math>Q_{\bold{y}}(\theta) = \begin{bmatrix}\cos \theta & 0 & \sin \theta \\ 0 & 1 & 0 \\ -\sin \theta & 0 & \cos \theta\end{bmatrix} , </math>
:<math>Q_{\bold{z}}(\theta) = \begin{bmatrix}\cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1\end{bmatrix} , </math>
fixing the ''x'' axis, the ''y'' axis, and the ''z'' axis, respectively. The rotation axis need not be a coordinate axis; if '''u''' = (''x'',''y'',''z'') is a unit vector in the desired direction, then
:<math>\begin{align}
Q_{\bold{u}}(\theta)
&{}=
\begin{bmatrix}
\begin{bmatrix}
0&-z&y\\
\cos{\theta_y} & 0 & \sin{\theta_y} \\
0 & 1 & 0 \\
z&0&-x\\
-y&x&0
-\sin{\theta_y} & 0 & \cos{\theta_y}
\end{bmatrix} \sin \theta + (I - \bold{u}\bold{u}^T) \cos \theta + \bold{u}\bold{u}^T \\
\end{bmatrix}
&{}=
=\exp\left(
\begin{bmatrix}
\begin{bmatrix}
(1-x^2) c_{\theta} + x^2 & - z s_{\theta} - x y c_{\theta} + x y & y s_{\theta} - x z c_{\theta} + x z \\
0 & 0 & -\theta_y \\
z s_{\theta} - x y c_{\theta} + x y & (1-y^2) c_{\theta} + y^2 & -x s_{\theta} - y z c_{\theta} + y z \\
0 & 0 & 0 \\
-y s_{\theta} - x z c_{\theta} + x z & x s_{\theta} - y z c_{\theta} + y z & (1-z^2) c_{\theta} + z^2
\theta_y & 0 & 0
\end{bmatrix}\right)
\end{bmatrix} \\
&{}=
</math> where <math>\theta_y</math> is the ''pitch angle''.
\begin{bmatrix}
x^2 (1-c_{\theta}) + c_{\theta} & x y (1-c_{\theta}) - z s_{\theta} & x z (1-c_{\theta}) + y s_{\theta} \\
x y (1-c_{\theta}) + z s_{\theta} & y^2 (1-c_{\theta}) + c_{\theta} & y z (1-c_{\theta}) - x s_{\theta} \\
x z (1-c_{\theta}) - y s_{\theta} & y z (1-c_{\theta}) + x s_{\theta} & z^2 (1-c_{\theta}) + c_{\theta}
\end{bmatrix} ,
\end{align}</math>
where ''c''<sub>&theta;</sub>&nbsp;= cos&nbsp;&theta;, ''s''<sub>&theta;</sub>&nbsp;= sin&nbsp;&theta;, is a rotation by angle &theta; leaving axis '''u''' fixed.


A direction in (''n''+1)-dimensional space will be a unit magnitude vector, which we may consider a point on a generalized sphere, ''S''<sup>''n''</sup>. Thus it is natural to describe the rotation group SO(''n''+1) as combining SO(''n'') and ''S''<sup>''n''</sup>. A suitable formalism is the [[fiber bundle]],
* Rotation around the ''z''-axis is defined as:
:<math> SO(n) \hookrightarrow SO(n+1) \to S^n , \,\!</math>
where for every direction in the "base space", ''S''<sup>''n''</sup>, the "fiber" over it in the "total space", SO(''n''+1), is a copy of the "fiber space", SO(''n''), namely the rotations that keep that direction fixed.


Thus we can build an ''n''×''n'' rotation matrix by starting with a 2×2 matrix, aiming its fixed axis on ''S''<sup>2</sup> (the ordinary sphere in three-dimensional space), aiming the resulting rotation on ''S''<sup>3</sup>, and so on up through ''S''<sup>''n''−1</sup>. A point on ''S''<sup>''n''</sup> can be selected using ''n'' numbers, so we again have ''n''(''n''−1)/2 numbers to describe any ''n''×''n'' rotation matrix.
:<math>
\mathcal{R}_z(\theta_z)=
\begin{bmatrix}
\cos{\theta_z} & -\sin{\theta_z} & 0 \\
\sin{\theta_z} & \cos{\theta_z} & 0 \\
0 & 0 & 1
\end{bmatrix}
=\exp\left(
\begin{bmatrix}
0 & \theta_z & 0 \\
-\theta_z & 0 & 0 \\
0 & 0 & 0
\end{bmatrix}\right)
</math> where <math>\theta_z</math> is the ''yaw angle''.


In fact, we can view the sequential angle decomposition, discussed previously, as reversing this process. The composition of ''n''−1 Givens rotations brings the first column (and row) to (1,0,…,0), so that the remainder of the matrix is a rotation matrix of dimension one less, embedded so as to leave (1,0,…,0) fixed.
In [[flight dynamics]], the roll, pitch and yaw angles are usually given the symbols <math>\gamma</math>, <math>\alpha</math>, and <math>\beta</math>, respectively; here, however, the symbols <math>\theta_x</math>, <math>\theta_y</math>, and <math>\theta_z</math> are used to avoid confusion with the [[Euler angles]].


=== Skew parameters ===
Any 3-dimensional rotation matrix <math>\mathcal{M}\in\mathbb{R}^{3\times 3}</math> can be characterised by the three angles <math>\theta_x</math>, <math>\theta_y</math>, and <math>\theta_z</math>, and may be expressed as a product of the roll, pitch and yaw matrices.
When an ''n''×''n'' rotation matrix, ''Q'', does not include −1 as an eigenvalue, so that none of the planar rotations of which it is composed are 180° rotations, then ''Q''+''I'' is an [[invertible matrix]]. Most rotation matrices fit this discription, and for them we can show that (''Q''−''I'')(''Q''+''I'')<sup>−1</sup> is a [[skew-symmetric matrix]], ''A''. Thus ''A''<sup>T</sup>&nbsp;= −''A''; and since the diagonal is necessarily zero, and since the upper triangle determines the lower one, ''A'' contains ''n''(''n''−1)/2 independent numbers. Conveniently, ''I''−''A'' is invertible whenever ''A'' is skew-symmetric; thus we can recover the original matrix using the ''[[Cayley transform]]'',
:<math> A \mapsto (I+A)(I-A)^{-1} , \,\!</math>
which maps any skew-symmetric matrix ''A'' to a rotation matrix. In fact, aside from the noted exceptions, we can produce any rotation matrix in this way. Although in practical applications we can hardly afford to ignore 180° rotations, the Cayley transform is still a potentially useful tool, giving a parameterization of most rotation matrices without trigonometric functions.


In three dimensions, for example, we have {{Harv|Cayley|1846}}
:<math>\mathcal{M}</math> is a rotation matrix in <math> \mathbb{R}^{3\times 3}\,\Leftrightarrow\,\exist\,\theta_x,\theta_y,\theta_z\in[0\ldots\pi):\,
:<math>\begin{align}
\mathcal{M}=\mathcal{R}_z(\theta_z)\,\mathcal{R}_y(\theta_y)\,\mathcal{R}_x(\theta_x)</math>
&\begin{bmatrix}0&-z&y\\z&0&-x\\-y&x&0\end{bmatrix} \mapsto {} \\
&\quad \frac{1}{1+x^2+y^2+z^2}
\begin{bmatrix}
1+x^2-y^2-z^2 & 2 x y-2 z & 2 y+2 x z \\
2 x y+2 z & 1-x^2+y^2-z^2 & 2 y z-2 x \\
2 x z-2 y & 2 x+2 y z & 1-x^2-y^2+z^2
\end{bmatrix} .
\end{align}</math>
If we condense the skew entries into a vector, (''x'',''y'',''z''), then we produce a 90° rotation around the ''x'' axis for (1,0,0), around the ''y'' axis for (0,1,0), and around the ''z'' axis for (0,0,1). The 180° rotations are just out of reach; for, in the limit as ''x'' goes to infinity, (''x'',0,0) does approach a 180° rotation around the ''x'' axis, and similarly for other directions.


== Lie theory ==
The set of all rotations in <math>\mathbb{R}^3</math>, together with the operation of [[function composition|composition]], form the [[rotation group]] SO(3). The matrices discussed here then provide a [[group representation|representation]] of the group.
=== Lie group ===
We have established that ''n''×''n'' rotation matrices form a [[group (mathematics)|group]], the [[special orthogonal group]], SO(''n''). This [[algebraic structure]] is coupled with a [[topological structure]], in that the operations of multiplication and taking the inverse (which here is merely transposition) are continuous functions of the matrix entries. Thus SO(''n'') is a classic example of a [[topological group]]. (In purely topological terms, it is a [[compact manifold]].) Furthermore, the operations are not only continuous, but [[smooth function|smooth]], so SO(''n'') is a [[differentiable manifold]] and a [[Lie group]] ({{Harvtxt|Baker|2003}}; {{Harvtxt|Fulton|Harris|1991}}).


Most properties of rotation matrices depend very little on the dimension, ''n''; yet in Lie group theory we see systematic differences between even dimensions and odd dimensions. As well, there are some irregularities below ''n''&nbsp;= 5; for example, SO(4) is, anomalously, not a [[simple Lie group]], but instead [[isomorphic]] to the [[direct product|product]] of ''S''<sup>3</sup> and SO(3).
=== Angle-Axis representation and quaternion representation ===<!-- This section is linked from [[Quaternions and spatial rotation]] -->
{{main|axis angle|Quaternions and spatial rotation}}
In three dimensions, a rotation can be defined by a single angle of rotation, <math>\theta</math>, and the direction of a [[unit vector]], <math>\hat{\mathbf{v}} = (x,y,z)</math>, about which to rotate.


=== Lie algebra ===
:<math> \mathcal{M}(\hat{\mathbf{v}},\theta) = \begin{bmatrix}
Associated with every Lie group is a [[Lie algebra]], a linear space with equipped with a bilinear alternating product called a bracket. The algebra for SO(''n'') is denoted by
\cos \theta + (1 - \cos \theta) x^2
:<math> \mathfrak{so}(n) , \,\!</math>
& (1 - \cos \theta) x y - (\sin \theta) z
and consists of all [[skew-symmetric matrix|skew-symmetric]] ''n''×''n'' matrices (as implied by differentiating the [[orthogonal matrix|orthogonality condition]], ''I''&nbsp;= ''Q''<sup>T</sup>''Q''). The bracket, [''A''<sub>1</sub>,''A''<sub>2</sub>], of two skew-symmetric matrices is defined to be ''A''<sub>1</sub>''A''<sub>2</sub>−''A''<sub>2</sub>''A''<sub>1</sub>, which is again a skew-symmetric matrix. This Lie algebra bracket captures the essence of the Lie group product via infinitesimals.
& (1 - \cos \theta) x z + (\sin \theta) y
\\
(1 - \cos \theta) y x + (\sin \theta) z
& \cos \theta + (1 - \cos \theta) y^2
& (1 - \cos \theta) y z - (\sin \theta) x
\\
(1 - \cos \theta) z x - (\sin \theta) y
& (1 - \cos \theta) z y + (\sin \theta) x
& \cos \theta + (1 - \cos \theta) z^2
\end{bmatrix} </math>

This rotation may be simply expressed in terms of its generator:

:<math> \mathcal{M}(\hat{\mathbf{v}},\theta)
= \exp\left( \begin{bmatrix}
0 & -z\theta & y\theta \\
z\theta & 0 & -x\theta \\
-y\theta & x\theta & 0 \\
\end{bmatrix}\right)
.</math>

When operating on a vector '''r''', this is equivalent to the [[Rodrigues' rotation formula]]

:<math>\mathcal{M} \cdot \mathbf{r} = \mathbf{r} \,\cos(\theta)+\hat{\mathbf{v}}\times \mathbf{r}\, \sin(\theta)+(\hat{\mathbf{v}}\cdot\mathbf{r})\hat{\mathbf{v}}(1-\cos(\theta)).</math>

The angle-axis representation is closely related to the [[Quaternions and spatial rotation|quaternion]] representation. In terms of the axis and angle, the quaternion representation is given by a normalized quaternion ''Q'':


For 2×2 rotation matrices, the Lie algebra is a one-dimensional vector space, multiples of
:<math>J = \begin{bmatrix}0&-1\\1&0\end{bmatrix} . </math>
Here the bracket always vanishes, which tells us that, in two dimensions, rotations commute. Not so in any higher dimension. For 3×3 rotation matrices, we have a three-dimensional vector space with the convenient basis (generators)
:<math>
:<math>
A_{\bold{x}} = \begin{bmatrix}0&0&0\\0&0&-1\\0&1&0\end{bmatrix} , \quad
Q=(xi+yj+zk)\sin(\theta/2)+\cos(\theta/2)\,
A_{\bold{y}} = \begin{bmatrix}0&0&1\\0&0&0\\-1&0&0\end{bmatrix} , \quad
A_{\bold{z}} = \begin{bmatrix}0&-1&0\\1&0&0\\0&0&0\end{bmatrix} .
</math>
The essence of the bracket for these basis vectors works out to be as follows.
:<math>
A_{\bold{x}} A_{\bold{y}} = A_{\bold{z}}, \quad
A_{\bold{z}} A_{\bold{x}} = A_{\bold{y}}, \quad
A_{\bold{y}} A_{\bold{z}} = A_{\bold{x}}.
</math>
</math>


We can conveniently identify any matrix in this Lie algebra with a vector in '''R'''<sup>3</sup>,
where ''i'', ''j'', and ''k'' are the three imaginary parts of ''Q''.
:<math>\begin{align}
\boldsymbol{\omega} &{}= (x,y,z) \\
\tilde{\boldsymbol{\omega}} &{}= x A_{\bold{x}} + y A_{\bold{y}} + z A_{\bold{z}} \\
&{}= \begin{bmatrix}0&-z&y\\z&0&-x\\-y&x&0\end{bmatrix} .
\end{align}</math>
Under this identification, the '''so'''(3) bracket has a memorable description; it is the vector [[cross product]],
:<math> [\tilde{\bold{u}},\tilde{\bold{v}}] = (\bold{u} \times \bold{v})^{\sim} . \,\!</math>
The matrix identified with a vector '''v''' is also memorable, because
:<math> \tilde{\bold{v}} \bold{u} = \bold{v} \times \bold{u} . \,\!</math>
Notice this implies that '''v''' is in the [[null space]] of the skew-symmetric matrix with which it is identified, because '''v'''×'''v''' is always the zero vector.


=== Exponential map ===
=== Angle-Axis representation via Rotation Tensor ===
Connecting the Lie algebra to the Lie group is the ''[[exponential map]]'', which we define using the familiar [[power series]] for ''e''<sup>''x''</sup> {{Harv|Wedderburn|1934|loc=&sect;8.02}},
:<math>\begin{align}
\exp \colon \mathfrak{so}(n) &{}\to SO(n) \\
A &{}\mapsto I + A + \tfrac{1}{2} A^2 + \tfrac{1}{6} A^3 + \cdots + \tfrac{1}{k!} A^k + \cdots \\
&{}= \sum_{k=0}^{\infty} \frac{1}{k!} A^k
\end{align}</math>
For any skew-symmetric ''A'', exp(''A'') is always a rotation matrix.


An important practical example is the 3×3 case, where we have have seen we can identify every skew-symmetric matrix with a vector '''ω''' = '''u'''θ, where '''u''' = (''x'',''y'',''z'') is a unit magnitude vector. Recall that '''u''' is in the null space of the matrix associated with '''ω''', so that if we use a basis with '''u''' as the ''z'' axis the final column and row will be zero. Thus we know in advance that the exponential matrix must leave '''u''' fixed. It is mathematically impossible to supply a straightforward formula for such a basis as a function of '''u''' (its existence would violate the [[hairy ball theorem]]), but direct exponentiation is possible, and yields
A rotation matrix is not invariant with respect to current [[reference frame]], where the actual rotation is considered. The same physical rotation will have different "rotation matrices" with respect to different sets of basis vectors (orthonormal or not). A [[rotation tensor]] is a more general representation of a rotation in space. The representation of rotation by rotation tensors is invariant with respect to change of current reference frame. Each "rotation matrix" representation then is just an "image" of corresponding rotation tensor in a given reference frame. Rotation tensors are constructed using vector [[dyadics]] (or "ordered of pairs of vectors"). Dyadics themself can be described as matrices in each given reference frame but are actually much more general objects and are also invariant with respect to rotations of the current reference frame.
:<math>\begin{align}
\exp( \tilde{\boldsymbol{\omega}} )
&{}= \exp \left( \begin{bmatrix} 0 & -z \theta & y \theta \\ z \theta & 0&-x \theta \\ -y \theta & x \theta & 0 \end{bmatrix} \right) \\
&{}= \begin{bmatrix}
2 (x^2 - 1) s^2 + 1 & 2 x y s^2 - 2 z c s & 2 x z s^2 + 2 y c s \\
2 x y s^2 + 2 z c s & 2 (y^2 - 1) s^2 + 1 & 2 y z s^2 - 2 x c s \\
2 x z s^2 - 2 y c s & 2 y z s^2 + 2 x c s & 2 (z^2 - 1) s^2 + 1
\end{bmatrix} ,
\end{align}</math>
where ''c''&nbsp;= cos&nbsp;<sup>&theta;</sup>&frasl;<sub>2</sub>, ''s''&nbsp;= sin&nbsp;<sup>&theta;</sup>&frasl;<sub>2</sub>. We recognize this as our matrix for a rotation around axis '''u''' by angle &theta;. We also note that this mapping of skew-symmetric matrices is quite different from the Cayley transform discussed earlier.


In any dimension, if we choose some nonzero ''A'' and consider all its scalar multiples, exponentiation yields rotation matrices along a ''[[geodesic]]'' of the group manifold, forming a [[one-parameter subgroup]] of the Lie group. More broadly, the exponential map provides a [[homeomorphism]] between a neighborhood of the origin in the Lie algebra and a neighborhood of the identity in the Lie group. In fact, we can produce any rotation matrix as the exponential of some skew-symmetric matrix, so for these groups the exponential map is a ''[[surjection]]''.


=== Baker–Campbell–Hausdorff formula ===
A rotation tensor <math>\mathcal{M}(\hat{\mathbf{v}},\theta)</math> representing a rotation about unit axis <math>\hat{\mathbf{v}}</math> for angle <math>\theta</math> is given by:
Suppose we are given ''A'' and ''B'' in the Lie algebra. Their exponentials, exp(''A'') and exp(''B''), are rotation matrices, which we can multiply. Since the exponential map is a surjection, we know that for some ''C'' in the Lie algebra, exp(''A'')exp(''B'') = exp(''C''), and we write
:<math> A \ast B = C . \,\!</math>
When exp(''A'') and exp(''B'') commute (which always happens for 2×2 matrices, but not higher), then ''C''&nbsp;= ''A''+''B'', mimicking the behavior of complex exponentiation. The general case is given by the [[BCH formula]], a series expanded in terms of the bracket ({{Harvnb|Hall|2004|loc=Ch.&nbsp;3}}; {{Harvnb|Varadarajan|1984|loc=&sect;2.15}}). For matrices, the bracket is the same operation as the [[commutator]], which detects lack of commutativity in multiplication. The general formula begins as follows.
:<math> A \ast B = A + B + \tfrac12 [A,B] + \tfrac{1}{12} [A,[A,B]] - \tfrac{1}{12} [B,[A,B]] - \cdots \,\!</math>
Representation of a rotation matrix as a sequential angle decomposition, as in Euler angles, may tempt us to treat rotations as a vector space, but the higher order terms in the BCH formula reveal that to be a mistake.


We again take special interest in the 3×3 case, where [''A'',''B''] equals the cross product, ''A''×''B''. If ''A'' and ''B'' are [[linearly independent]], then ''A'', ''B'', and ''A''×''B'' can be used as a basis; if not, then ''A'' and ''B'' commute. And conveniently, in this dimension the summation in the BCH formula has a closed form {{Harv|Engø|2001}} as &alpha;''A''+&beta;''B''+&gamma;(''A''×''B'').
:<math>
\mathcal{M}(\hat{\mathbf{v}},\theta) =
\hat{\mathbf{v}} \otimes \hat{\mathbf{v}} +
\cos\theta \, ( \mathbf{E} - \hat{\mathbf{v}} \otimes \hat{\mathbf{v}} ) +
\sin\theta \, \hat{\mathbf{v}} \times \mathbf{E} \, ,
</math>


=== Spin group ===
where <math>\mathbf{E}</math> is a [[unit tensor]] of second order, which is a sum of three dyadics
The Lie group of ''n''×''n'' rotation matrices, SO(''n''), is a [[compact space|compact]] and [[connected space|path-connected]] [[manifold]], and thus [[locally compact space|locally compact]] and [[connected space|connected]]. However, it is not [[simply connected space|simply connected]], so Lie theory tells us it is a kind of "shadow" (a homomorphic image) of a [[universal covering group]]. Often the covering group, which in this case is the [[spin group]] denoted by Spin(''n''), is simpler and more natural to work with ({{Harvnb|Baker|2003|loc=Ch.&nbsp;5}}; {{Harvnb|Fulton|Harris|1991|pp=299–315}}).
<math>\hat{\mathbf{e}}_i \otimes \hat{\mathbf{e}}_i \, , \, i=1,2,3 </math>, where <math>\hat{\mathbf{e}}_i \, , \, i=1,2,3</math> are three orthogonal
unit vectors of any orthonormal reference frame. The given representation does not depend on the actual current orientation of reference frame <math>\hat{\mathbf{e}}_i \, , \, i=1,2,3</math> because the unit tensor <math>\mathbf{E}</math> itself has the same representation in any orthonormal reference frame (non-orthonormal reference frames will be considered just few lines later).


In the case of planar rotations, SO(2) is topologically a [[sphere|circle]], ''S''<sup>1</sup>. Its universal covering group, Spin(2), is isomorphic to the [[real line]], '''R''', under addition. In other words, whenever we use angles of arbitrary magnitude, which we often do, we are essentially taking advantage of the convenience of the "mother space". Every 2×2 rotation matrix is produced by a countable infinity of angles, separated by integer multiples of 2&pi;. Correspondingly, the [[fundamental group]] of SO(2) is isomorphic to the integers, '''Z'''.


In the case of spatial rotations, SO(3) is topologically equivalent to three-dimensional [[real projective space]], '''RP'''<sup>3</sup>. Its universal covering group, Spin(3), is isomorphic to the 3-sphere, ''S''<sup>3</sup>. Every 3×3 rotation matrix is produced by two opposite points on the sphere. Correspondingly, the [[fundamental group]] of SO(2) is isomorphic to the two-element group, '''Z'''<sub>2</sub>. We can also describe Spin(3) as isomorphic to [[quaternion]]s of unit norm under multiplication, or to certain 4×4 real matrices, or to 2×2 complex [[special unitary group|special unitary matrices]].
The [[Rodrigues' rotation formula]] simply follows from the above representation as soon as
<math>\mathbf{E} \cdot \mathbf{r} = \mathbf{r}</math>


Concretely, a unit quaternion, ''q'', with
:<math>\begin{align}
q &{}= w + \bold{i}x + \bold{j}y + \bold{k}z , \\
1 &{}= w^2 + x^2 + y^2 + z^2 ,
\end{align}</math>
produces the rotation matrix
:<math> Q = \begin{bmatrix}
2 x^2 + 2 w^2 - 1 & 2 x y - 2 z w & 2 x z + 2 y w \\
2 x y + 2 z w & 2 y^2 + 2 w^2 - 1 & 2 y z - 2 x w \\
2 x z - 2 y w & 2 y z + 2 x w & 2 z^2 + 2 w^2 - 1
\end{bmatrix} . </math>
This is our third version of this matrix, here as a rotation around non-unit axis vector (''x'',''y'',''z'') by angle 2θ, where cos θ = ''w'' and sin θ = ||(''x'',''y'',''z'')||.


Many features of this case are the same for higher dimensions. The coverings are all two-to-one, with SO(''n''), ''n''&nbsp;&gt;&nbsp;2, having fundamental group '''Z'''<sub>2</sub>. The natural setting for these groups is within a [[Clifford algebra]]. And the action of the rotations is produced by a kind of "sandwich", denoted by ''qvq''<sup>&lowast;</sup>.
The parts of the the expression for the rotation tensor are easily recognizable.


== Infinitesimal rotations ==
The matrices in the Lie algebra are not themselves rotations; the skew-symmetric matrices are derivatives, proportional differences of rotations. An actual "differential rotation", or ''infinitesimal rotation matrix'' has the form
:<math> I + A \, d\theta , \,\!</math>
where ''d''θ is vanishingly small. These matrices do not satisfy all the same properties as ordinary finite rotation matrices under the usual treatment of infinitesimals {{Harv|Goldstein|Poole|Safko|2002|loc=§4.8}}. To understand what this means, consider
:<math> dA_{\bold{x}} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & -d\theta \\ 0 & d\theta & 1 \end{bmatrix} . </math>
We first test the orthogonality condition, ''Q''<sup>T</sup>''Q''&nbsp;= ''I''. The product is
:<math> dA_{\bold{x}}^T \, dA_{\bold{x}} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1+d\theta^2 & 0 \\ 0 & 0 & 1+d\theta^2 \end{bmatrix} , </math>
differing from an identity matrix by second order infinitesimals, which we discard. So to first order, an infinitesimal rotation matrix is an orthogonal matrix. Next we examine the square of the matrix.
:<math> dA_{\bold{x}}^2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1-d\theta^2 & -2d\theta \\ 0 & 2d\theta & 1-d\theta^2 \end{bmatrix} </math>
Again discarding second order effects, we see that the angle simply doubles. This hints at the most essential difference in behavior, which we can exhibit with the assistance of a second infinitesimal rotation,
:<math> dA_{\bold{y}} = \begin{bmatrix} 1 & 0 & d\phi \\ 0 & 1 & 0 \\ -d\phi & 0 & 1 \end{bmatrix} . </math>
Compare the products ''dA''<sub>'''x'''</sub>''dA''<sub>'''y'''</sub> and ''dA''<sub>'''y'''</sub>''dA''<sub>'''x'''</sub>.
:<math>\begin{align}
dA_{\bold{x}}\,dA_{\bold{y}} &{}= \begin{bmatrix} 1 & 0 & d\phi \\ d\theta\,d\phi & 1 & -d\theta \\ -d\phi & d\theta & 1 \end{bmatrix} \\
dA_{\bold{y}}\,dA_{\bold{x}} &{}= \begin{bmatrix} 1 & d\theta\,d\phi & d\phi \\ 0 & 1 & -d\theta \\ -d\phi & d\theta & 1 \end{bmatrix} \\
\end{align}</math>
Since ''d''θ ''d''φ is second order, we discard it; thus, to first order, multiplication of infinitesimal rotation matrices is commutative. In fact,
:<math> dA_{\bold{x}}\,dA_{\bold{y}} = dA_{\bold{x}} + dA_{\bold{y}} , \,\!</math>
again to first order.


But we must always be careful to distinguish (the first order treatment of) these infinitesimal rotation matrices from both finite rotation matrices and from derivatives of rotation matrices (namely skew-symmetric matrices). Contrast the behavior of finite rotation matrices in the BCH formula with that of infinitesimal rotation matrices, where all the commutator terms will be second order infinitesimals so we ''do'' have a vector space.
The dyad <math>\hat{\mathbf{v}} \otimes \hat{\mathbf{v}}</math> is responsible for a component
<math>(\hat{\mathbf{v}}\cdot\mathbf{r})\hat{\mathbf{v}}</math> of the vector <math>\mathbf{R}=\mathcal{M} \cdot \mathbf{r}</math>,
which is parallel to the axis of rotation <math>\hat{\mathbf{v}}</math> and is not affected by the multiplication
<math>\mathcal{M} \cdot \mathbf{r}</math>. The length of this component is <math>r \, \cos \alpha</math>,
where
<math> r </math> is the length of the vector <math>\mathbf{r}</math>
and
<math> \alpha </math> is the angle between vectors <math>\hat{\mathbf{v}}</math> and <math>\mathbf{r}</math>.


== Conversions ==
We have seen the existence of several decompositions that apply in any dimension, namely independent planes, sequential angles, and nested dimensions. In all these cases we can either decompose a matrix or construct one. We have also given special attention to 3×3 rotation matrices, and these warrant further attention, in both directions {{Harv|Stuelpnagel|1964}}.


=== Quaternion ===
The projector <math>(\mathbf{E}-\hat{\mathbf{v}}\otimes\hat{\mathbf{v}})</math> gives us a component of the vector
Rewrite the 3×3 rotation matrix again, as
<math>\mathbf{r}</math>, which is exactly orthogonal to <math>\hat{\mathbf{v}}</math>. The length of this component is
:<math> Q = \begin{bmatrix}
<math>r \, \sin \alpha</math>. This component is then scaled by <math>\cos\theta</math> depending on the actual rotation angle
1 - 2 y^2 - 2 z^2 & 2 x y - 2 z w & 2 x z + 2 y w \\
<math>\theta</math>.
2 x y + 2 z w & 1 - 2 x^2 - 2 z^2 & 2 y z - 2 x w \\
2 x z - 2 y w & 2 y z + 2 x w & 1 - 2 x^2 - 2 y^2
\end{bmatrix} . </math>
Now every quaternion component appears multiplied by two in a term of degree two, and if all such terms are zero what's left is an identity matrix. This leads to an efficient, robust conversion from any quaternion — whether unit, nonunit, or even zero — to a 3×3 rotation matrix.


Nq = w^2 + x^2 + y^2 + z^2
if Nq > 0.0 then s = 2/Nq else s = 0.0
X = x*s; Y = y*s; Z = z*s
wX = w*X; wY = w*Y; wz = w*Z
xX = x*X; xY = x*Y; xZ = x*Z
yY = y*Y; yZ = y*Z; zZ = z*Z
[ 1.0-(yY+zZ) xY-wZ xZ+wY ]
[ xY+wZ 1.0-(xX+zZ) yZ-wX ]
[ xZ-wY yZ+wX 1.0-(xX+yY) ]


Freed from the demand for a unit quaternion, we find that nonzero quaternions act as [[homogeneous coordinates]] for 3×3 rotation matrices. The Cayley transform, discussed earlier, is obtained by scaling the quaternion so that its ''w'' component is 1. For a 180° rotation around any axis, ''w'' will be zero, which explains the Cayley limitation.
And the last part <math>\sin \theta \, \hat{\mathbf{v}} \times \mathbf{E}</math> of the expression for the rotation tensor
is responsible for a component of the final vector <math>\mathbf{R}</math>, which is orthogonal to both
<math>\hat{\mathbf{v}}</math> and <math>\mathbf{r}</math> as soon as
<math>(\hat{\mathbf{v}} \times \mathbf{E}) \cdot \mathbf{r} = \hat{\mathbf{v}} \times \mathbf{r} </math>.
The length of vector <math>(\hat{\mathbf{v}}\times\mathbf{E})\cdot\mathbf{r}=\hat{\mathbf{v}}\times\mathbf{r}</math> is also equal <math>r \, \sin \alpha</math> due to definition of the [[cross-product]] of two vectors.


The sum of the entries along the main diagonal (the [[trace (linear algebra)|trace]]), plus one, equals 4−4(''x''<sup>2</sup>+''y''<sup>2</sup>+''z''<sup>2</sup>), which is 4''w''<sup>2</sup>. Thus we can write the trace itself as 2''w''<sup>2</sup>+2''w''<sup>2</sup>−1; and from the previous version of the matrix we see that the diagonal entries themselves have the same form: 2''x''<sup>2</sup>+2''w''<sup>2</sup>−1, 2''y''<sup>2</sup>+2''w''<sup>2</sup>−1, and 2''z''<sup>2</sup>+2''w''<sup>2</sup>−1. So we can easily compare the magnitudes of all four quaternion components using the matrix diagonal. We can, in fact, ''obtain'' all four magnitudes using sums and square roots, and choose consistent signs using the skew-symmetric part of the off-diagonal entries.
w = 0.5*sqrt(1+Q<sub>xx</sub>+Q<sub>yy</sub>+Q<sub>zz</sub>)
x = copysign(0.5*sqrt(1+Q<sub>xx</sub>-Q<sub>yy</sub>-Q<sub>zz</sub>),Q<sub>zy</sub>-Q<sub>yz</sub>)
y = copysign(0.5*sqrt(1-Q<sub>xx</sub>+Q<sub>yy</sub>-Q<sub>zz</sub>),Q<sub>xz</sub>-Q<sub>zx</sub>)
z = copysign(0.5*sqrt(1-Q<sub>xx</sub>-Q<sub>yy</sub>+Q<sub>zz</sub>),Q<sub>yx</sub>-Q<sub>xy</sub>)
Alternatively, use a single square root and division
t = Q<sub>xx</sub>+Q<sub>yy</sub>+Q<sub>zz</sub>
r = sqrt(1+t)
s = 0.5/r
w = 0.5*r
x = (Q<sub>zy</sub>-Q<sub>yz</sub>)*s
y = (Q<sub>xz</sub>-Q<sub>zx</sub>)*s
z = (Q<sub>yx</sub>-Q<sub>xy</sub>)*s
This is numerically stable so long as the trace, <tt>t</tt>, is not negative; otherwise, we risk dividing by (nearly) zero. In that case, suppose <tt>Q<sub>xx</sub></tt> is the largest diagonal entry, so ''x'' will have the largest magnitude (the other cases are similar); then the following is safe.
r = sqrt(1+Q<sub>xx</sub>-Q<sub>yy</sub>-Q<sub>zz</sub>)
s = 0.5/r
w = (Q<sub>zy</sub>-Q<sub>yz</sub>)*s
x = 0.5*r
y = (Q<sub>xy</sub>+Q<sub>yx</sub>)*s
z = (Q<sub>zx</sub>+Q<sub>xz</sub>)*s
If the matrix contains significant error, such as accumulated numerical error, we may construct a symmetric 4×4 matrix,
:<math> K = \frac13
\begin{bmatrix}
Q_{xx}-Q_{yy}-Q_{zz} & Q_{yx}+Q_{xy} & Q_{zx}+Q_{xz} & Q_{yz}-Q_{zy} \\
Q_{yx}+Q_{xy} & Q_{yy}-Q_{xx}-Q_{zz} & Q_{zy}+Q_{yz} & Q_{zx}-Q_{xz} \\
Q_{zx}+Q_{xz} & Q_{zy}+Q_{yz} & Q_{zz}-Q_{xx}-Q_{yy} & Q_{xy}-Q_{yx} \\
Q_{yz}-Q_{zy} & Q_{zx}-Q_{xz} & Q_{xy}-Q_{yx} & Q_{xx}+Q_{yy}+Q_{zz}
\end{bmatrix} ,
</math>
and find the [[eigenvector]], (''x'',''y'',''z'',''w''), of its largest magnitude eigenvalue. (If ''Q'' is truly a rotation matrix, that value will be 1.) The quaternion so obtained will correspond to the rotation matrix closest to the given matrix {{Harv|Bar-Itzhack|2000}}.


=== Polar decomposition ===
As a result the three parts <math>\hat{\mathbf{v}} \otimes \hat{\mathbf{v}}</math>, <math>(\mathbf{E}-\hat{\mathbf{v}}\otimes\hat{\mathbf{v}})</math> and <math>\hat{\mathbf{v}}\times\mathbf{E}</math> of the rotation tensor construct a local orthogonal reference frame which is most convinient for description of the actual rotation of any given vector <math>\mathbf{r}</math>.
If the ''n''×''n'' matrix ''M'' is non-singular, its columns are linearly independent vectors; thus the [[Gram–Schmidt process]] can adjust them to be an orthonormal basis. Stated in terms of [[numerical linear algebra]], we convert ''M'' to an orthogonal matrix, ''Q'', using [[QR decomposition]]. However, we often prefer a ''Q'' "closest" to ''M'', which this method does not accomplish. For that, the tool we want is the [[polar decomposition]] ({{Harvnb|Fan|Hoffman|1955}}; {{Harvnb|Higham|1989}}).


To measure closeness, we may use any [[matrix norm]] invariant under orthogonal transformations. A convenient choice is the [[Frobenius norm]], ||''Q''−''M''||<sub>F</sub>, squared, which is the sum of the squares of the element differences. Writing this in terms of the [[trace (linear algebra)|trace]], Tr, our goal is,
* Find ''Q'' minimizing Tr( (''Q''−''M'')<sup>T</sup>(''Q''−''M'') ), subject to ''Q''<sup>T</sup>''Q''&nbsp;= ''I''.
Though written in matrix terms, the [[objective function]] is just a quadratic polynomial. We can minimize it in the usual way, by finding where its derivative is zero. For a 3×3 matrix, the orthogonality constraint implies six scalar equalities that the entries of ''Q'' must satisfy. To incorporate the constraint(s), we may employ a standard technique, [[Lagrange multipliers]], assembled as a symmetric matrix, ''Y''. Thus our method is:
* Differentiate Tr( (''Q''−''M'')<sup>T</sup>(''Q''−''M'') + (''Q''<sup>T</sup>''Q''−''I'')''Y'' ) with respect to (the entries of) ''Q'', and equate to zero.
<div style="float:right;font-size:80%;border:1px solid black;padding:1em">
Consider a 2×2 example. Including constraints, we seek to minimize
:<math>\begin{align}
&\scriptstyle{ (Q_{xx}-M_{xx})^2 + (Q_{xy}-M_{xy})^2 } \\
&\scriptstyle{ {} + (Q_{yx}-M_{yx})^2 + (Q_{yy}-M_{yy})^2 } \\
&\scriptstyle{ {} + (Q_{xx}^2+Q_{yx}^2-1)Y_{xx} + (Q_{xy}^2+Q_{yy}^2-1)Y_{yy} } \\
&\scriptstyle{ {} + 2(Q_{xx} Q_{xy} + Q_{yx} Q_{yy})Y_{xy} . }
\end{align}</math>
Taking the derivative with respect to ''Q''<sub>xx</sub>, ''Q''<sub>xy</sub>, ''Q''<sub>yx</sub>, ''Q''<sub>yy</sub> in turn, we assemble a matrix.
:<math>\scriptstyle{ 2
\begin{bmatrix}
\scriptstyle{ Q_{xx}-M_{xx} + Q_{xx} Y_{xx} + Q_{xy} Y_{xy} } & \scriptstyle{ Q_{xy}-M_{xy} + Q_{xx} Y_{xy} + Q_{xy} Y_{yy} } \\
\scriptstyle{ Q_{yx}-M_{yx} + Q_{yx} Y_{xx} + Q_{yy} Y_{xy} } & \scriptstyle{ Q_{yy}-M_{yy} + Q_{yx} Y_{xy} + Q_{yy} Y_{yy} }
\end{bmatrix}}</math>
</div>
In general, we obtain the equation
:<math> 0 = 2(Q-M) + 2QY , \,\!</math>
so that
:<math> M = Q(I+Y) = QS , \,\!</math>
where ''Q'' is orthogonal and ''S'' is symmetric. To ensure a minimum, the ''Y'' matrix (and hence ''S'') must be positive definite. Linear algebra calls ''QS'' the [[polar decomposition]] of ''M'', with ''S'' the positive square root of ''S''<sup>2</sup>&nbsp;= ''M''<sup>T</sup>''M''.
:<math> S^2 = (Q^T M)^T (Q^T M) = M^T Q Q^ T M = M^T M \,\!</math>
When ''M'' is [[non-singular matrix|non-singular]], the ''Q'' and ''S'' factors of the polar decomposition are uniquely determined. However, the determinant of ''S'' is positive because ''S'' is positive definite, so ''Q'' inherits the sign of the determinant of ''M''. That is, ''Q'' is only guaranteed to be orthogonal, not a rotation matrix. This is unavoidable; an ''M'' with negative determinant has no uniquely-defined closest rotation matrix.


=== Axis and angle ===
The above representation can be is generalized onto the case of non-orthonormal [[reference frame]] by constructing the [[unit tensor]]
Determining an axis and angle, like determining a quaternion, is only possible up to sign; that is, ('''u''',θ) and (−'''u''',−θ) correspond to the same rotation matrix, just like ''q'' and −''q''. We might prefer '''u''' to be a unit vector; when θ is zero, however, the direction of '''u''' is undetermined. Also, the angle is only determined to within a multiple of 2π, and we need a [[Arctangent#Two argument variant of arctangent|two-argument arctangent]].
<math>\mathbf{E}</math> as <math>\mathbf{E} = \hat{\mathbf{e}}_i \otimes \hat{\mathbf{e}}^i </math> (assuming Einstein summation), where
x = Q<sub>zy</sub>-Q<sub>yz</sub>
<math>\hat{\mathbf{e}}^i \, , \, i=1,2,3</math> are [[covectors]] of vectors <math>\hat{\mathbf{e}}_i \, , \, i=1,2,3</math>.
y = Q<sub>xz</sub>-Q<sub>zx</sub>
z = Q<sub>yx</sub>-Q<sub>xy</sub>
r = sqrt(x*x+y*y+z*z)
θ = atan2(0.5*(Q_{xx}+Q_{yy}+Q_{zz}-1),r)
To efficiently construct a rotation matrix from an angle θ and a unit axis '''u''', we can take advantage of symmetry and skew-symmetry within the entries.
c = cos(θ); s = sin(θ); C = 1-c
xs = x*s; ys = y*s; zs = z*s
xC = x*C; yC = y*C; zC = z*C
xyC = x*yC; yzC = y*zC; zxC = z*xC
[ x*xC+c xyC-zs zxC+ys ]
[ xyC+zs y*yC+c yzC-xs ]
[ zxC-ys yzC+xs z*zC+c ]


=== Euler angles ===
Complexity of conversion escalates with Euler angles (used here in the broad sense). The first difficulty is to establish which of the twenty-four variations of Cartesian axis order we will use. Suppose the three angles are &theta;<sub>1</sub>, &theta;<sub>2</sub>, &theta;<sub>3</sub>; physics and chemistry may interpret these as
:<math> Q(\theta_1,\theta_2,\theta_3)= Q_{\bold{z}}(\theta_1) Q_{\bold{y}}(\theta_2) Q_{\bold{z}}(\theta_3) , \,\!</math>
while aircraft dynamics may use
:<math> Q(\theta_1,\theta_2,\theta_3)= Q_{\bold{z}}(\theta_3) Q_{\bold{y}}(\theta_2) Q_{\bold{x}}(\theta_1) . \,\!</math>
One systematic approach begins with choosing the right-most axis. Among all [[permutation]]s of (''x'',''y'',''z''), only two place that axis first; one is an even permutation and the other odd. Choosing parity thus establishes the middle axis. That leaves two choices for the left-most axis, either duplicating the first or not. These three choices gives us 3×2×2 = 12 variations; we double that to 24 by choosing static or rotating axes.


This is enough to construct a matrix from angles, but triples differing in many ways can give the same rotation matrix. For example, suppose we use the '''zyz''' convention above; then we have the following equivalent pairs:
The covectors <math>\hat{\mathbf{e}}^i</math> are built out of <math>\hat{\mathbf{e}}_i</math> as:
:{| style="text-align:right"
:<math>
| (90°,||45°,||−105°) || ≡ || (−270°,||−315°,||255°) ||   ''multiples of 360°''
\hat{\mathbf{e}}^i =
|-
\frac{\hat{\mathbf{e}}_j \times \hat{\mathbf{e}}_k}
| (72°,||0°,||0°) || ≡ || (40°,||0°,||32°) ||   ''singular alignment''
{\hat{\mathbf{e}}_i \cdot \hat{\mathbf{e}}_j \times \hat{\mathbf{e}}_k }\, ,
|-
</math>
| (45°,||60°,||−30°) || ≡ || (−135°,||−60°,||150°) ||   ''bistable flip''
where each triplet <math>\{i,j,k\}</math> is a cyclic permutations of <math>\{1,2,3\}</math> triplet. In orthonormal reference frames
|}
the vectors <math>\hat{\mathbf{e}}_i</math> coincide with their "co"-counterparts <math>\hat{\mathbf{e}}^i</math>.
The problem of singular alignment, the mathematical analog of physical [[gimbal lock]], occurs when the middle rotation aligns the axes of the first and last rotations. It afflicts every axis order at either even or odd multiples of 90°, causing Euler angles to be abandoned for quaternions in many applications. Setting these unavoidable issues aside, angles for any order can be found using a concise common routine ({{Harvnb|Herter|Lott|1993}}; {{Harvnb|Shoemake|1994}}).



== Uniform random rotation matrices ==
As a result the given description of rotation in 3D space by the rotation tensor is invariant with respect
We sometimes need to generate a uniformly distributed random rotation matrix. It seems intuitively clear in two dimensions that this means the rotation angle is uniformly distributed between 0 and 2&pi;. That intuition is correct, but does not carry over to higher dimensions. For example, if we decompose 3×3 rotation matrices in axis-angle form, the angle should ''not'' be uniformly distributed; the probability that (the magnitude of) the angle is at most &theta; should be <sup>1</sup>&frasl;<sub>&pi;</sub>(&theta;&nbsp;−&nbsp;sin&nbsp;&theta;), for 0&nbsp;&le;&nbsp;&theta;&nbsp;&le;&nbsp;&pi;.
to any (orthonormal or not) reference frame. Any "rotation matrix" representation is an "image" of the rotation tensor
taken in corresponding reference frame.


Since SO(''n'') is a connected and locally compact Lie group, we have a simple standard criterion for uniformity, namely that the distribution be unchanged when composed with any arbitrary rotation (a Lie group "translation"). This definition corresponds to what is called ''[[Haar measure]]''. {{Harvtxt|León|Massé|Rivest|2006}} show how to use the Cayley transform to generate and test matrices according to this criterion.
=== Euler Angle representation ===


We can also generate a uniform distribution in any dimension using the ''subgroup algorithm'' of {{Harvtxt|Diaconis|Shashahani|1987}}. This recursively exploits the nested dimensions group structure of SO(''n''), as follows. Generate a uniform angle and construct a 2×2 rotation matrix. To step from ''n'' to ''n''+1, generate a vector '''v''' uniformly distributed on the ''n-''sphere, ''S''<sup>''n''</sup>, embed the ''n''×''n'' matrix in the next larger size with last column (0,…,0,1), and rotate the larger matrix so the last column becomes '''v'''.
In three dimensions, a rotation can be defined by three [[Euler angles]], <math>(\alpha,\beta,\gamma)</math>. There are a number of possible definitions of the Euler angles. Each may be expressed in terms of a composition of the roll, pitch, and yaw rotations.
The rotation matrix expressed in terms of the "z-x-z" Euler angles, in right-handed [[cartesian coordinates]] may be expressed as:


As usual, we have special alternatives for the 3×3 case. Each of these methods begins with three independent random scalars uniformly distributed on the unit interval. {{Harvtxt|Arvo|1992}} takes advantage of the odd dimension to change a [[Householder reflection]] to a rotation by negation, and uses that to aim the axis of a uniform planar rotation.
:<math> \mathcal{M}(\alpha,\beta,\gamma)=\mathcal{R}_z(\gamma)\mathcal{R}_x(\beta) \mathcal{R}_z(\alpha)</math>


Another method uses unit quaternions. Multiplication of rotation matrices is homomorphic to multiplication of quaternions, and multiplication by a unit quaternion rotates the unit sphere. Since the homomorphism is a local isometry, we immediately conclude that to produce a uniform distribution on SO(3) we may use a uniform distribution on ''S''<sup>3</sup>.
carrying out the multiplications yields:


Euler angles can also be used, though not with each angle uniformly distributed ({{Harvnb|Murnaghan|1962}}; {{Harvnb|Miles|1965}}).
:<math> \mathcal{M}(\alpha,\beta,\gamma) = \begin{bmatrix}
\cos\alpha \cos\gamma - \cos\beta \sin\alpha \sin\gamma &
-\cos\beta \cos\alpha \sin\gamma - \cos\alpha \sin\gamma &
\sin\gamma \sin\beta
\\
\cos\alpha \sin\gamma + \cos\gamma \cos\beta \sin\alpha &
\cos\alpha \cos\beta \cos\gamma - \sin\alpha \sin\gamma &
-\cos\gamma \sin\beta
\\
\sin\beta \sin\alpha &
\cos\alpha \sin\beta &
\cos\beta
\end{bmatrix} </math>


For the axis-angle form, the axis is uniformly distributed over the unit sphere of directions, ''S''<sup>2</sup>, while the angle has the non-uniform distribution over [0,&pi;] noted previously {{Harv|Miles|1965}}.
Since this rotation matrix is not expressed as a rotation about a single axis, its generator is not as simply expressed as in the above examples.


== References ==
=== Symmetry Preserving SVD (Singular Value Decomposition) representation ===
* {{Citation
For an axis of rotation <math>q</math> and angle of rotation <math>\theta</math>, the rotation matrix
| last=Arvo
:<math> \mathcal{M} = qq^T+QGQ^T</math>
| first=James
where the columns of <math>Q=\begin{bmatrix}q_1, & q_2\end{bmatrix}</math> span the space orthogonal to <math>q</math> and <math>G</math> is the Givens rotation of <math>\theta</math> degrees, i.e.
| year=1992
:<math> G = \begin{bmatrix}
| contribution=Fast random rotation matrices
\cos\theta & \sin\theta\\
| title=Graphics Gems III
-\sin\theta & \cos\theta
| editor=David Kirk
\end{bmatrix}</math>
| publisher=[[Academic Press]] Professional
| place=San Diego
| pages=117–120
| isbn=978-0-12-409671-4
| url=http://www.graphicsgems.org/
}}
* {{Citation
| last=Baker
| first=Andrew
| title=Matrix Groups: An Introduction to Lie Group Theory
| year=2003
| publisher=[[Springer-Verlag|Springer]]
| isbn=978-1-85233-470-3
}}
* {{Citation
| last=Bar-Itzhack
| first=Itzhack Y.
| author-link=
| year=2000
| month=Nov.–Dec.
| title=New method for extracting the quaternion from a rotation matrix
| journal=AIAA Journal of Guidance, Control and Dynamics
| volume=23
| issue=6
| pages=1085–1087 (Engineering Note)
| issn=0731-5090
}}
* {{Citation
| last1=Björck
| first1=A.
| author1-link=
| last2=Bowie
| first2=C.
| author2-link=
| year=1971
| month=June
| title=An iterative algorithm for computing the best estimate of an orthogonal matrix
| journal=[[SIAM]] Journal on Numerical Analysis
| volume=8
| issue=2
| pages=358–364
| issn=0036-1429
| doi=
}}
* {{Citation
| last=Cayley
| first=Arthur
| author-link=Arthur Cayley
| year=1846
| title=Sur quelques propriétés des déterminants gauches
| journal=Journal für die Reine und Angewandte Mathematik ([[Crelle's Journal]]),
| volume=32
| pages=119–123
| issn=0075-4102
}}; reprinted as article 52 in {{Citation
| last=Cayley
| first=Arthur
| author-link=Arthur Cayley
| year=1889
| title=The collected mathematical papers of Arthur Cayley
| publisher=[[Cambridge University Press]]
| volume=I (1841–1853)
| pages=332–336
| isbn=<!-- none given -->
| url=http://www.hti.umich.edu/cgi/t/text/pageviewer-idx?c=umhistmath;cc=umhistmath;rgn=full%20text;idno=ABS3153.0001.001;didno=ABS3153.0001.001;view=image;seq=00000349
}}
* {{Citation
| last1=Diaconis
| first1=Persi
| author1-link=Persi Diaconis
| last2=Shahshahani
| first2=Mehrdad
| title=The subgroup algorithm for generating uniform random variables
| journal=Probability in the Engineering and Informational Sciences
| volume=1
| pages=15–32
| date=1987
| issn=0269-9648
}}
* {{Citation
| last=Engø
| first=Kenth
| author-link=
| year=2001
| month=June
| title=On the BCH-formula in '''so'''(3)
| journal=BIT Numerical Mathematics
| volume=41
| number=3
| pages=629–632
| issn=0006-3835
| doi=10.1023/A:1021979515229
| url=http://www.ii.uib.no/publikasjoner/texrap/abstract/2000-201.html
}}
* {{Citation
| last1=Fan
| first1=Ky
| author1-link=
| last2=Hoffman
| first2=Alan J.
| author2-link=
| year=1955
| month=February
| title=Some metric inequalities in the space of matrices
| journal=[[Proceedings of the American Mathematical Society|Proc. AMS]]
| volume=6
| issue=1
| pages=111–116
| issn=0002-9939
| doi=10.2307/2032662
}}
* {{Citation
| last1=Fulton
| first1=William
| author1-link=
| last2=Harris
| first2=Joe
| author2-link=
| year=1991
| title=Representation theory: a first course
<!--
| chapter=Spin Representations of '''so'''<sub>''m''</sub>&nbsp;'''C'''
| pages=299–315
-->
| publisher=[[Springer-Verlag|Springer]]
| place=New York
| isbn=0-387-97495-4
}} ([[Graduate Texts in Mathematics|GTM]] 129)
* {{Citation
| last1=Goldstein
| first1=Herbert
| author1-link=Herbert Goldstein
| last2=Poole
| first2=Charles P.
| author2-link=
| last3=Safko
| first3=John L.
| author3-link=
| year=2002<!-- January 15 -->
| title=Classical Mechanics
| edition=third
| publisher=[[Addison Wesley]]
| isbn=978-0-201-65702-9
}}
* {{Citation
| last=Hall
| first=Brian C.
| title=Lie Groups, Lie Algebras, and Representations: An Elementary Introduction
| year=2004
| publisher=[[Springer-Verlag|Springer]]
| isbn=978-0-387-40122-5
}} ([[Graduate Texts in Mathematics|GTM]] 222)
* {{Citation
| last1=Herter
| first1=Thomas
| author1-link=
| last2=Lott
| first2=Klaus
| author2-link=
| year=1993
| month=September–October
| title=Algorithms for decomposing 3-D orthogonal matrices into primitive rotations
| journal=Computers & Graphics
| volume=17
| number=5
| pages=517–527
| issn=0097-8493
}}
* {{Citation
| last=Higham
| first=Nicholas J.
| author-link=
| year=1989
| month=October 1
| contribution=Matrix nearness problems and applications
| title=Applications of Matrix Theory
| editor1-last=Gover
| editor1-first=M. J. C.
| editor2-last=Barnett
| editor2-first=S.
| pages=1–27
| publisher=[[Oxford University Press]]
| isbn=978-0-19-853625-3
| url=http://www.maths.manchester.ac.uk/~higham/pap-misc.html
}}
* {{Citation
| last1=León
| first1=Carlos A.
| author1-link=
| last2=Massé
| first2=Jean-Claude
| author2-link=
| last3=Rivest
| first3=Louis-Paul
| author3-link=
| year=2006
| month=February
| title=A statistical model for random rotations
| journal=Journal of Multivariate Analysis
| volume=97
| number=2
| pages=412–430
| issn=0047-259X
| doi=10.1016/j.jmva.2005.03.009
| url=http://archimede.mat.ulaval.ca/pages/lpr/
}}
* {{Citation
| last=Miles
| first=R. E.
| author-link=
| year=1965
| month=December
| title=On random rotations in ''R''<sup>3</sup>
| journal=[[Biometrika]]
| volume=52
| number=3/4
| pages=636–639
| issn=0006-3444
| doi=10.2307/2333716
}}
* {{Citation
| last=Murnaghan
| first=Francis D.
| author-link=<!-- [[Frank Murnaghan]] -->
| year=1962
| title=The Unitary and Rotation Groups
| series=Lectures on applied mathematics
| publisher=Spartan Books
| place=Washington
| isbn=<!-- none -->
}}
* {{Citation
| last=Prentice
| first=Michael J.
| author-link=
| year=1986
| title=Orientation statistics without parametric assumptions
| journal=Journal of the Royal Statistical Society. Series B (Methodological)
| volume=48
| number=2
| pages=214–222
| issn=0035-9246
}}
* {{Citation
| last=Shepperd
| first=Stanley W.
| author-link=
| year=1978
| month=May–June
| title=Quaternion from rotation matrix
| journal=AIAA Journal of Guidance, Control and Dynamics
| volume=1
| issue=3
| pages=223–224
| issn=0731-5090
}}
* {{Citation
| last=Shoemake
| first=Ken
| author-link=
| year=1994
| contribution=Euler angle conversion
| title=Graphics Gems IV
| editor=Paul Heckbert
| publisher=[[Academic Press]] Professional
| place=San Diego
| pages=222–229
| isbn=978-0-12-336155-4
| url=http://www.graphicsgems.org/
}}
* {{Citation
| last=Stuelpnagel
| first=John
| author-link=
| year=1964
| month=October
| title=On the parameterization of the three-dimensional rotation group
| journal=SIAM Review
| volume=6
| number=4
| pages=422–430
| issn=0036-1445
}} (Also [http://ntrs.nasa.gov/search.jsp NASA-CR-53568].)
* {{Citation
| last=Varadarajan
| first=V. S.
| title=Lie Groups, Lie Algebras, and Their Representation
| year=1984
| publisher=[[Springer-Verlag|Springer]]
| isbn=978-0-387-90969-1
}} ([[Graduate Texts in Mathematics|GTM]] 102)
* {{Citation
| last=Wedderburn
| first=J. H. M.
| author-link=Joseph Wedderburn
| year=1934
| title=Lectures on Matrices
| publisher=[[American Mathematical Society|AMS]]
| isbn=978-0-8218-3204-2
| url=http://www.ams.org/online_bks/coll17/
}}


==See also==
==See also==
Line 262: Line 797:


==External links==
==External links==
*[http://tools.wikimedia.de/~dschwen/tools/rotationmatrix.html Rotation matrix calculator]
* [http://mathworld.wolfram.com/RotationMatrix.html Rotation matrices at Mathworld]
* [http://www.mathaware.org/mam/00/master/dimension/demos/plane-rotate.html Math Awareness Month 2000 interactive demo] (requires [[Java (programming language)|Java]])
*[http://mathworld.wolfram.com/RotationMatrix.html Rotation matrices at Mathworld]


[[Category:Rotational symmetry]]
[[Category:Rotational symmetry]]

Revision as of 08:11, 30 August 2007

In matrix theory, a rotation matrix is a real square matrix whose transpose is its inverse and whose determinant is +1.

In other words, it is a real special orthogonal matrix. The name refers to the fact that an n×n rotation matrix corresponds to a geometric rotation about a fixed origin in an n-dimensional Euclidean space, or equivalently, to a rotation of an n-dimensional real vector space equipped with a Euclidean inner product. For example, the 3×3 rotation matrix

corresponds to a rotation of approximately 53° around the z axis in three-dimensional space.

Examples

Geometry

In Euclidean geometry, a rotation is an example of an isometry, a transformation that moves points without changing the distances between them. Rotations are distinguished from other isometries by two additional properties: they leave (at least) one point fixed, and they leave "handedness" unchanged. By contrast, a translation moves every point, a reflection exchanges left- and right-handed ordering, and a glide reflection does both.

If we take the fixed point as the origin of a Cartesian coordinate system, then every point can be given coordinates as a displacement from the origin. Thus we may work with the vector space of displacements instead of the points themselves. Now suppose (p1,…,pn) are the coordinates of the vector p from the origin, O, to point P. Choose an orthonormal basis for our coordinates; then the squared distance to P, by Pythagoras, is

which we can compute using the matrix multiplication

A geometric rotation transforms lines to lines, and preserves ratios of distances between points. From these properties we can show that a rotation is a linear transformation of the vectors, and thus can be written in matrix form, Qp. The fact that a rotation preserves, not just ratios, but distances themselves, we can state as

oder

Because this equation holds for all vectors, p, we conclude that every rotation matrix, Q, satisfies the orthogonality condition,

Rotations preserve handedness because they cannot change the ordering of the axes, which implies the special matrix condition,

Equally important, we can show that any matrix satisfying these two conditions acts as a rotation.

Multiplication

The inverse of a rotation matrix is its transpose, which is also a rotation matrix:

The product of two rotation matrices is a rotation matrix:

For n greater than 2, multiplication of n×n rotation matrices is not commutative.

Noting that any identity matrix is a rotation matrix, and that matrix multiplication is associative, we may summarize all these properties by saying that the n×n rotation matrices form a group, which for n > 2 is non-abelian. Called a special orthogonal group, and denoted by SO(n), SO(n,R), SOn, or SOn(R), the group of n×n rotation matrices is isomorphic to the group of rotations in an n-dimensional space. This means that multiplication of rotation matrices corresponds to composition of rotations.

Ambiguities

The interpretation of a rotation matrix can be subject to many ambiguities.

Alias and alibi rotations
Positive or negative sense
A positive rotation can mean clockwise or the opposite.
Row or column vectors
A square matrix can multiply a column vector or a row vector.
Alias or alibi transformation
The change in a vector's coordinates can indicate a turn of the coordinate system (alias) or a turn of the vector (alibi).
Right- or left-handed coordinates
The matrix can be with respect to a right-handed or left-handed coordinate system.
Row- or column-major storage
Matrix elements may be stored in computer memory in either row-major order or column-major order, depending on the programming language and API.
World or body axes
The coordinate axes can be fixed or rotate with a body.
Cartesian or homogeneous representation
Homogeneous coordinates carry an extra dimension compared to Cartesian coordinates to allow more flexibility.
Vectors or forms
The vector space has a dual space of linear forms, and the matrix can act on either vectors or forms.

In most cases the effect of the ambiguity is to transpose or invert the matrix.

Decompositions

Independent planes

Consider the 3×3 rotation matrix

If Q acts in a certain direction, v, purely as a scaling by a factor λ, then we have

so that

Thus λ is a root of the characteristic polynomial for Q,

Two features are noteworthy. First, one of the roots (or eigenvalues) is 1, which tells us that some direction is unaffected by the matrix. For rotations in three dimensions, this is the axis of the rotation (a concept that has no meaning in any other dimension). Second, the other two roots are a pair of complex conjugates, whose product is 1 (the constant term of the quadratic), and whose sum is 2 cos θ (the negated linear term). This factorization is of interest for 3×3 rotation matrices because the same thing occurs for all of them. (As special cases, for a null rotation the "complex conjugates" are both 1, and for a 180° rotation they are both −1.) Furthermore, a similar factorization holds for any n×n rotation matrix. If the dimension, n, is odd, there will be a "dangling" eigenvalue of 1; and for any dimension the rest of the polynomial factors into quadratic terms like the one here (with the two special cases noted). We are guaranteed that the characteristic polynomial will have degree n and thus n eigenvalues. And since a rotation matrix commutes with its transpose, it is a normal matrix, so can be diagonalized. We conclude that every rotation matrix, when expressed in a suitable coordinate system, partitions into independent rotations of two-dimensional subspaces, at most n2 of them.

The sum of the entries on the main diagonal of a matrix is called the trace; it does not change if we reorient the coordinate system, and always equals the sum of the eigenvalues. This has the convenient implication for 2×2 and 3×3 rotation matrices that the trace reveals the angle of rotation, θ, in the two-dimensional (sub-)space. For a 2×2 matrix the trace is 2 cos(θ), and for a 3×3 matrix it is 1+2 cos(θ). In the three-dimensional case, the subspace consists of all vectors perpendicular to the rotation axis (the invariant direction, with eigenvalue 1). Thus we can extract from any 3×3 rotation matrix a rotation axis and an angle, and these completely determine the rotation.

Sequential angles

The constraints on a 2×2 rotation matrix imply that it must have the form

with a2+b2 = 1. Therefore we may set a = cos θ and b = sin θ, for some angle θ. To solve for θ it is not enough to look at a alone or b alone; we must consider both together to place the angle in the correct quadrant, using a two-argument arctangent function.

Now consider the first column of a 3×3 rotation matrix,

Although a2+b2 will probably not equal 1, but some value r2 < 1, we can use a slight variation of the previous computation to find a so-called Givens rotation that transforms the column to

zeroing b. This acts on the subspace spanned by the x and y axes. We can then repeat the process for the xz subspace to zero c. Acting on the full matrix, these two rotations produce the schematic form

Shifting attention to the second column, a Givens rotation of the yz subspace can now zero the z value. This brings the full matrix to the form

which is an identity matrix. Thus we have decomposed Q as

An n×n rotation matrix will have (n−1)+(n−2)+⋯+2+1, or

entries below the diagonal to zero. We can zero them by extending the same idea of stepping through the columns with a series of rotations in a fixed sequence of planes. We conclude that the set of n×n rotation matrices, each of which has n2 entries, can be parameterized by n(n−1)/2 angles.

xzxw xzyw xyxw xyzw
yxyw yxzw yzyw yzxw
zyzw zyxw zxzw zxyw
xzxb yzxb xyxb zyxb
yxyb zxyb yzyb xzyb
zyzb xyzb zxzb yxzb

In three dimensions this restates in matrix form an observation made by Euler, so mathematicians call the ordered sequence of three angles Euler angles. However, the situation is somewhat more complicated than we have so far indicated. Despite the small dimension, we actually have considerable freedom in the sequence of axis pairs we use; and we also have some freedom in the choice of angles. Thus we find many different conventions employed when three-dimensional rotations are parameterized for physics, or medicine, or chemistry, or other disciplines. When we include the option of world axes or body axes, 24 different sequences are possible. And while some disciplines call any sequence Euler angles, others give different names (Euler, Cardano, Tait-Byan, roll-pitch-yaw) to different sequences.

One reason for the large number of options is that, as noted previously, rotations in three dimensions (and higher) do not commute. If we reverse a given sequence of rotations, we get a different outcome. This also implies that we cannot compose two rotations by adding their corresponding angles. Thus Euler angles are not vectors, despite a similarity in appearance as a triple of numbers.

Nested dimensions

A 3×3 rotation matrix like

suggests a 2×2 rotation matrix,

is embedded in the upper left corner:

This is no illusion; not just one, but many, copies of n-dimensional rotations are found within (n+1)-dimensional rotations, as subgroups. Each embedding leaves one direction fixed, which in the case of 3×3 matrices is the rotation axis. For example, we have

fixing the x axis, the y axis, and the z axis, respectively. The rotation axis need not be a coordinate axis; if u = (x,y,z) is a unit vector in the desired direction, then

where cθ = cos θ, sθ = sin θ, is a rotation by angle θ leaving axis u fixed.

A direction in (n+1)-dimensional space will be a unit magnitude vector, which we may consider a point on a generalized sphere, Sn. Thus it is natural to describe the rotation group SO(n+1) as combining SO(n) and Sn. A suitable formalism is the fiber bundle,

where for every direction in the "base space", Sn, the "fiber" over it in the "total space", SO(n+1), is a copy of the "fiber space", SO(n), namely the rotations that keep that direction fixed.

Thus we can build an n×n rotation matrix by starting with a 2×2 matrix, aiming its fixed axis on S2 (the ordinary sphere in three-dimensional space), aiming the resulting rotation on S3, and so on up through Sn−1. A point on Sn can be selected using n numbers, so we again have n(n−1)/2 numbers to describe any n×n rotation matrix.

In fact, we can view the sequential angle decomposition, discussed previously, as reversing this process. The composition of n−1 Givens rotations brings the first column (and row) to (1,0,…,0), so that the remainder of the matrix is a rotation matrix of dimension one less, embedded so as to leave (1,0,…,0) fixed.

Skew parameters

When an n×n rotation matrix, Q, does not include −1 as an eigenvalue, so that none of the planar rotations of which it is composed are 180° rotations, then Q+I is an invertible matrix. Most rotation matrices fit this discription, and for them we can show that (QI)(Q+I)−1 is a skew-symmetric matrix, A. Thus AT = −A; and since the diagonal is necessarily zero, and since the upper triangle determines the lower one, A contains n(n−1)/2 independent numbers. Conveniently, IA is invertible whenever A is skew-symmetric; thus we can recover the original matrix using the Cayley transform,

which maps any skew-symmetric matrix A to a rotation matrix. In fact, aside from the noted exceptions, we can produce any rotation matrix in this way. Although in practical applications we can hardly afford to ignore 180° rotations, the Cayley transform is still a potentially useful tool, giving a parameterization of most rotation matrices without trigonometric functions.

In three dimensions, for example, we have (Cayley 1846)

If we condense the skew entries into a vector, (x,y,z), then we produce a 90° rotation around the x axis for (1,0,0), around the y axis for (0,1,0), and around the z axis for (0,0,1). The 180° rotations are just out of reach; for, in the limit as x goes to infinity, (x,0,0) does approach a 180° rotation around the x axis, and similarly for other directions.

Lie theory

Lie group

We have established that n×n rotation matrices form a group, the special orthogonal group, SO(n). This algebraic structure is coupled with a topological structure, in that the operations of multiplication and taking the inverse (which here is merely transposition) are continuous functions of the matrix entries. Thus SO(n) is a classic example of a topological group. (In purely topological terms, it is a compact manifold.) Furthermore, the operations are not only continuous, but smooth, so SO(n) is a differentiable manifold and a Lie group (Baker (2003); Fulton & Harris (1991)).

Most properties of rotation matrices depend very little on the dimension, n; yet in Lie group theory we see systematic differences between even dimensions and odd dimensions. As well, there are some irregularities below n = 5; for example, SO(4) is, anomalously, not a simple Lie group, but instead isomorphic to the product of S3 and SO(3).

Lie algebra

Associated with every Lie group is a Lie algebra, a linear space with equipped with a bilinear alternating product called a bracket. The algebra for SO(n) is denoted by

and consists of all skew-symmetric n×n matrices (as implied by differentiating the orthogonality condition, I = QTQ). The bracket, [A1,A2], of two skew-symmetric matrices is defined to be A1A2A2A1, which is again a skew-symmetric matrix. This Lie algebra bracket captures the essence of the Lie group product via infinitesimals.

For 2×2 rotation matrices, the Lie algebra is a one-dimensional vector space, multiples of

Here the bracket always vanishes, which tells us that, in two dimensions, rotations commute. Not so in any higher dimension. For 3×3 rotation matrices, we have a three-dimensional vector space with the convenient basis (generators)

The essence of the bracket for these basis vectors works out to be as follows.

We can conveniently identify any matrix in this Lie algebra with a vector in R3,

Under this identification, the so(3) bracket has a memorable description; it is the vector cross product,

The matrix identified with a vector v is also memorable, because

Notice this implies that v is in the null space of the skew-symmetric matrix with which it is identified, because v×v is always the zero vector.

Exponential map

Connecting the Lie algebra to the Lie group is the exponential map, which we define using the familiar power series for ex (Wedderburn 1934, §8.02),

For any skew-symmetric A, exp(A) is always a rotation matrix.

An important practical example is the 3×3 case, where we have have seen we can identify every skew-symmetric matrix with a vector ω = uθ, where u = (x,y,z) is a unit magnitude vector. Recall that u is in the null space of the matrix associated with ω, so that if we use a basis with u as the z axis the final column and row will be zero. Thus we know in advance that the exponential matrix must leave u fixed. It is mathematically impossible to supply a straightforward formula for such a basis as a function of u (its existence would violate the hairy ball theorem), but direct exponentiation is possible, and yields

where c = cos θ2, s = sin θ2. We recognize this as our matrix for a rotation around axis u by angle θ. We also note that this mapping of skew-symmetric matrices is quite different from the Cayley transform discussed earlier.

In any dimension, if we choose some nonzero A and consider all its scalar multiples, exponentiation yields rotation matrices along a geodesic of the group manifold, forming a one-parameter subgroup of the Lie group. More broadly, the exponential map provides a homeomorphism between a neighborhood of the origin in the Lie algebra and a neighborhood of the identity in the Lie group. In fact, we can produce any rotation matrix as the exponential of some skew-symmetric matrix, so for these groups the exponential map is a surjection.

Baker–Campbell–Hausdorff formula

Suppose we are given A and B in the Lie algebra. Their exponentials, exp(A) and exp(B), are rotation matrices, which we can multiply. Since the exponential map is a surjection, we know that for some C in the Lie algebra, exp(A)exp(B) = exp(C), and we write

When exp(A) and exp(B) commute (which always happens for 2×2 matrices, but not higher), then C = A+B, mimicking the behavior of complex exponentiation. The general case is given by the BCH formula, a series expanded in terms of the bracket (Hall 2004, Ch. 3; Varadarajan 1984, §2.15). For matrices, the bracket is the same operation as the commutator, which detects lack of commutativity in multiplication. The general formula begins as follows.

Representation of a rotation matrix as a sequential angle decomposition, as in Euler angles, may tempt us to treat rotations as a vector space, but the higher order terms in the BCH formula reveal that to be a mistake.

We again take special interest in the 3×3 case, where [A,B] equals the cross product, A×B. If A and B are linearly independent, then A, B, and A×B can be used as a basis; if not, then A and B commute. And conveniently, in this dimension the summation in the BCH formula has a closed form (Engø 2001) as αAB+γ(A×B).

Spin group

The Lie group of n×n rotation matrices, SO(n), is a compact and path-connected manifold, and thus locally compact and connected. However, it is not simply connected, so Lie theory tells us it is a kind of "shadow" (a homomorphic image) of a universal covering group. Often the covering group, which in this case is the spin group denoted by Spin(n), is simpler and more natural to work with (Baker 2003, Ch. 5; Fulton & Harris 1991, pp. 299–315).

In the case of planar rotations, SO(2) is topologically a circle, S1. Its universal covering group, Spin(2), is isomorphic to the real line, R, under addition. In other words, whenever we use angles of arbitrary magnitude, which we often do, we are essentially taking advantage of the convenience of the "mother space". Every 2×2 rotation matrix is produced by a countable infinity of angles, separated by integer multiples of 2π. Correspondingly, the fundamental group of SO(2) is isomorphic to the integers, Z.

In the case of spatial rotations, SO(3) is topologically equivalent to three-dimensional real projective space, RP3. Its universal covering group, Spin(3), is isomorphic to the 3-sphere, S3. Every 3×3 rotation matrix is produced by two opposite points on the sphere. Correspondingly, the fundamental group of SO(2) is isomorphic to the two-element group, Z2. We can also describe Spin(3) as isomorphic to quaternions of unit norm under multiplication, or to certain 4×4 real matrices, or to 2×2 complex special unitary matrices.

Concretely, a unit quaternion, q, with

produces the rotation matrix

This is our third version of this matrix, here as a rotation around non-unit axis vector (x,y,z) by angle 2θ, where cos θ = w and sin θ = ||(x,y,z)||.

Many features of this case are the same for higher dimensions. The coverings are all two-to-one, with SO(n), n > 2, having fundamental group Z2. The natural setting for these groups is within a Clifford algebra. And the action of the rotations is produced by a kind of "sandwich", denoted by qvq.

Infinitesimal rotations

The matrices in the Lie algebra are not themselves rotations; the skew-symmetric matrices are derivatives, proportional differences of rotations. An actual "differential rotation", or infinitesimal rotation matrix has the form

where dθ is vanishingly small. These matrices do not satisfy all the same properties as ordinary finite rotation matrices under the usual treatment of infinitesimals (Goldstein, Poole & Safko 2002, §4.8). To understand what this means, consider

We first test the orthogonality condition, QTQ = I. The product is

differing from an identity matrix by second order infinitesimals, which we discard. So to first order, an infinitesimal rotation matrix is an orthogonal matrix. Next we examine the square of the matrix.

Again discarding second order effects, we see that the angle simply doubles. This hints at the most essential difference in behavior, which we can exhibit with the assistance of a second infinitesimal rotation,

Compare the products dAxdAy and dAydAx.

Since dθ dφ is second order, we discard it; thus, to first order, multiplication of infinitesimal rotation matrices is commutative. In fact,

again to first order.

But we must always be careful to distinguish (the first order treatment of) these infinitesimal rotation matrices from both finite rotation matrices and from derivatives of rotation matrices (namely skew-symmetric matrices). Contrast the behavior of finite rotation matrices in the BCH formula with that of infinitesimal rotation matrices, where all the commutator terms will be second order infinitesimals so we do have a vector space.

Conversions

We have seen the existence of several decompositions that apply in any dimension, namely independent planes, sequential angles, and nested dimensions. In all these cases we can either decompose a matrix or construct one. We have also given special attention to 3×3 rotation matrices, and these warrant further attention, in both directions (Stuelpnagel 1964).

Quaternion

Rewrite the 3×3 rotation matrix again, as

Now every quaternion component appears multiplied by two in a term of degree two, and if all such terms are zero what's left is an identity matrix. This leads to an efficient, robust conversion from any quaternion — whether unit, nonunit, or even zero — to a 3×3 rotation matrix.

Nq = w^2 + x^2 + y^2 + z^2
if Nq > 0.0 then s = 2/Nq else s = 0.0
X = x*s; Y = y*s; Z = z*s
wX = w*X; wY = w*Y; wz = w*Z
xX = x*X; xY = x*Y; xZ = x*Z
yY = y*Y; yZ = y*Z; zZ = z*Z
[ 1.0-(yY+zZ)   xY-wZ      xZ+wY    ]
[    xY+wZ   1.0-(xX+zZ)   yZ-wX    ]
[    xZ-wY      yZ+wX   1.0-(xX+yY) ]

Freed from the demand for a unit quaternion, we find that nonzero quaternions act as homogeneous coordinates for 3×3 rotation matrices. The Cayley transform, discussed earlier, is obtained by scaling the quaternion so that its w component is 1. For a 180° rotation around any axis, w will be zero, which explains the Cayley limitation.

The sum of the entries along the main diagonal (the trace), plus one, equals 4−4(x2+y2+z2), which is 4w2. Thus we can write the trace itself as 2w2+2w2−1; and from the previous version of the matrix we see that the diagonal entries themselves have the same form: 2x2+2w2−1, 2y2+2w2−1, and 2z2+2w2−1. So we can easily compare the magnitudes of all four quaternion components using the matrix diagonal. We can, in fact, obtain all four magnitudes using sums and square roots, and choose consistent signs using the skew-symmetric part of the off-diagonal entries.

w = 0.5*sqrt(1+Qxx+Qyy+Qzz)
x = copysign(0.5*sqrt(1+Qxx-Qyy-Qzz),Qzy-Qyz)
y = copysign(0.5*sqrt(1-Qxx+Qyy-Qzz),Qxz-Qzx)
z = copysign(0.5*sqrt(1-Qxx-Qyy+Qzz),Qyx-Qxy)

Alternatively, use a single square root and division

t = Qxx+Qyy+Qzz
r = sqrt(1+t)
s = 0.5/r
w = 0.5*r
x = (Qzy-Qyz)*s
y = (Qxz-Qzx)*s
z = (Qyx-Qxy)*s

This is numerically stable so long as the trace, t, is not negative; otherwise, we risk dividing by (nearly) zero. In that case, suppose Qxx is the largest diagonal entry, so x will have the largest magnitude (the other cases are similar); then the following is safe.

r = sqrt(1+Qxx-Qyy-Qzz)
s = 0.5/r
w = (Qzy-Qyz)*s
x = 0.5*r
y = (Qxy+Qyx)*s
z = (Qzx+Qxz)*s

If the matrix contains significant error, such as accumulated numerical error, we may construct a symmetric 4×4 matrix,

and find the eigenvector, (x,y,z,w), of its largest magnitude eigenvalue. (If Q is truly a rotation matrix, that value will be 1.) The quaternion so obtained will correspond to the rotation matrix closest to the given matrix (Bar-Itzhack 2000).

Polar decomposition

If the n×n matrix M is non-singular, its columns are linearly independent vectors; thus the Gram–Schmidt process can adjust them to be an orthonormal basis. Stated in terms of numerical linear algebra, we convert M to an orthogonal matrix, Q, using QR decomposition. However, we often prefer a Q "closest" to M, which this method does not accomplish. For that, the tool we want is the polar decomposition (Fan & Hoffman 1955; Higham 1989).

To measure closeness, we may use any matrix norm invariant under orthogonal transformations. A convenient choice is the Frobenius norm, ||QM||F, squared, which is the sum of the squares of the element differences. Writing this in terms of the trace, Tr, our goal is,

  • Find Q minimizing Tr( (QM)T(QM) ), subject to QTQ = I.

Though written in matrix terms, the objective function is just a quadratic polynomial. We can minimize it in the usual way, by finding where its derivative is zero. For a 3×3 matrix, the orthogonality constraint implies six scalar equalities that the entries of Q must satisfy. To incorporate the constraint(s), we may employ a standard technique, Lagrange multipliers, assembled as a symmetric matrix, Y. Thus our method is:

  • Differentiate Tr( (QM)T(QM) + (QTQI)Y ) with respect to (the entries of) Q, and equate to zero.

Consider a 2×2 example. Including constraints, we seek to minimize

Taking the derivative with respect to Qxx, Qxy, Qyx, Qyy in turn, we assemble a matrix.

In general, we obtain the equation

so that

where Q is orthogonal and S is symmetric. To ensure a minimum, the Y matrix (and hence S) must be positive definite. Linear algebra calls QS the polar decomposition of M, with S the positive square root of S2 = MTM.

When M is non-singular, the Q and S factors of the polar decomposition are uniquely determined. However, the determinant of S is positive because S is positive definite, so Q inherits the sign of the determinant of M. That is, Q is only guaranteed to be orthogonal, not a rotation matrix. This is unavoidable; an M with negative determinant has no uniquely-defined closest rotation matrix.

Axis and angle

Determining an axis and angle, like determining a quaternion, is only possible up to sign; that is, (u,θ) and (−u,−θ) correspond to the same rotation matrix, just like q and −q. We might prefer u to be a unit vector; when θ is zero, however, the direction of u is undetermined. Also, the angle is only determined to within a multiple of 2π, and we need a two-argument arctangent.

x = Qzy-Qyz
y = Qxz-Qzx
z = Qyx-Qxy
r = sqrt(x*x+y*y+z*z)
θ = atan2(0.5*(Q_{xx}+Q_{yy}+Q_{zz}-1),r)

To efficiently construct a rotation matrix from an angle θ and a unit axis u, we can take advantage of symmetry and skew-symmetry within the entries.

c = cos(θ); s = sin(θ); C = 1-c
xs = x*s;   ys = y*s;   zs = z*s
xC = x*C;   yC = y*C;   zC = z*C
xyC = x*yC; yzC = y*zC; zxC = z*xC
[ x*xC+c   xyC-zs   zxC+ys ]
[ xyC+zs   y*yC+c   yzC-xs ]
[ zxC-ys   yzC+xs   z*zC+c ]

Euler angles

Complexity of conversion escalates with Euler angles (used here in the broad sense). The first difficulty is to establish which of the twenty-four variations of Cartesian axis order we will use. Suppose the three angles are θ1, θ2, θ3; physics and chemistry may interpret these as

while aircraft dynamics may use

One systematic approach begins with choosing the right-most axis. Among all permutations of (x,y,z), only two place that axis first; one is an even permutation and the other odd. Choosing parity thus establishes the middle axis. That leaves two choices for the left-most axis, either duplicating the first or not. These three choices gives us 3×2×2 = 12 variations; we double that to 24 by choosing static or rotating axes.

This is enough to construct a matrix from angles, but triples differing in many ways can give the same rotation matrix. For example, suppose we use the zyz convention above; then we have the following equivalent pairs:

(90°, 45°, −105°) (−270°, −315°, 255°) multiples of 360°
(72°, 0°, 0°) (40°, 0°, 32°) singular alignment
(45°, 60°, −30°) (−135°, −60°, 150°) bistable flip

The problem of singular alignment, the mathematical analog of physical gimbal lock, occurs when the middle rotation aligns the axes of the first and last rotations. It afflicts every axis order at either even or odd multiples of 90°, causing Euler angles to be abandoned for quaternions in many applications. Setting these unavoidable issues aside, angles for any order can be found using a concise common routine (Herter & Lott 1993; Shoemake 1994).

Uniform random rotation matrices

We sometimes need to generate a uniformly distributed random rotation matrix. It seems intuitively clear in two dimensions that this means the rotation angle is uniformly distributed between 0 and 2π. That intuition is correct, but does not carry over to higher dimensions. For example, if we decompose 3×3 rotation matrices in axis-angle form, the angle should not be uniformly distributed; the probability that (the magnitude of) the angle is at most θ should be 1π(θ − sin θ), for 0 ≤ θ ≤ π.

Since SO(n) is a connected and locally compact Lie group, we have a simple standard criterion for uniformity, namely that the distribution be unchanged when composed with any arbitrary rotation (a Lie group "translation"). This definition corresponds to what is called Haar measure. León, Massé & Rivest (2006) show how to use the Cayley transform to generate and test matrices according to this criterion.

We can also generate a uniform distribution in any dimension using the subgroup algorithm of Diaconis & Shashahani (1987). This recursively exploits the nested dimensions group structure of SO(n), as follows. Generate a uniform angle and construct a 2×2 rotation matrix. To step from n to n+1, generate a vector v uniformly distributed on the n-sphere, Sn, embed the n×n matrix in the next larger size with last column (0,…,0,1), and rotate the larger matrix so the last column becomes v.

As usual, we have special alternatives for the 3×3 case. Each of these methods begins with three independent random scalars uniformly distributed on the unit interval. Arvo (1992) takes advantage of the odd dimension to change a Householder reflection to a rotation by negation, and uses that to aim the axis of a uniform planar rotation.

Another method uses unit quaternions. Multiplication of rotation matrices is homomorphic to multiplication of quaternions, and multiplication by a unit quaternion rotates the unit sphere. Since the homomorphism is a local isometry, we immediately conclude that to produce a uniform distribution on SO(3) we may use a uniform distribution on S3.

Euler angles can also be used, though not with each angle uniformly distributed (Murnaghan 1962; Miles 1965).

For the axis-angle form, the axis is uniformly distributed over the unit sphere of directions, S2, while the angle has the non-uniform distribution over [0,π] noted previously (Miles 1965).

References

  • Arvo, James (1992), "Fast random rotation matrices", in David Kirk (ed.), Graphics Gems III, San Diego: Academic Press Professional, pp. 117–120, ISBN 978-0-12-409671-4
  • Baker, Andrew (2003), Matrix Groups: An Introduction to Lie Group Theory, Springer, ISBN 978-1-85233-470-3
  • Bar-Itzhack, Itzhack Y. (2000), "New method for extracting the quaternion from a rotation matrix", AIAA Journal of Guidance, Control and Dynamics, 23 (6): 1085–1087 (Engineering Note), ISSN 0731-5090 {{citation}}: Unknown parameter |month= ignored (help)
  • Björck, A.; Bowie, C. (1971), "An iterative algorithm for computing the best estimate of an orthogonal matrix", SIAM Journal on Numerical Analysis, 8 (2): 358–364, ISSN 0036-1429 {{citation}}: Unknown parameter |month= ignored (help)
  • Cayley, Arthur (1846), "Sur quelques propriétés des déterminants gauches", Journal für die Reine und Angewandte Mathematik (Crelle's Journal), 32: 119–123, ISSN 0075-4102{{citation}}: CS1 maint: extra punctuation (link); reprinted as article 52 in Cayley, Arthur (1889), The collected mathematical papers of Arthur Cayley, vol. I (1841–1853), Cambridge University Press, pp. 332–336
  • Diaconis, Persi; Shahshahani, Mehrdad (1987), "The subgroup algorithm for generating uniform random variables", Probability in the Engineering and Informational Sciences, 1: 15–32, ISSN 0269-9648
  • Engø, Kenth (2001), "On the BCH-formula in so(3)", BIT Numerical Mathematics, 41 (3): 629–632, doi:10.1023/A:1021979515229, ISSN 0006-3835 {{citation}}: Unknown parameter |month= ignored (help)
  • Fan, Ky; Hoffman, Alan J. (1955), "Some metric inequalities in the space of matrices", Proc. AMS, 6 (1): 111–116, doi:10.2307/2032662, ISSN 0002-9939 {{citation}}: Unknown parameter |month= ignored (help)
  • Fulton, William; Harris, Joe (1991), Representation theory: a first course, New York: Springer, ISBN 0-387-97495-4 (GTM 129)
  • Goldstein, Herbert; Poole, Charles P.; Safko, John L. (2002), Classical Mechanics (third ed.), Addison Wesley, ISBN 978-0-201-65702-9
  • Hall, Brian C. (2004), Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, Springer, ISBN 978-0-387-40122-5 (GTM 222)
  • Herter, Thomas; Lott, Klaus (1993), "Algorithms for decomposing 3-D orthogonal matrices into primitive rotations", Computers & Graphics, 17 (5): 517–527, ISSN 0097-8493 {{citation}}: Unknown parameter |month= ignored (help)
  • Higham, Nicholas J. (1989), "Matrix nearness problems and applications", in Gover, M. J. C.; Barnett, S. (eds.), Applications of Matrix Theory, Oxford University Press, pp. 1–27, ISBN 978-0-19-853625-3 {{citation}}: Unknown parameter |month= ignored (help)
  • León, Carlos A.; Massé, Jean-Claude; Rivest, Louis-Paul (2006), "A statistical model for random rotations", Journal of Multivariate Analysis, 97 (2): 412–430, doi:10.1016/j.jmva.2005.03.009, ISSN 0047-259X {{citation}}: Unknown parameter |month= ignored (help)
  • Miles, R. E. (1965), "On random rotations in R3", Biometrika, 52 (3/4): 636–639, doi:10.2307/2333716, ISSN 0006-3444 {{citation}}: Unknown parameter |month= ignored (help)
  • Murnaghan, Francis D. (1962), The Unitary and Rotation Groups, Lectures on applied mathematics, Washington: Spartan Books
  • Prentice, Michael J. (1986), "Orientation statistics without parametric assumptions", Journal of the Royal Statistical Society. Series B (Methodological), 48 (2): 214–222, ISSN 0035-9246
  • Shepperd, Stanley W. (1978), "Quaternion from rotation matrix", AIAA Journal of Guidance, Control and Dynamics, 1 (3): 223–224, ISSN 0731-5090 {{citation}}: Unknown parameter |month= ignored (help)
  • Shoemake, Ken (1994), "Euler angle conversion", in Paul Heckbert (ed.), Graphics Gems IV, San Diego: Academic Press Professional, pp. 222–229, ISBN 978-0-12-336155-4
  • Stuelpnagel, John (1964), "On the parameterization of the three-dimensional rotation group", SIAM Review, 6 (4): 422–430, ISSN 0036-1445 {{citation}}: Unknown parameter |month= ignored (help) (Also NASA-CR-53568.)
  • Varadarajan, V. S. (1984), Lie Groups, Lie Algebras, and Their Representation, Springer, ISBN 978-0-387-90969-1 (GTM 102)
  • Wedderburn, J. H. M. (1934), Lectures on Matrices, AMS, ISBN 978-0-8218-3204-2

See also