RigidTransform in SciPy

SciPy is a fundamental library of the scientific Python ecosystem. I always wondered why it does not provide proper rigid transformations although they are needed in many scientific disciplines such as robotics, computer vision, and computer graphics. This is one reason why pytransform3d exists.

With the issue ENH: spatial.transform: cover proper rigid transformations with a RigidTransform class (rotations plus translations) #19254 opened in September 2023, Scott Shambaugh took the initiative to change this. It took more than a year to agree on an interface and finally implement the suggested changes with the pull request ENH: spatial.transform: baseline implementation of RigidTransform #22267, which will be part of SciPy 1.16.0.

I contributed to some parts of the new feature:

  • discussion of the interface
  • the initial draft of the conversions from and to exponential coordinates
  • the conversion from and to dual quaternions
  • dual quaternion normalization

This is a summary of the new feature.

The Rotation Class

With the interface, we tried to be as close as possible to the already existing Rotation class from the module scipy.spatial.transform, which looks like this:

In [1]:
import numpy as np
from scipy.spatial.transform import Rotation


# quaternions are use the scalar-last convention, but there is an option
# to use the scalar-first convention
r_BA = Rotation.from_quat([0, 0, np.sin(np.pi / 4), np.cos(np.pi / 4)])
r_BA.as_quat()

r_CB = Rotation.from_matrix(
    [[0, -1, 0],
     [1, 0, 0],
     [0, 0, 1]])
r_CB.as_matrix()

r_DC = Rotation.from_rotvec([0.3, 0.5, -0.2])
r_DC.as_rotvec()

r_ED = Rotation.from_euler(
    "zyx", [[90, 0, 0], [0, 45, 0], [45, 60, 30]], degrees=True)

r_FE = Rotation.identity()

r_ED[0].as_euler('zyx', degrees=True)

r_AB = r_BA.inv()  # inverse rotation
r_BA.apply([1, 2, 3])  # rotate a vector
r_CA = r_CB * r_BA  # compose rotations
# compose fractions of the same rotation
r_BA ** 2
r_BA ** 0.5
Out[1]:
Rotation.from_matrix(array([[ 0.70710678, -0.70710678,  0.        ],
                            [ 0.70710678,  0.70710678,  0.        ],
                            [ 0.        ,  0.        ,  1.        ]]))

In addition, all operations and conversions can also be applied to multiple rotations.

RigidTransform

Many names for the new class were considered, for example, TransformSe3, ProperRigidTransformation, RigidTransformation, and finally RigidTransform as a compromise between technically correct, readable, and short.

In [2]:
from scipy.spatial.transform import RigidTransform

The Interface

Creating Rigid Transformations

Rigid transformations can be constructed from translations and rotations...

In [3]:
t_BA = [1, 2, 3]
tf_BA = RigidTransform.from_translation(t_BA)
tf_BA
Out[3]:
RigidTransform.from_matrix(array([[1., 0., 0., 1.],
                                  [0., 1., 0., 2.],
                                  [0., 0., 1., 3.],
                                  [0., 0., 0., 1.]]))
In [4]:
tf_BA.translation
Out[4]:
array([1., 2., 3.])
In [5]:
tf_CB = RigidTransform.from_rotation(r_CB)
tf_CB
Out[5]:
RigidTransform.from_matrix(array([[ 0., -1.,  0.,  0.],
                                  [ 1.,  0.,  0.,  0.],
                                  [ 0.,  0.,  1.,  0.],
                                  [ 0.,  0.,  0.,  1.]]))
In [6]:
tf_CB.rotation
Out[6]:
Rotation.from_matrix(array([[ 0., -1.,  0.],
                            [ 1.,  0.,  0.],
                            [ 0.,  0.,  1.]]))

... or from both of these components.

In [7]:
tf_DC = RigidTransform.from_components(t_BA, r_BA)
In [8]:
tf_DC.as_components()
Out[8]:
(array([1., 2., 3.]),
 Rotation.from_matrix(array([[ 2.22044605e-16, -1.00000000e+00,  0.00000000e+00],
                             [ 1.00000000e+00,  2.22044605e-16,  0.00000000e+00],
                             [ 0.00000000e+00,  0.00000000e+00,  1.00000000e+00]])))

You can create the identity transformation via

In [9]:
RigidTransform.identity()
Out[9]:
RigidTransform.from_matrix(array([[1., 0., 0., 0.],
                                  [0., 1., 0., 0.],
                                  [0., 0., 1., 0.],
                                  [0., 0., 0., 1.]]))

Variable Naming Conventions

We use the prefix t for translations and these are represented by numpy arrays of shape (3,). We use the prefix r for objects of the Rotation class and tf for objects of the RigidTransform class.

Conversions between Representations of Proper Rigid Transformations

Just like pytransform3d, the RigidTransform class supports conversion from and to transformation matrices, exponential coordinates of transformation, and dual quaternions.

Homogenous Transformation Matrix

In the initial version of the feature, transformation matrices are used as an internal representation. The constructor of the class will normalize transformation matrices automatically. However, the internal representation is hidden and you should only access the transformation matrix with the following functions.

In [10]:
tf_ED = RigidTransform.from_matrix(
    [[0, -1, 0, 1],
     [1, 0, 0, 2],
     [0, 0, 1, 3],
     [0, 0, 0, 1]]
)
tf_ED.as_matrix()
Out[10]:
array([[ 0., -1.,  0.,  1.],
       [ 1.,  0.,  0.,  2.],
       [ 0.,  0.,  1.,  3.],
       [ 0.,  0.,  0.,  1.]])

Exponential Coordinates of Transformation

Exponential coordinates are a minimal representation $\boldsymbol{\tau} \in \mathbb{R}^6$. They are particularly useful to compute fractions of a transformation, for example, for interpolation (screw linear interpolation, ScLERP) or computing derivatives.

In [11]:
import pytransform3d.transformations as pt


rng = np.random.default_rng(42)
exp_coords = pt.random_exponential_coordinates(rng)
exp_coords
Out[11]:
array([ 0.30471708, -1.03998411,  0.7504512 ,  0.94056472, -1.95103519,
       -1.30217951])

There is a so-called exponential map

$$Exp: \boldsymbol{\tau} \in \mathbb{R}^6 \rightarrow \boldsymbol{T} \in SE(3)$$
In [12]:
tf_FE = RigidTransform.from_exp_coords(exp_coords)
tf_FE
Out[12]:
RigidTransform.from_matrix(array([[ 0.29002012, -0.68803531, -0.66520354,  1.96980191],
                                  [ 0.41444466,  0.71681443, -0.56072514, -1.16979835],
                                  [ 0.86262619, -0.11306848,  0.49304317, -0.63744908],
                                  [ 0.        ,  0.        ,  0.        ,  1.        ]]))

The inverse operation is the logarithmic map

$$Log: \boldsymbol{T} \in SE(3) \rightarrow \boldsymbol{\tau} \in \mathbb{R}^6$$
In [13]:
tf_FE.as_exp_coords()
Out[13]:
array([ 0.30471708, -1.03998411,  0.7504512 ,  0.94056472, -1.95103519,
       -1.30217951])

Dual Quaternions

Similarly to unit quaternions for rotations, unit dual quaternions are an alternative to represent transformations. They support similar operations as transformation matrices. A dual quaternion consists of a real quaternion and a dual quaternion: $\boldsymbol{\sigma} = \boldsymbol{p} + \epsilon \boldsymbol{q}$, where $\epsilon^2 = 0$ and $\epsilon \neq 0$.

We use an array of shape (8,) to represent dual quaternions.

In [14]:
dq = pt.dual_quaternion_from_transform(pt.random_transform(rng))
dq
Out[14]:
array([ 0.98545604,  0.06361002, -0.15735398, -0.00835981,  0.09957055,
       -0.42232652,  0.43230695,  0.38672941])

The tricky part when using dual quaternions is that SciPy uses the scalar-last convention by default for unit dual quaternions because the the Rotation class uses this convention by default for unit quaternions. pytransform3d uses the scalar-first convention. However, we can configure SciPy to use scalar-first.

In [15]:
tf_GF = RigidTransform.from_dual_quat(dq, scalar_first=True)
tf_GF
Out[15]:
RigidTransform.from_matrix(array([[ 0.95033968, -0.00354212, -0.3111944 , -0.95951462],
                                  [-0.03649504,  0.99176776, -0.12273885,  0.84123605],
                                  [ 0.30906732,  0.12800065,  0.94238698,  0.68596303],
                                  [ 0.        ,  0.        ,  0.        ,  1.        ]]))
In [16]:
tf_GF.as_dual_quat(scalar_first=True)
Out[16]:
array([ 0.98545604,  0.06361002, -0.15735398, -0.00835981,  0.09957055,
       -0.42232652,  0.43230695,  0.38672941])

Normalization of Dual Quaternions

A thing that I learned when implementing the conversion from dual quaternion to RigidTransform is that it is not sufficient to divide the dual quaternion by the Euclidean norm of the real quaternion to normalize it. Dual quaternions have a second constraint: the real and the dual quaternion must be orthogonal, that is, their scalar product must be 0. When applying normal dual quaternion operations, this constraint is usually not violated easily, but it has to be considered when we accept any arbitrary user input. Hence, we implemented a normalization procedure that enforces both constraints in RigidTransform.from_dual_quat().

In [17]:
dq_unnormalized = rng.normal(size=8)
dq_unnormalized
Out[17]:
array([ 0.0660307 ,  1.12724121,  0.46750934, -0.85929246,  0.36875078,
       -0.9588826 ,  0.8784503 , -0.04992591])
In [18]:
np.round(dq_unnormalized[:4] @ dq_unnormalized[:4], 12)
Out[18]:
np.float64(2.231981313568)
In [19]:
np.round(dq_unnormalized[:4] @ dq_unnormalized[4:], 12)
Out[19]:
np.float64(-0.602958427222)
In [20]:
dq_normalized = RigidTransform.from_dual_quat(dq_unnormalized).as_dual_quat()
dq_normalized
Out[20]:
array([ 0.0441978 ,  0.75452143,  0.31292843, -0.57516934,  0.17320425,
       -0.29317634,  0.4501586 , -0.12637176])
In [21]:
np.round(dq_normalized[:4] @ dq_normalized[:4], 12)
Out[21]:
np.float64(1.0)
In [22]:
np.round(dq_normalized[:4] @ dq_normalized[4:], 12)
Out[22]:
np.float64(0.0)

Operations

The following operations are already provided by RigidTransform.

Application to Vector

The vector could be a position or translation.

In [23]:
tf_BA.apply([2, 3, 4])
Out[23]:
array([3., 5., 7.])

Inversion

In [24]:
tf_BA.inv()
Out[24]:
RigidTransform.from_matrix(array([[ 1.,  0.,  0., -1.],
                                  [ 0.,  1.,  0., -2.],
                                  [ 0.,  0.,  1., -3.],
                                  [ 0.,  0.,  0.,  1.]]))

Composition

In [25]:
tf_CB * tf_BA
Out[25]:
RigidTransform.from_matrix(array([[ 0., -1.,  0., -2.],
                                  [ 1.,  0.,  0.,  1.],
                                  [ 0.,  0.,  1.,  3.],
                                  [ 0.,  0.,  0.,  1.]]))

Exponentiation

This can be done with integers or floats if you want to compute fractions of the transformation (for example, for screw linear interpolation):

In [26]:
tf_BA ** 2
Out[26]:
RigidTransform.from_matrix(array([[1., 0., 0., 2.],
                                  [0., 1., 0., 4.],
                                  [0., 0., 1., 6.],
                                  [0., 0., 0., 1.]]))
In [27]:
tf_BA ** 0.5
Out[27]:
RigidTransform.from_matrix(array([[1. , 0. , 0. , 0.5],
                                  [0. , 1. , 0. , 1. ],
                                  [0. , 0. , 1. , 1.5],
                                  [0. , 0. , 0. , 1. ]]))

The operation is implemented as

In [28]:
RigidTransform.from_exp_coords(tf_BA.as_exp_coords() * 0.5)
Out[28]:
RigidTransform.from_matrix(array([[1. , 0. , 0. , 0.5],
                                  [0. , 1. , 0. , 1. ],
                                  [0. , 0. , 1. , 1.5],
                                  [0. , 0. , 0. , 1. ]]))

Vectorization

The best feature of the RigidTransform class is that it provides a vectorized implementation of everything like the Rotation class. Here is a (maybe not so useful) example:

In [29]:
import pytransform3d.trajectories as ptr


exp_coords = ptr.exponential_coordinates_from_transforms(
    ptr.random_trajectories(rng, n_trajectories=1, n_steps=5))[0]
exp_coords
Out[29]:
array([[ 2.20810174e-01,  4.20608441e-01, -2.55810631e+00,
        -1.34804258e+00, -8.86896742e-01, -2.66618093e-01],
       [ 1.85131157e-01, -6.74173631e-03, -5.15167346e-01,
        -1.25089375e+00, -2.78255916e-01,  2.43839425e+00],
       [ 6.19387890e-02,  4.78173367e-01,  1.60936659e+00,
         7.03915088e-01,  2.60872186e+00,  7.14607252e+00],
       [-5.48148379e-01, -2.02513971e+00, -8.07553759e-01,
         3.17156118e+00,  4.99888882e+00, -1.45942207e+00],
       [ 4.05495194e-01, -2.62440723e+00, -1.05705048e+00,
        -4.78402325e-01,  3.70464340e+00, -1.31443798e+00]])

We can use indices to access individual transformations:

In [30]:
tfs = RigidTransform.from_exp_coords(exp_coords)
tfs[0]
Out[30]:
RigidTransform.from_matrix(array([[-0.84445547,  0.53077281, -0.07193876, -0.9193893 ],
                                  [-0.47979596, -0.80928491, -0.33890082,  0.78792548],
                                  [-0.2380983 , -0.25167072,  0.93806772,  0.04575961],
                                  [ 0.        ,  0.        ,  0.        ,  1.        ]]))

We can use, for instance, composition, exponentiation, inversion, and conversions with this batch of transformations.

In [31]:
((tfs * tfs) ** -0.75).inv().as_dual_quat(scalar_first=True)
Out[31]:
array([[ 9.19175766e-01, -3.34246220e-02, -6.36686159e-02,
         3.87227341e-01,  1.28738511e-03,  2.04367759e-01,
         1.34844273e-01,  3.67560052e-02],
       [ 9.16882411e-01,  1.34979705e-01, -4.91542099e-03,
        -3.75610120e-01,  8.12522311e-01, -8.73995169e-01,
        -2.04262353e-01,  1.67199926e+00],
       [-4.57013511e-01,  3.27920389e-02,  2.53157672e-01,
         8.52041387e-01, -5.07918773e+00,  1.27844919e-01,
        -5.08956690e-01, -2.57804780e+00],
       [ 7.83731604e-01,  1.51443405e-01,  5.59509188e-01,
         2.23112383e-01, -2.21371584e+00,  1.25005624e-01,
         2.31802698e+00,  1.87829285e+00],
       [ 9.77496996e-01, -2.99275784e-02,  1.93694411e-01,
         7.80156251e-02, -4.72004583e-01, -3.06226228e-01,
         1.93702709e+00,  9.87329538e-01]])

What's Next?

SciPy

For SciPy, there is a roadmap issue for this new feature that contains several related tasks: RFC: Feature roadmap for 3D spatial transformations #22370.

pytransform3d

As you have seen, pytransform3d is fully compatible with this class. However, I envision it to be a more complete and transparent set of implementations. While SciPy supports the most useful representations of rigid transformations, it hides its internal representation so that it can be replaced in the future. It also does not need a direct translation from dual quaternions to transformation matrices. pytransform3d does not support it yet either, but it would be in the scope of the library. Furthermore, in the scope of pytransform3d are tools to work effectively with every representation, such as normalization in different ways, detecting numerical issues, and similarity checks for representations with double cover. In addition, the TransformManager and the plotting and visualization tools are most likely out of scope for SciPy.

JAX

I really like the idea of JAX, a differentiable and GPU-compatible NumPy. It supports the Rotation class of SciPy already. A next step would be to adopt the RigidTransform interface.