pytransform

My work often combines motion capture, reinforcement learning, and robotics. There are many different systems involved, like proprietary motion capturing software, our own machine learning library, robotic simulations (e.g. MARS, Gazebo, or Bullet), and robotic middleware (e.g. ROS or RoCK). All of them come with their own complex visualization tools, tools for handling transformations, and usually with their own conventions for transformations.

For example, the following conventions can be found in various tools:

  • pybullet uses [x, y, z] to represent positions and quaternions [x, y, z, w] to represent orientations.
  • A pose in ROS follows the same convention for quaternions.
  • The kinematic description format for robots URDF uses two different formats for orientations: offsets for joints and geometric shapes are given in xyz Euler angles (roll, pitch, yaw) and joints are defined with a rotation axis. Together with the angle, the state of a joint is fully defined.
  • Eigen is used in many libraries and frameworks. The constructor of quaternions accepts values in the order [w, x, y, z]. Eigen also has an angle-axis representation. For transformations it has a class that uses a 4x4 transformation matrix internally.

Sometimes I just need a lightweight tool to translate between conventions, debug, and visualize. This is why pytransform has been developed. It is a Python library for transformations in three dimensions. It makes conversions between rotation and transformation conventions as easy as possible. The library focuses on readability and debugging, not on computational efficiency.

pytransform is built for numpy and matplotlib. Everyone who knows the scientific Python ecosystem should have no problems using it. In addition, its code is simple. If you want to integrate an efficient version in you prefered language (C, C++, Julia, ...), this will be straightforward based on the source code of pytransform.

Example: From Hand to Gripper Trajectories

Let us take a look at a typical problem that involves a lot of transformations and usually some different systems and conventions.

Motion Capture

Step 1: We have a marker-based motion capture (MoCap) system. We measure the pose of an object with three markers ($p_{object,1},p_{object,2},p_{object,3}$) and we have another marker $p_{other}$ (usually more than one) that should be transformed in the object's coordinate system. All measured positions are represented in the MoCap's coordinate system.

In [1]:
%pylab inline
Populating the interactive namespace from numpy and matplotlib
In [2]:
# this is what we get from the MoCap
object_marker1_2_mocap = np.array([0.13, 0.17, 1.22])
object_marker2_2_mocap = np.array([0.22, 0.16, 1.22])
object_marker3_2_mocap = np.array([0.14, 0.26, 1.22])
another_marker_2_mocap = np.array([0.32, 0.35, 1.41])
markers2mocap = np.vstack((
        object_marker1_2_mocap, object_marker2_2_mocap,
        object_marker3_2_mocap, another_marker_2_mocap))
In [3]:
from pytransform.plot_utils import make_3d_axis
def plot_markers(markers):
    plt.figure(figsize=(8, 8))
    ax = make_3d_axis(ax_s=1)
    ax.set_xlim((0.0, 0.4))
    ax.set_ylim((0.1, 0.5))
    ax.set_zlim((1.1, 1.5))
    ax.scatter(markers[:, 0], markers[:, 1], markers[:, 2])
    return ax
plot_markers(markers2mocap)
Out[3]:
<matplotlib.axes._subplots.Axes3DSubplot at 0x7f59f3f21550>

We can put a coordinate system at the object's markers. $p_{object,2} - p_{object,1}$ and $p_{object,3} - p_{object,3}$ will be normalized and represent the x-axis and the y-axis of this coordinate system. We can compute the cross product to obtain the third basis vector.

In [4]:
from pytransform.rotations import norm_vector, perpendicular_to_vectors
object_x = norm_vector(object_marker2_2_mocap - object_marker1_2_mocap)
object_y = norm_vector(object_marker3_2_mocap - object_marker1_2_mocap)
object_z = perpendicular_to_vectors(object_x, object_y)

The three vectors are not really orthogonal at the moment because of measurement errors or because the markers might not be aligned perfectly. We can use a trick and recompute the 2nd basis vector with the cross product:

In [5]:
object_y = perpendicular_to_vectors(object_z, object_x)

Now we can compute the transformation matrix to rotate a point from the object's coordinate system to the global coordinate system by writing the basis vectors in the columns of the matrix and we can draw the coordinate system.

In [6]:
from pytransform.rotations import plot_basis

R = np.vstack((object_x, object_y, object_z)).T

ax = plot_markers(markers2mocap)
plot_basis(ax=ax, R=R, p=object_marker1_2_mocap, s=0.05)
Out[6]:
<matplotlib.axes._subplots.Axes3DSubplot at 0x7f59f1ea4810>

Now, we have a translation (first object marker in MoCap coordinate system) and a rotation. Let us combine both to the transformation of the object to the MoCap coordinate system.

In [7]:
from pytransform.transformations import transform_from
object2mocap = transform_from(R=R, p=object_marker1_2_mocap)
object2mocap
Out[7]:
array([[ 0.99388373,  0.11043153, -0.        ,  0.13      ],
       [-0.11043153,  0.99388373,  0.        ,  0.17      ],
       [ 0.        ,  0.        ,  1.        ,  1.22      ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])

This is a homogeneous transformation matrix. You can multiply a vector with it to transform it from the object's coordinate system to the MoCap coordinate system. For example, this happens when you transform the basis vectors (a 1 must be added as a 4th coordinate if you want to transform positions):

In [8]:
object2mocap.dot(np.array([1, 0, 0, 1]))
Out[8]:
array([ 1.12388373,  0.05956847,  1.22      ,  1.        ])
In [9]:
object2mocap.dot(np.array([0, 1, 0, 1]))
Out[9]:
array([ 0.24043153,  1.16388373,  1.22      ,  1.        ])
In [10]:
object2mocap.dot(np.array([0, 0, 1, 1]))
Out[10]:
array([ 0.13,  0.17,  2.22,  1.  ])

There is a convenience function to plot these transformations directly.

In [11]:
from pytransform.transformations import plot_transform
ax = plot_markers(markers2mocap)
plot_transform(ax, A2B=object2mocap, s=0.05)
Out[11]:
<matplotlib.axes._subplots.Axes3DSubplot at 0x7f59f1df1a90>

Let us transform the point $p_{other}$ to object-relative coordinates. This requires the inversion of the transformation from object's to MoCap's coordinate system.

In [12]:
from pytransform.transformations import invert_transform
mocap2object = invert_transform(object2mocap)
mocap2object
Out[12]:
array([[ 0.99388373, -0.11043153,  0.        , -0.11043153],
       [ 0.11043153,  0.99388373,  0.        , -0.18331633],
       [ 0.        ,  0.        ,  1.        , -1.22      ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])

That seems to be a lot of work if there are more transformations chained up. Well, there is a TransformManager that can help us and compute these kind of transformations automatically:

In [13]:
from pytransform.transform_manager import TransformManager
tm = TransformManager()
tm.add_transform("object", "mocap", object2mocap)
mocap2object = tm.get_transform("mocap", "object")
mocap2object
Out[13]:
array([[ 0.99388373, -0.11043153,  0.        , -0.11043153],
       [ 0.11043153,  0.99388373,  0.        , -0.18331633],
       [ 0.        ,  0.        ,  1.        , -1.22      ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])

The TransformManager can plot itself:

In [14]:
ax = tm.plot_frames_in("mocap")
ax.set_xlim((-0.5, 1.5))
ax.set_ylim((-0.5, 1.5))
ax.set_zlim((0, 2))
Out[14]:
(0, 2)

Again, we have to add a 1 at the end of the vector to transform the point:

In [15]:
from pytransform.transformations import vector_to_point
another_marker2object = mocap2object.dot(vector_to_point(another_marker_2_mocap))
another_marker2object
Out[15]:
array([ 0.16896023,  0.19988106,  0.19      ,  1.        ])

Transfer to Robot

Step 2: Suppose the point that we just transformed was the position of a hand. Instead of a hand, we want to move a robotic gripper at this position. Instead of measuring the object's position with the MoCap system, we will measure the position with some kind of object detection method. There are several problems:

  • The coordinate system of the detected object is not aligned with the coordinate system of the markers on the object.
  • The coordinate system of the detected object is measured in the camera's coordinate system.
  • The camera is somewhere in the robot, connected through a complex kinematic structure with the gripper of the robot.

Fortunately, we can simply measure the transformation from the object position and orientation that the robot will detect and the object position and orientation that is measured by the MoCap system. (We assume that the position and orientation of the object can be detected very accurately.) We will just write it down manually here.

In [16]:
from pytransform.rotations import matrix_from_angle
detected_object2object = transform_from(
    R=matrix_from_angle(1, np.pi / 8.0),
    p=np.array([0.1, 0.05, -0.05]))
tm.add_transform("detected_object", "object", detected_object2object)
Out[16]:
<pytransform.transform_manager.TransformManager at 0x7f59f1da7850>
In [17]:
ax = plot_markers(markers2mocap)
tm.plot_frames_in("mocap", ax=ax, s=0.05,
                  whitelist=["object", "detected_object"])
Out[17]:
<matplotlib.axes._subplots.Axes3DSubplot at 0x7f59f001f510>

Now we can add the detected position and orientation of the object in camera coordinates. This usually involves complex object detection algorithms that are integrated in some robotic middleware. For example, the ArUco library will give you directly a transformation matrix based on markers that are attached to the object.

In [18]:
detected_object2camera = np.array(
    [[  1.28844494e-01,  -7.11376792e-01,  -6.90899527e-01, 0.2],
     [  1.57788998e-17,  -6.96706709e-01,   7.17356091e-01, 0.1],
     [ -9.91664810e-01,  -9.24273828e-02,  -8.97668236e-02, 0.2],
     [             0.0,              0.0,              0.0, 1.0]])
tm.add_transform("detected_object", "camera", detected_object2camera)
Out[18]:
<pytransform.transform_manager.TransformManager at 0x7f59f1da7850>
In [19]:
ax = plot_markers(markers2mocap)
tm.plot_frames_in("mocap", ax=ax, s=0.05, whitelist=["object", "detected_object", "camera"])
Out[19]:
<matplotlib.axes._subplots.Axes3DSubplot at 0x7f59eaccb7d0>

The camera is mounted somewhere on or in the robot. It is often not static. We need a kinematic description of the robot and its current joint states to compute the forward kinematics so that we can compute the position of the camera in the robot's base. This is usually done by the robotic middleware, for example, with ROS tf or you can use KDL directly. There is also a UrdfTransformManager in pytransform that is able to load URDF files and you can set the joint angles. We will just assume that we know the current kinematic configuration of the robot. and add the current camera position in the robot's base.

In [20]:
camera2robot = np.array(
    [[-0.38343259,  0.27452029,  0.88182655,  0.07286918],
     [ 0.71137679,  0.69670671,  0.09242738, -0.08043151],
     [-0.58900128,  0.66275061, -0.46242742,  0.09401068],
     [ 0.        ,  0.        ,  0.        ,  1.        ]])
tm.add_transform("camera", "robot", camera2robot)
Out[20]:
<pytransform.transform_manager.TransformManager at 0x7f59f1da7850>
In [21]:
ax = plot_markers(markers2mocap)
tm.plot_frames_in("mocap", ax=ax, s=0.05,
                  whitelist=["object", "detected_object", "camera", "robot"])
Out[21]:
<matplotlib.axes._subplots.Axes3DSubplot at 0x7f59eac2e2d0>

Our last step here is to compute the transformation of the point $p_{other}$ to the robots coordinate system.

In [22]:
mocap2robot = tm.get_transform("mocap", "robot")
hand2robot = mocap2robot.dot(vector_to_point(another_marker_2_mocap))
In [23]:
tm.get_transform("camera", "robot")
Out[23]:
array([[-0.38343259,  0.27452029,  0.88182655,  0.07286918],
       [ 0.71137679,  0.69670671,  0.09242738, -0.08043151],
       [-0.58900128,  0.66275061, -0.46242742,  0.09401068],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])
In [24]:
plt.figure(figsize=(10, 10))
ax = tm.plot_frames_in("robot", s=0.05, whitelist=["robot", "camera"])
ax.scatter(hand2robot[0], hand2robot[1], hand2robot[2])
ax.set_xlim((0.0, 0.2))
ax.set_ylim((-0.1, 0.1))
ax.set_zlim((0.0, 0.2))
ax.view_init(20, 50)

We can use any library that solves the inverse kinematics of the robot to reach this point with a gripper. Usually we also want to reach it with a certain orientation of the gripper. Furthermore, we usually do not just want to reach one point but we would like to reproduce a trajectory... Problems are a little bit more complicated in reality but I hope this article gives a good idea of what pytransform can do.

pytransform

pytransform is available at Github. You can install it from sources with

git clone https://github.com/rock-learning/pytransform.git
cd pytransform
python setup.py install

Documentation is available here.

Update

This software has been renamed to pytransform3d is now available at Github and PyPI.