Working with coordinate transforms is one of those skills that every robotics engineer needs but rarely gets taught properly. This article covers the theory, the gotchas, and a clean C++ implementation.

Why Transforms Matter

Every sensor on your robot lives in its own coordinate frame. The lidar sees the world from its mounting position. The camera has its own origin. The wheels measure motion relative to the robot base. Your job is to unify all of this into a coherent world model.

The fundamental question: if the lidar detects an obstacle at position $(x_l, y_l)$ in the lidar frame, where is that obstacle in the world frame?

The Mathematics

A 2D rigid body transform consists of a rotation and a translation. We represent this as a homogeneous transformation matrix:

$$ T = \begin{bmatrix} \cos\theta & -\sin\theta & t_x \\ \sin\theta & \cos\theta & t_y \\ 0 & 0 & 1 \end{bmatrix} $$

Where $\theta$ is the rotation angle and $(t_x, t_y)$ is the translation vector.

To transform a point $p$ from frame A to frame B:

$$ p_B = T_{AB} \cdot p_A $$

The inverse transform (going from B back to A) is:

$$ T_{BA} = T_{AB}^{-1} = \begin{bmatrix} \cos\theta & \sin\theta & -t_x\cos\theta - t_y\sin\theta \\ -\sin\theta & \cos\theta & t_x\sin\theta - t_y\cos\theta \\ 0 & 0 & 1 \end{bmatrix} $$

Chaining Transforms

One of the beautiful properties: transforms compose by matrix multiplication. If you have:

  • $T_{WR}$: robot pose in world frame
  • $T_{RL}$: lidar pose in robot frame

Then the lidar-to-world transform is simply:

$$ T_{WL} = T_{WR} \cdot T_{RL} $$

Implementation

Here’s a clean C++ implementation that I use in production:

#include <cmath>
#include <array>

class Transform2D {
public:
    Transform2D() : x_(0), y_(0), theta_(0) {}
    
    Transform2D(double x, double y, double theta)
        : x_(x), y_(y), theta_(theta) 
    {
        normalizeAngle();
    }
    
    // Transform a point from this frame to the parent frame
    std::array<double, 2> transformPoint(double px, double py) const {
        double cos_t = std::cos(theta_);
        double sin_t = std::sin(theta_);
        
        return {
            cos_t * px - sin_t * py + x_,
            sin_t * px + cos_t * py + y_
        };
    }
    
    // Compose transforms: result = this * other
    Transform2D compose(const Transform2D& other) const {
        double cos_t = std::cos(theta_);
        double sin_t = std::sin(theta_);
        
        return Transform2D(
            cos_t * other.x_ - sin_t * other.y_ + x_,
            sin_t * other.x_ + cos_t * other.y_ + y_,
            theta_ + other.theta_
        );
    }
    
    // Get the inverse transform
    Transform2D inverse() const {
        double cos_t = std::cos(theta_);
        double sin_t = std::sin(theta_);
        
        return Transform2D(
            -x_ * cos_t - y_ * sin_t,
             x_ * sin_t - y_ * cos_t,
            -theta_
        );
    }
    
    double x() const { return x_; }
    double y() const { return y_; }
    double theta() const { return theta_; }

private:
    void normalizeAngle() {
        while (theta_ > M_PI) theta_ -= 2 * M_PI;
        while (theta_ < -M_PI) theta_ += 2 * M_PI;
    }
    
    double x_, y_, theta_;
};

Common Pitfalls

A few things that have bitten me over the years:

Angle normalization: Always keep angles in $[-\pi, \pi]$. Accumulated rotations will drift outside this range and cause subtle bugs.

Frame direction confusion: Does $T_{AB}$ transform points from A to B, or describe the pose of A in B? Pick a convention and stick to it. I use “transforms points from A to B” consistently.

Premature optimization: Don’t cache $\cos\theta$ and $\sin\theta$ unless profiling shows it matters. Modern CPUs are fast. Clarity wins.

Diagram

Here’s how the frames relate on a typical mobile robot:

Robot coordinate frames

Figure 1: The world frame (W), robot base frame (R), and lidar frame (L). The transform $T_{WR}$ comes from your localization system. The transform $T_{RL}$ is fixed calibration.

Verification

Always write unit tests for your transform code. A useful property: composing a transform with its inverse should give identity (within floating point tolerance):

void testInverse() {
    Transform2D t(1.5, -2.0, 0.7);
    Transform2D identity = t.compose(t.inverse());
    
    assert(std::abs(identity.x()) < 1e-10);
    assert(std::abs(identity.y()) < 1e-10);
    assert(std::abs(identity.theta()) < 1e-10);
}

Further Reading

The gold standard reference is Lynch and Park’s Modern Robotics. For a more practical treatment focused on mobile robots, Siegwart’s Introduction to Autonomous Mobile Robots covers the same material with more implementation detail.