Core types & transforms
Pose6D, Keypoint, BBox, and the coordinate-transform helpers in stera.core.
Pose6D
@dataclass
class Pose6D:
rotation: np.ndarray # (3, 3)
translation: np.ndarray # (3,)
timestamp: Optional[float] = None
def as_matrix(self) -> np.ndarray # (4, 4) homogeneous
@classmethod
def from_matrix(cls, mat: np.ndarray, timestamp: float | None = None) -> Pose6D6-DoF pose with rotation matrix + translation vector. Used for camera poses (SyncedFrame.camera_pose, MCAPReader.all_camera_poses()) and TF transforms.
Keypoint
@dataclass
class Keypoint:
x: float
y: float
z: float = 0.0
confidence: float = 1.0
name: Optional[str] = None
def as_array(self) -> np.ndarray # [x, y, z]See HandPose for the typical use.
Pose3D
@dataclass
class Pose3D:
keypoints: list[Keypoint] = field(default_factory=list)
timestamp: Optional[float] = None
def as_array(self) -> np.ndarray # (N, 3)A named set of 3D keypoints.
BBox
@dataclass
class BBox:
x1: float
y1: float
x2: float
y2: float
confidence: float = 1.0
label: Optional[str] = None
@property
def width(self) -> float
@property
def height(self) -> float
@property
def center(self) -> tuple[float, float]Axis-aligned bounding box. Most face-detection paths use raw (N, 4) numpy arrays for batching, but BBox is convenient for one-off rendering.
Transforms
stera.core.transforms exposes the helpers and constants needed to move between optical, link, and world frames.
quat_to_rot
def quat_to_rot(qx: float, qy: float, qz: float, qw: float) -> np.ndarrayxyzw quaternion → (3, 3) rotation matrix.
rot_to_quat
def rot_to_quat(R: np.ndarray) -> tuple[float, float, float, float](3, 3) rotation matrix → (qx, qy, qz, qw) tuple.
optical_to_world
def optical_to_world(
pts: np.ndarray, # (N, 3) in camera optical frame
R_world: np.ndarray, # (3, 3) link → world
t_world: np.ndarray, # (3,) link origin in world
R_o2l: np.ndarray | None = None, # (3, 3) optical → link, default R_OPTICAL_TO_LINK
) -> np.ndarrayTransform points from camera optical frame to world frame. See Coordinate frames.
depth_to_pointcloud
def depth_to_pointcloud(
depth: np.ndarray, # (H, W) uint16 mm
rgb: np.ndarray, # (H, W, 3) uint8
K: np.ndarray, # (3, 3) depth intrinsics
max_pts: int = 50_000,
min_depth: float = 0.3,
max_depth: float = 5.0,
) -> tuple[np.ndarray, np.ndarray]Build a coloured point cloud in the camera optical frame from a depth image + matching RGB.
Constants
R_OPTICAL_TO_LINK # (3, 3), rotation matrix
OPTICAL_QUAT_XYZW # [-0.7071, 0.0, -0.7071, 0.0]The default optical→link rotation used when /tf is empty. For most rigs MCAPReader.R_optical_to_link returns the actual TF, falling back to these constants only when the recording lacks /tf.
See also
- Coordinate frames
SyncedFrame, usesPose6Dfor camera poses.MCAPReader.tf_transforms, yieldsPose6D.