Output formats

A Stera session moves through two formats: MCAP going in, an episode directory coming out.

Input — MCAP

Each recording from the Stera App is a single .mcap file. It bundles every sensor stream — RGB, depth, ARKit pose, IMU — on a shared clock.

Process reads MCAPs through MCAPReader:

from stera.data import MCAPReader

reader = MCAPReader("recording.mcap")
for frame in reader.frames():
    rgb, depth, pose = frame.rgb, frame.depth, frame.cam_pose

Full reference: Process > Guides > Reading MCAP.

Output — Episode directory

After processing, session.export(out_dir) writes a complete episode you can drop straight into a training pipeline.

File	Format	What it is
`rgb.mp4`	H.264 MP4	Original RGB video at capture framerate.
`mesh.ply`	PLY	Scene mesh reconstructed from depth + ARKit.
`thumbnail.jpg`	JPEG	One-frame preview, useful for dataset browsers.
`annotation.hdf5`	HDF5	All time-series: depth, cam-pose, hand-pose, IMU, metadata.
`visualization.rrd`	Rerun	Optional replay file, opens in `rerun-viewer`.
`calibrations/`	`.npy` + `meta.json`	Intrinsics, distortion, RGB↔depth extrinsics.

annotation.hdf5

The HDF5 file is the most important output for downstream training. It holds every time-series annotation behind one file handle:

annotation.hdf5
├── /depth           per-RGB-frame depth maps, gzip-compressed
├── /cam-pose        camera pose translations + rotations
├── /imu             IMU samples
├── /hand-pose       hand detections (when buffered)
└── /metadata        durations, frame counts, start/end timestamps

Every dataset's shape, dtype, and units is documented in the canonical reference: Process > Concepts > HDF5 schema.

All poses in the export use a single convention — right-handed, +X right / +Y down / +Z forward in the camera frame, with depth in millimetres. See Process > Concepts > Coordinate frames before you start training.

Don't mix raw MCAP poses with exported HDF5 poses without checking the frame convention — MCAP carries the ARKit native frame, while session.export rebases to the Stera frame.

Choosing what to export

session.export() writes the full episode by default. To skip outputs, pass keyword flags:

session.export(
    "episodes/run_01",
    write_mesh=False,         # skip mesh.ply
    write_rrd=False,          # skip visualization.rrd
    annotations=["depth", "cam-pose"],   # subset of HDF5 groups
)

Full option list: Process > Guides > Episode export.

Output formats

Input — MCAP

Output — Episode directory

annotation.hdf5

Coordinate frames

Choosing what to export

On this page