Metrics

Every metric Evaluate computes, grouped by section. Use this as the reference for what each number means.

Evaluate.compute() returns a dict whose top-level keys mirror the report sections. This page is the field-level reference: what each metric is, its units, and where it shows up in the HTML.

metrics = Evaluate(session).compute()
metrics["trajectory"]["path_length_m"]   # 12.43
metrics["hands"]["frames_with_2_hands_pct"]  # 41.2  (frames with exactly 2 hands)

Anything that can't be computed (missing stream, empty buffer) becomes None.

Recording

metrics["recording"] — file-level facts pulled from the MCAP summary.

Key	Type	Notes
`path`, `filename`	`str`	Absolute and bare filename.
`size_bytes`, `size_mb`	`int`, `float`	File size on disk.
`duration_s`, `duration_hms`	`float`, `str`	Seconds and `HH:MM:SS`.
`start_time`, `end_time`	`float`	Epoch seconds.
`start_iso`, `end_iso`, `weekday`	`str`	Human-readable wrappers.
`message_count`	`int`	Total MCAP messages across all topics.
`topic_counts`	`dict[str, int]`	Per-topic message counts. Rendered as a collapsible table.
`topics_present_count`	`int`
`missing_reference_topics`	`list[str]`	Reference topics from `MCAPReader.REFERENCE_TOPICS` that have zero messages.

RGB stream

metrics["rgb"] — pulled from session._rgb_ts() and session.rgb_intrinsics.

Key	Type	Notes
`frame_count`	`int`
`effective_fps`	`float`	`frame_count / duration`.
`median_dt_ms`, `min_dt_ms`, `max_dt_ms`, `dt_std_ms`	`float`	Inter-frame interval statistics.
`gap_count`	`int`	Number of inter-frame intervals greater than `2 × median_dt`. Feeds into the health score.
`intrinsics`	`dict`	See Intrinsics below.

Intrinsics

Same shape for both rgb.intrinsics and depth.intrinsics.

Key	Type	Notes
`width`, `height`	`int`
`fx`, `fy`, `cx`, `cy`	`float`
`fx_over_fy`	`float`
`fov_x_deg`, `fov_y_deg`	`float`	Derived from focal length and image dimensions.
`aspect_ratio`	`float`
`principal_offset_px`	`(float, float)`	`cx - w/2`, `cy - h/2`.
`principal_offset_pct`	`(float, float)`	Same, normalised to image size.
`distortion_model`	`str`	E.g. `"plumb_bob"`.
`distortion`	`list[float]`	Distortion coefficient values.

Depth stream

metrics["depth"] — depth frames are iterated once with stride every_n = max(1, num_depth // 200) so any-length recording finishes in a few seconds. Stats below come from those sampled frames.

Key	Type	Notes
`frame_count`	`int`	Full count, not the sampled count.
`effective_fps`	`float`
`median_dt_ms`	`float`
`sampled_frames`	`int`	How many frames the stats below came from.
`valid_pct_mean`	`float`	Average % of pixels with depth `> 0`. Colour-coded by `depth_valid_thresholds`. Feeds into the health score.
`valid_pct_min`, `valid_pct_max`, `valid_pct_std`	`float`
`empty_frame_count`	`int`	Frames where every pixel is zero.
`global_min_m`, `global_max_m`	`float`	Across all valid pixels in the sample set.
`depth_percentiles_m`	`dict`	`{"p5", "p50", "p95"}` in metres.
`depth_hist_counts`, `depth_hist_pct`	`dict`	Buckets `<1m / 1-2m / 2-5m / >5m`.
`intrinsics`	`dict`	See Intrinsics.

Camera trajectory

metrics["trajectory"] — derived from session.all_camera_poses(). World frame is the MCAP's pose frame; height = Y.

Key	Type	Notes
`pose_count`	`int`
`effective_rate_hz`	`float`
`path_length_m`	`float`	Sum of segment lengths.
`net_displacement_m`	`float`	Start-to-end distance.
`tortuosity`	`float`	`path_length / displacement`.
`bbox_min`, `bbox_max`, `bbox_extents`	`list[float]`	Axis-aligned bounding box in world frame.
`bbox_volume_m3`	`float`
`footprint_area_m2`	`float`	Convex-hull area of positions projected onto the XZ plane.
`height_min_m`, `height_max_m`, `height_mean_m`, `height_std_m`	`float`	Y-axis distribution.
`speed_mean_mps`, `speed_median_mps`, `speed_p95_mps`, `speed_max_mps`	`float`	Segment length over segment dt.
`accel_mean_mps2`, `accel_max_mps2`	`float`	Finite difference of speed.
`yaw_rate_deg_per_s`, `pitch_rate_deg_per_s`, `roll_rate_deg_per_s`	`float`	Median absolute angular rate per axis, from rotation-matrix ZYX Euler decomposition.
`cumulative_rotation_deg`	`float`	Sum of \|Δheading\|.
`turn_count`	`int`	Heading steps greater than 45°.
`stationary_duration_s`, `stationary_pct`	`float`	Time with speed `< 0.05 m/s`.

Plot inputs (ts_series, positions, speed_ts, speed_series, headings_deg) are also in the dict.

IMU

metrics["imu"] — from session.all_imu_samples().

Key	Type	Notes
`sample_count`	`int`
`effective_rate_hz`	`float`
`rate_jitter_ms`	`float`	Std-dev of inter-sample intervals.
`accel_axis_mean / std / min / max`	`list[float]`	Per-axis (x, y, z) m/s².
`accel_mag_mean / std / p95 / max`	`float`	Magnitude statistics.
`gyro_axis_mean / std / min / max`	`list[float]`	Per-axis rad/s.
`gyro_mag_mean / std / p95 / max`	`float`
`gravity_vector`	`list[float]`	Mean accel vector — approximates the gravity direction.
`gravity_magnitude`	`float`	\|mean accel\|.
`gravity_deviation`	`float`	\|9.81 − gravity_magnitude\|. Colour-coded by `imu_gravity_max_dev`.
`jolt_count`	`int`	Samples with \|accel\| > 20 m/s².
`high_rotation_events`	`int`	Samples with \|gyro\| > 2 rad/s.
`motion_duration_s`, `still_duration_s`	`float`	Time accel-mag is above / below the gravity-normalised threshold.

Tracking state

metrics["tracking_state"] — from /camera/tracking_state if present.

Key	Type	Notes
`message_count`	`int`
`state_counts`	`dict[str, int]`	Counts keyed by `state_str` from the tracking-state message.
`state_pct`	`dict[str, float]`	Percentages of total messages.

TF transforms

metrics["tf"] — every /tf message decoded once.

Key	Type	Notes
`message_count`	`int`
`unique_pair_count`	`int`
`pairs`	`list[dict]`	One row per parent→child pair: `{parent, child, count, rate_hz}`.

Trajectory topic

metrics["trajectory_topic"] — the /trajectory topic (separate from /camera/pose).

Key	Type	Notes
`pose_count`	`int`
`path_length_m`	`float`	Same calculation as in `trajectory`, but from the `/trajectory` poses.

Mesh

metrics["mesh"] — from /map/mesh.

Key	Type	Notes
`vertex_count`, `face_count`	`int`
`bbox_extents`, `bbox_volume_m3`	`list[float]`, `float`
`surface_area_m2`	`float`	Sum of triangle areas via cross product.
`edge_length_mean_m`, `edge_length_p5_m`, `edge_length_p95_m`	`float`	Edge-length distribution across all triangles.
`color_coverage_pct`	`float`	% of vertices that aren't the default grey (`[128,128,128]`).
`verts_per_m2`, `faces_per_m2`	`float`	Density divided by surface area.

Point cloud

metrics["point_cloud"] — from /map/mesh_cloud if present, else /map/point_cloud.

Key	Type	Notes
`point_count`	`int`
`bbox_extents`, `bbox_volume_m3`	`list[float]`, `float`
`density_pts_per_m3`	`float`
`color_coverage_pct`	`float`	% of points with non-zero RGB.

Sync quality

metrics["sync"] — nearest-neighbour offsets between RGB timestamps and each other stream. Each sub-block has:

Key	Type	Notes
`median_ms`, `p95_ms`, `max_ms`	`float`	\|Δt\| statistics across all RGB timestamps.
`within_50ms_pct`, `within_100ms_pct`	`float`	Fraction of RGB frames whose nearest match is within the bucket. The 50 ms number is colour-coded by `sync_thresholds` and feeds into the health score.

Sub-blocks:

rgb_vs_depth
rgb_vs_pose
rgb_vs_imu

Hands

metrics["hands"] — only populated when session.add_hand_pose(frame.index, hands) was called during the loop.

Three headline buckets are reported alongside the per-side detection rates:

≥1 hand (frames_with_1plus_hand*) — at least one detection in the frame. Same value as frames_with_any_hand and the natural "did we see hands at all" KPI.
Exactly 2 hands (frames_with_2_hands*) — both hands cleanly detected, no spurious extras. The most useful KPI for two-handed manipulation tasks.
More than 2 hands (frames_with_more_hands*) — typically a detection error (false-positive on background, mirrored reflection, second person in frame). Reported as a metric only; no health-score deduction by default.

The pie chart in the report shows the disjoint exact-count buckets (no hands / exactly 1 / exactly 2 / >2) derived on the fly from counts_per_frame.

Key	Type	Notes
`backend`	`str`	E.g. `"wilor"`, `"mediapipe"`, `"hamer"`.
`frames_total`	`int`	Same as `num_rgb_frames`.
`frames_with_any_hand`	`int`	Frames with ≥1 hand (alias for `frames_with_1plus_hand`).
`frames_with_any_hand_pct`	`float`	Colour-coded by `hand_any_thresholds`. Feeds into the health score.
`frames_with_1plus_hand`, `frames_with_1plus_hand_pct`	`int`, `float`	Frames with at least 1 hand. Coloured by `hand_1plus_thresholds`.
`frames_with_2_hands`, `frames_with_2_hands_pct`	`int`, `float`	Frames with exactly 2 hands. Coloured by `hand_2_thresholds`.
`frames_with_more_hands`, `frames_with_more_hands_pct`	`int`, `float`	Frames with strictly more than 2 hands. Metric only — no score deduction by default.
`counts_per_frame`	`np.ndarray (frames_total,)`	Number of hand detections per frame (clipped at 100). Used by the pie chart to derive exact buckets.
`left_detection_pct`, `right_detection_pct`	`float`	Per-side rate.
`both_hands_pct`	`float`	Frames with both sides detected.
`left_conf_mean`, `left_conf_p10 / p50 / p90`	`float`	Per-side confidence stats. Same for `right_conf_*`.
`has_3d`	`bool`	True if any hand had non-zero z.
`mano_frames_pct`	`float`	% of any-hand frames where MANO vertices were attached. `None` for non-MANO backends.
`kpts_in_frame_pct`	`float`	Of all 2D keypoints, the fraction inside the RGB image bounds.
`left_wrist_depth_mean_m`, `right_wrist_depth_mean_m`	`float`	Mean Z (camera-frame depth) of the wrist joint.
`palm_width_mean_m`	`float`	Distance between MCP joints 5 and 17.
`grip_closure_mean_m`	`float`	Mean tip-to-wrist distance across the 5 fingertips.
`left_wrist_track`, `right_wrist_track`	`dict`	`{length_m, speed_mean_mps, speed_max_mps}` in the camera frame, assuming even frame timing.

Skeleton

metrics["skeleton"] — only when you pass skeleton=... to Evaluate.

Key	Type	Notes
`frame_count`	`int`
`detection_pct`	`float`	Always 100 today — present-frame ratio.
`joint_visibility_pct`	`dict[str, float]`	Per-joint visibility (`head`, `neck`, `spine`, `l_shoulder`, `l_elbow`, `l_wrist`, `r_shoulder`, `r_elbow`, `r_wrist`, `mount_cam`).
`elbow_left_mean_deg`, `elbow_right_mean_deg`	`float`	Angle at the elbow joint, in degrees.
`reach_left_mean_m`, `reach_right_mean_m`	`float`	Shoulder-to-wrist distance.
`head_height_mean_m`	`float`	Mean Y of the head joint.

Streamed rgb.mp4

metrics["streamed_rgb"] — only when session.add_rgb_frame was used.

Key	Type	Notes
`active`	`bool`	True while the H.264 writer is still open.
`tmp_path`, `tmp_size_bytes`	`str`, `int`	Where the temp mp4 lives and how big it is at compute time.
`width`, `height`, `fps`	`int`, `float`	Pulled from `session.rgb_intrinsics`.
`has_thumbnail`	`bool`	Whether `session._rgb_mid_frame` was captured.

Health

metrics["health"] — the rollup. See Health score for how it's computed.

Key	Type	Notes
`score`	`float`	`0..100`, clamped.
`notes`	`list[str]`	One human-readable line per active deduction. Rendered next to the score.

Config snapshot

metrics["config"] — the EvaluateConfig instance used for the run. Surfaced so the report can dump the active thresholds and so downstream code can diff configs.

Plot input arrays (ts_series, positions, accel_mag_series, global_depth_samples, etc.) are kept on the dict alongside the scalars so plotting and metric layers stay decoupled. They're safe to ignore from Python.

Metrics

On this page