Face blurring

FaceBlurrer wraps three face detectors behind a single API. They differ only in how they find faces, the blur compositing is identical across all three (large-kernel cv2.blur over an elliptical mask), so the pixel output is consistent regardless of which backend you pick.

from stera.models import FaceBlurrer

blur = FaceBlurrer(model="mediapipe")     # or "egoblur", "retinaface"
for frame in session.frames():
    blurred_rgb = blur.blur(frame)

Picking a backend

Backend	Setup	Detector	When to use
MediaPipe	`pip install "stera-sdk[mediapipe]"`; nothing else	BlazeFace (Tasks API)	CPU, zero setup, best for quick local runs.
RetinaFace	`pip install "stera-sdk[retinaface]"`; weights auto-download	RetinaFace via `batch-face`	GPU, fast, no manual weight management. Good default when you have CUDA.
EgoBlur	Clone Meta's EgoBlur repo + drop `ego_blur_face_gen2.jit` in it	Meta's gen2 EgoBlur (TorchScript)	Highest face recall on egocentric footage; what the original release was tuned for.

MediaPipe

blur = FaceBlurrer(model="mediapipe")
blurred = blur.blur(frame)              # single frame
blurred_list = blur.blur_batch([rgb1, rgb2, rgb3])

The asset (~3 MB) downloads to ~/.cache/mediapipe/face_detector.task on first call. Pure CPU.

RetinaFace

blur = FaceBlurrer(
    model="retinaface",
    network="resnet50",   # or "mobilenet" (default, ~2 MB)
    gpu_id=0,             # set to -1 for CPU
)

Weights download to ~/.cache/torch/hub/checkpoints/ on first call. GPU by default.

EgoBlur

blur = FaceBlurrer(
    model="egoblur",
    model_path="/opt/EgoBlur",   # contains the gen2/ source + the .jit file
    score_thresh=0.8,
    iou_thresh=0.5,
)

The model_path directory must contain both:

ego_blur_face_gen2.jit, the TorchScript model
gen2/script/..., the gen2 Python source (vendored from Meta's repo)

The SDK prepends model_path to sys.path so import gen2.script.predictor resolves.

See Installation → EgoBlur.

Detection only

If you want bboxes without applying the blur (e.g. to draw your own overlay):

boxes_per_frame = blur.detect_boxes([rgb1, rgb2, rgb3])
# list of (N, 4) float32 arrays, [x1, y1, x2, y2] in pixel coords

blur.blur(...) is just apply_elliptical_blur(rgb, blur.detect_boxes([rgb])[0], ...), exposed as a single call for the common case.

What blur accepts

blur.blur(frame)             # SyncedFrame → uses frame.rgb
blur.blur(rgb)               # raw (H, W, 3) uint8 array
blur.blur_batch([f1, f2])    # list of SyncedFrame or raw arrays

Both return RGB arrays of the same shape and dtype as the inputs.

Configuration

Per-backend config fields are forwarded as kwargs from FaceBlurrer(...).

EgoBlur

Field	Default	Notes
`egoblur_dir`	required	Local clone of EgoBlur.
`model_path`	auto	Override the JIT path. Defaults to `<egoblur_dir>/ego_blur_face_gen2.jit`.
`code_dir`	auto	Override the gen2 source dir. Defaults to `egoblur_dir`.
`device`	`"cuda"`
`score_thresh`	`0.8`	Minimum face-detection score.
`iou_thresh`	`0.5`	NMS IoU.
`use_fp16`	`True`	FP16 on CUDA when supported.
`batch_size`	`8`	Internal sub-batch size.
`scale_factor_detections`	`1.15`	bbox inflation before blur.

MediaPipe (face)

Field	Default	Notes
`min_detection_confidence`	`0.5`
`model_selection`	`1`	`0` = short-range (close subjects), `1` = full-range (general).
`scale_factor_detections`	`1.15`

RetinaFace

Field	Default	Notes
`network`	`"mobilenet"`	Or `"resnet50"` (~109 MB, more accurate).
`gpu_id`	`0`	`-1` for CPU.
`score_thresh`	`0.8`
`batch_size`	`8`
`scale_factor_detections`	`1.15`

Patterns

Stream blurred frames into the episode video

for frame in session.frames():
    blurred = blur.blur(frame)
    session.add_rgb_frame(frame.index, blurred)

session.export("episodes/run_01")  # rgb.mp4 contains the blurred stream

session.add_rgb_frame lazily opens an internal H.264 writer. The mp4 ends up in the episode directory with the same fps/intrinsics as the source recording.

Apply only to detected boxes (custom overlay)

import cv2

for frame in session.frames():
    rgb_bgr = cv2.cvtColor(frame.rgb, cv2.COLOR_RGB2BGR)
    [boxes] = blur.detect_boxes([frame.rgb])
    for x1, y1, x2, y2 in boxes:
        cv2.rectangle(rgb_bgr, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)

Combine with hand tracking in one loop

for frame in session.frames():
    blurred = blur.blur(frame)
    hands   = tracker.detect_hands(frame)   # use the original (faces don't matter for hand det)

    session.add_rgb_frame(frame.index, blurred)
    session.add_hand_pose(frame.index, hands)

Run hand tracking on the unblurred frame and face blur on the output frame. Blurred faces don't help hand detection and the cost is effectively zero, the two pipelines are independent.

Face blurring

Picking a backend

MediaPipe

RetinaFace

EgoBlur

Detection only

What blur accepts

Configuration

EgoBlur

MediaPipe (face)

RetinaFace

Patterns

Stream blurred frames into the episode video

Apply only to detected boxes (custom overlay)

Combine with hand tracking in one loop

See also

On this page