Guides

Face blurring

FaceBlurrer wraps three face detectors behind a single API. They differ only in how they find faces, the blur compositing is identical across all three (large-kernel cv2.blur over an elliptical mask), so the pixel output is consistent regardless of which backend you pick.

from stera.models import FaceBlurrer

blur = FaceBlurrer(model="mediapipe")     # or "egoblur", "retinaface"
for frame in session.frames():
    blurred_rgb = blur.blur(frame)

Picking a backend

BackendSetupDetectorWhen to use
MediaPipepip install "stera-sdk[mediapipe]"; nothing elseBlazeFace (Tasks API)CPU, zero setup, best for quick local runs.
RetinaFacepip install "stera-sdk[retinaface]"; weights auto-downloadRetinaFace via batch-faceGPU, fast, no manual weight management. Good default when you have CUDA.
EgoBlurClone Meta's EgoBlur repo + drop ego_blur_face_gen2.jit in itMeta's gen2 EgoBlur (TorchScript)Highest face recall on egocentric footage; what the original release was tuned for.

MediaPipe

blur = FaceBlurrer(model="mediapipe")
blurred = blur.blur(frame)              # single frame
blurred_list = blur.blur_batch([rgb1, rgb2, rgb3])

The asset (~3 MB) downloads to ~/.cache/mediapipe/face_detector.task on first call. Pure CPU.

RetinaFace

blur = FaceBlurrer(
    model="retinaface",
    network="resnet50",   # or "mobilenet" (default, ~2 MB)
    gpu_id=0,             # set to -1 for CPU
)

Weights download to ~/.cache/torch/hub/checkpoints/ on first call. GPU by default.

EgoBlur

blur = FaceBlurrer(
    model="egoblur",
    model_path="/opt/EgoBlur",   # contains the gen2/ source + the .jit file
    score_thresh=0.8,
    iou_thresh=0.5,
)

The model_path directory must contain both:

  • ego_blur_face_gen2.jit, the TorchScript model
  • gen2/script/..., the gen2 Python source (vendored from Meta's repo)

The SDK prepends model_path to sys.path so import gen2.script.predictor resolves.

See Installation → EgoBlur.

Detection only

If you want bboxes without applying the blur (e.g. to draw your own overlay):

boxes_per_frame = blur.detect_boxes([rgb1, rgb2, rgb3])
# list of (N, 4) float32 arrays, [x1, y1, x2, y2] in pixel coords

blur.blur(...) is just apply_elliptical_blur(rgb, blur.detect_boxes([rgb])[0], ...), exposed as a single call for the common case.

What blur accepts

blur.blur(frame)             # SyncedFrame → uses frame.rgb
blur.blur(rgb)               # raw (H, W, 3) uint8 array
blur.blur_batch([f1, f2])    # list of SyncedFrame or raw arrays

Both return RGB arrays of the same shape and dtype as the inputs.

Configuration

Per-backend config fields are forwarded as kwargs from FaceBlurrer(...).

EgoBlur

FieldDefaultNotes
egoblur_dirrequiredLocal clone of EgoBlur.
model_pathautoOverride the JIT path. Defaults to <egoblur_dir>/ego_blur_face_gen2.jit.
code_dirautoOverride the gen2 source dir. Defaults to egoblur_dir.
device"cuda"
score_thresh0.8Minimum face-detection score.
iou_thresh0.5NMS IoU.
use_fp16TrueFP16 on CUDA when supported.
batch_size8Internal sub-batch size.
scale_factor_detections1.15bbox inflation before blur.

MediaPipe (face)

FieldDefaultNotes
min_detection_confidence0.5
model_selection10 = short-range (close subjects), 1 = full-range (general).
scale_factor_detections1.15

RetinaFace

FieldDefaultNotes
network"mobilenet"Or "resnet50" (~109 MB, more accurate).
gpu_id0-1 for CPU.
score_thresh0.8
batch_size8
scale_factor_detections1.15

Patterns

Stream blurred frames into the episode video

for frame in session.frames():
    blurred = blur.blur(frame)
    session.add_rgb_frame(frame.index, blurred)

session.export("episodes/run_01")  # rgb.mp4 contains the blurred stream

session.add_rgb_frame lazily opens an internal H.264 writer. The mp4 ends up in the episode directory with the same fps/intrinsics as the source recording.

Apply only to detected boxes (custom overlay)

import cv2

for frame in session.frames():
    rgb_bgr = cv2.cvtColor(frame.rgb, cv2.COLOR_RGB2BGR)
    [boxes] = blur.detect_boxes([frame.rgb])
    for x1, y1, x2, y2 in boxes:
        cv2.rectangle(rgb_bgr, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)

Combine with hand tracking in one loop

for frame in session.frames():
    blurred = blur.blur(frame)
    hands   = tracker.detect_hands(frame)   # use the original (faces don't matter for hand det)

    session.add_rgb_frame(frame.index, blurred)
    session.add_hand_pose(frame.index, hands)

Run hand tracking on the unblurred frame and face blur on the output frame. Blurred faces don't help hand detection and the cost is effectively zero, the two pipelines are independent.

See also