Real-time face detection powers everything from access control to live streaming overlays. YuNet—a lightweight, millisecond-level detector bundled with OpenCV—delivers the needed speed and accuracy without a heavyweight GPU setup. This DevTip shows how to pair YuNet with straightforward CLI automation to process images and videos at scale.

Understand yunet’s strengths

YuNet is a deep-learning face detector that lives inside OpenCV’s DNN module. It balances speed, accuracy, and size, making it ideal for edge devices and serverless functions.

Detection Range: Detects faces from roughly 10 × 10 pixels up to 300 × 300 pixels. • Performance: Scores 0.8844 (AP_easy), 0.8656 (AP_medium), and 0.7503 (AP_hard) on the WIDER Face validation set. • Efficiency: Runs in milliseconds on modest CPUs.

Citation:

@article{wu2023yunet,
  title={Yunet: A tiny millisecond-level face detector},
  author={Wu, Wei and Peng, Hanyang and Yu, Shiqi},
  journal={Machine Intelligence Research},
  volume={20},
  number={5},
  pages={656--665},
  year={2023},
  publisher={Springer}
}

Set up YuNet quickly

Create an isolated Python environment and grab OpenCV:

python3 -m venv venv
source venv/bin/activate   # On Windows: .\venv\Scripts\activate
pip install --upgrade opencv-python opencv-python-headless

Download the latest model (face_detection_yunet_2023mar.onnx, as of March 2023):

curl -fsSLo face_detection_yunet.onnx \
  https://github.com/opencv/opencv_zoo/raw/main/models/face_detection_yunet/face_detection_yunet_2023mar.onnx

Automate single-image detection

detect_faces.py accepts a file path, draws rectangles, and writes output.jpg:

import cv2, sys, os

MODEL_PATH = 'face_detection_yunet.onnx'
# This is the input size the YuNet model expects (width, height).
# Images will be resized to this size before being fed into the network.
NETWORK_INPUT_SIZE = (320, 320)

if not os.path.isfile(MODEL_PATH):
    sys.exit(f"❌ Model file not found: {MODEL_PATH}")

img_path = sys.argv[1] if len(sys.argv) > 1 else ''
if not img_path:
    sys.exit("❌ Please provide an image path as an argument.")
if not os.path.isfile(img_path):
    sys.exit(f"❌ Image not found: {img_path}")

img = cv2.imread(img_path)
if img is None:
    sys.exit(f"❌ Could not read image: {img_path}")

fd = cv2.FaceDetectorYN_create(MODEL_PATH, '', NETWORK_INPUT_SIZE)

# Set the input size for the detection Step to the image's actual dimensions.
# This helps the detector scale results correctly to the original image coordinates.
fd.setInputSize((img.shape[1], img.shape[0]))

_, faces = fd.detect(img)

if faces is not None:
    for face_data in faces:
        # face_data: [x, y, w, h, re_x, re_y, le_x, le_y, nt_x, nt_y, rcm_x, rcm_y, lcm_x, lcm_y, score]
        box = list(map(int, face_data[:4]))
        x, y, w, h = box[0], box[1], box[2], box[3]
        cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
else:
    print(f"⚠️ No faces detected in {img_path}")

output_filename = 'output.jpg'
cv2.imwrite(output_filename, img)
print(f"✅ Saved to {output_filename} (processed {img_path})")

Run it:

python detect_faces.py path/to/photo.jpg

Batch-process folders

A short shell loop works, but Python’s multiprocessing can fully saturate CPU cores:

# Batch_detect.py
from glob import glob
from multiprocessing import Pool
from subprocess import run
import os

# Ensure detect_faces.py is in the same directory or adjust path
SCRIPT_PATH = os.path.join(os.path.dirname(__file__), 'detect_faces.py')

images = glob('images/*.jpg') # Assumes images are in an 'images' subdirectory

if not images:
    print("No .jpg images found in the 'images' folder.")
else:
    print(f"Found {len(images)} images to process.")
    with Pool() as p:
        # Note: This will create multiple 'output.jpg' files, each overwriting the previous.
        # For unique outputs, detect_faces.py would need modification to accept output filenames.
        p.map(lambda f: run(['python', SCRIPT_PATH, f]), images)
    print("Batch processing complete. Each result was saved as output.jpg.")

# Create an 'images' folder and put your JPGs there first
python batch_detect.py

Handle video streams

The snippet below streams, annotates, and saves output.mp4 while guarding resources:

import cv2, sys, os

MODEL_PATH = 'face_detection_yunet.onnx'
NETWORK_INPUT_SIZE = (320, 320)

if not os.path.isfile(MODEL_PATH):
    sys.exit(f"❌ Model file not found: {MODEL_PATH}")

video_path = sys.argv[1] if len(sys.argv) > 1 else ''
if not video_path:
    sys.exit("❌ Please provide a video path as an argument.")
if not os.path.isfile(video_path):
    sys.exit(f"❌ Video not found: {video_path}")

cap = None
out = None

try:
    fd = cv2.FaceDetectorYN_create(MODEL_PATH, '', NETWORK_INPUT_SIZE)
    cap = cv2.VideoCapture(video_path)

    if not cap.isOpened():
        sys.exit(f"❌ Cannot open video: {video_path}")

    frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps = cap.get(cv2.CAP_PROP_FPS)
    if fps == 0:
        fps = 30  # Default to 30 FPS

    fd.setInputSize((frame_width, frame_height))

    output_filename = 'output.mp4'
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    out = cv2.VideoWriter(output_filename, fourcc, fps, (frame_width, frame_height))
    if not out.isOpened():
        sys.exit(f"❌ Could not open VideoWriter for {output_filename}")

    print(f"Processing video: {video_path}...")
    frame_count = 0
    detected_faces_count = 0

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break

        frame_count += 1
        _, faces = fd.detect(frame)

        if faces is not None:
            detected_faces_count += len(faces)
            for face_data in faces:
                box = list(map(int, face_data[:4]))
                x, y, w, h = box[0], box[1], box[2], box[3]
                cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        out.write(frame)

    if frame_count > 0:
        print(f"✅ Video saved as {output_filename} ({frame_count} frames processed, {detected_faces_count} faces detected).")
    else:
        print(f"⚠️ No frames processed from {video_path}")

except Exception as e:
    print(f"An error occurred: {e}")
finally:
    if cap is not None:
        cap.release()
    if out is not None:
        out.release()

Tune for real-time speed

Network Input Size: Lower the NETWORK_INPUT_SIZE in the Python scripts (e.g., to (240, 240)) to cut inference time. This is the size the image is resized to before network processing. Smaller sizes are faster but might miss smaller faces or be less accurate. • GPU Acceleration: Prefer OpenCV’s CUDA backend if a compatible NVIDIA GPU and drivers are available. Initialize the detector like this:

# Make sure OpenCV is built with cuda support
fd = cv2.FaceDetectorYN_create(
    MODEL_PATH, '', NETWORK_INPUT_SIZE,
    score_threshold=0.9, nms_threshold=0.3, top_k=5000,
    backend_id=cv2.dnn.DNN_BACKEND_CUDA,
    target_id=cv2.dnn.DNN_TARGET_CUDA
)

Processing Strategy: Pin one detector instance per CPU core for parallel processing instead of spawning unbounded threads. • Frame Skipping: For video, process every Nth frame (e.g., every second or third frame) to maintain a higher output FPS in resource-constrained environments, at the cost of temporal smoothness in detection.

Compare alternative detectors briefly

| Model | Speed | Accuracy | When to pick it | | ----------------- | -----: | -------: | ---------------------------------------------- | --- | ------------- | -------- | --- | ------------------------------------------- | | YuNet (this post) | ⚡⚡⚡ | ⚡⚡ | Real-time apps on CPU or edge devices | | OpenCV DNN SSD | ⚡ | ⚡⚡⚡ | Highest accuracy when latency is less critical | n | Haar Cascades | ⚡⚡⚡⚡ | ⚡ | Legacy projects or ultra-low-power hardware |

Wrap-up

YuNet’s lean architecture, paired with a few CLI scripts, lets you batch- or stream-process faces in seconds, not minutes. Give it a spin on your next project—and if you ever need to run computer-vision pipelines at cloud scale, check out Transloadit’s Artificial Intelligence service.