perf(csi): request YUV420 main stream when supported (-28% to -39% CPU) by antoinecellerier · Pull Request #147 · mainsail-crew/spyglass

antoinecellerier · 2026-05-22T16:59:00Z

Use of AI

This PR was written using GitHub Copilot CLI with Claude Opus 4.7. I've confirmed continued working behavior.

Also from past experience coding in VLC and associated projects a couple of decades ago this makes total sense to me ... video & jpeg work in the YUV space so avoiding any need to convert to a different color space is an immediate win.

Summary

Spyglass currently relies on picamera2's default main-stream pixel format, which is XBGR8888 (4 bytes/pixel). The Pi V4L2 H264 and MJPEG hardware encoders both accept YUV420 natively (1.5 bytes/pixel — 2.7× less DMA bandwidth) — so asking libcamera to deliver YUV420 directly avoids both the colour-space conversion inside the encoder and a large amount of memory bandwidth on the camera→ISP→encoder path.

This PR adds proactive format selection in the CSI camera path:

Camera._main_stream_config(width, height) is a new extension point on the abstract Camera base — the default returns {"size": (w, h)} (i.e. USB cameras are unchanged).
CSI._main_stream_config enumerates the formats libcamera actually advertises for the VideoRecording role via picam2.camera.generate_configuration([VideoRecording]).at(0).formats.pixel_formats and picks the first entry from a preference list ("YUV420", "BGR888", "RGB888", "XBGR8888", "XRGB8888") that the camera supports. These five formats are exactly the intersection of what the V4L2 HW encoder and the SW JpegEncoder both accept (see picamera2.encoders.v4l2_encoder._v4l2_format and picamera2.encoders.jpeg_encoder); requesting anything else risks the encoder failing at start_encoder() with RuntimeError("Unrecognised format ...").
If the libcamera query fails, or none of the preferred formats are supported, we fall back to omitting the format key entirely, which preserves the previous picamera2 default — i.e. this PR can only ever match-or-improve the previous behaviour.

A reactive try/except around picam2.configure() was tried first but rejected: libcamera's validate() returns Adjusted (not Invalid) for many unsupported formats, silently substituting e.g. RGB161616 → YUYV. picam2.configure() then succeeds, only for the MJPEG encoder to crash downstream because YUYV is not in its format table. Proactive enumeration avoids that whole class of failure mode.

Measured CPU impact

Raspberry Pi 4B, Camera Module 3 (IMX708), 1920×1080 @ 30 FPS, hardware encoders, sampled with ps -p $PID -o pcpu:

Scenario	`upstream/develop` (XBGR8888)	This PR (YUV420)	Reduction
Idle (encoders running, no WebRTC)	89.1% (87.5–90.6%)	~54% (40–72%)	~39%
Live WebRTC	93.0% (91.1–94.6%)	~67% (52–100%)	~28%

Note: this PR does not include the lazy-encoder changes in #146 — these numbers are with both encoders running eagerly, which is the current develop behaviour. The two PRs compose: with lazy encoders enabled and only one of MJPEG/WebRTC active, idle drops to ~0% (per #146) and the live overhead also benefits from the smaller main stream.

Tested

All existing unit tests pass (129 → 134, +5 new tests for the format-selection path).
Smoke-tested on a Pi 4B + Camera Module 3:
- Happy path: libcamera log shows configuring streams: (0) 1920x1080-YUV420/Rec709 (no “Camera configuration has been adjusted!” message — proof that libcamera honoured our pick exactly), /snapshot returns a valid 1920×1080 JPEG.
- Fall-through path: temporarily prepended a RGB161616 entry to the preference list — confirmed our INFO log fires (Main stream using 'YUV420' (preferred 'RGB161616' not supported by camera)), service starts cleanly, /snapshot still valid.
The new tests/test_camera_configure.py covers: top-preference picked, fall-through to next supported preference, no-overlap returns no format key, libcamera enumeration RuntimeError is handled, base Camera does not enumerate (so USB path is untouched).

Previously, csi.py started both the MJPEG encoder and (when WebRTC was enabled) the H264 encoder unconditionally at server startup, regardless of whether any client was connected. Both encoders ran continuously, burning CPU on every frame even with zero consumers. This change introduces LazyEncoder, a small ref-counted wrapper around picamera2's start_encoder/stop_encoder. Encoders are now started on the first consumer (MJPEG /stream or /snapshot, WebRTC WHEP POST) and stopped when the last consumer disconnects. Measured CPU impact on a Raspberry Pi 4B with a Camera Module 3 (hardware encoders enabled, sampled with `top`, single python process, percentages are of one core; the kernel reports values >100% when the process spans multiple cores): Default config (640x480 @ 15 FPS): Idle (no clients): ~13-15% -> ~5% Live MJPEG / WebRTC client: unchanged Heavier config (1920x1080 @ 30 FPS): Idle (no clients): ~95-100% -> ~15% Live WebRTC only: ~120-135% -> ~60% Live MJPEG only: ~95-104% -> ~50-60% The biggest wins come from two effects: idle is no longer paying for either encoder, and a live WebRTC-only client no longer pays for the unused MJPEG encoder running in parallel (and vice versa). The remaining idle CPU is the picamera2 capture loop itself, which still runs continuously; pausing the camera entirely is left for a follow-up commit. Implementation notes: - LazyEncoder.acquire/release are protected by a threading.Lock; a failed start_encoder rolls back the refcount so callers can retry. - WebRTC release is wired to RTCPeerConnection 'closed' state with an idempotent guard, plus a try/except around setup so a failure during SDP exchange or ICE gathering does not leak an encoder. - USB camera path is unaffected: it never used encoders, and the new code uses getattr(handler, ..., None) so missing attributes fall back to no-op. All 129 existing unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Building on the previous LazyEncoder commit, this commit ref-counts the Picamera2 instance itself: the camera only runs while at least one encoder is active. When the last encoder is released, picam2.stop() is called; on the next consumer, picam2.start() runs again. Adds a small CameraSession wrapper next to LazyEncoder; each LazyEncoder takes an optional session and acquires/releases it together with the underlying encoder, so the camera is the union of the encoders' refs. Measured CPU impact, applied on top of the previous commit (Raspberry Pi 4B, Camera Module 3, hardware encoders, sampled with `top`): Default config (640x480 @ 15 FPS): Idle (no clients): ~5% -> ~0% Live MJPEG / WebRTC client: unchanged Heavier config (1920x1080 @ 30 FPS): Idle (no clients): ~15% -> ~0% Live WebRTC client: ~60% (unchanged) Cumulative effect of both commits at 1080p30: Idle: ~95-100% -> ~0% Live WebRTC: ~120-135% -> ~60% Cold-start latency on first connect (camera off -> on -> first JPEG): ~150-500 ms for /snapshot at 1080p30, dominated by libcamera sensor init and AE/AWB warm-up. The first cold start after process boot is typically slowest (~500 ms); subsequent cold starts are faster (~150-200 ms) since some libcamera state appears to be retained. There is no warm path while no consumers are active: every snapshot released triggers a full camera stop. Implementation notes: - CameraSession.acquire/release are protected by a threading.Lock and roll back the refcount if picam2.start() raises, mirroring the LazyEncoder pattern. - LazyEncoder.acquire takes the session ref before calling start_encoder so the camera is always running when the encoder starts. release does the inverse: stop_encoder first, then drop the session ref. If start_encoder fails, the session ref is rolled back too. - Lock acquisition order is consistent (encoder lock then session lock), so two encoders sharing one session cannot deadlock. - CSI.stop() now defensively calls picam2.stop_encoder() and picam2.stop() under try/except, since at systemd shutdown the camera may already be stopped (no consumers) and stop_encoder/stop would otherwise raise. - USB camera path is unaffected; only csi.py wires the session in. All 129 existing unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Adds two new options to control how long each encoder (and the underlying camera, when no other encoder is active) keeps running after the last consumer disconnects: --mjpeg-linger-seconds / MJPEG_LINGER_SECONDS (default: -1) --webrtc-linger-seconds / WEBRTC_LINGER_SECONDS (default: 5) Semantics: -1 keeps the encoder running once started; spyglass pre-warms it at startup so the first consumer never pays cold-start latency. 0 stops the encoder immediately when the last consumer releases (the behavior introduced in the first two commits of this branch). > 0 stops the encoder N seconds after the last consumer releases; a fresh acquire within the window cancels the pending stop. The MJPEG default (-1) restores the upstream 'always on' idle behavior so /snapshot use cases (notably timelapse) and the Mainsail Adaptive MJPEG service keep working without paying cold-start latency on each request. The WebRTC default (5s) keeps the bulk of the lazy CPU win while bridging brief peer reconnects. Implementation: - LazyEncoder gains a linger_seconds ctor parameter and a shared threading.Timer for the delayed stop. A monotonically increasing _stop_token invalidates any stale timer callback that already raced past Timer.cancel(). - _stop_now_locked centralizes the stop_encoder / session.release sequence so both the immediate and lingered stop paths use the same code with the same try/finally guarantees. - CSI.start_and_run_server pre-warms each encoder for which linger < 0 by issuing a single never-released acquire() at startup. - Camera.start_and_run_server abstract signature and the USB override gain the new kwargs for compatibility; USB ignores them since it does not use LazyEncoder. Adds focused unit tests for LazyEncoder covering acquire/release ref-counting, all three linger modes, timer cancellation on re-acquire, session sharing across two encoders, and rollback on start_encoder / stop_encoder failure. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

mryel00 · 2026-05-22T17:26:29Z

Thanks for the PR.

Did you do image comparisons between both color formats?
I would expect YUV420 to look more dull in terms of colors or maybe some compression artifacts.
I'm no colorspace expert but the reduced data size obviously comes at a price and this might result in a less appealing image.
Performance should not be the only measured variable for such changes.

Nevertheless this should be a configurable option and if the image quality indeed differs we should write some documentation about it.

antoinecellerier · 2026-05-22T17:30:35Z

YUV420 is what JPEG & modern video formats encode. The formats do not provide options for RGB space chromas. In effect what this PR is doing is forcing the whole processing pipeline to operate in YUV space earlier (at least after sensor debayer & whatever happens there I guess?). This should not have any perceivable visual artifacts.

If you look in how pipelines operate in typical video software, they will always prefer operating on color spaces which are optimal to process. This is what's happening with this proposed change.

Happy to provide further context if needed. In this case though a config option IMO has 0 value and just adds needless complexity.

mryel00 · 2026-05-22T19:13:20Z

I personally also use AI for things I don't understand or get a feature faster but you should make sure that at least some basic code quality is still given.
I don't really like to review AI PRs especially if I'm the first human reading the code.

antoinecellerier · 2026-05-31T17:19:11Z

Happy to update code to better fit your coding style. From my initial reading of the code before posting the PR it was perfectly sensible to me (after a few iterations with the AI). I definitely will not post PRs on any projects without reading the code first. Let me know what you feel would need changing on top of the 2 comments you posted earlier which I would appreciate a bit more clarity on before addressing them. Thanks!

mryel00 · 2026-05-31T17:33:44Z

I just skipped through the code till now, so I made a mistake in one of the comments sry for that 😓
I will soon have time and will actually test and check the code in detail.
This was more like a first scan what felt kinda obvious, but as you saw I made a stupid mistake there. Sry again 🙈

mryel00 · 2026-05-31T17:38:52Z

Just as a general note. I sometimes sound very harsh but it's most of the times not meant like that.
If I make a mistake I'm happy about corrections 😄

antoinecellerier · 2026-05-31T17:54:47Z

No worries. I really appreciate the quick replies

The picamera2 default main-stream format is XBGR8888 (4 bytes/pixel), but the V4L2 H264/MJPEG encoders both accept YUV420 natively. Letting libcamera deliver YUV420 directly avoids an in-encoder colour-space conversion and cuts main-stream memory bandwidth ~2.7x (1.5 bpp vs 4 bpp), reducing CPU substantially on a Pi 4B + Camera Module 3: | scenario | XBGR8888 | YUV420 | reduction | |--------------------------|---------:|-------:|----------:| | idle (encoders, no WHEP) | 89.1% | ~54% | ~39% | | live WebRTC | 93.0% | ~67% | ~28% | CSI._main_stream_config() proactively enumerates the camera's supported pixel formats via libcamera (camera.generate_configuration([Video- Recording]).formats.pixel_formats) and picks the first encoder-compatible format from a preference list. If none of the preferred formats are supported - or the libcamera query fails - we fall back to the picamera2 default (no format hint), which is the pre-change behaviour. USB cameras are untouched. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

mryel00 · 2026-06-09T17:39:08Z

I tested the code and you are right, there are no visual artifacts or similar. Performance is way better too.
Thank you for the contribution ❤️

I would refactor a few lines of code to improve readability.
If you want to do it yourself, I will comment the specific lines.

antoinecellerier · 2026-06-10T07:38:03Z

Thanks. Do feel free to refactor as you see fit.

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

mryel00 · 2026-06-10T09:40:01Z

I adjusted the pick_main_stream_format method to not nest the logic too much.
The logic should still be the exact same.

antoinecellerier · 2026-06-10T12:56:14Z

Looks good thanks.
And I learned about using next() to get the first or default value from an iterable!

mryel00 · 2026-06-21T11:28:02Z

I'm back home now and can finally do the finishing touches.

I came across something weird in your description:

These five formats are exactly the intersection of what the V4L2 HW encoder and the SW JpegEncoder both accept (see picamera2.encoders.v4l2_encoder._v4l2_format and picamera2.encoders.jpeg_encoder)

This is correct, but it's just the same set of formats, isn't it?
The JpegEncoder doesn't even explicitely list the YUV420 format in it's FORMAT_TABLE but checks for it in a condition in the encode_func.
So I just wanna check, if I'm missing something here, as the description is written as if there should be different formats in those two encoders.

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

antoinecellerier · 2026-06-21T14:56:25Z

Good question.

jpeg encoder explicitly lists the formats here: https://github.com/raspberrypi/picamera2/blob/3e6dff099d3982c8fbdb25f7cbe2e4c17bac34f7/picamera2/encoders/jpeg_encoder.py#L33 . From reading the code, YUV420 is handled differently from the RGB colors in the code so isn't listed in FORMAT_TABLE as the encode function looks to be different for YUV vs RGB color spaces.

and yeah, agreed that the AI phrasing is a bit misleading as currently the format list is identical in https://github.com/raspberrypi/picamera2/blob/3e6dff099d3982c8fbdb25f7cbe2e4c17bac34f7/picamera2/encoders/v4l2_encoder.py#L41 . Where it's actually way broader is in the actual kernel v4l2 format listing https://github.com/torvalds/linux/blob/322008f87f917e2217eeac386a9410945092eb2e/include/uapi/linux/videodev2.h#L524 .

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

…sage (#147) The picamera2 default main-stream format is XBGR8888 (4 bytes/pixel), but the V4L2 H264/MJPEG encoders both accept YUV420 natively. Letting libcamera deliver YUV420 directly avoids an in-encoder colour-space conversion and cuts main-stream memory bandwidth ~2.7x (1.5 bpp vs 4 bpp), reducing CPU substantially. Signed-off-by: Patrick Gehrsitz <github@mryel.de> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Patrick Gehrsitz <github@mryel.de>

mryel00 · 2026-06-22T09:56:14Z

Thanks for this contribution. I think I would have never come up with this myself ^^

antoinecellerier · 2026-06-22T10:55:57Z

You're welcome. Thanks for the review & follow-ups

antoinecellerier and others added 3 commits May 20, 2026 20:11

antoinecellerier marked this pull request as ready for review May 22, 2026 17:04

antoinecellerier requested review from KwadFan, mryel00 and roamingthings as code owners May 22, 2026 17:04

antoinecellerier mentioned this pull request May 22, 2026

feat: lazy-start picamera2 encoders and camera to save idle CPU #146

Merged

mryel00 reviewed May 22, 2026

View reviewed changes

Comment thread spyglass/camera/csi.py Outdated

mryel00 reviewed May 22, 2026

View reviewed changes

Comment thread spyglass/camera/camera.py

mryel00 reviewed May 31, 2026

View reviewed changes

Comment thread spyglass/camera/camera.py

Comment thread spyglass/camera/csi.py Outdated

antoinecellerier force-pushed the csi-yuv420-stream branch from 71d323a to e83a40a Compare May 31, 2026 21:06

mryel00 reviewed Jun 9, 2026

View reviewed changes

Comment thread spyglass/camera/csi.py Outdated

refactor: improve code readability

ec60937

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

mryel00 added 3 commits June 21, 2026 15:14

refactor: adjust logging messages

49dd8f3

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

test: adjust test code

10e98ef

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

style: adjust code style

f0ab95e

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

mryel00 added 7 commits June 21, 2026 19:46

style: remove unnecessary comments

6537122

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

refactor: improve code readability

0ad5eb0

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

fix: remove deprecated argument style for new options

e405d2c

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

test(cli): remove unnecessary tests and fix tests

ea0ca4b

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

Merge branch 'develop' into lazy-encoders

d09444c

refactor: merge two early returns

fe782dd

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

fix: fix typo in logging message

238bbb3

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

mryel00 approved these changes Jun 22, 2026

View reviewed changes

mryel00 added 4 commits June 22, 2026 10:57

refactor(lazy_encoder): remove unnecessary condition in release

0a26d79

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

refactor(lazy_encoder): improve release code quality

51c5794

Signed-off-by: Patrick Gehrsitz <github@mryel.de>

Merge branch 'lazy-encoders' into csi-yuv420-stream

318787a

Merge branch 'develop' into csi-yuv420-stream

3321750

mryel00 merged commit 5d3e62f into mainsail-crew:develop Jun 22, 2026
5 checks passed

Conversation

antoinecellerier commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Use of AI

Summary

Measured CPU impact

Tested

Uh oh!

mryel00 commented May 22, 2026

Uh oh!

antoinecellerier commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mryel00 commented May 22, 2026

Uh oh!

antoinecellerier commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mryel00 commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mryel00 commented May 31, 2026

Uh oh!

antoinecellerier commented May 31, 2026

Uh oh!

mryel00 commented Jun 9, 2026

Uh oh!

Uh oh!

antoinecellerier commented Jun 10, 2026

Uh oh!

mryel00 commented Jun 10, 2026

Uh oh!

antoinecellerier commented Jun 10, 2026

Uh oh!

mryel00 commented Jun 21, 2026

Uh oh!

antoinecellerier commented Jun 21, 2026

Uh oh!

Uh oh!

mryel00 commented Jun 22, 2026

Uh oh!

antoinecellerier commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

antoinecellerier commented May 22, 2026 •

edited

Loading

antoinecellerier commented May 22, 2026 •

edited

Loading

antoinecellerier commented May 31, 2026 •

edited

Loading

mryel00 commented May 31, 2026 •

edited

Loading