Drive DisplayID-tiled (5K) monitors as one output#2041
Conversation
The first generation of 5K monitors (LG UltraFine 5K, Dell UP2715K, mid-2010s) couldn't fit 5120x2880@60 down a single DP 1.2 cable and instead carry the image as two streams, one per tile half. Each half presents as its own DRM connector, with a DisplayID Tile Topology block in its EDID describing where it sits in the monitor. The kernel parses that block and exposes the result as the `TILE` immutable blob, formatted as eight ':'-delimited decimals: group_id:flags:num_h_tiles:num_v_tiles:loc_h:loc_v:tile_w:tile_h. Newer 5K panels carry the image on a single cable using DSC over DP 1.4, or uncompressed over DP 2.1 UHBR. `DrmDevice::tile_info()` reads the blob, parses it into a new `backend::drm::TileInfo` struct, and returns `None` for connectors without one. Compositors can use the shared `group_id` to identify connectors that belong to the same physical monitor. The kernel's `flags` field is parsed and discarded. It isn't documented and Mutter ignores it. AI-Assisted
A tiled DrmCompositor (added in a following commit) drives several CRTCs as one logical, frame-locked output, each CRTC scanning out a sub-rect of that output. `TilePlacement` describes one CRTC's contribution: its `DrmSurface`, the planes it may use, and its `Rectangle` sub-rect in the logical output's untransformed coordinate space. `validate_tile_regions` checks that a set of regions exactly tiles a gap-free, overlap-free rectangle anchored at the origin, returning the size of the logical output they compose. Pure geometry, factored into a free function with unit tests. The same-device requirement is enforced later by the constructor. AI-Assisted
A tiled DrmCompositor renders the logical scene per CRTC by relocating elements so each tile's slice lands in its framebuffer. The offset must be derived *through* the output transform: tile regions are fixed in framebuffer space, but the transform is applied to the whole logical scene before tiles are sliced, so a raw framebuffer-space offset is only correct at Transform::Normal. `tile_logical_region` maps a tile's fixed framebuffer-space region into the logical (post-transform) coordinate space via `transform_rect_in`, and `tile_render_offset` returns the element relocation that places that slice at the tile framebuffer's origin. Both are exposed as `TilePlacement` methods for the renderer to call. The geometry is pure, so rotation correctness is unit-tested across all eight transforms — the property test asserts the fixed halves still partition the rotated logical output with no gaps or overlaps, which a framebuffer-space offset would violate for non-Normal transforms. AI-Assisted
Pure refactor in preparation for tiled output: move the per-scan-out pipeline fields (surface, planes, swapchain, frame-state machine, element-state caches, cursor plane, …) out of DrmCompositor into a new private TileState struct, and have DrmCompositor hold a SmallVec<[TileState; 1]>. The logical-output state (output_mode_source, framebuffer_exporter, cursor_size, debug_flags, span) stays on DrmCompositor. With exactly one tile this is behaviour-identical — every method now operates on self.tiles[0]; a following commit adds the multi-tile constructor and render path. Debug is now a manual impl that reports the tile count, since deriving it through SmallVec<TileState> would demand a Debug bound on the allocator buffer that the previous flat derive avoided. No functional change; the full lib test suite is unchanged (89 passing). AI-Assisted
Add the multi-CRTC constructor. `new_tiled(output_mode_source, config: impl Into<TileConfig>, …)` validates that the tiles tile a gap-free origin-anchored rectangle (TileConfig::logical_size) and share one DRM device, then builds one TileState per CRTC — each with an OutputModeSource::Static sized to its region and carrying the logical output's scale/transform. The per-surface construction is factored into a shared build_tile helper; new() is reduced to the single-tile case that delegates to it (region = the whole output), so there is one construction path. FrameError gains an InvalidTileConfig variant (From<TileConfigError>) for the validation failures. The TileState.region field is set but not yet read — the multi-tile render path that consumes it lands next. No change to single-tile behaviour; 89 lib tests unchanged. AI-Assisted
The plane-assignment helpers (try_assign_element and the per-plane try_assign_* / element_config it drives) always operated on `self.tiles[0]`. Thread a `tile: usize` through them so they act on an arbitrary tile, in preparation for the multi-tile render loop. The sole caller (render_frame) still passes 0, so behaviour is unchanged. AI-Assisted
Generalize the render path from a single CRTC to N. The former
render_frame body becomes `render_tile`, which renders one tile's slice
into that tile's own swapchain and damage tracker (its mode is read from
the tile's tracker, so a `Static` per-tile slice and the single-tile
whole-output case are handled uniformly) and stashes the prepared frame
in `self.tiles[tile].next_frame`.
`render_frame` keeps its signature and, for the single-tile case,
delegates straight to `render_tile(0, ..)` — behaviour is unchanged. For
a tiled output it relocates the elements into each tile's framebuffer
space (offset derived through the output transform, so rotation stays
correct) and renders every tile, then returns one frame-locked
`RenderFrameResult`:
- `is_empty` is the AND across tiles, so a change confined to one tile
still queues the whole group (avoids dropping to one tile's refresh);
- the per-tile `RenderElementStates` are merged the same way a single
render folds an element seen on several planes;
- the plane elements describe the primary tile, recovered as references
into the original elements via a new `RelocateRenderElement::element`
accessor (full-logical screencopy is left for a later change).
Pure aggregation logic is covered by unit tests.
AI-Assisted
A tiled DrmCompositor flips all of its tiles in a single atomic commit, so the caller's per-frame `user_data` token is one value for the whole group, not one per tile (and it generally is not `Clone` — e.g. cosmic-comp passes presentation feedback that must be submitted exactly once). Move `user_data` out of the per-tile `QueuedFrame`/`PendingFrame` wrappers and onto the `DrmCompositor` as `queued_user_data`/`pending_user_data`. The wrappers (and `TileState`) lose their `U` parameter and keep only the scan-out `FrameState`. `submit`/`handle_flip`/`frame_submitted` thread the token through the group fields. Single-tile behaviour is unchanged; this prepares the lifecycle for driving several tiles from one queued frame. AI-Assisted
|
@Drakulix I hope this is in line with what you were imagining. Happy to take pointers and rework stuff if it's not right. The PR should be possible to both compile and review commit-by-commit. |
e9f2d77 to
2c01971
Compare
A tiled DrmCompositor flips all its CRTCs together in a single atomic commit so they stay frame-locked. To assemble that commit, every CRTC's state has to land in one `AtomicModeReq`. The property handles all resolve through the device's shared `PropMapping`, so one request can carry every CRTC's properties. `DrmSurface` gains three internal primitives. `append_to_request` emits a surface's page-flip or modeset state into a shared `&mut AtomicModeReq` without committing. `commit_request` commits such a request (for a modeset it test-commits first, since a multi-CRTC commit can fail where each CRTC passes alone). `mark_request_submitted` advances state after success. The release-mode `AtomicRequest` property logic is extracted into free functions (`req_set_plane` etc.) so the per-surface builder and the append path share one implementation. Legacy devices have no atomic API and return `Error::AtomicUnsupported`. `queue_frame`/`commit_frame`/`submit`/`frame_submitted` dispatch to tiled variants when driving more than one tile. They store even an unchanged tile's frame (`render_tile`'s `store_empty`) so the tiles always flip together, the primary tile included, then build one request from all tiles, commit it via the primary surface, and advance every tile on success. Because the flip is atomic, the primary CRTC's vblank stands in for the whole group, so the caller routes that one vblank to `frame_submitted` (unchanged signature) to advance every tile. The other CRTCs' redundant vblanks belong to no logical output and are ignored. A pending modeset on any tile routes the whole commit through the modeset path. The single-tile path is untouched. AI-Assisted
`DrmCompositor::new_tiled` drives several CRTCs as one frame-locked output, but consumers go through the `DrmOutputManager`/`DrmOutput` layer, which only had a single-CRTC `initialize_output`. Add the tiled counterpart: it takes one `DrmTilePlacement` per CRTC (CRTC + mode + connectors + planes + sub-rect), creates a surface for each, and builds one `new_tiled` compositor wrapped in a `DrmOutput` keyed by the primary tile's CRTC. The output's render/queue/frame_submitted then delegate to that compositor exactly as for a single-CRTC output. Unlike `initialize_output` it does not yet retry under bandwidth pressure (overlay-plane disabling / implicit modifiers); a tiled output is one dedicated monitor, so the simple path is used for now. AI-Assisted
2c01971 to
7c0432c
Compare
Drakulix
left a comment
There was a problem hiding this comment.
So on a first pass, this looks pretty good. I commented on a few bits, that stood out to me.
I am very sure we can proceed with this approach.
The whole PR is quite comment-heavy. This isn't inherently a bad thing, but a good amount of bits and pieces are quite repetitive or needlessly verbose. I assume this is due to the AI-assisted nature of the PR. (Code-wise this is pretty good though, good job cleaning it up.) I'd suggest to focus mostly on documenting reasoning and design decisions, but try to cut down on repetition and obvious explanations.
One thing I am wondering is, whether the code in DrmCompositor would already be able to direct-scanout a client buffer spanning multiple outputs (think a fullscreened window). I think it should be able to figure out the necessary parameters for CRTC_W/H and SRC_W/H properties, which is exciting. For compositing though I think going with multiple swapchains is a good approach (instead of going for a shared framebuffer covering the whole space), since it is likely more compatible.
Last but not least: It isn't 100% obvious to me, if a compositor could use this to make a "tiled output", that doesn't actually use the TILE-property, but simply wants to combine multiple outputs. (Assuming the outputs can each use atomic-commits and are connected to the same DrmDevice.) I don't see a logical reason, why that shouldn't work, so I just want to make sure the code allows it.
Ping @cmeissl. Since you wrote most of the code that is refactored here, I would really like a review from you, if you have the time.
Trim repetitive comments around the tiled compositor path and document new_tiled as a generic same-device multi-CRTC primitive, not something tied to the DRM TILE property. AI-Assisted
Replace the deprecated Point::from tuple conversion in tile-region validation. AI-Assisted
Mirror initialize_output's fallback sequence for tiled outputs. If the initial multi-CRTC compositor setup fails, first force existing outputs through composited frames to lower overlay-plane bandwidth, then retry with implicit modifiers before reporting the original initialization failure. AI-Assisted
Expose a typed wrapper for assembling one atomic commit from multiple DrmSurfaces. The primary surface begins the request and is included immediately, additional same-device surfaces can be appended, and commit consumes the builder while marking contributors submitted only after the shared commit succeeds. Use the new builder in the tiled compositor paths so callers no longer manually coordinate append_to_request, commit_request, and mark_request_submitted. AI-Assisted
Format the tiled-output DRM changes with cargo fmt. AI-Assisted
🎉
I have made a pass, making it terser and removed some redundant comments.
It may work on drivers that accept clipped/off-screen plane destinations, but the code does not explicitly compute per-tile SRC_X/Y/W/H crops. We could potentially add crops to make it more compatible.
The idea is that it should allow that. The only restriction on |
| /// # Ok(()) | ||
| /// # } | ||
| /// ``` | ||
| pub fn begin_atomic_commit<'a, 'b>( |
There was a problem hiding this comment.
It's worth noting that this API could be made more strict. We can't use "the most elegant Rust", of locking the lifetime of a &'a mut because inside DrmSurface we have Arc<Mutex<XXXInner>>, meaning a mut doesn't lock anything.
That means this API currently is not guaranteeing atomic commits across threads (or even re-entrant/interleaved calls on the same thread). This could be fixed by adding some Arc<Mutex<CommitLock>> that can be exclusively taken for the lifetime of the DrmAtomicCommit.
However I was unsure about introducing more internal Arc<Mutex - so didn't do it straight away.
Sure, I am going to review it today. |
Maybe we should close #2034 and move the comments here, since it takes the TILE property PR and builds on it? |
Yeah, sorry, I had both PRs open and picked the wrong one when answering.
I did not mean to add it to
What I though about is adding an abstraction for holding multiple We also have to initialize all planes states once at the start so that each test can incrementally extend the planes. Doing it per Tile and ignoring the atomic properties of all other tiles will lead to uncatched edge cases and might prevent using planes in some cases. I think we also have to provide atomic requests for all other functionality like enabling VRR, not sure doing it one-by-one will work. |
|
Thank you for that feedback! I will process it and fix the PR. A bit busy with work right now so it may take me until the weekend. |
Some high-res monitors can't fit their full resolution down a single cable, so they present as several DRM connectors, one per tile, each carrying a slice of the image. The connectors share a DisplayID Tile Topology block describing how the tiles fit together (the LG UltraFine 5K and Dell UP2715K are the common ones).
This drives such a monitor as a single logical output. The tiles sit on separate CRTCs but page-flip together in one atomic commit, so they stay frame-locked. The primary tile's vblank stands in for the whole group.
The pieces:
DrmDevice::tile_inforeads a connector'sTILEproperty into aTileInfo. Connectors sharing agroup_idbelong to the same monitor.DrmCompositor::new_tileddrives N CRTCs as one logical output. It's a general primitive (a single-GPU video wall qualifies too), so it takes geometry as aTilePlacementper CRTC rather thanTileInfodirectly. All surfaces must live on one device so they can share the atomic commit.DrmOutputManager::initialize_tiled_outputwraps that as oneDrmOutput, keyed by the primary tile's CRTC.Each tile renders its slice of the logical scene into its own framebuffer. The relocation offset is derived through the output transform, so it stays correct under rotation and flip. That part is pure geometry and unit-tested across all eight transforms.
Exercised on a real LG UltraFine 5K over its two DisplayPort streams (via cosmic-comp): display, live rotation, and screencopy all work.
Making use of this requires changes to
cosmic-compto read theTileInfoand usingnew_tiled. This is deliberate, sincecosmic-compneeds to present this correctly to the user in settings.I have prepared the
cosmic-compchanges, but will keep that PR away until we settled the smithay side. It needs tidying up before PR, but for reference, here it is: pop-os/cosmic-comp@master...algesten:cosmic-comp:tiled-outputClose #2034