Fix binfmt_misc handler loss on peer distro termination (rebase of #14443)#40617
Open
benhillis wants to merge 7 commits into
Open
Fix binfmt_misc handler loss on peer distro termination (rebase of #14443)#40617benhillis wants to merge 7 commits into
benhillis wants to merge 7 commits into
Conversation
Fixes the issue where terminating a WSL2 distro with systemd clears binfmt_misc handlers across all running distros. Changes: - Add mount unit override to prevent binfmt_misc unmount during shutdown - Fix FP vs P flag inconsistency in WSLInterop registration - Use declarative /run/binfmt.d/ approach for handler persistence Fixes #13885 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce BINFMT_INTEROP_REGISTRATION_STRING_VM with ':FP' flags for WSL2-only paths (mini_init, systemd generator), while keeping the original BINFMT_INTEROP_REGISTRATION_STRING with ':P' for WSL1 (lxcore) which does not support the 'F' (fix-binary) flag. - binfmt.h: Add _VM variant macro with ':FP', restore base macro to ':P' - main.cpp: Use _VM macro for mini_init registration (WSL2 only) - init.cpp: Use _VM macro for /run/binfmt.d/ config (systemd, WSL2 only) - config.cpp: Unchanged, continues using ':P' for WSL1 registration Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pure declarative approach via /run/binfmt.d/ does not handle conflicting binfmt configs. When a distro installs its own binfmt handler with the same name (WSLInterop), systemd-binfmt processes files alphabetically, potentially letting the conflicting handler win. Add ExecStartPost to the systemd-binfmt service override that forcefully unregisters any conflicting handler and re-registers WSL's after systemd-binfmt's normal ExecStart completes. This combines declarative config for proper re-registration with imperative override for guaranteed priority. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When protectBinfmt or interopEnabled is turned off, remove the /run/binfmt.d/WSLInterop.conf file from any previous run to ensure config changes take effect without requiring a full VM restart. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tolerate ENOENT (file doesn't exist) but log unexpected errors when removing stale /run/binfmt.d/WSLInterop.conf, matching the pattern used in timezone.cpp and plan9.cpp. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Validates the core scenario this PR fixes: terminating one WSL2 distro with systemd doesn't break binfmt_misc interop in another running distro. Imports a second distro, enables systemd on both, terminates one, and verifies cmd.exe interop + binfmt handler registration still works in the remaining distro. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens WSL’s protection of the global binfmt_misc WSLInterop handler so Windows interop continues to function in surviving WSL2 distros when a peer systemd-enabled distro terminates (preventing the “Exec format error” regression described in #13885). It also adds/strengthens TAEF coverage to validate the generated systemd/binfmt configuration and the multi-distro termination scenario.
Changes:
- Generate
/run/binfmt.d/WSLInterop.confplus systemd generator drop-ins to prevent unregister-on-shutdown and to re-registerWSLInteropafter clears/restarts. - Split binfmt registration strings so VM/WSL2 uses
:FP(fix-binary) while retaining a non-Fvariant for non-VM scenarios. - Add a new WSL2 regression test that imports a peer distro, enables systemd, terminates it, and verifies interop +
Fflag persistence.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| test/windows/UnitTests.cpp | Adds validation of generated binfmt/systemd files and introduces the peer-distro termination regression test. |
| src/linux/init/main.cpp | Switches binfmt registration constant to use the VM-specific :FP macro. |
| src/linux/init/init.cpp | Writes binfmt config + systemd drop-ins for protecting/re-registering WSLInterop; cleans up stale config when disabled. |
| src/linux/init/binfmt.h | Introduces VM-specific registration macro (:FP) alongside the existing :P string. |
Comment on lines
+386
to
+411
| const auto serviceOverrideContent = std::format( | ||
| R"(# Note: This file is generated by WSL to prevent distributions from removing the WSL binfmt entry on shutdown. | ||
| # To disable this unit, add the following to /etc/wsl.conf: | ||
| # [boot] | ||
| # protectBinfmt=false | ||
|
|
||
| [Service] | ||
| ExecStop= | ||
| ExecStart=/bin/sh -c '(echo -1 > {}/{}) ; (echo "{}" > {})' )", | ||
| ExecStartPost=/bin/sh -c '(echo -1 > {}/{} 2>/dev/null || true) ; echo "{}" > {}' | ||
| )", | ||
| BINFMT_MISC_MOUNT_TARGET, | ||
| LX_INIT_BINFMT_NAME, | ||
| BINFMT_INTEROP_REGISTRATION_STRING(LX_INIT_BINFMT_NAME), | ||
| BINFMT_INTEROP_REGISTRATION_STRING_VM(LX_INIT_BINFMT_NAME), | ||
| BINFMT_MISC_REGISTER_FILE); | ||
| constexpr auto* mountOverrideContent = R"(# Note: This file is generated by WSL to keep binfmt_misc mounted during shutdown. | ||
| # To disable this unit, add the following to /etc/wsl.conf: | ||
| # [boot] | ||
| # protectBinfmt=false | ||
|
|
||
| [Unit] | ||
| DefaultDependencies=no | ||
| Before=umount.target | ||
|
|
||
| [Mount] | ||
| Options=nosuid,nodev,noexec | ||
| )"; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the
binfmt_mischandler loss when peer WSL distributions terminate. Without this fix, running a second systemd-enabled distro and shutting it down clears the kernel's globalbinfmt_misctable, breaking interop in all surviving distros ("Exec format error" oncmd.exe,mono, etc.).This is a rebase of @yeelam-gordon's #14443 (221 commits behind master) plus a hardened test. The fix commits are unchanged — full credit to Gordon.
Fixes #13885.
PR Checklist
UnitTests::BinfmtSurvivesDistroTerminationbin\x64\Debug\test.bat /name:UnitTests::UnitTests::BinfmtSurvivesDistroTermination)Detailed Description
Root cause
binfmt_miscentries are keyed by user namespace, not by mount namespace. WSL peer distros clone withCLONE_NEWNS | CLONE_NEWPID | CLONE_NEWUTS | SIGCHLD— noCLONE_NEWUSER— so they share the root user namespace and therefore the single global binfmt entry table.When a peer distro shuts down, its
systemd-binfmt.serviceExecStop=runssystemd-binfmt --unregister, which writes-1to/proc/sys/fs/binfmt_misc/status. The kernel handler for that write (parse_command()) callsclear_entries()— wiping the globalWSLInteropentry, not just the peer's view of it.Fix (Gordon's commits, replayed)
fix(init): prevent binfmt_misc handler loss on distro termination— writes/run/binfmt.d/WSLInterop.confplus systemd drop-ins that neutralizesystemd-binfmt --unregisterand re-register WSLInterop after any clear.fix(init): split binfmt registration macros for WSL1/WSL2 compatibility— addsBINFMT_INTEROP_REGISTRATION_STRING_VMwith theFflag (:FP) so the kernel preloads/initand the interop path survivesclear_entries()-and-restore races.fix(init): add ExecStartPost to ensure WSL binfmt handler priorityfix(init): clean up stale binfmt config when protectBinfmt is disabledfix(init): log error on binfmt config cleanup failureTest
BinfmtSurvivesDistroTermination(TAEF,WSL2_TEST_METHOD):cmd.exe /c "echo alive"works in both primary and peer.wsl --terminate <peer>— triggers the unregister-on-shutdown path that previously wiped the global entry.cmd.exe /c "echo alive"still works in primary./proc/sys/fs/binfmt_misc/WSLInteropfor^flags:.*Fto catch a regression to:P-only registration.wsl --unregister <peer>.Why this approach over
MS_PRIVATEmount propagationI initially tried marking
/proc/sys/fs/binfmt_miscasMS_PRIVATEin peer mount namespaces (#40612, #40614). That approach cannot work because the data lives in the user namespace, not the mount namespace — see CI failures on those PRs. Closed in favor of this PR.Validation Steps
Local run:
Total=1, Passed=1, Failed=0.Supersedes #14443. Closes #40612, closes #40614.