feat(bridge): OpenFang tool surface v2 over MCP bridge (+ Stage 9 hardening)#1205
Open
benhoverter wants to merge 43 commits into
Open
feat(bridge): OpenFang tool surface v2 over MCP bridge (+ Stage 9 hardening)#1205benhoverter wants to merge 43 commits into
benhoverter wants to merge 43 commits into
Conversation
Standalone crate exposing OpenFang's tool surface to MCP clients (primarily
Claude Code subprocesses) over stdio. Per architectural decision in ANAI-22:
not folded into openfang-runtime — keeps the protocol adapter out of the
kernel/compactor blast radius and the dep graph clean.
This commit is scaffolding only:
* Cargo manifest with rmcp 1.x (server, transport-io, macros)
* lib.rs: ToolDispatcher seam trait (runtime-implements, bridge-consumes,
one-way dep), ToolDispatchError enum, Bridge struct wrapping an
Optional<Arc<dyn ToolDispatcher>>, single stub `ping` tool
* main.rs: stdio MCP server entrypoint, tracing -> stderr (stdout is the
transport), no dispatcher attached
* Workspace members updated
Identity is bound at Bridge construction time, not per-call — the security
invariant tracked by ANAI-31. Real tool surface mapping lands in ANAI-30.
cargo check -p openfang-mcp-bridge: clean.
cargo check --workspace: clean (pre-existing imap-proto future-incompat
warning unrelated).
Refs: ANAI-22, ANAI-29
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the daemon-side foundation for the MCP bridge per the ANAI-30 plan
(topology 1b: daemon → CC → bridge → unix socket → daemon dispatcher).
- New `protocol` module in openfang-mcp-bridge: Frame/Hello/HelloAck/
CallRequest/CallResponse types with length-prefixed JSON framing
(1 MiB cap, 4-byte BE length prefix). Gated by `ipc-codec` feature
so type-only consumers can drop the tokio io traits.
- New `bridge_ipc` module in openfang-api: BridgeIpcServer binds
<home_dir>/run/bridge.sock (0600), accept loop with graceful
shutdown via Notify, per-connection Hello validation and CallRequest
→ CallResponse loop.
- run_daemon spawns the listener; failure is non-fatal (HTTP keeps
serving; bridge just unavailable). Socket file removed on shutdown.
Step 1 stub: the dispatcher returns CallResult::Error
("not yet wired"). Step 2 replaces this with a call into
openfang_runtime::tool_runner::execute_tool, scoped to the four-tool
allowlist (file_read, file_list, agent_list, channel_send). Identity
binding + token-table auth land in ANAI-31.
Tests: 3 protocol roundtrip tests + 4 IPC handler tests
(handshake/dispatch end-to-end via tempfile socket, version mismatch
rejection, empty-token rejection).
Refs ANAI-30, ANAI-22.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the step-1 stub in `BridgeIpcServer` with a real call into
`openfang_runtime::tool_runner::execute_tool`, mirroring the argument
bundle used by the HTTP /mcp endpoint in routes.rs.
- Added ALLOWED_TOOLS allowlist: file_read, file_list, agent_list,
channel_send. Rejection happens at the protocol layer (CallResult::Error)
before any kernel touch.
- Added dispatch_call(): snapshots the skill registry, builds a
KernelHandle from Arc<OpenFangKernel>, and invokes execute_tool.
- ToolResult mapped to CallResult::Ok { content, is_error }, preserving
the Ok/Error distinction (Error = bridge couldn't dispatch; Ok with
is_error = tool ran but returned an error).
- Identity stub: caller_agent_id taken at face value from
CallRequest::agent_id. Real per-spawn token-bound identity lands in
ANAI-31.
Test: ipc_handshake_and_allowlist_gate verifies wire shape end-to-end:
disallowed tool gets allowlist Error, allowed tool gets Ok response. Real
execute_tool integration tests come once the daemon spawns the bridge
for real (ANAI-31).
…l surface
Replaces the stub `ping` tool with the four ANAI-30 tools (file_read,
file_list, agent_list, channel_send) and wires the bridge binary to forward
each `tools/call` over the daemon IPC socket established in step 1.
Library (lib.rs):
- ToolDispatcher::call now returns DispatchOk { content, is_error }
preserving the tool-error-vs-dispatch-error distinction across the seam
- built_in_tools() declares the four-tool slice; schemas mirror
runtime::tool_runner::builtin_tool_definitions() (kept in lockstep)
- Bridge: manual ServerHandler impl (drops the #[tool_router] macro). Filters
advertised tools by intersecting built_in_tools() with
ToolDispatcher::allowed_tools(); double-checks before dispatch
- Bridge::new now requires a dispatcher (was Option<_>)
Binary (main.rs):
- Reads OPENFANG_BRIDGE_SOCKET / TOKEN / AGENT_ID env vars (last is stub for
ANAI-30; ANAI-31 derives identity from token)
- Connects to daemon, performs Hello/HelloAck handshake, exits on rejection
- IpcDispatcher: bridge-side ToolDispatcher impl. Forwards each call via mpsc
to an actor task that owns the stream; correlation-by-request_id with a
PendingMap<u64, oneshot> so concurrent tools/call invocations don't
serialize at the dispatcher layer
- Reader task drains pending oneshots with an error on connection close so
in-flight calls don't hang; production path exits the process so CC
notices and tears down (gated behind cfg(not(test)))
Tests:
- lib: built_in_tools_has_anai30_slice, permitted_tools_intersects_with_dispatcher_allowed
- main: ipc_dispatcher_round_trip_and_correlation — fake daemon listener,
full handshake, two concurrent calls, verifies per-id correlation and the
NotPermitted gate
Workspace check clean. Daemon-side bridge_ipc tests still pass (4/4).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end topology now exists at the type level:
daemon → claude (per-prompt) → openfang-mcp-bridge → IPC → daemon
- Add `caller_agent_id: Option<String>` to CompletionRequest. Plumbed
through all construction sites; agent_loop populates it with
session.agent_id, everywhere else passes None.
- Daemon (`server.rs::run_daemon`): after BridgeIpcServer starts,
publish OPENFANG_BRIDGE_SOCKET and OPENFANG_BRIDGE_BIN as process env
for subprocess drivers to discover. Bridge bin defaults to a sibling
of current_exe; operators can override with OPENFANG_BRIDGE_BIN. Both
set with `unsafe` (edition 2024) but only during single-threaded
daemon startup, before any subprocess spawns.
- BridgeIpcServer gains `socket_path()` accessor.
- ClaudeCodeDriver: per-spawn `try_build_bridge_mcp_config`. When
caller_agent_id is set AND both discovery env vars are present,
generate a UUID token, write `<home>/run/cc-mcp-<uuid>.json` (0600),
and add `--mcp-config <path> --strict-mcp-config` to the claude args.
RAII guard removes the file on drop so per-spawn token lifetime is
bounded by the CC subprocess.
- apply_env_filter extended to strip OPENFANG_BRIDGE_* from CC's child
env. Bridge gets these only via the explicit `env` map in the
mcp-config — CC inheriting them would risk a stray bridge picking up
the daemon socket without a fresh per-spawn token.
- Tests:
- test_build_bridge_mcp_config_shape — verifies wire shape claude
expects: mcpServers.openfang.{command,args,env} with exactly the
three discovery vars in env (no extras to leak state).
- test_apply_env_filter_strips_bridge_discovery_vars — confirms
filter removes all four bridge vars from CC's child env.
- test_bridge_mcp_config_drop_removes_file — RAII cleanup invariant.
Stub points still flagged: token validated as non-empty (ANAI-31
replaces with daemon-issued per-spawn token table); agent_id taken
in-band from CallRequest (ANAI-31 derives from token).
11 CC driver tests pass. bridge_ipc (4) and bridge crate (6) tests
unchanged. Workspace check clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…NABLED
Default-off kill switch so we can deploy the bridge code path without
inlining it into every CC invocation. When the gate is unset or not in
{1, true}, try_build_bridge_mcp_config returns None and CC is spawned
exactly as it was pre-step-4 — no --mcp-config, no temp file, no bridge
child. Validation flow: deploy with gate off (sanity), launchctl setenv
OPENFANG_BRIDGE_ENABLED 1, bounce daemon, observe; if anything regresses,
flip back to 0 and bounce for instant recovery.
Daemon still starts the IPC listener and publishes BRIDGE_SOCKET/BIN env
unconditionally — both are harmless without a bridge child connecting.
Pure additive switch; zero behavior change when off.
Test exercises the full truth table for bridge_enabled() (unset, truthy
variants, falsy/garbage variants) and confirms the gate suppresses
config generation regardless of other env. Single test owns the global
env var so no serial_test infra needed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bridge IPC handshake works standalone (bridge binary connects + Hello/HelloAck ok against the live socket), and the daemon-side `wired CC --mcp-config for OpenFang bridge` debug line confirms the flag is being passed to claude. But no `bridge IPC accepted connection` events ever fire — meaning claude is launched with `--mcp-config` but isn't spawning the MCP server subprocess. Without `--debug`, claude swallows MCP launch errors silently. And we drop CC's stderr on success spawns, so any silent rejection is invisible. Add (both spawn paths): - `--debug` flag when bridge config is wired, so MCP errors print to stderr. - Always log a 4 KB tail of CC stderr at info when bridge_wired, regardless of success/failure. Streaming path now drains stderr concurrently to avoid pipe deadlock under chatty --debug output. Existing 12/12 claude_code unit tests still pass; workspace check clean. Diagnostic only — once the cause is identified we'll pare back to bounded on-demand logging.
- bridge_ipc: promote handshake/dispatch events to INFO and add an `accepted connection` log on accept. Operators can now observe the full bridge lifecycle from daemon stderr without crawling through ~/.claude/debug/<uuid>.txt. - claude_code driver: gate --debug + the 4 KB CC-stderr-tail diagnostic behind a new OPENFANG_BRIDGE_DEBUG env var (off by default). With proper INFO logs daemon-side, the noisy --debug output and the per-spawn ~/.claude/debug/ files are no longer load-bearing. - server: validate operator-supplied OPENFANG_BRIDGE_BIN path at boot and log the resolution outcome (override vs. probe). Catches deploy ordering bugs where the env points at a binary that doesn't exist. Stderr is still drained concurrently in the streaming path — required whenever --debug might be on, cheap when it isn't.
The MCP bridge IPC is unix-domain-socket-only by construction (daemon
listens on a unix socket; bridge subprocess connects to it). The bridge
crate and the daemon-side `bridge_ipc` module unconditionally imported
`tokio::net::{UnixStream, UnixListener}`, which broke Windows CI with
E0432 unresolved-import errors in `openfang-mcp-bridge::main` and
`openfang-api::bridge_ipc`.
Gates:
- `openfang-mcp-bridge::main` — entire body cfg-gated to `unix`; on
non-unix the binary is a no-op stub that prints a clear message and
exits non-zero. Tests gated `cfg(all(test, unix))`.
- `openfang-api::lib` — `pub mod bridge_ipc` gated to `unix`.
- `openfang-api::server::run_daemon` — `BridgeIpcServer::start` call
gated to `unix`; non-unix logs a single info line and proceeds without
bridge IPC. The CC driver's existing missing-socket fallthrough means
CC subprocesses spawn without `--mcp-config` on Windows, matching the
bridge-disabled path.
No behavioral change on Linux/macOS. Windows users get a daemon that
boots without bridge support; MCP-routed tools are unavailable until a
Windows-native transport (named pipes / TCP loopback) lands as a
follow-up.
Verified: cargo check --workspace, cargo check --workspace --tests,
cargo test -p openfang-mcp-bridge -p openfang-api --lib, cargo fmt
--check, and cargo clippy all clean on macOS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduce the security primitives that will gate bridge IPC handshake: - openfang-types::bridge_auth::Token — 32-byte CSPRNG-generated opaque token. No Debug impl (anti-leak); only an 8-hex-char fingerprint() for logs. Constant-time equality, hex round-trip. 8 unit tests. - openfang-runtime::bridge_auth — TokenIssuer trait (issue/revoke) and SpawnGuard. Guard holds Arc<dyn TokenIssuer>; Drop revokes. Trait lives here so the Claude Code driver (phase B) can hold Arc<dyn TokenIssuer> without a circular dep on openfang-api. 2 unit tests via stub issuer. - openfang-api::bridge_auth::BridgeAuthority — concrete TokenIssuer impl. Mutex<HashMap<Token, AgentId>> behind Arc::new_cyclic so issued guards carry a Weak<Self> back for self-revocation. Inherent resolve() and live_spawn_count() for the IPC dispatcher and Debug impl. Manual Debug redacts to spawn count (Token has none by design). 7 unit tests, including an Arc<dyn TokenIssuer> round-trip that exercises the exact abstraction the driver will hold. Dep direction respected: openfang-api -> openfang-runtime -> openfang-types. No wiring yet; phase B threads Arc<dyn TokenIssuer> into ClaudeCodeDriver, phase C constructs the Arc<BridgeAuthority> in the daemon and hands it to both BridgeIpcServer and the driver factory. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wire the spawn side of the bridge handshake. The driver can now mint
authenticated bridge tokens via the daemon-side authority, with the legacy
UUID path preserved so dev builds without an issuer don't regress.
- ClaudeCodeDriver gains token_issuer: Option<Arc<dyn TokenIssuer>>.
Builder with_token_issuer(...) for phase C wiring; new()/with_timeout()
default to None.
- BridgeMcpConfig gains _guard: Option<SpawnGuard>. Lifetime-only field
(underscore-prefixed) — when the spawned process exits and the config
drops, SpawnGuard::drop revokes the token from the authority's spawn
table.
- try_build_bridge_mcp_config is now a &self method with two branches:
* issuer present -> parse caller_agent_id as AgentId, refuse to wire
if it doesn't parse (no anonymous tokens), then issuer.issue(...) and
emit guard.token().to_hex() (64-char hex) on OPENFANG_BRIDGE_TOKEN.
Debug log carries token fingerprint.
* issuer absent -> legacy UUID path (renamed
generate_legacy_bridge_token) — preserves current ANAI-30 behavior
where any non-empty token is treated as authenticated.
- Both call sites in complete() and stream_complete() updated to call
through self. Env var name OPENFANG_BRIDGE_TOKEN unchanged.
- Two test fixtures updated to construct via the driver and to set
_guard: None on synthetic BridgeMcpConfig values. claude_code tests:
21/21 green.
Phase C will construct Arc<BridgeAuthority> in openfang-api::server and
hand it to both BridgeIpcServer::start and the driver factory.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase C1 of bridge auth wiring. create_driver now accepts an optional TokenIssuer; the Claude Code driver consumes it via with_token_issuer. OpenFangKernel gains a token_issuer slot + setter/getter; resolve_driver threads the issuer through to fresh driver builds. Boot-time and one-shot probe sites keep the legacy UUID path (None). Agent-loop fallback sites are marked for C2.
…loop Phase C2 of bridge auth wiring. KernelHandle gains a default-None token_issuer() accessor; OpenFangKernel overrides it to expose the daemon's BridgeAuthority. The agent loop reads the issuer via KernelHandle and threads it into call_with_retry / stream_with_retry, which forward it to create_driver on the ModelNotFound fallback path so fallback drivers also get hardened bridge tokens. server.rs constructs the BridgeAuthority at daemon startup (unix-gated) and hands it to the kernel before background agents start.
…r connection Phase D of bridge auth wiring. validate_hello is renamed to authenticate_hello and now resolves the presented token through the daemon's BridgeAuthority. Well-formed hex tokens that the authority issued resolve to an AgentId that overrides the bridge's claimed agent_id on every subsequent CallRequest (with a warn! on mismatch). Well-formed hex tokens the authority never issued are rejected. Non-hex tokens fall through to the ANAI-30 legacy path for back-compat with spawn sites that haven't yet been wired through the TokenIssuer (boot- time create_driver in particular). Handshake log line now includes the token fingerprint and authenticated agent id for audit correlation. BridgeIpcServer::start gains the authority argument; server.rs hoists the authority out of the C2 cfg block so it can be threaded in.
…gents Phase E of the bridge tool surface v2 work closes a boot-time loophole left by phases C1/C2/D: agents brought up during kernel boot (autostart + persisted) were instantiated before the post-boot `set_token_issuer` call, so their drivers were baked with legacy-UUID identity even though every later code path was already on the hardened BridgeAuthority/TokenIssuer track. Changes: - kernel: add `boot_with_config_and_issuer(config, Option<Arc<dyn TokenIssuer>>)`; the existing `boot_with_config` becomes a thin wrapper passing `None`. The issuer is stashed into the kernel's `token_issuer` slot before the boot-time driver chain is built, so the three in-boot `create_driver` sites see the hardened path. Adds two unit tests covering both entry points. - cli/main: construct `BridgeAuthority` before kernel boot and pass `Some(issuer)` into the new entrypoint; thread the same `Arc` into `run_daemon`. - server::run_daemon: gain `#[cfg(unix)] bridge_authority` param; remove the now-redundant post-boot `set_token_issuer` call. - bridge_auth: add `as_token_issuer()` ergonomic helper. - bridge_ipc: comment refresh - the legacy lane is now reserved for non-unix / non-daemon callers only. Tests: 307/307 green, including the two new boot_with_config tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…IDGE_ALLOWED The bridge subprocess has been deafer than the rest of the system: every CC spawn fell back to the static four-tool DEFAULT_ALLOWED set inside `openfang-mcp-bridge`, regardless of what the calling agent's `agent.toml` actually permits. Capability/RBAC decisions made by the kernel and agent loop didn't reach the bridge. This commit plumbs the existing source of truth — `agent.toml` → kernel-resolved `available_tools` → agent loop — into the bridge's env via `OPENFANG_BRIDGE_ALLOWED`: - Add `allowed_tools: Option<Vec<String>>` to `CompletionRequest`. - Populate at both agent_loop call sites from `available_tools`. - Other 9 construction sites (compactor, routing tests, driver tests, HTTP probe, kernel router probe, fallback test helper) pass `None` — they don't drive bridge subprocesses or don't have an agent context. - Thread the list through `try_build_bridge_mcp_config` → `build_bridge_mcp_config_value`, which now emits `OPENFANG_BRIDGE_ALLOWED` in the per-spawn `--mcp-config` env map. - Strip `OPENFANG_BRIDGE_ALLOWED` in `apply_env_filter` so a CC child never inherits a stale allowlist. - `None` → env absent → bridge falls back to its hard-coded default, matching today's behavior. Empty list is meaningful: it emits the env var as the empty string so the bridge produces an explicit zero-tool surface instead of silently defaulting. Tool surface change: zero. This is pure plumbing; the bridge still only knows about the ANAI-30 four-tool slice. Adding `agent_send` to that slice ships in the next commit. Tests: - New `test_build_bridge_mcp_config_emits_allowed_tools` asserts the comma-separated env var lands in the config. - New `test_build_bridge_mcp_config_emits_empty_allowed_tools_explicitly` pins the empty-list-vs-absent distinction. - Existing `test_build_bridge_mcp_config_shape` updated to assert `OPENFANG_BRIDGE_ALLOWED` is *absent* when no list is supplied. - Existing env-filter test extended to assert the new var is stripped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First tool added to the bridge surface past the original ANAI-30 slice. Now that the previous commit makes the bridge honor per-agent capability gates (via OPENFANG_BRIDGE_ALLOWED sourced from each agent's agent.toml), expanding the bridge's *capability* set is safe: agents whose toml does not list agent_send will not see or be able to dispatch it. Schema mirrors `openfang_runtime::tool_runner::builtin_tool_definitions` → agent_send entry. Kept in sync by hand; the bridge crate is runtime-free by design and can't import the source. - `built_in_tools()` in openfang-mcp-bridge gains the agent_send Tool. - `DEFAULT_ALLOWED` in openfang-mcp-bridge/src/main.rs gains agent_send so the legacy/dev path (env var unset) still advertises every tool the bridge can dispatch. Production daemon spawns always set the env. - `bridge_ipc::ALLOWED_TOOLS` (daemon-side ceiling) gains agent_send so the dispatch_call gate doesn't reject it. - `built_in_tools_has_anai30_slice` → `built_in_tools_surface`: now asserts the five-tool set and includes a drift-warning comment. - `permitted_tools_intersects_with_dispatcher_allowed` swaps `agent_send` for `shell_exec` as its "unknown to the bridge" example, since agent_send is no longer unknown. - Bridge `instructions` string updated: drops the ANAI-30 enumeration, explains the gating model instead. Tool surface effect: agents whose agent.toml grants `agent_send` (which already includes the coder agents that have been using it via the legacy path) can now call it through MCP. Agents that don't list it remain unable to. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bridge IPC path (`bridge_ipc::dispatch_call`) and HTTP `/mcp` endpoint both passed `workspace_root: None` to `execute_tool`, which caused `resolve_file_path` to fall through to a `..`-only validation check. Absolute paths bypassed it entirely and bare-relative paths resolved against the daemon CWD (`~/.openfang`), allowing any agent advertised `mcp__openfang__file_read`/`file_list` to read every sibling workspace plus `secrets.env` and GCP service-account JSON sitting at the openfang root. Fix: - bridge_ipc: look up workspace via authenticated `AgentId` → registry manifest. Refuse FS tools when no workspace is registered rather than silently falling back to an unscoped view. - routes (`POST /mcp`): same gate; HTTP has no ambient agent identity, so callers must pass `_agent_id` in arguments to scope FS calls. Kernel-level tools (agent_list, channel_send, etc.) keep working without `_agent_id` since they don't touch the filesystem. - Both paths now thread `workspace_root` through to `execute_tool`, so `workspace_sandbox::resolve_sandbox_path` actually runs — canonicalize- then-prefix-check blocks absolute escapes and symlink traversal; the existing `..` denial covers relative traversal. Smoke (post-rebuild, from `coder-learn-rust` with file_read/list): - `secrets.env` (bare relative) → ENOENT inside sandboxed workspace - `/Users/rlyeh/.openfang/secrets.env` → "Access denied: resolves outside workspace" - `../assistant/IDENTITY.md` → "Path traversal denied" - `IDENTITY.md` → success (workspace-relative legitimate read works) - `file_list .` → returns the agent's own workspace contents only; zero openfang-root entries Deferred: defensive log on disallowed-tool invocation at execution time (thread C), and removing `tool_runner`'s `workspace_root: None` legacy fallback (used only by tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Defense-in-depth against a misbehaving subprocess crafting JSON-RPC calls to tools outside its advertised set. Restructures dispatch_call into three ordered gates: 1. Static bridge allowlist (unchanged) — reject unknown tools pre-identity. 2. Identity resolution (hardened) — authenticated AgentId wins; legacy lane requires claimed string to parse as AgentId AND have a registry entry. Closes the "trust the claimed string" loophole (ANAI-30 follow-up). 3. Per-agent permission gate (new) — calls the canonical resolver kernel.available_tools_with_registry + AgentMode::filter_tools, the exact pair agent_loop uses to build OPENFANG_BRIDGE_ALLOWED at spawn. Advertise-time and execute-time gates cannot drift. 4. Workspace sandbox (D-fix, unchanged) — FS tools require a workspace. Snapshot construction now matches agent_loop's bundled→global→workspace layering, so workspace skill overrides are visible bridge-side. Rejections log warn! with request_id, tool, agent, mode, permitted_count. Caller sees a generic "tool 'X' not permitted for this agent". Promotes available_tools_with_registry from fn to pub fn. Verified end-to-end via raw IPC frames: rejection on agent_list for coder-learn-rust (Gate 2), positive path on file_read, identity gate rejection for unregistered UUID and unparseable legacy claim, and Gate 1 rejection for shell_exec outside bridge surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add file_write and web_fetch to the bridge tool surface so CC-backed agents reach parity with API-backed agents for FS-write and HTTP-out. Both inherit the existing four-gate pipeline (static allowlist, identity, per-agent permission via available_tools_with_registry, FS sandbox). - openfang-mcp-bridge: hand-mirrored schemas in built_in_tools(); drift catcher test updated. Both added to DEFAULT_ALLOWED. - openfang-api/bridge_ipc: both added to ALLOWED_TOOLS. file_write added to FS_SANDBOXED_TOOLS; web_fetch deliberately not (no FS touch — SSRF guardrails live in the runtime impl, including the external-content delimiter wrapper). Smoke (raw-IPC harness): permitted positive + unpermitted negative for each tool, all four pass with matching daemon warn lines.
Adds the four remaining agent-lifecycle tools to the MCP bridge surface, bringing parity with OpenFang's native kernel tool set for agent control. - openfang-mcp-bridge/lib.rs: register all four in built_in_tools(); drift-catcher test bumped to 11 tools. - openfang-mcp-bridge/main.rs: add to DEFAULT_ALLOWED. - openfang-api/bridge_ipc.rs: add to ALLOWED_TOOLS. Not added to FS_SANDBOXED_TOOLS (kernel-only, no filesystem touch). Gate behavior is inherited from existing plumbing: - Gate 1 (static allowlist): all four now pass. - Gate 2 (per-agent capability via agent.toml): enforced through the canonical resolver, no extra hardcoded checks. - Gate 3 (FS sandbox): skipped by design. Smoke: 8/8 scenarios green (positives via coder-openfang against a throwaway coder-smoketest agent; negatives via code-reviewer hit Gate 2 cleanly with permitted_count=5 matching her agent.toml). Full spawn → activate → kill lifecycle verified end-to-end through the bridge. Smoketest workspace + registry entry cleaned up.
Brings the MCP bridge surface to 13 tools, leaving only shell_exec pending the Claude Code Bash-disable spike. Memory tools are gated per-agent via agent.toml; not sandboxed at the FS layer since memory writes route through kernel-managed workspace memory. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Material change: OpenFang now authoritatively blocks Claude Code's native filesystem, shell, and web tool surface for every CC subprocess it spawns, by injecting a per-spawn settings.json (written under the bridge socket dir, removed on drop) and passing it via `claude --settings <file>`. Why this works alongside `--dangerously-skip-permissions`: per Anthropic's settings documentation, `permissions.deny` rules are enforced even when the YOLO flag is set — that flag only bypasses allow/ask prompts, not security-critical denies. The daemon keeps non-interactive operation (no TTY to answer prompts) while still authoritatively blocking the native surface. Deny set: Bash, Read, Write, Edit, MultiEdit, NotebookEdit, WebFetch, WebSearch, Glob, Grep. The MCP namespace (`mcp__openfang__*`) lives in a separate pattern space and is untouched — that's the point: replace the bypassable native surface with the gated, RBAC-checked bridge surface. TodoWrite and Task (subagent) are deliberately left alone: agent-internal control flow with no escape to the host. Gated on `bridge_enabled()` so the deny set co-travels with the bridge wiring — either both or neither. Without the bridge wired the agent has no `mcp__openfang__*` fallback, so blanket-denying native tools would yield a useless agent. Also retroactively earns the docstring claim in `llm_driver.rs::skip_permissions`: before this commit the claim that "OpenFang's RBAC layer makes YOLO safe" was true at the bridge wire but false at CC's wide-open native surface. With native tools denied, the bridge is now the only path to host resources, and per-agent agent.toml capabilities become authoritative for CC subprocesses. Tests: - test_build_cc_settings_shape: wire shape, bare tool names only - test_cc_native_deny_includes_glob_grep: pins the deny set decisions - test_cc_settings_file_drop_removes_file: RAII cleanup Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 13a denies CC's native Bash via --settings; every CC-driven agent
now needs shell access through the OpenFang bridge instead. This adds
`shell_exec` to the bridge allowlist and threads the two pieces of
context the runtime needs to dispatch it safely:
* `exec_policy` resolved from the registered manifest (per-agent
override or kernel fallback, already populated by the kernel at
agent boot — kernel.rs:1440-1447). Drives the allowlist/full/deny
decision plus shell-metacharacter rejection in tool_runner.
* `allowed_env_vars` read from `manifest.metadata["hand_allowed_env"]`,
mirroring agent_loop.rs:320 — same lookup, same semantics, so the
bridge path and the in-proc path see identical hand-granted env.
`shell_exec` joins `FS_SANDBOXED_TOOLS`: tool_runner sets the command's
cwd to the workspace_root, so without a registered workspace the shell
would default to the daemon CWD (`~/.openfang`, where secrets.env and
the GCP service-account JSON live). Refusing the call when no workspace
is registered keeps that closed.
Tests:
* `allowlist_contains_shell_exec` — name-locked allowlist membership
* `shell_exec_is_workspace_sandboxed` — name-locked sandbox membership
* Updated `ipc_handshake_and_allowlist_gate` to use a synthetic
non-tool name as the "not on allowlist" example; shell_exec is now
permitted there.
Backcompat: agents that already declared `shell_exec` in their
capabilities get it via the bridge with the same gating they had
in-proc. Agents that didn't, can't — gate 2 still rejects.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 13a's deny set blocks CC's native `WebSearch`; researcher agents
(researcher, researcher-autumn-business, researcher-autumn-medical)
depend on this surface for primary research. Restoring it through the
bridge means the request flows through OpenFang's multi-provider chain
(Tavily → Brave → Perplexity → DDG) instead of CC's own provider, which
is also a sturdier behavior win — we control the retry/fallback policy.
Implementation is a one-line allowlist add. Every prerequisite was
already in place:
* `web_ctx` is already passed to `execute_tool` at the dispatch site.
* `web_search` is already a fully-implemented native tool with a
ToolDefinition (tool_runner.rs:645) and provider chain
(tool_runner.rs:233).
* Per-agent capability gating already honors `web_search` against the
agent.toml's resolved surface (gate 2 at bridge_ipc.rs:475).
No new context arg, no new FS sandbox concerns (pure-net tool). The
smallest bridge add to date.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commits 9bd3e48 (13b) and 64078d0 (13d) added shell_exec and web_search to the daemon-side ALLOWED_TOOLS allowlist in bridge_ipc, but missed the two corresponding files on the bridge subprocess side: - crates/openfang-mcp-bridge/src/lib.rs :: built_in_tools() The MCP tools/list surface advertised to CC subprocesses. Without an entry here, CC never sees the tool name or schema; permitted_tools() intersects this list with the dispatcher's allowed_tools, so the daemon's allowlist alone is not enough. - crates/openfang-mcp-bridge/src/main.rs :: DEFAULT_ALLOWED Fallback allowlist used when OPENFANG_BRIDGE_ALLOWED is unset (legacy / dev spawn path). Mirrors the daemon's bridge_ipc list. Net effect: with 13a deploying the deny of CC's native Bash/WebSearch tools, CC agents would have lost shell + web search entirely on deploy. This commit closes that gap so the bridge actually delivers what 13b/13d advertised. Surface drift test (built_in_tools_surface) updated with both names. The permitted_tools intersection test now uses a synthetic 'not_a_real_tool' name since shell_exec is no longer an outsider. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bridges OpenFang's native apply_patch tool (registered in tool_runner.rs at the ToolDefinition table, dispatched at line 209) end-to-end so CC subprocesses can perform surgical multi-hunk edits instead of paying the token + drift cost of whole-file file_write rewrites. This is the closest analogue to CC's native `Edit` tool that OpenFang ships today — slightly higher emit cost (unified diff vs string substitution) but the same core ergonomic: change only what changes. A native `string_edit` follow-up may replace this as the primary edit ergonomic for agents; for now apply_patch closes the worst of the "no Edit" gap left by 13a's CC native-tool deny set. Three files, mirroring every prior bridge-tool add since 908c39c: - crates/openfang-api/src/bridge_ipc.rs • apply_patch added to ALLOWED_TOOLS • apply_patch added to FS_SANDBOXED_TOOLS — tool_apply_patch resolves Add/Update/Delete paths embedded in the patch against workspace_root, so the no-workspace fail-closed gate is critical (same sibling-leak class as file_write) • Two name-locked tests: allowlist + sandbox membership - crates/openfang-mcp-bridge/src/lib.rs • Tool::new("apply_patch", ...) added to built_in_tools() (mirrors the runtime schema by hand, as the bridge crate is runtime-free by design) • built_in_tools_surface test updated - crates/openfang-mcp-bridge/src/main.rs • apply_patch added to DEFAULT_ALLOWED No new plumbing required — workspace_root + the kernel handle are already threaded into the bridge's execute_tool call site, and the native apply_patch implementation already enforces the sandbox via crate::apply_patch::apply_patch(&ops, root). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 18 closes the gaps surfaced by the deny-set audit (plan §10d). Adds 14 entries to CC_NATIVE_DENY, bringing it from 10 → 24 names, grouped into five categories: - Bash adjuncts / shell streamer: BashOutput, KillShell, KillBash, Monitor. Inert today (Bash is denied) but locked down as defense in depth — Monitor in particular is Bash-with-streaming and bypasses shell_exec's exec_policy gating entirely. - FS escape via worktree: EnterWorktree, ExitWorktree. Direct FS-escape primitive; native workspace-aware variant is on the follow-up backlog. - Notebook read: NotebookRead. Symmetry with NotebookEdit; forward-compat against future CC versions that ship it. - Skill substrate: SlashCommand. CC's parallel skill curation surface; OF's is canonical. - OpenFang-First scheduling / control plane: CronCreate, CronDelete, CronList, ScheduleWakeup, RemoteTrigger, plus PushNotification on the comms-routing axis. Deliberately NOT denied (pinned in tests): TodoWrite, Task, Skill, AskUserQuestion, EnterPlanMode, ExitPlanMode, TaskOutput, TaskStop, ToolSearch — agent-internal control flow / UX with no escape surface. Tests: existing pins extended; new test_cc_native_deny_covers_audit_gaps locks in each new addition with a category-rationale assertion. 20/20 in drivers::claude_code green; 1005/1005 openfang-runtime lib green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Locks in the invariant that the bridge tool surface is described by three co-equal lists, all of which must agree: 1. openfang_api::bridge_ipc::ALLOWED_TOOLS (daemon dispatch gate) 2. openfang_mcp_bridge::built_in_tools() (MCP advertise surface) 3. openfang_mcp_bridge::DEFAULT_ALLOWED (bridge process default) 13b/13d shipped with a tool dispatchable on the daemon side but absent from the MCP advertise surface — invisible to CC. The existing IPC smoke tests exercised the daemon path directly and never hit `tools/list`, so the gap shipped silently in two commits in a row. This commit closes that class of failure permanently. Changes: - Move `DEFAULT_ALLOWED` from `main.rs` (binary, untestable) into `lib.rs` as `pub const`, so the drift-catcher can reach all three sets from `openfang-api` tests. `main.rs` now imports it. - Add `allowlist_three_way_correspondence` in `bridge_ipc::tests` — asserts the three sets are equal, with diff output identifying which set is missing what when it fires. - Add `allowlist_count_is_sixteen` — pins cardinality on all three sets so an off-by-one add/remove on any single side fails loudly. Both tests are documented with the lesson from 13b/13d and the three-file pattern any future bridge tool add must follow. Test results: - openfang-api lib: 13/13 bridge_ipc tests pass (was 11; +2 new) - openfang-mcp-bridge lib: 5/5 - openfang-runtime lib: 1005/1005 - openfang-runtime drivers::claude_code: 20/20 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The exec_policy deny path in shell_exec and process_start formatted the
rejection message as "{reason}. Current exec_policy.mode = ...", but
validate_command_allowlist already terminates its reason string with a
period. Result was "safe_bins.. Current ..." in every deny payload.
Trim a trailing "." off the reason before interpolating so the rendered
message has a single, correctly-placed period.
Cosmetic only - no behavior change to the deny gate itself.
Closes S1-11 / S3-03 / S8-02 (FS-sandbox drift) and the FS-half of S3-01
(HTTP `/mcp` parity) by collapsing two drifted copies of the FS-sandbox
allowlist into one canonical `pub const FS_SANDBOXED_TOOLS` in
`openfang_runtime::tool_runner` (6-tool union).
Before: `bridge_ipc.rs` carried a 5-tool list (file_read, file_list,
file_write, shell_exec, apply_patch); `routes.rs` carried a 4-tool list
(file_read, file_list, file_write, create_directory). Both lists gate
whether a tool call's path args get rewritten against the caller's
`workspace_root` and whether the call fail-closes when no workspace is
registered. Drift meant:
- HTTP `/mcp` invoking `shell_exec` did **not** force cwd to
`workspace_root` — sibling-workspace / secrets.env leak.
- HTTP `/mcp` invoking `apply_patch` did **not** sandbox patch-embedded
paths — same leak surface.
- IPC invoking `create_directory` did **not** sandbox the target path
(when reachable) — same leak surface inverted.
The union closes all three holes in one place. Both surfaces now consult
the same constant; future FS-touching tools land in one file.
Side effect: `create_directory` is added to `ALLOWED_TOOLS`,
`DEFAULT_ALLOWED`, and `built_in_tools()`. The PR's stated objective is
full native-tool parity on the bridge; its absence from those three lists
was an oversight, not a deliberate gate. Audit of
`~/.openfang/agents/*/agent.toml` confirms zero references to
`create_directory` — no agent was relying on it being unreachable.
Tests:
- New `fs_sandboxed_tools_subset_of_allowed_tools` (belt-and-braces):
catches "added an FS tool to the sandbox set without putting it on
the bridge surface" or the inverse.
- Existing `allowlist_three_way_correspondence` continues to enforce
daemon-dispatch ≡ MCP-advertise ≡ bridge-default sets.
- Cardinality assertion bumped 16 → 17 in lockstep.
Test results: cargo test -p openfang-api -p openfang-mcp-bridge
-p openfang-runtime → 1170 passed, 0 failed.
… (S9-09) Tier B: PowerShell -EncodedCommand / -ec / -e payloads are base64(UTF-16LE) encoded scripts that bypass the existing inline-flag allowlist. Decode them and feed the inner script back through the same wrapper-extraction logic so they are validated against allowed_commands / safe_bins like any other inline script. Recursion is capped at MAX_SHELL_RECURSION_DEPTH = 2 (one outer wrapper plus one nested wrapper), which is more than any legitimate pwsh-in-pwsh use case needs and prevents algorithmic-DoS via deeply-nested payloads. Tier C: hard-deny load-from-disk and interactive flags on every known shell wrapper, regardless of allowlist contents — the validator cannot see what those flags will load. - pwsh / powershell: -File, -PSConsoleFile - bash / sh / zsh: --rcfile, --init-file, -i - bash / sh / zsh: -O extdebug (two-token form, handled separately) These checks run before any other parsing in validate_command_allowlist, so a flag like `bash -i -c "..."` is rejected outright rather than relying on the inline-script extractor. Internal API: extract_shell_wrapper_commands and extract_inner_script_commands now return Result<Vec<String>, String> so decode failures, depth violations, and disk-load flag violations propagate as clear errors. Public surface (validate_command_allowlist) signature is unchanged. Tests: 13 new unit tests covering both tiers, plus a roundtrip and odd-length test for the UTF-16LE decoder. Existing 50 subprocess_sandbox tests unchanged. cargo test -p openfang-runtime --lib subprocess_sandbox → 63 passed, 0 failed. Closes S9-09.
…rp tools (S9-08) Closes S9-08. Three new gates in Allowlist mode: 1. WRAPPER_BINARIES_RECURSE (env, sudo, nice, nohup, timeout). The validator now unwraps these and validates the inner binary against the allowlist — plugging the `env FOO=bar /bin/evil` and `sudo /bin/evil` bypass where only the outer wrapper was previously checked. Per-wrapper flag-parsers handle the common flag-with-value forms (`-u`, `-g`, `--unset=`, `-n`, `-s SIG`, `-k DUR`, `--`, `KEY=VALUE` env assignments, `timeout`'s DURATION positional). Fail-closed: unrecognized flag patterns or missing inner command → reject. 2. WRAPPER_BINARIES_DENY (xargs, find, strace, gdb, chroot, unshare, setsid, stdbuf, flock, time) — hard-denied regardless of allowlist contents. Sentinel-token execution (`xargs`, `find -exec`), process-tracing (`strace`, `gdb`), and namespace/buffer manipulation have no legitimate use through LLM-driven `shell_exec`. 3. INLINE_SCRIPT_INTERPRETERS — `python -c`, `node -e/--eval/-p/--print`, `perl -e/-E`, `ruby -e`. Inline scripts in these languages are not parseable for allowlist validation; hard-denied. Operators can still pass a script file path (a plain arg, not a command) or use Full mode. Recursion: wrapper-binary chains share the existing MAX_SHELL_RECURSION_DEPTH cap (2). Inner segments that are themselves shell wrappers route through `extract_shell_wrapper_inner`; nested wrapper binaries recurse through `extract_wrapper_binary_chain`. Gates also fire on segments extracted from shell-wrapper inline scripts, so `bash -c "sudo evil"` and `bash -c "xargs ls"` are caught. Tests: 25 new (88 total in subprocess_sandbox; 1044 in -runtime; 113 in -api; 33 in -mcp-bridge). All passing. No exposure change on any surface — this is allowlist-validation tightening only.
Closes S3-01 from the bridge-v2 audit. Before this commit, `shell_exec`
dispatched through the HTTP `/mcp` endpoint ran with
`exec_policy = None` and an empty hand-allowed-env list — degrading to
the daemon-global `ExecPolicy` (typically `Full` on developer setups)
and silently bypassing every per-agent manifest gate the operator had
authored. The bridge IPC path applied the gate correctly; HTTP did not.
This commit:
1. Extracts the per-agent resolution into a shared helper module
`openfang_runtime::agent_tool_context` exposing:
- `AgentExecContext::from_manifest(&AgentManifest)` — pure read-only
derivation of `exec_policy` and `hand_allowed_env` from the
manifest, matching the existing pattern at agent_loop.rs:320,
agent_loop.rs:1572, and bridge_ipc.rs:553.
- `EXEC_POLICY_REQUIRED_TOOLS` + `requires_exec_policy(tool)` —
the tight set of tools (currently `shell_exec`) whose dispatch
is unsafe without a manifest-bound policy.
2. Wires the helper into `bridge_ipc::dispatch_call`, replacing the
inline 16-line extraction with a 4-line shared call. Behavior
unchanged on the bridge IPC path.
3. Wires the helper into `routes::mcp_http`:
- Resolves the registry entry once so workspace lookup and
exec_context lookup share a manifest snapshot (no TOCTOU).
- Fail-loud (-32602) when `requires_exec_policy(tool_name)` and
no manifest-bound `exec_policy` is available — distinct error
messages for "no `_agent_id` supplied", "agent has no registry
entry", and "manifest has no `[exec_policy]`". Per Ben's call:
loud during the security-fix window so misconfigured callers
surface immediately rather than silently degrade.
- Threads `allowed_env_arg` and `effective_exec_policy` into
`execute_tool` from `AgentExecContext` instead of hard-coded
`None, None`.
Tests (7 new in `agent_tool_context::tests`):
- Default manifest → no policy, no env.
- Manifest with `exec_policy` propagates intact (mode + allowed_commands).
- `hand_allowed_env` array reads from metadata correctly.
- Malformed `hand_allowed_env` value (string instead of array) →
fail-closed to empty (NOT partial-bind).
- Empty `hand_allowed_env` array → `allowed_env()` returns `None`
(parity with bridge_ipc.rs:559's "fall through to runtime default"
semantics, not "explicitly grant nothing").
- `requires_exec_policy` flags `shell_exec` and nothing else.
- Drift-catcher: `EXEC_POLICY_REQUIRED_TOOLS` still contains
`shell_exec` (forces intentional change if someone removes it).
Verification:
- `cargo test -p openfang-runtime --lib agent_tool_context`
→ 7 passed, 0 failed.
- `cargo test -p openfang-api --lib` → 113 passed, 0 failed
(bridge_ipc tests unchanged after refactor).
- `cargo test -p openfang-runtime --lib subprocess_sandbox`
→ 88 passed (prior Stage 9 commits unaffected).
- `cargo build -p openfang-api` clean.
Files touched: 4 (1 new). +143/-19 LOC including the new helper
module and its tests.
Bridge-v2 audit: S3-01 (HIGH, block-this-PR). Stage 9 rollup
unaffected.
… routes (S6-04)
`/api/agents/{id}/message` and `/api/agents/{id}/message/stream` accepted
`sender_id` / `sender_name` from the request body and threaded them straight
into the kernel, where they land in the agent's prompt context as the
authoritative "this turn came from $sender". Neither route has a
provenance/auth path today (no `BridgeAuthority` resolve, no per-spawn
token, no signed envelope), so any local POST could forge any channel
identity — e.g. claim to be a specific WhatsApp number or Telegram user-id
and the LLM would treat the turn as that user's. This is the impersonation
primitive behind the S6-04 → S6-09 → S4-03 chain described in
`stage-6-findings.md`.
Tight fix: refuse the assertion entirely. Both HTTP routes now hard-fail
(400) with a stable error string when `sender_id` or `sender_name` is
present, and pass `None`/`None` to the kernel unconditionally. Empty
strings are treated as absence (defensive). Channel adapters (Telegram,
WhatsApp, Discord, ...) do **not** call these routes — they reach the
kernel directly via the channel router — so legitimate channel flows are
unaffected.
The follow-up that *restores* the legitimate identity-forwarding capability
under a proper trust model is filed as **FU-01** in
`_shared/openfang/findings/bridge-v2/followups.md` (Option B from the
design discussion): add a bridge-auth-style provenance path so verified
callers can supply identity, with hard-fail on mismatch, and tag inbound
messages with a remote-origin envelope (merges naturally with the S6-09 /
S5-01 family). Bundled together because that's a single design across
three patch points; cheap to do once, expensive to retrofit later.
Implementation:
- New `pub(crate) fn reject_unauthenticated_sender_identity(&MessageRequest)
-> Option<&'static str>` with stable error strings (test-pinned).
- `send_message` and `send_message_stream` both validate before kernel
dispatch; on rejection, log a structured warn with the agent id and the
reason, return 400 JSON `{"error": <reason>}`.
- Kernel calls now pass `None, None` for sender identity in both routes —
validator guarantees the fields are absent/empty, but we pass `None`
explicitly to be defensive (belt-and-suspenders).
Tests (6 new, all passing):
- `s604_no_sender_fields_is_allowed`
- `s604_empty_sender_fields_is_allowed` (empty-string == absence)
- `s604_sender_id_alone_is_rejected`
- `s604_sender_name_alone_is_rejected`
- `s604_both_fields_rejected_with_combined_message`
- `s604_reason_strings_are_stable` (pins exact error strings — breaking
change for callers if these are bumped)
`cargo test -p openfang-api --lib` → **119 passed** (113 prior + 6 new).
Closes S6-04 (HIGH, block-this-PR).
Refs: FU-01 in `_shared/openfang/findings/bridge-v2/followups.md`
… allowlist (S7-06/S4-02)
The bridge's `DEFAULT_ALLOWED` constant — used by `openfang-mcp-bridge`'s
`main.rs` when `OPENFANG_BRIDGE_ALLOWED` is unset (legacy/dev/test fallback) —
previously included `agent_spawn`, `agent_kill`, and `agent_activate`. In
the fallback path any agent could spawn/kill/activate sibling agents
regardless of its `agent.toml` capabilities.
The production runtime path is unaffected: `agent_loop.rs` already threads
the manifest-derived `available_tools` into `OPENFANG_BRIDGE_ALLOWED`, so
agents that opt in via manifest still reach these tools. The daemon-side
dispatch (`ALLOWED_TOOLS`) and MCP advertise surface (`built_in_tools()`)
retain all 17 tools so opted-in agents can still dispatch them.
Changes:
- `openfang-mcp-bridge::DEFAULT_ALLOWED` shrinks from 17 → 14 entries
(drops `agent_spawn`, `agent_kill`, `agent_activate`).
- New `openfang-mcp-bridge::PRIVILEGED_DEFAULT_DENY` lists the three
excluded tools with the rationale doc-pinned in source.
- Drift-catcher in `bridge_ipc::tests` evolves from three-way equality to:
- Invariant A (equality): `ALLOWED_TOOLS == built_in_tools()`.
- Invariant B (safe-by-default subset):
`DEFAULT_ALLOWED ⊂ ALLOWED_TOOLS`,
`PRIVILEGED_DEFAULT_DENY ⊂ ALLOWED_TOOLS`,
`DEFAULT_ALLOWED ∩ PRIVILEGED_DEFAULT_DENY = ∅`,
`DEFAULT_ALLOWED ∪ PRIVILEGED_DEFAULT_DENY = ALLOWED_TOOLS`
(so every new tool must be classified default-safe or privileged-deny).
- New name-level regression guard `privileged_lifecycle_tools_excluded_from_default`
hard-pins the three tool names so a grep for `agent_spawn` lands on the
guard.
Tests: `cargo test -p openfang-mcp-bridge --lib` → 5 passed.
`cargo test -p openfang-api --lib` → 120 passed (was 119; +1 new).
Closes S7-06 / S4-02 (bridge-side), the last block-this-PR HIGH on
`feat/bridge-tool-surface-v2`.
…andbox Defense-in-depth for the sandbox: in addition to confining agent file operations to `workspace_root`, `resolve_sandbox_path` now hard-denies a curated set of sensitive paths under the OpenFang home, regardless of the agent's workspace_root configuration. Motivation: the workspace-confinement check is sufficient only when each agent's `workspace_root` is correctly scoped to its own subdir under `workspaces/`. A misconfigured or privileged manifest (e.g. `workspace_root = "~/.openfang"` or `"~"`), or a future tool surface that bypasses workspace confinement, could otherwise read or overwrite credentials, runtime tokens, and daemon state. Generalizes jaberjaber's review feedback on the closed RightNow-AI#1162 to the right layer. Deny categories (returned as stable reason strings for audit): - config config.toml - config-backup config.toml.bak* - secrets secrets.env, .env - daemon-state daemon.json - credential-file gcp-key*, *.pem, *.key, *.p12, *.pfx - runtime-tokens run/ (bridge sockets, mcp-config-*.json) - credential-vault vault/ - paired-devices paired_devices.json - daemon-log daemon.stderr.log*, daemon.stdout.log* Workspaces, skills, bin, scripts, and other non-sensitive subdirs of the OpenFang home remain unaffected. Implementation notes: - New private `is_sensitive_openfang_path()` predicate, called after canonicalization (symlink escapes are resolved before the check). - `openfang_home()` honors `OPENFANG_HOME` env var, falls back to `$HOME/.openfang`. No new crate dependency (uses std::env only). - Denials emit `tracing::warn!` with target `openfang_runtime::sandbox`, the user-supplied path, the resolved path, and the reason string, for auditability. Tests: 8 new unit tests covering each deny category, non-sensitive passthroughs, the misconfigured-workspace_root case, and the absolute-path escape attempt. All 14 workspace_sandbox tests pass.
Clears RUSTSEC-2026-0141 (CVSS 9.1 — TLS hostname verification disabled with Boring TLS backend). Workspace uses tokio1-rustls-tls so the vuln is not exploitable in this config, but the advisory still gates cargo audit so we bump to clear CI.
Pre-flight for upstream CI on PR RightNow-AI#1205. Three clippy lints fixed alongside a workspace-wide cargo fmt --all run: doc_lazy_continuation in openfang-mcp-bridge/src/lib.rs (paragraph break before the 'Default tool allowlist' doc block), manual_inspect at four call sites in runtime/drivers/claude_code.rs (switch .map to .inspect for side-effect-only closures over Option of TempPath), and field_reassign_with_default in runtime/agent_tool_context.rs (collapsed mut policy assignment into struct-update-syntax init). No behavior changes.
Windows clippy CI was failing with -D unused-imports because `tokio::io::BufReader` and `tracing_subscriber::EnvFilter` are only referenced inside the `#[cfg(unix)] async fn main`. The items they support were already gated; the imports were not. Add `#[cfg(unix)]` to both `use` lines so the lint matches usage on non-unix targets.
Document the full local gate (fmt, clippy, test, audit) plus the cross-platform cfg-gating gotcha that bit us on RightNow-AI#1205. Anyone opening a PR should run all of this locally first so CI feedback loops stay short.
`tracing::warn!` is only invoked from `#[cfg(unix)]` blocks in this
file (signal-handler shutdown paths). On Windows, the bare `warn`
import was flagged unused under `-D warnings`.
Verified locally via cross-compile:
cargo clippy --target x86_64-pc-windows-gnu --workspace \
--all-targets -- -D warnings
Automates the Windows cross-compile sanity check we've been running manually for the last few rounds of cfg-gate fixes. Verifies the x86_64-pc-windows-gnu rustup target + mingw-w64 are present, sets linker/CC env vars, then runs clippy (default), check, or build against the Windows target. Referenced from the PR template pre-PR checklist.
This was referenced May 20, 2026
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lands the full OpenFang tool surface (file / memory / agent / shell / web /
patch) over the MCP bridge so Claude Code subprocess agents act on
OpenFang's authoritative tool implementations instead of CC's native ones.
Stage 9 of the same branch closes every audit finding raised against that
new surface — shipping the surface without those closures would ship a
known-bad feature, so they land together as a single concern: land the
bridge surface safely. 37 commits ahead of
upstream/main(
8405c4a..421e5d9); near-single-file commits; workspace tests green atHEAD; live-smoked end-to-end.
Changes
Bridge tool surface (
openfang-mcp-bridge+ driver wiring):openfang-mcp-bridgecrate; per-spawn IPC listener on the daemonside, bridge-side
RuntimeDispatcherclient.--mcp-config, gated onOPENFANG_BRIDGE_ENABLED.BridgeAuthority+TokenIssuer: Hello frames authenticated,AgentIdbound per connection,
TokenIssuerconstructed before kernel boot soautostart agents are hardened from t=0.
CompletionRequest → OPENFANG_BRIDGE_ALLOWED;execute-time permission gate honors
agent.tomlcapabilities.agent_send,file_write,web_fetch,agent_spawn/kill/activate/find,memory_store/recall,shell_exec(workspace + exec_policy gated),web_search,apply_patch(workspace sandboxed).
--settings(
CC_NATIVE_DENYexpansion), so the only path for those ops is backthrough OpenFang.
Hardening (Stage 9 — audit closures):
ee7edd5521391aFS_SANDBOXED_TOOLSgate; bridgecreate_directorye59cdaf478ac5a48a7741-EncodedCommanddecode + recurse; load-from-disk / interactive hard-deny52e13abenv/sudo/nice/nohup/timeout); hard-denyxargs/find/strace/gdb/chroot/unshare/setsid/stdbuf/flock/time; inline-interpreter-c/-edeny8c74500/mcpexec_policy parity with bridge IPC (TOCTOU closed)f01700csender_id/sender_nameb421318421e5d9421e5d9generalizes the PR #1162 review feedback (config exfil via outboundattach) to the sandbox layer, so the protection is independent of any
single tool surface and survives
workspace_rootmisconfiguration.Categories:
config/config-backup/secrets/daemon-state/credential-file(gcp-key*/*.pem/*.key/*.p12/*.pfx) /runtime-tokens(run/) /credential-vault(vault/) /paired-devices/daemon-log(rotated variants included).workspaces/,skills/,bin/,scripts/,src/pass through untouched.Dependencies: no new workspace dependencies.
421e5d9usesstd::envrather than pulling
dirs.Deferred / out of scope (tracked in maintainer notes, do not block
merge):
shell_execsubprocess bypass ofresolve_sandbox_path(
cat/headunderexec_policy = Full). Layering property, not aregression introduced by this PR. Fix shape: argv-path inspection +
default
exec_policy = Nonefor non-developer agents.AgentManifest::workspacesilent-ignore. Field defined butno loader threads it into
SandboxConfig.workspace_root; no#[serde(deny_unknown_fields)]. Fix shape: thread it through, or removethe field and add
deny_unknown_fields.session
Messagedeserialization drift) — also deferred, documented.Testing
cargo clippy --workspace --all-targets -- -D warningspassescargo test --workspacepassesNotes:
cargo build -p openfang-runtimeclean at HEAD; clippy clean on the newcode. Pre-existing warnings in
agent_tool_context.rsanddrivers/claude_code.rsare unchanged by this PR.subprocess_sandboxmodule: 88 tests passing (25 new for S9-08, 13new for S9-09, plus the three-way drift-catcher).
workspace_sandboxmodule: 14 tests passing (6 original + 8 new forthe sensitive-path predicate, including misconfigured-
workspace_rootdefense-in-depth assertions).
file_readagainst<openfang_home>/config.tomlwas denied withreason="config"fromopenfang_runtime::sandbox, while aworkspace-internal
file_readsucceeded. A second probe via athrowaway agent exercised the structured deny error surfaced to the
caller and confirmed the predicate fired with the correct reason in the
daemon log.
Security
Notes:
numbered audit finding (table above); each ships with unit tests
asserting the deny path.
AgentIdbinding,
TokenIssuerconstructed pre-boot so autostart agents aregated from t=0.
and before workspace-confinement, with auditable
tracing::warn!(target = "openfang_runtime::sandbox", ...)on every denial(user_path, resolved path, reason).
sender_id/sender_name(S6-04).shell_execargv parsing recurses through wrapper binaries(
env/sudo/nice/nohup/timeout) and hard-deniesxargs/find/strace/gdb/chroot/unshare/setsid/stdbuf/flock/time;inline-interpreter
-c/-eflags denied; pwsh-EncodedCommanddecodedand recursed.
apply_patchconfined to agent workspace.agent_spawn/killand similar) excludedfrom the bridge default allowlist; explicit per-agent grant required.
cc @jaberjaber for security review of the Stage 9 hardening commits.