feat(fleet): add MCP channel server for real-time agent communication by Wirasm · Pull Request #642 · Wirasm/kild

Wirasm · 2026-03-24T11:57:16Z

Summary

Adds a kild-fleet MCP channel server that watches inbox files and pushes notifications into Claude Code sessions via the channels protocol. Reduces fleet communication latency from ~1s (Claude inbox polling) to ~100ms (fs.watch + stdio notification). Also exposes MCP tools so agents can write status/reports/messages without shelling out to CLI commands.

Changes

New: integrations/channel.rs — installs channel server, patches .mcp.json, cleanup on destroy
New: kild init-channels CLI command — installs server.ts + bun deps at ~/.kild/channels/fleet/
New: FleetConfig in kild-config with [fleet] channels toggle (default: false)
Modified: fleet_agent_flags() appends --dangerously-load-development-channels server:kild-fleet when channels enabled
Modified: daemon_spawn.rs calls setup_channel_integration() in the spawn sequence
Modified: destroy.rs cleans up .mcp.json from project root for --main sessions

Architecture

Agent brain (Claude session)         Agent worker (Claude session)
  └─ kild-fleet MCP server            └─ kild-fleet MCP server
       │  fs.watch(KILD_FLEET_DIR)         │  fs.watch(KILD_INBOX)
       │                                   │
       └──── ~/.kild/inbox/<project>/ ─────┘
              (existing file protocol)

The channel server is a TypeScript/Bun MCP server embedded as a string constant in the Rust binary. It reads $KILD_INBOX and $KILD_FLEET_DIR from the PTY environment (already injected by the inbox module). The file-based inbox protocol remains the source of truth — the channel is a notification + tooling layer on top.

MCP Tools Exposed

Tool	Role	Action
`report_status`	worker	Write status + optional report.md
`send_to_brain`	worker	Write task.md to brain's inbox
`send_to_worker`	brain	Write task.md to worker's inbox
`list_fleet`	both	List fleet members with status

Files Changed

15 files changed (+616, -23)

File list

crates/kild-core/src/sessions/integrations/channel.rs (new, 442 lines)
crates/kild/src/commands/init_channels.rs (new, 59 lines)
crates/kild-config/src/types.rs — FleetConfig struct
crates/kild-config/src/loading.rs — merge logic
crates/kild-config/src/lib.rs — re-export
crates/kild-core/src/lib.rs — re-export
crates/kild-core/src/sessions/daemon_spawn.rs — wire channel integration
crates/kild-core/src/sessions/fleet.rs — channels flag in fleet_agent_flags()
crates/kild-core/src/sessions/destroy.rs — .mcp.json cleanup
crates/kild-core/src/sessions/daemon_helpers.rs — re-export
crates/kild-core/src/sessions/integrations/mod.rs — register module
crates/kild-paths/src/lib.rs — channels_dir(), fleet_channel_dir()
crates/kild/src/app/misc.rs — init-channels clap def
crates/kild/src/app/mod.rs — subcommand registration
crates/kild/src/commands/mod.rs — dispatch

Testing

cargo fmt --check && cargo clippy --all -- -D warnings passes
cargo test --all passes (0 failures)
cargo build --all succeeds
E2E: kild init-channels installs server + deps
E2E: Brain create gets .mcp.json + channels flag
E2E: Worker create gets .mcp.json + channels flag
E2E: kild inject → worker receives task, writes status + report
E2E: Brain destroy cleans up .mcp.json from project root
E2E: Graceful degradation when [fleet] channels = false (no .mcp.json, no flag)

Configuration

[fleet]
channels = true  # default: false (research preview)

Requires Bun runtime. Run kild init-channels to install dependencies.

* investigate: mutex unwrap in get_process_metrics can panic (#333) * fix: replace mutex unwrap with proper error handling in get_process_metrics (#333) `get_process_metrics()` used `.unwrap()` on `SYSTEM.lock()` which would panic if the mutex was poisoned. Replace with `.map_err()` to convert into `ProcessError::SystemError`, which the caller already handles gracefully via `Option`. Fixes #333

* Investigate issue #289: Ghostty focus/hide after CG migration Root cause: focus_window doesn't unminimize — AXRaised + activate_app don't undo kAXMinimizedAttribute set by kild hide. Need to add UnminimizeAndRaise action that sets kAXMinimized=false before raising. * fix: unminimize Ghostty window before raising on focus (#289) After the Core Graphics migration (#286), `kild focus` could not restore a window previously minimized via `kild hide` because the AX API focus path only raised the window without unsetting kAXMinimizedAttribute. Changes: - Add UnminimizeAndRaise variant to WindowAction enum - Add ax_unminimize_and_raise_window function mirroring existing pattern - Check is_minimized in focus_window and unminimize before raising - Add test for new WindowAction variant Fixes #289 * fix: add specific unminimize failure logging and exhaustive match test Add dedicated focus_unminimize_failed log event to distinguish unminimize failures from general AX raise failures — helps debug Ghostty AX quirks. Replace array length test with exhaustive match test for WindowAction variants, providing stronger compile-time guarantees.

* Investigate issue #320: daemon PTY sessions exit immediately after open/resume * fix: detect early PTY exit in daemon sessions (#320) When a daemon PTY process exits immediately after spawn (bad resume session, missing binary, env issue), kild now detects it within 200ms instead of letting the user discover it later via `kild attach`. Changes: - Add exit_code field to DaemonSession and SessionInfo wire type - Store exit_code in handle_pty_exit before transitioning to Stopped - Add get_session_info() and read_scrollback() daemon client functions - Add DaemonPtyExitedEarly error variant with exit code and scrollback - Add post-creation health check in both create_session and open_session daemon paths (200ms grace period + daemon status poll) - Clean up stopped daemon session on early exit detection Fixes #320

* fix: send SIGWINCH to PTY when terminal element resizes The embedded terminal was hardcoded to 80x24. GPUI computes correct cols/rows from element bounds in prepaint() but never sent them to the PTY. Store the PTY master handle in Terminal (via Arc<Mutex<>>), add ResizeHandle that bundles the refs needed for resize operations, and call resize_if_changed() from prepaint() when dimensions change. This sends SIGWINCH to the child process and reflows the terminal grid so programs like vim/htop respond to window resize correctly. Closes #310 * fix: replace panic with proper error handling in PTY resize - Replace .expect() on current_size lock with map_err + early return to prevent UI crash on lock poison - Replace silent if-let-Ok on pty_master lock with explicit error logging and propagation - Make resize_if_changed() return Result<(), TerminalError>, using the previously-unused PtyResize error variant - Handle Result in TerminalElement::prepaint() with tracing::error - Fix SIGWINCH comment accuracy (updates kernel winsize, not direct signal) - Document lock scope management in ordering comment

* investigate issue #332: DaemonClientError missing KildError trait * fix: implement KildError trait for DaemonClientError (#332) DaemonClientError was the only error type in kild-core missing the KildError trait implementation, breaking the documented error handling contract. Changes: - Add KildError impl with DAEMON_* error codes matching existing convention - NotRunning is the only user error (user can fix by starting daemon) - Add tests covering all 5 variants for error_code() and is_user_error() Fixes #332

* Investigate issue #334: health module has zero test coverage * test: add unit tests for health module (#334) The health module had 537 lines of code with zero test coverage. This adds 24 tests covering health status calculation, session enrichment, aggregation, snapshot storage, and history cleanup. Refactors storage functions to accept &Path parameter for testability with tempfile::TempDir, keeping public API unchanged. Fixes #334

) * investigate: process module dependency on agents module (#326) Analyze the upward dependency from process to agents module and document implementation plan to decouple via caller-passed patterns. * refactor: remove process module dependency on agents module (#326) The process module imported the agents module to look up agent-specific process name patterns during process detection, creating an upward dependency from a low-level utility to a domain-specific layer. Move agent pattern resolution to the caller (terminal/handler.rs) by adding an `additional_patterns` parameter to `find_process_by_name()` and `generate_search_patterns()`. Add `get_all_process_patterns()` to the agents module for bidirectional pattern resolution. Fixes #326

…inal (#349) * feat: add scrollback buffer and scroll wheel support to embedded terminal Wire GPUI ScrollWheelEvent to alacritty_terminal's scroll_display() via a hitbox-based mouse event listener in TerminalElement. Supports both trackpad (pixel deltas) and mouse wheel (line deltas). Add a "Scrollback" badge at top-right when scrolled up from bottom so users know they're viewing history. The badge disappears when scrolled back to bottom. The 10,000-line scrollback buffer was already provided by TermConfig::default() — this change just adds the scroll UI. Closes #311 * fix: add visual fallback for scrollback badge paint failure Paint a thin accent bar when badge text rendering fails so the user still gets a visible "scrolled up" indicator.

* refactor: remove inverted dependency from errors module to agents Add `supported_agents` field to `ConfigError::InvalidAgent` so the error message is fully constructed at the call site. This removes the `use crate::agents::supported_agents_string` import from the base errors module, which should have no domain-module dependencies. Closes #325 * fix: use realistic supported_agents in dispatch error test Use supported_agents_string() instead of String::new() so the test validates with real data rather than producing a non-actionable error message pattern.

…ion agent (#344) * investigate: open --no-agent inherits session agent name (#288) * fix: open --no-agent sets agent to 'shell' instead of inheriting session agent (#288) The BareShell match arm in open_session used session.agent.clone() which inherited the original agent name (e.g. "claude") instead of "shell". This caused incorrect display in list/status output despite the actual shell command running correctly. Fixes #288

* investigate issue #321: sync_daemon_session_status field-dropping bug * fix: use targeted JSON patching in sync_daemon_session_status (#321) sync_daemon_session_status() was using save_session_to_file() which round-trips through the Session struct, silently dropping fields from newer binary versions (e.g., task_list_id). This is the same class of bug fixed in agent_status.rs by PR #319. Changes: - Add patch_session_json_fields() for atomic multi-field JSON patching - Replace save_session_to_file() with field-level patches in sync_daemon_session_status - Add test verifying unknown fields survive multi-field patching Fixes #321

* refactor: decouple kild CLI from kild-daemon server embedding Make kild-daemon a standalone binary instead of embedding it as a library dependency in the CLI. The CLI now spawns kild-daemon as a subprocess for both foreground and background modes, and auto-start discovers the binary as a sibling executable. This removes tokio and kild-daemon dependencies from the CLI crate, resulting in a smaller binary and cleaner dependency separation. Closes #324 * docs: update architecture docs for standalone kild-daemon binary Clarify that kild-daemon is now a standalone binary spawned as a subprocess by the CLI, rather than being embedded as a library dependency. Auto-start discovers the binary as a sibling executable. * refactor: extract shared find_sibling_binary utility, improve daemon startup - Extract binary discovery logic into `daemon::find_sibling_binary()` in kild-core, eliminating duplication across autostart.rs, daemon.rs (CLI), and handler.rs (shim) - Add child process crash detection to CLI background daemon start loop (matching autostart.rs behavior) - Add debug logging to CLI daemon socket readiness loop - Add structured error logging to kild-daemon main.rs for config load, runtime init, and server failures

Remove unused PtyRead and ChannelSend variants, the unused impl block (error_code, is_user_error), and #[allow(dead_code)] attributes. These were left over from before the terminal rendering was wired up.

* feat: add mouse selection and copy/paste to embedded terminal Wire GPUI mouse events to alacritty_terminal's Selection API for click-drag text selection with visual highlight. Cmd+C copies selection to clipboard (falls back to SIGINT with no selection), Cmd+V pastes clipboard to PTY. Supports single-click, double-click (word), and triple-click (line) selection. * fix: address review feedback for terminal mouse selection - Surface PTY write failures to user via error banner (Cmd+C/Cmd+V) - Add bounds clamping in pixel_to_grid to prevent overflow on cast - Add debug logging when selection coordinates are clamped - Fix inaccurate comments (mouse down handler, Cmd+C behavior)

* refactor: move MergeReadiness and compute_merge_readiness to kild-core Move merge readiness business logic from CLI layer (stats.rs) into kild-core so both CLI and kild-ui can import it without duplication. - Add MergeReadiness enum to kild-core/src/git/types.rs - Add compute_merge_readiness() to kild-core/src/git/operations.rs - Export via git/mod.rs and lib.rs - Update CLI to import from kild-core - Move 11 unit tests to kild-core Closes #330 * refactor: make compute an inherent method on MergeReadiness Move computation logic from free function to MergeReadiness::compute() for better type encapsulation. Remove backward-compat wrapper per project guidelines. Add edge-case tests for CI Pending/Unknown states and Draft PR handling.

) * refactor: extract shared kild-protocol crate for IPC message types Extract ClientMessage, DaemonMessage, and SessionInfo wire types from kild-daemon into a new kild-protocol crate with only serde/serde_json dependencies. All three consumers (kild-daemon, kild-core, kild-tmux-shim) now import typed enums from kild-protocol instead of hand-crafting JSON, eliminating protocol drift across three independent implementations. - kild-protocol: new crate with typed enums and serde roundtrip tests - kild-daemon: re-exports types from kild-protocol (zero behavior change) - kild-core: replace json!() + .get() chains with typed construction/matching - kild-tmux-shim: same typed rewrite for all IPC functions * refactor: add typed SessionStatus and ErrorCode enums to kild-protocol Replace stringly-typed fields with compile-checked enums: - SessionInfo.status: String → SessionStatus enum (Creating, Running, Stopped) - DaemonMessage::Error.code: String → ErrorCode enum with #[serde(other)] fallback - DaemonClientError::DaemonError now carries ErrorCode for typed error matching Fix silent failures in daemon client (kild-core): - read_scrollback: base64 decode errors now surface instead of unwrap_or_default() - read_scrollback: unexpected response types return ProtocolError instead of empty vec - get_session_info: unexpected responses return ProtocolError instead of Ok(None) - get_session_status: unexpected responses warn + return ProtocolError instead of silently returning Ok(None) with misleading "completed" event name Wire format unchanged — serde rename_all ensures backward compatibility.

Split the 2,280-line operations.rs into 5 files by responsibility: - naming.rs: path sanitization, branch names, project ID generation - validation.rs: branch/arg validation, git directory checks - status.rs: diff stats, worktree status, git stats aggregation - health.rs: branch health metrics with pub(super) shared helpers - overlaps.rs: file overlap detection across kilds Updated mod.rs with re-exports so callers use cleaner paths (e.g. crate::git::kild_branch_name instead of crate::git::operations::kild_branch_name). Tests moved with their code, no behavioral changes.

* feat: auto-detect runtime mode on `kild open` When a daemon-created kild is stopped and reopened with `kild open`, it now automatically reopens in daemon mode without requiring the `--daemon` flag. - Add `runtime_mode: Option<RuntimeMode>` to Session struct (persists across stop/start via `#[serde(default)]`) - Set runtime_mode during `create_session` based on resolved mode - Change `open_session` to accept `Option<RuntimeMode>` where `None` triggers auto-detection: explicit flag > session stored mode > config > Terminal default - Add `resolve_explicit_runtime_mode` CLI helper that returns `None` when no flags passed (vs `resolve_runtime_mode` which always resolves) - UI actions pass `None` for auto-detect instead of hardcoded Terminal Closes #297 * docs: update kild open runtime mode resolution Document auto-detection behavior where kild open now uses the session's stored runtime_mode by default, only using config or flags when explicit overrides are provided. Resolution chain: --daemon/--no-daemon flag > session's stored mode > config > Terminal default * refactor: extract runtime mode resolution to dedicated function Extract the nested unwrap_or_else cascade into resolve_effective_runtime_mode() which returns (RuntimeMode, source) for clearer logging. The source now distinguishes "config" from "default" instead of lumping both as "default". Add unit tests for all 4 resolution branches (explicit, session, config, default) and a persistence test verifying runtime_mode survives stop + reload. * test: add auto-detect (None) cases to store and serialization tests Clarify doc comment on Command::OpenKild.runtime_mode to distinguish CLI-level semantics (no flag passed) from Session-level semantics (legacy session). Add runtime_mode: None variants to store contract test and both serialization round-trip tests.

Move applescript_escape from terminal::common::escape to a top-level escape module in kild-core. This eliminates the cross-domain dependency where notify reached into terminal internals for a general-purpose string escaping function. Also fix pre-existing unused imports in git/health.rs exposed by cargo fmt (moved test-only imports into #[cfg(test)] module). Closes #327

* feat: add cursor blink animation to embedded terminal Add a 530ms on/off cursor blink to the kild-ui terminal. The blink timer lives in TerminalView using an epoch-based invalidation pattern: each keystroke resets the blink so the cursor stays solid while typing. Unfocused cursors remain static (no blinking). Also fix pre-existing unused imports in git/health.rs (moved test-only imports into #[cfg(test)] module). Closes #314 * refactor: extract shared blink timer, fix comment, add debug logging - Extract duplicated blink timer loop into `spawn_blink_timer()` method - Split catch-all `_ => break` into explicit `Ok(false)` (stale epoch) and `Err(e)` (view dropped) arms with debug-level tracing - Fix misleading "thin bar" comment to accurately describe cursor_visible vs has_focus responsibility split

* feat: add `kild completions` subcommand for shell tab-completion Add `kild completions <shell>` that generates shell completion scripts for bash, zsh, fish, powershell, and elvish using clap_complete. Completions are generated dynamically from the clap CLI definition so they stay in sync as commands change. Closes #89 * docs: add completions command to development reference

* refactor: decompose sessions/handler.rs into focused modules Extract the remaining functions from handler.rs (2,943 lines) into focused, single-responsibility modules: - create.rs: create_session + helpers - open.rs: open_session, restart_session + helpers - stop.rs: stop_session - list.rs: list_sessions, get_session, sync_daemon_session_status - daemon_helpers.rs: build_daemon_create_request, ensure_shim_binary, create_zdotdir_wrapper, compute_spawn_id handler.rs becomes a pure re-export facade (~18 lines) preserving the session_ops::* API used by lib.rs, dispatch.rs, and health/handler.rs. All tests migrated to their respective modules. No API changes, no behavior changes — pure code movement following the established pattern from destroy.rs, complete.rs, and agent_status.rs extractions. Closes #329 * docs: update module structure for decomposed sessions handlers * refactor: rename and relocate misplaced tests in sessions modules - Rename persistence lifecycle tests in create.rs to reflect what they actually test (save/load/remove cycles, not destroy behavior) - Move destroy_session tests from create.rs to destroy.rs where they belong

* Investigate issue #369: --json commands must return valid JSON for all states * fix: --json commands return valid JSON for all states (#369) Several CLI commands violated the JSON contract by printing plain text for empty or error states when --json was set. Fix pr, stats --all, and overlaps commands to always output valid JSON regardless of state. Changes: - pr: output JSON for no-remote and no-PR-found paths - stats --all: output [] for empty sessions - overlaps: output JSON object for empty/insufficient kild states - Add integration tests for stats --all --json and overlaps --json - Fix pre-existing duplicate imports in health.rs and trailing newline in create.rs Fixes #369 * archive investigation artifact for #369 * fix: add reason field to no-PR-found JSON response for consistency Both pr.rs empty-state JSON responses now include a "reason" field, matching the pattern used in overlaps.rs. * fix: gate macOS-only imports behind cfg(target_os = "macos") The applescript and escape imports in iterm.rs and terminal_app.rs were missing the cfg gate, causing unused import errors on Linux CI.

…nal (#383) * fix: render wide characters (CJK/emoji) at double cell width in terminal Break text runs at wide character boundaries so each wide char is positioned individually at its exact grid column. This prevents width errors from accumulating across mixed normal/wide character runs. Also extend the focused cursor to span 2 cell widths when sitting on a wide character. Closes #315 * fix: clarify wide char comment to explain text shaper mismatch

Root causes identified: - remove_worktree_force() never calls branch deletion - detect_orphaned_branches() only matches legacy "worktree-" prefix Fix: explicit branch deletion in destroy_session() using session metadata, and update cleanup detection to match kild/ prefix.

Root cause: window resolution fails before run_assertion() is called, error propagates to main() where it's dropped without printing. Secondary issue: process::exit(1) doesn't flush stdout buffers. Fix: catch window resolution errors in assert handler and format as assertion failure output. Flush stdout before exit.

When `kild-peek assert --app "NonExistentApp" --exists` failed, it exited with code 1 but printed nothing. Window resolution errors were propagated as infrastructure errors instead of formatted as assertion failures. Changes: - Catch window resolution errors in assert handler and format as "Assertion: FAIL" with diagnostic details (or JSON when --json) - Flush stdout before process::exit(1) to prevent lost output in pipes - Add integration tests for assert failure output (plain, JSON, clean) Fixes #354

`kild complete` was succeeding even when no PR existed because `is_pr_merged()` conflated "no PR" with "not merged". Now uses `check_pr_exists()` for early detection and returns NoPrFound error. Changes: - Add NoPrFound error variant to SessionError - Add PR existence check before merge check in complete_session - Simplify CLI handler (remove redundant pre-check, core is source of truth) - Add test for new error variant Fixes #358

* refactor(ipc): consolidate thread-local connection pools into kild-protocol Both kild-core and kild-tmux-shim maintained nearly identical thread-local IpcConnection pools (~30 lines each). Extract the common take/release pattern into kild_protocol::pool so both crates delegate to a single implementation. No new dependencies added to kild-protocol — the pool is pure RefCell + thread_local logic with no tracing needed. Closes #583 * docs: update CLAUDE.md for kild_protocol::pool consolidation Three entries referenced the old per-crate thread-local pools. Update them to reflect that both kild-core and kild-tmux-shim now delegate to the shared kild_protocol::pool module. * fix(ipc): address review feedback on pool consolidation - Fix take() doc to describe all 3 behavioral paths (reuse, evict stale, fresh) instead of collapsing eviction into "otherwise" - Document single-socket-path invariant in module doc - Return bool from take()/release() so callers can emit their own tracing events — restores connection_reused, connection_created, connection_cached, and connection_dropped_on_return debug events in both kild-core and shim - Add test_take_evicts_stale_cached_connection test covering the "released alive, server closes, next take reconnects" path - Align wording: "take" instead of "reuse a cached one" in doc summaries

…sions (#609) PR #584 fixed PTY timing issues (#540) by skipping PTY delivery for fleet claude sessions, intending to rely on "dropbox task.md + Claude inbox". However, only the dropbox write was implemented — the Claude Code inbox write was never added. Since Claude Code polls its inbox file (not the dropbox), the initial prompt was silently dropped. Changes: - Move write_to_inbox from CLI inject.rs to kild-core fleet.rs - Call write_to_inbox from deliver_initial_prompt_for_session when skip_pty is true (fleet claude sessions) - Update inject.rs to use the kild-core version - Add tests for write_to_inbox in fleet.rs

When --initial-prompt is used with fleet claude sessions, the CLI now: 1. Prints a deprecation warning telling the agent to use kild inject 2. Delivers the prompt via the reliable inbox path as fallback Also updates kild-brain.md to instruct the brain to always use kild create + kild inject (two-step), never --initial-prompt. The --initial-prompt delivery path has been unreliable for fleet sessions across PRs #540, #584, #609. The inject path is battle-tested. Changes: - create.rs/open.rs: detect fleet claude sessions with --initial-prompt, print warning, deliver via fleet::write_to_inbox as fallback - fleet.rs: make is_claude_fleet_agent and fleet_mode_active pub - kild-brain.md: replace --initial-prompt usage with create + inject

* fix(fleet): store augmented command with fleet flags in session file The session file stored the base agent command before fleet flags were appended, so `kild list` showed a command missing --agent-id, --agent-name, and --team-name flags. Use the already-computed fleet_command instead so the stored command matches what actually runs. Closes #607 * docs: clarify fleet-augmented command in spawn_daemon_agent doc comment

* refactor(tests): fix naming convention in agent, terminal, editor tests Update `define_agent_backend!` macro to generate test names that follow the `<subject>_<expected_behavior>` convention using `paste` for ident concatenation (e.g. `amp_backend_returns_correct_name`). Rename trait test functions: - `test_agent_backend_basic_methods` → `mock_agent_backend_delegates_name_display_and_yolo_correctly` - `test_terminal_backend_basic_methods` → `mock_terminal_backend_name_and_availability_are_accessible` - `test_editor_backend_basic_methods` → `mock_editor_backend_is_available_and_not_a_terminal_editor` Closes #523 * fix(tests): move paste to dev section, rename remaining test_ prefixes Move `paste` from production to dev/test section in workspace Cargo.toml to match `temp-env` placement convention. Rename remaining sibling tests that were missed in the initial pass: - test_terminal_backend_execute_spawn → mock_terminal_backend_execute_spawn_returns_window_title - test_terminal_backend_close_window → mock_terminal_backend_close_window_does_not_panic - test_editor_backend_open → mock_editor_backend_open_succeeds * refactor(tests): extract shared macro tests, fix stale doc comment Address review agent findings: - Extract 4 shared test functions into define_agent_backend_tests! helper macro, eliminating ~30 lines of duplication between the two arms - Update module doc comment to reflect the new test_prefix parameter - Trim redundant return-type clause from terminal close_window test comment

) * fix(daemon): detect and restart stale daemons after binary upgrade After cargo install, old daemon processes kept running from the previous binary. Auto-start saw the daemon alive via ping and returned immediately, never comparing binaries. Now the daemon records its binary path + mtime at startup in ~/.kild/daemon.bin. On auto-start, if the running daemon's recorded mtime differs from the current binary on disk, it is gracefully restarted. - kild daemon status warns when the daemon is stale - kild daemon restart added as a convenience command (stop + start) - ensure_daemon_running auto-restarts stale daemons transparently Closes #608 * fix: address code review issues for stale daemon detection - Fix autostart logic: stale restart now bypasses auto_start guard, preventing Disabled error after stopping a stale daemon - Fix mtime fallback: return false (not stale) when mtime is unreadable, preventing false-positive restart loops - Fix bin_path derivation: use KildPaths-based bin_file_path() in both daemon and client for consistent path resolution - Add proper error logging to kild-core read_bin_file (match daemon version) - Extract spawn_daemon_background() helper to deduplicate CLI start/restart * fix: address findings from silent failure, simplifier, and comment agents Silent failure fixes: - Log find_sibling_binary error in is_daemon_stale instead of swallowing - Return Option<u32> from spawn_daemon_background to avoid printing PID 0 - Log mtime read failure in write_bin_file instead of silent fallback - Wait for socket removal (not just PID file) in stop_stale_daemon - Add error! log on restart stop timeout path - Make handle_daemon_start PID read failure soft (was hard error) - Log mtime parse failures in both read_bin_file implementations Simplification: - Replace needs_spawn flag with early return spawn_daemon() in ensure_daemon_running for clearer control flow Comment fixes: - Update run_server docstring to include bin file write step - Match bin_file_path doc to daemon version with path literal - Add actionable hint when stale daemon stop fails

* refactor(git): consolidate direct git2 usages behind git/ module API Add git/query.rs with high-level query functions (is_git_repo, get_origin_url, has_any_remote, has_uncommitted_changes, etc.) and git/test_support.rs for test helpers. Replace all direct git2 usages outside git/ module: - forge/registry.rs: use git::get_origin_url instead of Repository::open - sessions/destroy.rs: delegate has_remote_configured to git::has_any_remote - sessions/info.rs: use git::has_uncommitted_changes for status check - projects/types.rs: delegate is_git_repo to git::is_git_repo - cleanup/operations.rs: rewrite to use git::list_local_branch_names, worktree_active_branches, list_worktree_entries, is_worktree_valid - cleanup/handler.rs: use git::delete_local_branch, remove Repository usage Rename ProjectError::Git2CheckFailed to GitCheckFailed with GitError source type instead of raw git2::Error. Closes #593 * fix(git): restore error logging in query functions, improve error handling - has_uncommitted_changes: log repo-open and status-check failures instead of silently discarding via .ok()? - is_worktree_valid: log repo-open and HEAD failures at debug level instead of silently returning false - ensure_in_repo: distinguish NotFound from unexpected errors, consistent with is_git_repo - list_local_branch_names: log unreadable branch names at debug level - validate_cleanup_request: properly map NotInRepository vs other GitErrors instead of discarding error context - Improve doc comments for WorktreeEntry, list_worktree_entries, worktree_active_branches, ensure_in_repo, is_worktree_valid * refactor(cleanup): flatten nested match blocks for clarity - Flatten three-level nested match in worktree_active_branches into sequential match-with-continue, removing one level of indentation - Replace match-on-result-then-propagate in scan_for_orphans and cleanup_orphaned_resources with map_err+? to reduce boilerplate - Flatten nested match chain in collect_session_worktree_paths into sequential match-with-continue - Replace assert_eq!(..., true/false) with assert!/assert!(!...) in tests

* fix(hooks): filter and tag claude-status hook events by source (#611) The claude-status hook was forwarding all Claude Code lifecycle events (SubagentStop, TeammateIdle, TaskCompleted) to the honryu brain session, making subagent noise indistinguishable from the primary agent finishing. Replace per-event message formats ([DONE], [WAITING], [IDLE], [ERROR]) with a unified tagged format: [EVENT] <branch> <tag>: <summary> Tags: agent.stop, subagent.stop, teammate.idle, task.completed, agent.waiting, agent.idle By default only primary agent events (Stop, Notification) are forwarded. SubagentStop, TeammateIdle, and TaskCompleted are dropped unless KILD_HOOK_VERBOSE=1 is set in the session environment. Closes #611 * fix(hooks): prevent verbose events from writing dedup gate Review agents found a bug: in verbose mode, TeammateIdle can write the .idle_sent gate file before Stop fires, silently suppressing the primary completion signal to the brain. Fix: introduce WRITE_GATE flag so only primary events (Stop, idle_prompt) write the gate. Verbose-only events (SubagentStop, TeammateIdle, TaskCompleted) forward but never touch the gate. Also: - Update doc comment to describe event tagging and forwarding behavior - Strengthen test assertions: verify KILD_HOOK_VERBOSE conditional expression (not just name), include TaskCompleted in forward block check, assert Stop is not gated on KILD_HOOK_VERBOSE, assert WRITE_GATE presence

* fix(core): decouple process/ from sessions/ module process/cleanup.rs imported Session from sessions/, creating a bidirectional dependency between what should be a lower-level utility module and the session domain. Invert the dependency: rename cleanup_session_pid_files to cleanup_pid_files accepting &[String] (PID keys) instead of &Session. Add Session::pid_keys() method so the session-aware key extraction stays in sessions/types where it belongs. After this change, process/ has zero imports from sessions/. Closes #589 * fix(core): restore warn for no-agents fallback, improve doc comments Address review findings: - Restore warn! log in pid_keys() when session has no tracked agents (diagnostic signal was accidentally dropped during refactor) - Clarify pid_keys() doc to mention empty-spawn-id edge case and potential duplicate keys for legacy agents - Reference get_pid_file_path in cleanup_pid_files doc for path format

* refactor(terminal): extract duplicated AppleScript patterns Add high-level helpers (spawn_via_applescript, close_via_applescript, focus_via_applescript, hide_via_applescript) that combine template substitution with osascript execution, reducing boilerplate in iTerm and Terminal.app backends. Move window ID validation into a default close_window trait method that delegates to a new close_window_by_id required method, removing require_window_id duplication from all four backends. Closes #436 * fix(terminal): address review findings from automated agents - Fix stale doc in platform_unsupported! macro (close_window → close_window_by_id) - Add non-macOS stubs for high-level AppleScript helpers to match low-level pattern - Update close_window_by_id doc to include Alacritty alongside Ghostty - Update CLAUDE.md trait snippet to reflect new close_window_by_id and defaults

* refactor(git): extract kild-git crate from kild-core (#587) Move the self-contained git/ module into its own workspace crate, following the pattern of kild-config and kild-paths extractions. - Create crates/kild-git/ with all git operations (worktree management, branch health, status, queries, remote ops, naming, validation) - Extract detect_project/detect_project_at into kild-git::project - kild-core depends on kild-git and re-exports all public types/functions for backward compatibility - handler.rs (create_worktree) and overlaps.rs stay in kild-core since they depend on kild-core internals (files module, session types, config) - Move KildError impl for GitError to kild-core::errors (avoids circular dep) - Update all consumers (sessions, cleanup, CLI sync) to use git:: re-exports No behavior changes — pure structural extraction. Closes #587 * fix: gate test_support behind feature flag, clean up imports Address review feedback: - Gate kild-git test_support module behind `testing` feature flag so test helpers don't ship in production binaries (restores #[cfg(test)] behavior from before extraction) - kild-core uses the feature via dev-dependencies for its tests - Replace types::* glob import with explicit {ProjectInfo, WorktreeInfo} in handler.rs - Remove misleading inline module-source comments from re-export block (items are alphabetically sorted, labels were inaccurate)

* feat(ui): add cursor blink support with BlinkManager Add epoch-based cursor blink timer to TerminalView. The BlinkManager toggles cursor visibility every 500ms via cx.spawn(), with epoch tracking to cancel stale timers when a new cycle starts. Cursor stays visible during typing — pause() on every keystroke resets the blink timer. Unfocused terminals show a static hollow block cursor (no blinking). Closes #471 * fix(ui): address review findings in cursor blink - Stop blink timer on focus loss, restart on focus gain — no more wasteful repaints on unfocused terminals, and cursor state is always reset when focus returns (even via mouse click) - Replace .unwrap_or(false) with explicit match on view update — view-released teardown is now a distinct code path, not silently collapsed with stale-epoch exits - Encapsulate toggle logic behind toggle_if_current() method — blink field stays pub(super) for the spawn closure but mutation is self-contained - Rename pause() to reset() — matches actual semantics (restarts the blink cycle, not a pause) - Remove two-step new() + enable() — blink starts inert, render() drives the lifecycle based on focus state - Tighten module visibility: pub mod blink → mod blink - Fix doc comments for accuracy

* refactor(notify): extract NotificationBackend trait with platform backends Add trait-based extensibility for the notification subsystem, following the existing TerminalBackend and ForgeBackend patterns: - NotificationBackend trait with name/is_available/send interface - MacOsNotificationBackend (osascript) and LinuxNotificationBackend (notify-send) - Registry with platform-based auto-detection - NotifyError type with KildError implementation Also add standard From/TryFrom trait implementations: - From<AgentMode> for OpenMode (kild-protocol) - From<kild_protocol::SessionStatus> for core SessionStatus - From<&ProcessInfo> for ProcessMetadata Closes #435 * fix(notify): address review findings from 4 review agents - Remove dead NotifyError::IoError variant (YAGNI — no caller produces it) - Remove redundant which::which check in LinuxNotificationBackend::send (is_available already gates dispatch; Command::new handles missing binary) - Return bool from send_via_backend to distinguish sent vs skipped, fixing misleading send_completed log when no backend is available - Remove premature lib.rs re-exports of NotificationBackend/NotifyError (no callers outside kild-core) - Simplify registry detect() with find+map instead of verbose find_map - Inline detect_backend() into send_via_backend (single caller) - Add warn log on wildcard arm in From<SessionStatus> for unknown variants - Add doc note on format_notification_message re "needs input" for Error - Remove redundant test comments

* refactor: rename *Info types to domain-role names Rename generic *Info types to more descriptive domain-role names: - WorktreeInfo → WorktreeState (kild-git) - ProjectInfo → GitProjectState (kild-git) - BranchInfo → BranchState (kild-git) - PrInfo → PullRequest (kild-core forge) - NativeWindowInfo → NativeWindow (kild-core terminal) Pure rename, no behavior changes. Updates all usages across kild-git, kild-core, kild CLI, and documentation. Closes #522 * refactor: rename test functions to match new type names Align test function names with the renamed types: - test_worktree_info → test_worktree_state - test_worktree_info_preserves_original_branch_name → test_worktree_state_preserves_original_branch_name - test_project_info → test_git_project_state - test_branch_info → test_branch_state - test_pr_info_serde_roundtrip → test_pull_request_serde_roundtrip - test_pr_info_with_none_summaries → test_pull_request_with_none_summaries - test_native_window_info_fields → test_native_window_fields

* refactor: rename *Info/*Data types to domain-role names Rename ambiguous types across sessions and process modules: - SessionInfo (protocol) → DaemonSessionStatus - SessionInfo (core) → SessionSnapshot - DestroySafetyInfo → DestroySafety - AgentStatusInfo → AgentStatusRecord - AgentProcessData → AgentProcessDto - ProcessInfo → ProcessSnapshot Also renames the DaemonSession::to_session_info() method to to_daemon_session_status() for consistency. Pure rename — no behavior changes. Closes #521 * fix: update stale string references after type renames Fix error messages, test names, and CLAUDE.md that still referenced old type names (SessionInfo, AgentStatusInfo) after the rename refactor. * fix: update ProcessInfo → ProcessSnapshot in From impl and tests * remove accidental research file

#624) * fix(daemon): use real terminal size for PTY instead of hardcoded 24x80 Daemon PTYs were created with hardcoded 24x80 dimensions regardless of the actual terminal size. This caused agents like Claude Code to render incorrectly until the attach window connected and sent a resize. Add a resolution chain for initial PTY dimensions: 1. --rows/--cols CLI flags (explicit override) 2. [daemon] default_rows/default_cols config (set-and-forget) 3. ioctl(TIOCGWINSZ) on stdout (real terminal detection) 4. 80x24 fallback (no TTY available) This fixes the common case where `kild create --daemon` runs from a real terminal, and enables non-TTY callers (Claude Code, scripts) to specify dimensions via flags or config. * fix(daemon): resolve PTY dimensions independently + introduce OpenSessionRequest - Fix resolve_pty_size to resolve each dimension independently via .or() chaining instead of requiring both --rows AND --cols. Previously passing only --cols 220 silently discarded the flag. Same fix for config default_rows/default_cols. - Add debug logging to PTY size resolution and ioctl fallback paths. - Introduce OpenSessionRequest struct matching CreateSessionRequest pattern, removing #[allow(clippy::too_many_arguments)] from open_session. - Merge with_rows/with_cols into with_pty_size on both request types.

* Investigate issue #520: rename *Manager types to domain-role names * feat(hooks): add HTTP hook endpoint and Claude Code hook enhancements (#629-#634) Replace the fragile shell-script-based hook pipeline with typed Rust in the daemon for events that support HTTP hooks. - Add hyper HTTP listener to daemon on localhost:19222 for Stop and SubagentStop events with in-memory IdleGate replacing .idle_sent file - Rewrite claude.rs settings patching: HTTP hooks for Stop/SubagentStop, command hooks for TeammateIdle/TaskCompleted/Notification - Add prompt hook on Stop for task verification before stopping (#630) - Add SessionStart hook for auto-priming via kild prime --self --raw (#631) - Add kild report command for TaskCompleted structured reporting (#632) - Add kild check-queue command and inject --queue for TeammateIdle auto-reassign (#633) - Add --self and --raw flags to kild prime command - Add hooks_port to DaemonConfig and DaemonRuntimeConfig (default 19222) - Remove file-based idle gate from dropbox.rs and daemon_request.rs * fix(hooks): address review findings — error handling, type safety, tests, docs - Propagate write_task error in check_queue instead of silently dropping tasks - Replace unwrap_or_default() with proper JSON parse error handling in report - Separate prompt-hook detection from has_our_hook to prevent over-broad matching - Add 1 MiB body size limit to HTTP hook endpoint - DRY DEFAULT_HOOKS_PORT constant in kild-protocol, imported by 3 crates - Log config load failure in resolve_hooks_port with fallback warning - Log actionable message on HTTP hook bind failure with port number - Refactor IdleGate to HashSet, HookResult to BrainForward + typed AgentStatus - Add HookDecision enum replacing Option<String> for hook responses - Add 7 queue/report unit tests (FIFO ordering, peek idempotency, overwrite) - Fix test_ensure_claude_status_hook_always_overwrites to test actual overwrite - Update CLAUDE.md: hook architecture, new commands, hooks_port config, queue protocol

* perf: process kill retry, PID jitter, daemon runtime - Add process.wait() after kill() to verify termination and prevent zombie processes - Add +/-20% random jitter to PID file polling interval to prevent thundering herd when multiple kild create commands run simultaneously - Switch daemon from multi-threaded tokio runtime to current_thread since it's I/O-bound with low concurrency (PTY reads + IPC) Closes #478 * fix: bounded kill wait, PID-based jitter, revert daemon runtime - Replace process.wait() with bounded 500ms poll loop to avoid blocking indefinitely on uninterruptible sleep - Use PID-based jitter instead of SystemTime nanos which are correlated across simultaneous launches - Revert current_thread runtime: daemon multiplexes PTYs + IPC across multiple sessions, single-threaded would stall all sessions on any slow operation * fix: address review feedback on kill wait loop and PID jitter - Reuse existing `system` in kill wait loop instead of allocating a new System::new() on each of up to 50 poll iterations - Add debug log when kill wait times out (process didn't exit within 500ms after SIGKILL) - Hoist constant jitter calculation above the polling loop - Align docstring to use "decorrelate" instead of "thundering herd" * refactor: simplify kill wait loop and jitter arithmetic - Replace exited flag with early return on process exit - Use elapsed() pattern consistent with read_pid_file_with_retry - Remove unnecessary signed-integer casts in jitter computation — stays in u64 since BASE_INTERVAL_MS (100) > JITTER_RANGE_MS (20)

* refactor: rename *Manager types to domain-role names Rename four types that used the vague `Manager` suffix to descriptive domain-role names per the Code Naming Contract: - SessionManager → DaemonSessionStore (kild-daemon) - PtyManager → PtyStore (kild-daemon) - ProjectManager → ProjectRegistry (kild-core) - TeamManager → TeamStore (kild-ui) No behavior changes — pure mechanical rename across all usages, docs, and CLAUDE.md/AGENTS.md references. Closes #520 * fix: address review feedback on *Manager rename - Rename team_manager field → team_store across all UI call sites - Rename 13 test functions test_project_manager_* → test_project_registry_* - Fix 2 doc comments still saying "project manager"

…e error display (#628) * refactor: extract magic strings, decompose long functions, standardize error display - Add WORKTREE_ADMIN_PREFIX constant to naming.rs alongside KILD_BRANCH_PREFIX - Use kild_branch_name() instead of raw format!("kild/{}") in pr.rs, detail_view.rs - Use KILD_BRANCH_PREFIX in kild_branch_name() function body - Add SHIM_VERSION constant for tmux version string in shim commands - Extract kill_tracked_agents() from destroy_session() (127 lines → helper) - Extract sweep_ui_daemon_sessions() from destroy_session() (55 lines → helper) - Extract resolve_resume_args() from open_session() (46 lines → helper) - Add display_operation_error() helper for consistent CLI error formatting - Standardize error display across open, hide, focus, diff, health, stats, sync, commits, and teammates commands to use color::error() consistently Closes #438 * fix: address PR review — restore comment, tighten visibility, fix test helpers - Restore non-fatal comment on daemon cleanup in kill_tracked_agents() - Change WORKTREE_ADMIN_PREFIX to pub(crate) — no external callers - Replace raw format!("kild/...") in cleanup/handler.rs and overlaps.rs test helpers with kild_branch_name() / kild_worktree_admin_name() * fix: address review feedback from PR review agents - Fix kill_tracked_agents docstring: clarify daemon errors are always non-fatal, not gated on force flag - Fix resolve_resume_args docstring: document is_bare_shell parameter and error return paths - Remove .unwrap() in kill_tracked_agents, use indexing instead - Change display_operation_error to use impl Display over &dyn Display - Inline WORKTREE_ADMIN_PREFIX constant (single use site) - Reuse kild_branch variable in pr.rs no_pr_found path - Use imported color module instead of crate::color in teammates.rs - Add kild-git crate to CLAUDE.md workspace structure

…635) * Investigate issue #520: rename *Manager types to domain-role names * feat(brain): add memory, hooks, maxTurns to kild-brain agent; deprecate --initial-prompt - Add `memory: user`, `maxTurns: 200`, and agent-scoped `hooks:` to kild-brain frontmatter (PreToolUse bash guard + Stop fleet snapshot) - Create `.claude/hooks/brain-bash-guard.sh` to enforce "no source code access" constraint at the hook level - Replace ~40 lines of manual memory management with auto-memory note - Fix router skill to use create-then-inject instead of --initial-prompt - Deprecate --initial-prompt in CLI help text with runtime warnings - Update CLAUDE.md with deprecation notes and inject-based brain setup * fix: address review findings on brain hooks and deprecation warnings - Guard: switch from fragile grep+sed JSON parsing to jq, fail closed on parse failure, add ERR trap, block subshell invocations (bash -c, sh -c), remove overly broad src/ pattern, document advisory nature - Deprecation: eliminate contradictory double-warning for fleet sessions by branching into fleet-specific vs general paths (not both) - Deprecation: use color::warning/color::hint consistently in open.rs (was bare eprintln, now matches create.rs) - Deprecation: remove redundant initial_prompt_for_warning clone in create.rs, clone at use site instead - Logging: add structured error!() events for inbox fallback failures in both create.rs and open.rs - Stop hook: log stderr to file instead of /dev/null, ensure dir exists - SKILL.md: add session-active check after sleep 5 before injecting, warn user if session not ready instead of silently losing the message

* refactor(fleet): replace dropbox protocol with universal inbox Remove the complex dropbox messaging system (task IDs, history.jsonl, flock locking, ack files) in favor of a simpler file-based inbox protocol at ~/.kild/inbox/<project_id>/<branch>/. Key changes: - Delete dropbox.rs (2,100 lines) and its CLI commands (check-queue, report) - Add inbox.rs with streamlined read_inbox_state() and generate_prime_context() - Extract fleet instruction generation into fleet_instructions.rs - Simplify Claude hooks: remove Stop prompt hook, SessionStart auto-prime, check-queue from TeammateIdle, and report from TaskCompleted - Rewrite inbox/inject/prime CLI commands for the new protocol - Add inbox path helpers to kild-paths * fix(fleet): address review findings from PR #637 Critical fixes: - Remove write_task("honryu",...) from forward_to_brain that clobbered the brain's task.md on every worker Stop event - Populate fleet entries in kild prime --json (was always empty vec) - Surface errors in handle_all_prime (was silently returning Ok) Error handling fixes: - Narrow status file read to only swallow NotFound, warn on other I/O errors - Await spawn_blocking JoinHandle in forward_to_brain to catch panics - Add eprintln for inbox status init failure (was warn-only) Simplification: - Remove unused _is_brain parameter from ensure_inbox - Flatten write_task return from Result<Option<()>> to Result<bool> - Extract build_fleet_entries() and render_fleet_table() helpers - Merge _resolved wrappers into primary read_inbox_state/generate_prime_context - Extract write_fleet_instructions_to() shared helper - Warn on corrupt fleet instruction markers (begin without end) Docs: - Fix README.md stale dropbox references and removed --task/--report/--status flags - Fix wave planner skill referencing deleted dropbox.rs

…erations (#638) * fix(daemon): restore pooled connection timeout after short-timeout operations Four functions (ping_daemon, get_session_status, get_session_info, read_scrollback) set a 2s read timeout on pooled IPC connections but never restored the default 30s before returning to the pool. The next caller on the same thread inherited the corrupted 2s timeout, causing spurious timeouts on slower operations. Save the original timeout before overriding, and restore it before returning the connection to the pool. If the restore fails, the connection is dropped rather than poisoning the pool. Add IpcConnection::get_read_timeout() to support the save/restore pattern. * refactor(daemon): extract RAII timeout guard for pooled connection operations Replace 8 manual save/set/restore cycles across 4 functions with IpcConnection::with_read_timeout() — a closure-based helper that saves the original timeout, sets a short one, runs the caller's closure, and restores the original on return.

) * fix(session): mark session Stopped even when daemon is unreachable When the daemon isn't running, `kild stop` would fail with DaemonError and leave the session stuck Active on disk. The early return at stop.rs:124 fired before the session status was updated. Consolidate the `is_daemon_unreachable` check from list.rs into a proper `is_unreachable()` method on DaemonClientError. Use it in both stop_session() and stop_teammate() to treat unreachable daemon errors (NotRunning, ConnectionFailed, ProtocolError, Io) as "PTY already dead" instead of blocking the stop flow. * fix(session): use is_unreachable() in destroy path for consistency

… ID races (#640) * fix(shim): hold registry lock across load-modify-save to prevent pane ID races state::load() and state::save() each acquired independent flocks. Between the two calls, a concurrent split-window could read the same next_pane_id, causing duplicate pane IDs and orphaned daemon PTYs. Add LockedRegistry guard type that holds both the Flock and PaneRegistry. save(self) writes while the lock is still held. Update all 8 callers in commands.rs to use load_and_lock() + locked.save(). Add thread-level concurrency test verifying unique pane ID allocation. * fix(shim): gate standalone load() behind #[cfg(test)], keep save() for init_registry * fix(shim): address review findings from PR #640 Update module doc to reflect that load() is now test-only and save() is init-only. Add read/write phase comments to handle_new_session. Restructure handle_new_window to do all reads before writes, eliminating interleaved registry()/registry_mut() calls.

* feat(ui): add terminal reconnection on daemon disconnect When the daemon reader connection drops, the terminal view now shows "Press R to reconnect" instead of a dead-end error. The Reconnect action spawns an async task that calls connect_for_attach(), builds a new Terminal::from_daemon(), and replaces the terminal + event task atomically. Handles edge cases: daemon gone, session destroyed, multiple rapid presses, local terminals unaffected. * fix(ui): address review findings from PR #641 - Use actual terminal dimensions on reconnect instead of hardcoded 24x80 - Log error on reconnect_state lock poison instead of silently swallowing - Accept uppercase R for reconnect key (CapsLock resilience) - Fix theme::surface_1() -> theme::surface() compilation error

Add a kild-fleet MCP channel server that watches inbox files and pushes notifications into Claude Code sessions via the channels protocol. This reduces fleet communication latency from ~1s (Claude inbox polling) to ~100ms (fs.watch + stdio notification). The channel server (TypeScript/Bun) is embedded in the Rust binary and installed to ~/.kild/channels/fleet/ via `kild init-channels`. It exposes MCP tools (report_status, send_to_worker, send_to_brain, list_fleet) so agents can communicate without shelling out to CLI commands. Gated behind [fleet] channels = true config flag (default: false). Requires Bun runtime. Graceful degradation when unavailable. New command: `kild init-channels` — installs server + bun deps. New config: `[fleet] channels` — enables channel server for fleet sessions.

…tall Channel server writes a .channel breadcrumb to the inbox dir after MCP handshake completes. `kild inbox` surfaces this as [channel] next to the status line, so you can confirm the channel server is actually connected. Stale breadcrumbs are cleaned on session create/open (ensure_inbox). Also skip `bun install` in `kild init-channels` when node_modules exists.

Wirasm and others added 30 commits February 11, 2026 09:43

refactor: remove dead code from TerminalError enum in kild-ui (#372)

fb10de9

Remove unused PtyRead and ChannelSend variants, the unused impl block (error_code, is_user_error), and #[allow(dead_code)] attributes. These were left over from before the terminal rendering was wired up.

Investigate issue #358: complete should fail on branches with no PR

9eeaa7b

Investigate issue #363: kild list should not truncate column values

250c07d

Wirasm and others added 28 commits February 26, 2026 23:09

Wirasm closed this Jun 5, 2026

Wirasm force-pushed the main branch from 369c35b to 9c874c2 Compare June 5, 2026 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(fleet): add MCP channel server for real-time agent communication#642

feat(fleet): add MCP channel server for real-time agent communication#642
Wirasm wants to merge 602 commits into
mainfrom
feature/fleet-channels

Wirasm commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Wirasm commented Mar 24, 2026

Summary

Changes

Architecture

MCP Tools Exposed

Files Changed

Testing

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant