Skip to content

fix(session): reject kild open on active sessions#603

Merged
Wirasm merged 3 commits into
mainfrom
fix/issue-599-open-active-guard
Feb 26, 2026
Merged

fix(session): reject kild open on active sessions#603
Wirasm merged 3 commits into
mainfrom
fix/issue-599-open-active-guard

Conversation

@Wirasm

@Wirasm Wirasm commented Feb 26, 2026

Copy link
Copy Markdown
Owner

Summary

kild open on an already-active session unconditionally spawned a second agent process and overwrote agent_session_id with a new UUID. This created competing agents in the same worktree and permanently broke --resume for the original conversation.

Root Cause

open_session() had no guard checking session status before spawning — unlike create_session() which checks for AlreadyExists. The agents vec got a duplicate push and rotate_agent_session_id() silently discarded the live session ID.

Changes

File Change
crates/kild-core/src/sessions/errors.rs Add AlreadyActive error variant with actionable alternatives (attach, inject, stop+reopen)
crates/kild-core/src/sessions/open.rs Add active-session guard before agent spawn with daemon liveness check
crates/kild-core/src/sessions/errors.rs Add test for AlreadyActive error
crates/kild-core/src/sessions/open.rs Add tests for guard logic (active+agents blocked, stopped allowed)

Design Decisions

  • Daemon liveness check: For daemon sessions, sync_daemon_session_status() verifies the PTY is truly running. Stale-active sessions (daemon PTY died without kild stop) get synced to Stopped and can reopen normally.
  • Terminal sessions trust stored status: No reliable PID-based liveness check for external terminal processes. Users can kild stop then kild open.
  • --all unaffected: The open --all CLI path already filters for Stopped sessions only.

Testing

  • cargo fmt --check passes
  • cargo clippy --all -- -D warnings passes
  • cargo test --all passes (all existing + 3 new tests)

Validation

cargo fmt --check && cargo clippy --all -- -D warnings && cargo test --all

Issue

Fixes #599

…agent

Root cause: open_session() has no guard checking session status before
spawning. Fix: add AlreadyActive error + active-session guard with daemon
liveness check, mirroring the AlreadyExists guard in create_session().
`kild open` on an already-active session unconditionally spawned a second
agent process and overwrote `agent_session_id`, breaking `--resume` for
the original conversation with no recovery path.

Changes:
- Add `AlreadyActive` error variant with actionable alternatives
  (attach, inject, stop+reopen)
- Add active-session guard in `open_session()` before agent spawn
- Daemon sessions get a liveness check via `sync_daemon_session_status()`
  so stale-active sessions (dead PTY) can still be reopened
- Non-daemon sessions trust stored status (no reliable liveness check)

Fixes #599
@Wirasm

Wirasm commented Feb 26, 2026

Copy link
Copy Markdown
Owner Author

PR Review Summary

Reviewed by 5 specialized agents: code-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, and self-reviewer.


Critical Issues (3 found)

# Agent(s) Issue Location
1 code-reviewer, self-reviewer, silent-failure-hunter, comment-analyzer Redundant save_session_to_file in stale-active path. sync_daemon_session_status() already persists the status change via patch_session_json_fields() (which preserves unknown fields from newer binary versions). The follow-up save_session_to_file() is a full JSON rewrite that can silently drop unknown fields — directly contradicting the safety guarantee established in list.rs:117-119. Additionally, the info! fires before the save attempt with no _failed counterpart on error. Fix: remove the redundant save entirely. open.rs:125-132
2 self-reviewer Guard blocks BareShell opens on active sessions. The guard fires for all OpenMode variants including BareShell. A bare shell doesn't spawn an agent and can't corrupt agent_session_id — the exact scenario the guard exists to prevent. Users running kild open my-branch --no-agent on an active session will get a false AlreadyActive rejection. Fix: gate guard on !matches!(mode, OpenMode::BareShell). open.rs:95-98
3 comment-analyzer open_session docstring contradicts new behavior. The docstring says "additive - doesn't close existing terminals" and "multiple agents can run in the same kild" — the guard now prevents exactly that. Fix: update docstring to reflect the new guard. open.rs:35-41

Important Issues (3 found)

# Agent(s) Issue Location
4 self-reviewer, pr-test-analyzer, code-reviewer Tests validate boolean condition copy, not open_session(). Both guard tests manually evaluate session.status == Active && session.has_agents() without calling open_session(). If the guard is removed or inverted, these tests still pass. The stale-active daemon path (the subtlest logic) is entirely untested. Fix: add end-to-end test calling open_session() on an Active terminal session asserting Err(AlreadyActive). open.rs:1136-1187
5 silent-failure-hunter No log event for terminal liveness-skip branch. When a non-daemon session hits the guard, there's no log indicating the terminal path was taken (trusting stored status without liveness check). Makes debugging "why was my open blocked?" harder. Fix: add info! with core.session.open_terminal_liveness_skipped. open.rs:108-110
6 code-reviewer Test names don't follow <subject>_<expected_behavior> convention. Per CLAUDE.md. open.rs, errors.rs

Suggestions (5 found)

# Agent(s) Suggestion Location
7 pr-test-analyzer Add kild stop assertion to test_already_active_error (one-liner) errors.rs:230
8 comment-analyzer Add comment explaining !sync_daemon_session_status() negation (returns true when changed to Stopped, so ! = still running) open.rs:106
9 silent-failure-hunter Log resume field in open_started event open.rs:51-56
10 silent-failure-hunter Log session_id in open_stale_active_synced event open.rs:127-131
11 pr-test-analyzer, comment-analyzer test_stopped_session_not_blocked claims "even if agents non-empty" but doesn't add agents — add one to prove the claim open.rs:1176-1187

Strengths

  • Error message is actionable — three concrete recovery paths (attach, inject, stop && open)
  • Guard placement is correct — after worktree check, before any agent resolution or spawn
  • Daemon liveness check reuses existing sync_daemon_session_status() infrastructure
  • --all path correctly unaffected (already filters for Stopped)
  • Structured logging follows {layer}.{domain}.{action}_{state} naming convention
  • AlreadyActive error type is well-formed: error code, is_user_error(), actionable display

Verdict

NEEDS FIXES — 3 critical issues require changes before merge. The core approach is sound; the guard logic, error variant, and daemon liveness integration are correct. The critical fixes are:

  1. Remove the redundant save_session_to_file (5-line deletion)
  2. Gate the guard on !BareShell (1-line addition)
  3. Update the docstring (text edit)

Recommended Action Order

  1. Fix Critical Implement file-based persistence system #1 — remove redundant save
  2. Fix Critical Add Ghostty terminal support (blocked by upstream -e parsing) #2 — gate on !BareShell
  3. Fix Critical feat: implement cleanup tracking system #3 — update docstring
  4. Fix Important feat: Hierarchical TOML Configuration System #4 — add end-to-end regression test
  5. Fix Important Implement missing session management commands (destroy, cleanup, info) #5 — add terminal liveness-skip log
  6. Apply suggestions as time permits

Critical fixes:
- Remove redundant save_session_to_file in stale-active path —
  sync_daemon_session_status already persists via patch_session_json_fields
  which preserves unknown fields from newer binary versions
- Gate guard on !BareShell — bare shell opens don't spawn agents and
  should not be blocked by the active-session guard
- Update open_session docstring to reflect the new guard behavior

Important fixes:
- Add terminal liveness-skip log event for debuggability
- Rename tests to follow <subject>_<expected_behavior> convention
- Add BareShell bypass test to verify bare shells are not blocked
- Stopped session test now adds an agent to prove status check, not
  just empty agents vec
- Add kild stop assertion to error message test
- Add comment explaining sync_daemon_session_status negation
- Log resume field in open_started event
- Log session_id in open_stale_active_synced event
@Wirasm Wirasm merged commit 01fedb1 into main Feb 26, 2026
6 checks passed
@Wirasm Wirasm deleted the fix/issue-599-open-active-guard branch February 26, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: kild open on active session spawns duplicate agent, overwrites agent_session_id

1 participant