feat: Support long-running local inference with configurable timeouts and busy-agent queueing#1209
Open
Coder666 wants to merge 2 commits into
Open
feat: Support long-running local inference with configurable timeouts and busy-agent queueing#1209Coder666 wants to merge 2 commits into
Coder666 wants to merge 2 commits into
Conversation
c0bdd4c to
51756cd
Compare
…usy-agent queueing Introduces configurable HTTP/tool/runtime timeouts, for inter-agent work, persistent queueing for messages sent while an agent is busy, and safer agent state cleanup so agents do not get stuck in a permanent busy state
51756cd to
9a958d2
Compare
fix: align queued channel routing with bridge router fix: validate channel queue timing config
This was referenced May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes OpenFang much more reliable for long-running agent turns, especially when using local or self-hosted inference backends that can be significantly slower than hosted APIs.
It introduces configurable HTTP/tool/runtime timeouts for inter-agent work, persistent queueing for messages sent while an agent is busy, and safer agent state cleanup so agents do not get stuck in a permanent busy state.
Motivation
The existing runtime assumed relatively short hosted-model response times. That breaks down for local inference, large models, slow GPUs, CPU fallback, long-context generation, or agent-to-agent workflows where one agent may trigger a full downstream turn.
The symptoms this branch addresses include:
Major Changes
Applies to:
This allows operators to extend request timeouts for local servers, proxy layers, or long-form generation workloads.
Runtime Configuration
Adds a new runtime configuration surface for agent execution behavior.
New runtime settings include:
The agent loop now receives runtime config and uses it for loop limits, tool timeout behavior, continuation limits, and context/tool-result budgeting.
Agent Tool Timeouts
Inter-agent tools such as agent_send and agent_spawn can have a much longer default timeout, because they may represent complete agent turns rather than simple tool calls.
Timeouts can be overridden through environment variables, and setting timeout values to 0 disables that timeout for operators who explicitly want unbounded waits.
Busy-Agent Handling
The kernel now detects when an agent is already processing a turn and returns a structured AgentBusy error instead of letting concurrent messages pile into the same session path.
It also marks active turns as Thinking, allowing channels and status surfaces to distinguish an active long-running turn from a normal idle agent.
Safe Thinking-State Cleanup
Adds guarded cleanup around Thinking state so an interrupted, aborted, or dropped turn does not leave the agent permanently busy.
This is important for integrations that check busy state before dispatching, because a stale Thinking state can otherwise make an agent appear unavailable forever.
Persistent Channel Queueing
Adds queue handling for messages sent while an agent is busy.
New channel queue settings include:
Queued messages are persisted through the kernel memory substrate and retried by a background queue processor. This means channel messages can be accepted while an agent is busy and delivered when the agent becomes available.
The queueing path also:
Channel Bridge Retry Behavior
Channel dispatch now retries busy-agent responses instead of immediately surfacing an error.
This applies to:
Context and Compaction Configuration
Adds config plumbing for session compaction behavior and context-window fallback behavior.
This includes:
Documentation
Updates configuration docs for:
Why Should You Accept This PR?
Compatibility
Validation
Adds and updates tests around:
Testing
cargo clippy --workspace --all-targets -- -D warningspassescargo test --workspacepasses(Tested on docker container with local inference)
Security