Skip to content

feat: Support long-running local inference with configurable timeouts and busy-agent queueing#1209

Open
Coder666 wants to merge 2 commits into
RightNow-AI:mainfrom
Coder666:feat/agentic-timeouts
Open

feat: Support long-running local inference with configurable timeouts and busy-agent queueing#1209
Coder666 wants to merge 2 commits into
RightNow-AI:mainfrom
Coder666:feat/agentic-timeouts

Conversation

@Coder666
Copy link
Copy Markdown

@Coder666 Coder666 commented May 20, 2026

Summary

This PR makes OpenFang much more reliable for long-running agent turns, especially when using local or self-hosted inference backends that can be significantly slower than hosted APIs.

It introduces configurable HTTP/tool/runtime timeouts for inter-agent work, persistent queueing for messages sent while an agent is busy, and safer agent state cleanup so agents do not get stuck in a permanent busy state.

Motivation

The existing runtime assumed relatively short hosted-model response times. That breaks down for local inference, large models, slow GPUs, CPU fallback, long-context generation, or agent-to-agent workflows where one agent may trigger a full downstream turn.

The symptoms this branch addresses include:

  • Long local inference requests timing out too early.
  • agent_send / agent_spawn timing out before downstream agents complete.
  • Telegram and other channel integrations receiving “agent busy” when a message arrives during an active turn.
  • Agents remaining in Thinking after interrupted or dropped work.
  • Operators needing to tune timeout behavior without recompiling.

Major Changes

  • Configurable HTTP Timeouts
  • Adds http_timeout_secs to model configuration so OpenAI-compatible providers can be tuned for slow inference.

Applies to:

  • Default model
  • Fallback providers
  • Provider driver creation
  • OpenAI-compatible/local providers
  • Azure/OpenAI-compatible variants where applicable

This allows operators to extend request timeouts for local servers, proxy layers, or long-form generation workloads.

Runtime Configuration

Adds a new runtime configuration surface for agent execution behavior.

New runtime settings include:

  • max_iterations
  • max_retries
  • base_retry_delay_ms
  • tool_timeout_secs
  • agent_tool_timeout_secs
  • max_continuations
  • max_agent_call_depth
  • browser/MCP timeout fields
  • tool_result_budget_ratio

The agent loop now receives runtime config and uses it for loop limits, tool timeout behavior, continuation limits, and context/tool-result budgeting.

Agent Tool Timeouts

Inter-agent tools such as agent_send and agent_spawn can have a much longer default timeout, because they may represent complete agent turns rather than simple tool calls.

Timeouts can be overridden through environment variables, and setting timeout values to 0 disables that timeout for operators who explicitly want unbounded waits.

Busy-Agent Handling

The kernel now detects when an agent is already processing a turn and returns a structured AgentBusy error instead of letting concurrent messages pile into the same session path.

It also marks active turns as Thinking, allowing channels and status surfaces to distinguish an active long-running turn from a normal idle agent.

Safe Thinking-State Cleanup

Adds guarded cleanup around Thinking state so an interrupted, aborted, or dropped turn does not leave the agent permanently busy.

This is important for integrations that check busy state before dispatching, because a stale Thinking state can otherwise make an agent appear unavailable forever.

Persistent Channel Queueing

Adds queue handling for messages sent while an agent is busy.

New channel queue settings include:

  • queue_enabled
  • queue_max_retries
  • queue_sleep_secs
  • queue_poll_secs

Queued messages are persisted through the kernel memory substrate and retried by a background queue processor. This means channel messages can be accepted while an agent is busy and delivered when the agent becomes available.

The queueing path also:

  • avoids duplicate queued messages by platform message id
  • reports queue position to the user
  • preserves lifecycle reactions
  • keeps typing indicators from leaking
  • records delivery success/failure
  • retains busy-race messages instead of dropping them

Channel Bridge Retry Behavior

Channel dispatch now retries busy-agent responses instead of immediately surfacing an error.

This applies to:

  • normal text dispatch
  • multimodal/content-block dispatch
  • re-resolution retry paths

Context and Compaction Configuration

Adds config plumbing for session compaction behavior and context-window fallback behavior.

This includes:

  • configurable compaction thresholds
  • configurable recent-message retention
  • configurable summary/token budgeting
  • configurable tool-result budget ratio

Documentation

Updates configuration docs for:

  • HTTP timeout settings
  • runtime limits
  • channel queue behavior
  • local inference tuning
  • local model cost/power notes

Why Should You Accept This PR?

  • Makes OpenFang more practical on local/self-hosted inference.
  • Avoids premature failures for slow but valid model responses.
  • Prevents channel integrations from treating a long-running agent as broken.
  • Preserves user messages sent during busy turns.
  • Reduces session corruption risk from concurrent sends.
  • Gives operators config knobs instead of hard-coded runtime assumptions.

Compatibility

  • Existing configs continue to work because new fields have defaults.
  • Timeout environment variables are still supported.
  • Queueing defaults are configurable.
  • Hosted-provider users should see no behavior change unless they opt into new timeout settings.

Validation

Adds and updates tests around:

  • busy-agent retry behavior
  • persistent queue helpers
  • runtime timeout defaults
  • driver config construction
  • HTTP timeout config compatibility
  • compaction/runtime defaults

Testing

  • cargo clippy --workspace --all-targets -- -D warnings passes
  • cargo test --workspace passes
  • Live integration tested (if applicable)

(Tested on docker container with local inference)

Security

  • No new unsafe code
  • No secrets or API keys in diff
  • User input validated at boundaries

@Coder666 Coder666 force-pushed the feat/agentic-timeouts branch from c0bdd4c to 51756cd Compare May 20, 2026 21:33
…usy-agent queueing

Introduces configurable HTTP/tool/runtime timeouts, for inter-agent work, persistent queueing for messages sent while an agent is busy, and safer agent state cleanup so agents do not get stuck in a permanent busy state
@Coder666 Coder666 force-pushed the feat/agentic-timeouts branch from 51756cd to 9a958d2 Compare May 20, 2026 21:39
@Coder666 Coder666 changed the title Support long-running local inference with configurable timeouts and busy-agent queueing feat: Support long-running local inference with configurable timeouts and busy-agent queueing May 20, 2026
fix: align queued channel routing with bridge router
fix: validate channel queue timing config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant