Skip to content

Fix Codex Minimal Reasoning Effort Execution#2020

Open
QwertyCoolMT wants to merge 1 commit into
coleam00:devfrom
QwertyCoolMT:archon/task-fix-archon-minimal-setting-retry2
Open

Fix Codex Minimal Reasoning Effort Execution#2020
QwertyCoolMT wants to merge 1 commit into
coleam00:devfrom
QwertyCoolMT:archon/task-fix-archon-minimal-setting-retry2

Conversation

@QwertyCoolMT

@QwertyCoolMT QwertyCoolMT commented Jun 19, 2026

Copy link
Copy Markdown

Summary

Describe this PR in 2-5 bullets:

  • Problem: workflow-level Codex options such as modelReasoningEffort: minimal were parsed but not forwarded into DAG node assistantConfig.
  • Why it matters: users selecting Codex minimal reasoning could get a different runtime effort than configured, especially when tier defaults or aliases were involved.
  • What changed: DAG execution now applies Codex workflow-level modelReasoningEffort, webSearchMode, and additionalDirectories to assistantConfig, with explicit workflow reasoning taking precedence over preset effort.
  • What did not change (scope boundary): Claude node/workflow effort, Copilot/OpenCode behavior, Settings UI semantics, docs, generated API types, and unrelated workflow YAML/defaults drift.

UX Journey

Before

User                    Archon workflow engine              Codex provider
----                    ----------------------              --------------
sets minimal effort --> loader accepts workflow field
                        DAG builds node options
                        workflow Codex fields dropped ----> receives no minimal workflow effort
user observes run <---- response streams back                 SDK uses default/tier behavior

After

User                    Archon workflow engine                    Codex provider
----                    ----------------------                    --------------
sets minimal effort --> loader accepts workflow field
                        DAG builds node options
                        [applies Codex workflow assistant config] ==> receives minimal/webSearch/directories
user observes run <---- response streams back                       SDK uses configured workflow effort

Architecture Diagram

Before

workflow schema -> loader -> dag-executor -> provider registry -> CodexProvider -> Codex SDK
                         \-> preset tier routing -> assistantConfig

After

workflow schema -> loader -> [~] dag-executor ===> assistantConfig ===> CodexProvider -> Codex SDK
                         \-> [~] preset tier routing respects explicit workflow modelReasoningEffort

Connection inventory (list every module-to-module edge, mark changes):

From To Status Notes
workflowSchema loader unchanged Existing schema already parsed Codex workflow-level options.
loader dag-executor unchanged Workflow object already carried parsed fields.
dag-executor assistantConfig modified Codex workflow-level options are now copied into assistant config.
dag-executor preset tier routing modified Explicit workflow modelReasoningEffort blocks preset effort overwrite for Codex.
assistantConfig CodexProvider unchanged Provider already forwards assistant config to SDK thread options.
route/CLI validation config writes modified Tests now lock acceptance of Codex effort: minimal.

Label Snapshot

  • Risk: risk: low
  • Size: size: M
  • Scope: workflows|providers|server|cli|adapters|core|tests
  • Module: workflows:dag-executor, providers:codex, server:ai-config, cli:ai, tests:validation

Change Metadata

  • Change type: bug
  • Primary scope: multi

Linked Issue

  • Closes: None provided by workflow artifacts.
  • Related: Workflow 1d7c2db9d26534d28e79af44b0d707a6
  • Depends on: None
  • Supersedes: None

Validation Evidence (required)

Commands and result summary:

bun run type-check
bun run lint --max-warnings 0
bun run format:check
bun run test
bun run build
bun run validate
  • Evidence provided (test/log/trace/screenshot): validation artifact reports type-check pass, lint pass with 0 warnings, format pass, bun run test pass with 5135 passed / 0 failed / 17 skipped, build pass, and aggregate bun run validate pass.
  • If any command is intentionally skipped, explain why: none skipped in final validation artifact.

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • New external network calls? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • File system access scope changed? (Yes/No): No
  • If any Yes, describe risk and mitigation: N/A

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Database migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Human Verification (required)

What was personally validated beyond CI:

  • Verified scenarios: Codex DAG workflow-level modelReasoningEffort: minimal reaches assistantConfig; webSearchMode and additionalDirectories are forwarded for Codex; built-in Codex small tier still routes minimal effort; explicit workflow reasoning wins over preset effort.
  • Edge cases checked: server install-scope tier/alias writes accept Codex effort: minimal; per-user tier/alias writes accept Codex effort: minimal; CLI --effort minimal validation accepts Codex.
  • What was not verified: browser UI flow, because no UI behavior changed; generated API and docs were verified as already accurate and left unchanged.

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: DAG workflow execution for Codex nodes, Codex provider forwarding tests, AI preference validation tests, CLI AI config tests, deterministic test isolation in affected suites.
  • Potential unintended effects: provider option precedence regressions if future providers reuse Codex-specific fields incorrectly.
  • Guardrails/monitoring for early detection: focused DAG regression tests, Codex provider thread option tests, server/CLI validation tests, and aggregate validation.

Rollback Plan (required)

  • Fast rollback command/path: revert commit 6853b056.
  • Feature flags or config toggles (if any): none.
  • Observable failure symptoms: Codex workflows ignore configured reasoning effort again, or tests around Codex minimal effort fail.

Risks and Mitigations

  • Risk: Codex preset effort could override an explicit workflow-level reasoning setting.
    • Mitigation: DAG tests cover explicit workflow precedence over tier preset effort.
  • Risk: non-Codex providers could accidentally receive Codex-only workflow fields.
    • Mitigation: the new helper applies only when the resolved provider is codex.

Summary by CodeRabbit

  • New Features

    • Added support for "minimal" effort tier setting for AI model configuration
    • Introduced workflow-level AI options for reasoning effort, web search mode, and additional directories
  • Tests

    • Enhanced test isolation and environment variable restoration across test suites
    • Expanded test coverage for AI tier settings and workflow execution behavior

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Extends dag-executor.ts to propagate Codex-specific workflow-level fields (modelReasoningEffort, webSearchMode, additionalDirectories) through WorkflowLevelOptions into assistantConfig via a new applyCodexWorkflowAssistantOptions helper. Adds "minimal" effort support across tier tests in the CLI, server routes, and Codex provider. Fixes env var leakage in three test suites.

Changes

Codex Workflow-Level Options and Minimal Effort

Layer / File(s) Summary
WorkflowLevelOptions extension and imports
packages/workflows/src/dag-executor.ts
Imports ModelReasoningEffort and WebSearchMode, adds optional modelReasoningEffort, webSearchMode, and additionalDirectories to WorkflowLevelOptions, and populates them from the workflow definition in executeDagWorkflow.
applyPresetOptions adjustment and applyCodexWorkflowAssistantOptions helper
packages/workflows/src/dag-executor.ts
Adjusts applyPresetOptions early-return to respect workflow-level modelReasoningEffort for Codex, adds applyCodexWorkflowAssistantOptions helper copying workflow-level fields into assistantConfig, and wires the helper into resolveNodeProviderAndModel.
DAG executor Codex options routing and tier effort tests
packages/workflows/src/dag-executor.test.ts
Adds executor tests verifying workflow-level Codex options reach assistantConfig, and tests tier effort routing with workflow reasoning taking precedence over tier preset effort.
DAG executor bash execution test determinism
packages/workflows/src/dag-executor.test.ts
Refactors bash-node tests to mock git.execFileAsync with explicit stdout/stderr, wraps executions in try/finally for spy restoration, and tightens assertions on node_failed error text, node_completed node_output, cancellation messaging, status derivation, and typed artifact output.
Minimal effort tier acceptance across CLI, server, and provider
packages/providers/src/codex/provider.test.ts, packages/cli/src/commands/ai.test.ts, packages/server/src/routes/api.providers.test.ts, packages/server/src/routes/api.user-ai-prefs.test.ts
Updates Codex provider test expectations from medium to minimal for modelReasoningEffort, and adds new minimal effort tier tests in CLI aiTierSetCommand, PATCH /api/config/tiers, and PATCH /api/auth/me/ai-prefs/tiers.

Test Environment Variable Isolation

Layer / File(s) Summary
Env var isolation in adapter, codebases, and workflow route tests
packages/adapters/src/forge/github/adapter.test.ts, packages/core/src/db/codebases.test.ts, packages/server/src/routes/api.workflows.test.ts
Adds beforeEach/afterEach lifecycle hooks to capture and restore GITHUB_ALLOWED_USERS, DEFAULT_AI_ASSISTANT, and ARCHON_HOME, preventing env var leakage between test cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • coleam00/Archon#1403: Directly overlaps with dag-executor.test.ts status derivation tests (anyFailed/trigger rules) that are also modified in this PR.
  • coleam00/Archon#1873: This PR's Codex effort/minimal tier routing builds directly on the tier/effort resolution work introduced there.

Poem

🐇 Hoppity hop, the options now flow,
Through workflow levels, a minimal glow.
Env vars restored, no leakage in sight,
Each test cleans up after its night.
The executor's wiser, the Codex is set —
The cleanest of rabbits you ever have met! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix Codex Minimal Reasoning Effort Execution' is specific and directly related to the main change: fixing workflow-level Codex options (specifically modelReasoningEffort) not being forwarded to DAG node assistantConfig during execution.
Description check ✅ Passed The description comprehensively covers all required template sections: Summary (problem, why it matters, what changed, scope boundaries), UX Journey (before/after flows), Architecture Diagram, Connection inventory, Label Snapshot, Change Metadata, Linked Issues, Validation Evidence (all tests passing), Security Impact (no security concerns), Compatibility/Migration (backward compatible, no migrations needed), Human Verification (scenarios and edge cases validated), Side Effects/Blast Radius, and Rollback Plan with detailed risk mitigations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

- Forward workflow-level Codex options into DAG assistant config
- Preserve explicit reasoning effort precedence over preset tiers
- Add regression and validation coverage for minimal effort
@QwertyCoolMT QwertyCoolMT force-pushed the archon/task-fix-archon-minimal-setting-retry2 branch from 9b7d934 to 6853b05 Compare June 19, 2026 20:51
@QwertyCoolMT QwertyCoolMT changed the title [codex] fix minimal Archon workflow bundle Fix Codex Minimal Reasoning Effort Execution Jun 19, 2026
@QwertyCoolMT QwertyCoolMT marked this pull request as ready for review June 19, 2026 20:54

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/server/src/routes/api.workflows.test.ts`:
- Around line 16-31: The `originalArchonHome` variable is captured once at
module load time, which means `afterEach` restores a potentially stale value
instead of the test's actual incoming value. Move the `originalArchonHome`
assignment from module scope into the `beforeEach` hook as a `let` variable
instead of `const`, so that each test run captures its own incoming state of
`process.env.ARCHON_HOME` before modifying it. This ensures that `afterEach`
restores the environment to the correct pre-test state for each individual test,
maintaining proper test isolation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3e08bde2-9b16-4405-a3e4-90a97c1af1da

📥 Commits

Reviewing files that changed from the base of the PR and between e77a338 and 6853b05.

📒 Files selected for processing (9)
  • packages/adapters/src/forge/github/adapter.test.ts
  • packages/cli/src/commands/ai.test.ts
  • packages/core/src/db/codebases.test.ts
  • packages/providers/src/codex/provider.test.ts
  • packages/server/src/routes/api.providers.test.ts
  • packages/server/src/routes/api.user-ai-prefs.test.ts
  • packages/server/src/routes/api.workflows.test.ts
  • packages/workflows/src/dag-executor.test.ts
  • packages/workflows/src/dag-executor.ts

Comment on lines +16 to +31
const originalArchonHome = process.env.ARCHON_HOME;
const isolatedArchonHome = join(tmpdir(), 'archon-api-workflows-test-home');

beforeEach(async () => {
await rm(isolatedArchonHome, { recursive: true, force: true });
process.env.ARCHON_HOME = isolatedArchonHome;
});

afterEach(async () => {
await rm(isolatedArchonHome, { recursive: true, force: true });
if (originalArchonHome === undefined) {
delete process.env.ARCHON_HOME;
} else {
process.env.ARCHON_HOME = originalArchonHome;
}
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make ARCHON_HOME snapshot per-test, not module-scoped.

originalArchonHome is captured once at module load (Line 16), so afterEach restores a potentially stale value instead of the test’s incoming value. Capture it inside beforeEach via a mutable let to keep restoration symmetric per test run.

Suggested patch
-const originalArchonHome = process.env.ARCHON_HOME;
+let originalArchonHome: string | undefined;
 const isolatedArchonHome = join(tmpdir(), 'archon-api-workflows-test-home');

 beforeEach(async () => {
+  originalArchonHome = process.env.ARCHON_HOME;
   await rm(isolatedArchonHome, { recursive: true, force: true });
   process.env.ARCHON_HOME = isolatedArchonHome;
 });

As per coding guidelines, “Keep tests deterministic … [and ensure] test isolation.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/routes/api.workflows.test.ts` around lines 16 - 31, The
`originalArchonHome` variable is captured once at module load time, which means
`afterEach` restores a potentially stale value instead of the test's actual
incoming value. Move the `originalArchonHome` assignment from module scope into
the `beforeEach` hook as a `let` variable instead of `const`, so that each test
run captures its own incoming state of `process.env.ARCHON_HOME` before
modifying it. This ensures that `afterEach` restores the environment to the
correct pre-test state for each individual test, maintaining proper test
isolation.

Source: Coding guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant