Detect unknown_model_ai_credits failure in conclusion job by Copilot · Pull Request #38610 · github/gh-aw

Copilot · 2026-06-11T13:03:30Z

When max-ai-credits is active and the requested model isn't in the built-in pricing table, the AWF API proxy returns HTTP 400 with type unknown_model_ai_credits. Previously this produced only a generic fallback message in the conclusion comment. Reference: run 27346208956

Detection pipeline

Follows the same pattern as ai_credits_rate_limit_error:

ai_credits_context.cjs — parseUnknownModelAICreditsFromAuditLog walks firewall audit JSONL for "type": "unknown_model_ai_credits"
parse_mcp_gateway_log.cjs — detects across all log paths (gateway.jsonl, rpc-messages.jsonl, gateway.md, legacy logs), emits unknown_model_ai_credits step output
compiler_main_job.go — exposes it as an agent job output from parse-mcp-gateway
notify_comment.go — passes GH_AW_UNKNOWN_MODEL_AI_CREDITS env var to the conclusion job

Failure message

New template unknown_model_ai_credits.md explains the error is a non-transient config failure (not worth retrying) and offers three remediation paths:

# Option 1: alias to a known model via `models`
model: my-custom-model
max-ai-credits: 500
models:
  my-custom-model:
    model: gpt-4.1

# Option 2: use a model already in the pricing table

# Option 3: supply a fallback price
models:
  my-custom-model:
    defaultAiCreditsPricing: 3.0

handle_agent_failure.cjs — reads env var, adds unknown_model_ai_credits failure category, new buildUnknownModelAICreditsContext wired into both comment and issue template contexts
agent_failure_comment.md / agent_failure_issue.md — insert {unknown_model_ai_credits_context} after {ai_credits_rate_limit_error_context}

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

This PR improves failure reporting when max-ai-credits is enabled and the requested model is not present in the built-in AI credits pricing table by surfacing an unknown_model_ai_credits signal to the conclusion job and ensuring it can be rendered in the failure comment/issue templates.

Changes:

Expose unknown_model_ai_credits as an agent job output sourced from parse-mcp-gateway, and forward it to the conclusion job via GH_AW_UNKNOWN_MODEL_AI_CREDITS.
Update agent failure templates to include {unknown_model_ai_credits_context} alongside existing AI-credits failure contexts.
Add a focused compiler/lockfile assertion test and refresh golden/lockfile fixtures to include the new output wiring.

Show a summary per file

File	Description
pkg/workflow/compiler_main_job.go	Adds `unknown_model_ai_credits` to the agent job outputs map, sourced from the `parse-mcp-gateway` step.
pkg/workflow/notify_comment.go	Passes `GH_AW_UNKNOWN_MODEL_AI_CREDITS` into the conclusion job environment from agent outputs.
pkg/workflow/unknown_model_ai_credits_test.go	Adds a regression test asserting the output/env-var plumbing appears in compiled lock output.
actions/setup/md/agent_failure_comment.md	Includes `{unknown_model_ai_credits_context}` in the failure comment template insertion point.
actions/setup/md/agent_failure_issue.md	Includes `{unknown_model_ai_credits_context}` in the failure issue template insertion point.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/basic-copilot.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/with-imports.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/smoke-copilot.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/playwright-cli-mode.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_AllEngines/pi.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_AllEngines/gemini.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_AllEngines/copilot.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_AllEngines/codex.golden	Updates golden to include the new agent output.
pkg/workflow/testdata/TestWasmGolden_AllEngines/claude.golden	Updates golden to include the new agent output.
.github/workflows/*.lock.yml	Regenerates workflow lockfiles to propagate the new agent output and conclusion env var wiring across locked workflows.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 265/265 changed files
Comments generated: 0

github-actions · 2026-06-11T13:15:44Z

✅ PR Code Quality Reviewer completed the code quality review.

github-actions · 2026-06-11T13:15:46Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #38610 does not have the "implementation" label (has_implementation_label=false) and has only 40 new lines of code in business logic directories, below the 100-line threshold (requires_adr_by_default_volume=false).

github-actions · 2026-06-11T13:16:06Z

🧠 Matt Pocock Skills Reviewer was skipped during the skills-based review.

github-actions · 2026-06-11T13:16:55Z

🧪 Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-11T13:25:35Z

🧪 Test Quality Sentinel Report

⚠️ Test Quality Score: 72/100 — Acceptable

Analyzed 11 test(s): 11 design, 0 implementation, 0 guideline violation(s). Marginal test inflation detected in 2 JS files (just over 2:1 ratio).

📊 Metrics & Test Classification (11 tests analyzed)

Metric	Value
New/modified tests analyzed	11
✅ Design tests (behavioral contracts)	11 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	8 (73%)
Duplicate test clusters	0
Test inflation detected	⚠️ Yes — marginal (ai_credits_context: 2.03:1, handle_agent_failure: 2.09:1)
🚨 Coding-guideline violations	0

Test Classification Details

Test	File	Classification	Issues Detected
`detects unknown_model_ai_credits type in audit log`	`ai_credits_context.test.cjs`	✅ Design	None
`detects unknown_model_ai_credits using audit.jsonl filename`	`ai_credits_context.test.cjs`	✅ Design	None
`detects unknown_model_ai_credits in a multi-entry log`	`ai_credits_context.test.cjs`	✅ Design	None
`returns false when no unknown_model_ai_credits signal is present`	`ai_credits_context.test.cjs`	✅ Design	None
`returns false for missing audit log`	`ai_credits_context.test.cjs`	✅ Design	None
`returns false for empty audit log`	`ai_credits_context.test.cjs`	✅ Design	None
`does not detect other error types as unknown_model_ai_credits`	`ai_credits_context.test.cjs`	✅ Design	None
`returns empty string when no unknown_model_ai_credits error`	`handle_agent_failure.test.cjs`	✅ Design	None
`returns template content when error is true and template exists`	`handle_agent_failure.test.cjs`	✅ Design	None
`throws when template is missing`	`handle_agent_failure.test.cjs`	✅ Design	None
`TestUnknownModelAICreditsOutput`	`pkg/workflow/unknown_model_ai_credits_test.go`	✅ Design	Happy-path only

Language Support

Tests analyzed:

🐹 Go (*_test.go): 1 test — unit (//go:build !integration) ✅
🟨 JavaScript (*.test.cjs): 10 tests (vitest)

⚠️ Notes — Marginal Observations (1 item)

⚠️ Test Inflation — Marginal (`ai_credits_context`, `handle_agent_failure`)

Classification: Metric note (not a guideline violation)
Observation: Both JS test files added slightly more than twice the lines of their corresponding production files:

ai_credits_context.test.cjs added 71 lines vs 35 in ai_credits_context.cjs → 2.03:1
handle_agent_failure.test.cjs added 46 lines vs 22 in handle_agent_failure.cjs → 2.09:1

The ratios are only marginally above the 2:1 threshold and are attributable to the fact that thorough edge-case coverage (7 and 3 tests respectively) naturally requires more test scaffolding than production code. The tests themselves are high quality — the inflation ratio metric is flagged for completeness, not as a concern.

📝 `TestUnknownModelAICreditsOutput` — Happy-path only

Classification: Design test (no deduction)
Observation: The Go test verifies the compiled lock file contains both unknown_model_ai_credits output wiring strings, but does not include a negative case (e.g., verifying the string is absent when the feature is off). Consider adding an error-path or absent-feature row to the table-driven pattern used elsewhere in the package.

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 11 tests enforce behavioral contracts. No coding-guideline violations detected.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §27348833330

🧪 Test quality analysis by Test Quality Sentinel · 300.9 AIC · ⌖ 20.8 AIC · ◷

github-actions

✅ Test Quality Sentinel: 72/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 11 tests enforce behavioral contracts with solid edge-case coverage. No coding-guideline violations detected.

github-actions

REQUEST_CHANGES — One critical correctness bug must be fixed before merge.

Blocking issues

Critical: `writeStepSummaryWithTokenUsage` deleted from main path (`parse_mcp_gateway_log.cjs` ~line 994)

The new setUnknownModelAICreditsOutput line replaced writeStepSummaryWithTokenUsage instead of being inserted alongside it. On every normal workflow run (the main code path through main()), this means:

core.summary.write() is never called → the step summary is silently discarded
token-usage.jsonl is never read → GH_AW_AIC, aic, ambient_context outputs are never set
The unified event timeline is never appended

All three early-return paths (~lines 903, 934, 973) were updated correctly; only the terminal path at the bottom of main() was missed.

Medium: Redundant disk reads in `hasUnknownModelAICreditsError` (`parse_mcp_gateway_log.cjs` ~line 269)

The function unconditionally falls through to parseUnknownModelAICreditsFromAuditLog() when the in-memory pattern does not match, causing up to 5 sequential disk reads of the same audit log file (once per content source). This is inconsistent with hasAICreditsRateLimitError, which is content-only. See inline comment for the recommended refactor.

🔎 Code quality review by PR Code Quality Reviewer · ⌖ 21.9 AIC

github-actions · 2026-06-11T13:28:35Z

    }
    setAICreditsRateLimitOutput(core, aiCreditsRateLimitError);
-    writeStepSummaryWithTokenUsage(core);
+    setUnknownModelAICreditsOutput(core, unknownModelAICredits);


Dropped writeStepSummaryWithTokenUsage call breaks the main execution path: the step summary is never flushed, token usage (GH_AW_AIC, aic, ambient_context) is never exported, and the unified event timeline is never appended — silently, on every normal workflow run.

💡 Suggested fix

The new line was intended to be added alongside the existing call, not instead of it. All three early-return paths (lines ~903, ~934, ~973) already do this correctly. The final path was missed:

setAICreditsRateLimitOutput(core, aiCreditsRateLimitError); setUnknownModelAICreditsOutput(core, unknownModelAICredits); + writeStepSummaryWithTokenUsage(core); // <-- must be re-added } catch (error) {

Without this call, core.summary.write() is never invoked, so any content queued with core.summary.addRaw() in the main path is silently discarded. Token-usage.jsonl is never read, so downstream jobs that depend on aic / ambient_context outputs will receive empty values.

github-actions · 2026-06-11T13:28:35Z

+function hasUnknownModelAICreditsError(contents) {
+  const joined = contents.filter(Boolean).join("\n");
+  if (joined && UNKNOWN_MODEL_AI_CREDITS_PATTERNS.some(pattern => pattern.test(joined))) return true;
+  return parseUnknownModelAICreditsFromAuditLog();


hasUnknownModelAICreditsError reads the audit log from disk on every call: it unconditionally falls through to parseUnknownModelAICreditsFromAuditLog() when the in-memory pattern check fails. Because the function is called for each content source (gateway.jsonl, rpc-messages.jsonl, gateway.md, gateway.log, stderr.log), this means up to 5 sequential disk reads of the audit log when no error is detected — inconsistent with hasAICreditsRateLimitError, which is content-only.

💡 Suggested fix

Read the audit log once outside the per-source loop, using ||= to latch the result:

// Before the content-source loop, read audit log once unknownModelAICredits ||= parseUnknownModelAICreditsFromAuditLog();

Then simplify hasUnknownModelAICreditsError to only check in-memory content (matching the shape of hasAICreditsRateLimitError):

function hasUnknownModelAICreditsError(contents) { const joined = contents.filter(Boolean).join("\n"); return joined ? UNKNOWN_MODEL_AI_CREDITS_PATTERNS.some(p => p.test(joined)) : false; }

This mirrors the existing architecture where audit-log parsing is centralised and done once, not embedded inside a per-content-block helper.

Done

Copilot AI and others added 2 commits June 11, 2026 12:50

Add unknown_model_ai_credits detection and specialized failure message

d1810e4

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Update golden files after unknown_model_ai_credits changes

a96101e

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Jun 11, 2026

Copilot created this pull request from a session on behalf of pelikhan June 11, 2026 13:04 View session

Copilot AI requested a review from pelikhan June 11, 2026 13:04

Copilot finished work on behalf of pelikhan June 11, 2026 13:04

pelikhan marked this pull request as ready for review June 11, 2026 13:04

Copilot AI review requested due to automatic review settings June 11, 2026 13:04

Copilot started reviewing on behalf of pelikhan June 11, 2026 13:05 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 11, 2026

[aw] No-Op Runs #38496

Open

github-actions Bot approved these changes Jun 11, 2026

View reviewed changes

github-actions Bot previously requested changes Jun 11, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 11, 2026

[PR Triage Report] PR Triage Report — 2026-06-11 (Run #27350912646) #38619

Open

pelikhan merged commit 1a95104 into main Jun 11, 2026
110 checks passed

pelikhan deleted the copilot/detect-unknown-model-ai-credits-failure branch June 11, 2026 13:52

github-actions Bot mentioned this pull request Jun 11, 2026

[aw-failures] [fix] Gemini router selects unpriced TTS model (gemini-3.1-flash-tts-preview) -> Smoke Gemini 400 unknown_model_ai_credits #38558

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect unknown_model_ai_credits failure in conclusion job#38610

Detect unknown_model_ai_credits failure in conclusion job#38610
pelikhan merged 2 commits into
mainfrom
copilot/detect-unknown-model-ai-credits-failure

Copilot AI commented Jun 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026

Test Classification Details

Language Support

⚠️ Test Inflation — Marginal (`ai_credits_context`, `handle_agent_failure`)

📝 `TestUnknownModelAICreditsOutput` — Happy-path only

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Jun 11, 2026

Uh oh!

github-actions Bot Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Detection pipeline

Failure message

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

🧪 Test Quality Sentinel Report

Test Classification Details

Language Support

⚠️ Test Inflation — Marginal (ai_credits_context, handle_agent_failure)

📝 TestUnknownModelAICreditsOutput — Happy-path only

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Critical: writeStepSummaryWithTokenUsage deleted from main path (parse_mcp_gateway_log.cjs ~line 994)

Medium: Redundant disk reads in hasUnknownModelAICreditsError (parse_mcp_gateway_log.cjs ~line 269)

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

⚠️ Test Inflation — Marginal (`ai_credits_context`, `handle_agent_failure`)

📝 `TestUnknownModelAICreditsOutput` — Happy-path only

Critical: `writeStepSummaryWithTokenUsage` deleted from main path (`parse_mcp_gateway_log.cjs` ~line 994)

Medium: Redundant disk reads in `hasUnknownModelAICreditsError` (`parse_mcp_gateway_log.cjs` ~line 269)