fix(adapters): drop duplicate webhook deliveries at ingest#1987
fix(adapters): drop duplicate webhook deliveries at ingest#1987truffle-dev wants to merge 3 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds webhook delivery deduplication to GitHub handling, threads delivery IDs from the server into the adapter, extends the webhook comment type, and adds utility and adapter tests for duplicate and edited comment deliveries. ChangesWebhook Delivery Deduplication
Estimated code review effort: 3 (Moderate) | ~25 minutes Sequence Diagram(s)sequenceDiagram
participant GitHub
participant ServerWebhook
participant GitHubAdapter
participant DeliveryDeduplicator
GitHub->>ServerWebhook: POST webhook + x-github-delivery
ServerWebhook->>GitHubAdapter: handleWebhook(payload, signature, deliveryId)
GitHubAdapter->>DeliveryDeduplicator: seen(commentKey or deliveryKey)
alt duplicate delivery
DeliveryDeduplicator-->>GitHubAdapter: true
GitHubAdapter-->>ServerWebhook: return early
else new delivery
DeliveryDeduplicator-->>GitHubAdapter: false
GitHubAdapter->>GitHubAdapter: continue webhook processing
GitHubAdapter-->>ServerWebhook: processed
end
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/adapters/src/forge/github/adapter.test.ts`:
- Around line 470-527: The tests currently swallow errors from unmocked Octokit
calls in deliver(), making dedup tests non-deterministic; update
createDedupAdapter() to attach a mocked octokit client on the GitHubAdapter
instance that stubs the minimal downstream API methods used by handleWebhook
(e.g., the issues/comments and any repo/pull methods your webhook flow calls) so
calls resolve successfully, and then change deliver() to await
adapter.handleWebhook(...) without catching/ignoring errors so the test
deterministically fails on unexpected behavior; reference createDedupAdapter,
GitHubAdapter, and deliver when locating where to add the stubbed octokit
methods and remove the try/catch.
In `@packages/adapters/src/forge/github/adapter.ts`:
- Around line 1005-1010: The dedup key construction (dedupKey) currently treats
a comment identity as present when event.comment?.id exists even if
event.comment.updated_at is missing; change the condition that builds the
`comment:` key to require both `event.comment.id` and `event.comment.updated_at`
(non-null/undefined) before using them, otherwise fall back to the `deliveryId`
branch or undefined; update the ternary/condition around `dedupKey` (the
expression referencing `event.comment?.id` and `event.comment.updated_at`) so
edited-comment deduping only occurs when both values exist.
In `@packages/core/src/utils/delivery-dedup.test.ts`:
- Around line 3-79: Tests rely on real sleep/timers causing flakiness; switch to
Jest fake timers in the tests that use sleep so expiry is deterministic: in the
"key expires after TTL and may run again", "expired entries are pruned on
insert", and "re-seeing a key after expiry refreshes its eviction position"
tests, replace async await sleep(30) calls with jest.useFakeTimers() at test
start, call jest.advanceTimersByTime(30) (or
jest.runOnlyPendingTimers()/jest.runAllTimers() as needed) instead of awaiting
sleep, then call jest.useRealTimers() at the end; keep references to
DeliveryDeduplicator, seen, size, and remove reliance on the sleep helper (or
stub it to call advanceTimersByTime) so TTL behavior is driven by the fake clock
rather than wall-clock waits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5448b689-2aee-409a-9ecd-b9cc1dbdfaff
📒 Files selected for processing (8)
packages/adapters/src/forge/github/adapter.test.tspackages/adapters/src/forge/github/adapter.tspackages/adapters/src/forge/github/types.tspackages/core/package.jsonpackages/core/src/index.tspackages/core/src/utils/delivery-dedup.test.tspackages/core/src/utils/delivery-dedup.tspackages/server/src/index.ts
| function createDedupAdapter(): GitHubAdapter { | ||
| const adapter = new GitHubAdapter( | ||
| { kind: 'pat', token: 'fake-token-for-testing' }, | ||
| 'fake-webhook-secret', | ||
| mockLockManager, | ||
| 'archon' | ||
| ); | ||
| // @ts-expect-error - accessing private method for testing | ||
| adapter.verifySignature = mock(() => true); | ||
| return adapter; | ||
| } | ||
|
|
||
| /** | ||
| * Comment payload carrying GitHub's comment identity (id + updated_at), | ||
| * as real issue_comment deliveries do. | ||
| */ | ||
| function createIdentifiedCommentPayload( | ||
| commentBody: string, | ||
| commentId: number | undefined, | ||
| updatedAt: string | undefined | ||
| ): string { | ||
| const comment: { | ||
| id?: number; | ||
| body: string; | ||
| user: { login: string }; | ||
| updated_at?: string; | ||
| } = { body: commentBody, user: { login: 'user123' } }; | ||
| if (commentId !== undefined) comment.id = commentId; | ||
| if (updatedAt !== undefined) comment.updated_at = updatedAt; | ||
| return JSON.stringify({ | ||
| action: 'created', | ||
| issue: { | ||
| number: 42, | ||
| title: 'Test Issue', | ||
| body: 'Description', | ||
| user: { login: 'user123' }, | ||
| labels: [], | ||
| state: 'open', | ||
| }, | ||
| comment, | ||
| repository: { | ||
| owner: { login: 'testuser' }, | ||
| name: 'testrepo', | ||
| full_name: 'testuser/testrepo', | ||
| html_url: 'https://github.com/testuser/testrepo', | ||
| default_branch: 'main', | ||
| }, | ||
| sender: { login: 'user123' }, | ||
| }); | ||
| } | ||
|
|
||
| async function deliver(adapter: GitHubAdapter, payload: string, deliveryId?: string) { | ||
| try { | ||
| await adapter.handleWebhook(payload, 'mock-signature', deliveryId); | ||
| } catch { | ||
| // Expected - Octokit API not mocked for the downstream message path. | ||
| } | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Dedup tests depend on unmocked downstream Octokit failures.
Line 521-Line 526 swallows expected errors from an unmocked webhook path, which makes these tests depend on external call behavior instead of only dedup logic. Mock the minimal Octokit methods in createDedupAdapter() and make deliver() await success deterministically.
Suggested fix
function createDedupAdapter(): GitHubAdapter {
const adapter = new GitHubAdapter(
{ kind: 'pat', token: 'fake-token-for-testing' },
'fake-webhook-secret',
mockLockManager,
'archon'
);
// `@ts-expect-error` - accessing private method for testing
adapter.verifySignature = mock(() => true);
+ // `@ts-expect-error` - accessing private property for testing
+ adapter.octokit = {
+ rest: {
+ repos: {
+ get: mock(async () => ({ data: { default_branch: 'main' } })),
+ },
+ issues: {
+ listComments: mock(async () => ({ data: [] })),
+ createComment: mock(async () => ({ data: { id: 1 } })),
+ },
+ },
+ };
return adapter;
}
async function deliver(adapter: GitHubAdapter, payload: string, deliveryId?: string) {
- try {
- await adapter.handleWebhook(payload, 'mock-signature', deliveryId);
- } catch {
- // Expected - Octokit API not mocked for the downstream message path.
- }
+ await adapter.handleWebhook(payload, 'mock-signature', deliveryId);
}As per coding guidelines, “Keep tests deterministic — no flaky timing or network dependence without guardrails” and “mock external dependencies (database, AI SDKs, platform APIs).”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/forge/github/adapter.test.ts` around lines 470 - 527,
The tests currently swallow errors from unmocked Octokit calls in deliver(),
making dedup tests non-deterministic; update createDedupAdapter() to attach a
mocked octokit client on the GitHubAdapter instance that stubs the minimal
downstream API methods used by handleWebhook (e.g., the issues/comments and any
repo/pull methods your webhook flow calls) so calls resolve successfully, and
then change deliver() to await adapter.handleWebhook(...) without
catching/ignoring errors so the test deterministically fails on unexpected
behavior; reference createDedupAdapter, GitHubAdapter, and deliver when locating
where to add the stubbed octokit methods and remove the try/catch.
Source: Coding guidelines
…1951) A duplicate delivery of the same comment passed every guard and queued a byte-identical second workflow run behind the first: the lock manager orders per-conversation messages but never dedups them. Dual repo+App webhook subscriptions (different delivery GUIDs for one comment), LB double-forwards, and redeliveries all hit this. Adds a bounded TTL first-seen cache in core and gates GitHub webhook processing on a logical idempotency key (comment id + updated_at, so edited comments still re-trigger), falling back to the X-GitHub-Delivery GUID when the payload lacks comment identity. Fails open when neither is available.
…stic TTL tests Keying on comment id alone (updated_at absent) would dedup an edit against the original within the TTL window — require both fields and use the delivery-GUID fallback otherwise. Dedup TTL tests now use an injected clock instead of wall-clock sleeps.
fae21fc to
3400495
Compare
|
Rebased onto current |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/adapters/src/forge/github/adapter.ts`:
- Around line 999-1018: The dedup key in github/adapter.ts is being marked as
seen too early in the webhook flow, before clone/resolution/orchestration
succeeds. Update the deliveryDedup usage around the duplicate check in the
handler so the key is only committed after downstream processing has succeeded,
or remove/evict it when the run fails, to preserve redelivery retries for failed
comment triggers. Refer to the dedupKey / this.deliveryDedup.seen path in the
GitHub adapter handler.
In `@packages/core/src/utils/delivery-dedup.ts`:
- Line 102: The debug log event name in delivery-dedup should follow the
structured `{domain}.{action}_{state}` convention instead of the current generic
name. Update the `getLog().debug(...)` call in `delivery-dedup` (around the
eviction logic in the `DeliveryDedup` implementation) to use a domain-prefixed
event such as `delivery_dedup.entries_evicted`, keeping the existing payload
fields intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7ed72cba-c5bd-4181-a9bb-11d6242704b1
📒 Files selected for processing (8)
packages/adapters/src/forge/github/adapter.test.tspackages/adapters/src/forge/github/adapter.tspackages/adapters/src/forge/github/types.tspackages/core/package.jsonpackages/core/src/index.tspackages/core/src/utils/delivery-dedup.test.tspackages/core/src/utils/delivery-dedup.tspackages/server/src/index.ts
✅ Files skipped from review due to trivial changes (2)
- packages/core/src/index.ts
- packages/server/src/index.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- packages/adapters/src/forge/github/types.ts
- packages/core/src/utils/delivery-dedup.test.ts
- packages/adapters/src/forge/github/adapter.test.ts
- packages/core/package.json
| // 5a. Ingest idempotency. Key on the comment's identity (id + updated_at) | ||
| // rather than the delivery GUID: dual subscriptions (repo webhook + App | ||
| // webhook) deliver the same comment under different GUIDs, so GUID-only | ||
| // dedup misses the common duplicate source. An edited comment gets a new | ||
| // updated_at and forms a new key, so edit re-triggers still run. Both | ||
| // fields are required — keying on id alone would dedup an edit against | ||
| // the original. Falls back to the delivery GUID when either is absent. | ||
| const dedupKey = | ||
| event.comment?.id !== undefined && event.comment.updated_at | ||
| ? `comment:${owner}/${repo}#${String(number)}:${String(event.comment.id)}:${event.comment.updated_at}` | ||
| : deliveryId | ||
| ? `delivery:${deliveryId}` | ||
| : undefined; | ||
| if (dedupKey && this.deliveryDedup.seen(dedupKey)) { | ||
| getLog().info( | ||
| { eventType, owner, repo, number, deliveryId }, | ||
| 'github.duplicate_delivery_dropped' | ||
| ); | ||
| return; | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Dedup marks the key as seen before downstream processing succeeds.
seen(dedupKey) records the key on first sight, then the handler proceeds through repo clone, branch resolution, and orchestration — any of which can throw. Because the webhook is fired-and-forget at the route (errors are only logged), a GitHub at-least-once redelivery (or LB retry) of the same comment within the 10-minute TTL will now be dropped, so a trigger that failed mid-processing is never re-attempted. This narrows the resilience the redelivery mechanism previously provided.
If recovering failed triggers via redelivery matters, consider recording the key only after the run is successfully enqueued/handled, or evicting the key on downstream failure.
The updated_at-required fix from the prior review is correctly applied here.
🧰 Tools
🪛 ESLint
[error] 1013-1013: Unsafe call of a type that could not be resolved.
(@typescript-eslint/no-unsafe-call)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/forge/github/adapter.ts` around lines 999 - 1018, The
dedup key in github/adapter.ts is being marked as seen too early in the webhook
flow, before clone/resolution/orchestration succeeds. Update the deliveryDedup
usage around the duplicate check in the handler so the key is only committed
after downstream processing has succeeded, or remove/evict it when the run
fails, to preserve redelivery retries for failed comment triggers. Refer to the
dedupKey / this.deliveryDedup.seen path in the GitHub adapter handler.
…seen() ordering
The eviction debug log used 'evicted_oldest_entries', which does not
follow the {domain}.{action}_{state} event-name convention; rename it to
'delivery_dedup.entries_evicted'.
Also document why the ingest dedup marks the key before downstream
processing: the duplicates it guards against are dual subscriptions
delivering the same comment near-simultaneously, so the key must be
claimed before either delivery finishes or both would double-process.
Summary
Fixes #1951 — @sbiitmc's diagnosis is exactly right: a duplicate delivery of the same comment passes every guard in
handleWebhook, reachesConversationLockManager.acquireLock, and gets queued behind the in-flight run asqueued-conversation. The lock manager orders per-conversation messages but never dedups them, so when run #1 completes,.finally → processQueuereplays the byte-identical message as a full second workflow run (full token cost, can push/open a second PR).The incident's duplicate source — dual repo-webhook + App-webhook subscriptions — delivers the same comment under different delivery GUIDs, so deduping on
X-GitHub-Deliveryalone would miss it. The trigger path always carries a comment (close events return early inparseEvent), so the fix keys on the comment's identity instead:id+updated_at. Dual-subscription duplicates and LB double-forwards share that identity and get dropped; an edited comment gets a newupdated_atand still re-triggers.Changes
DeliveryDeduplicatorinpackages/core/src/utils/delivery-dedup.ts(exported from core, reusable by the gitea/gitlab adapters later): a bounded first-seen cache — 10-minute TTL so deliberate manual redeliveries hours later still run, 10k-entry cap with oldest-first eviction so memory stays bounded.comment:{owner}/{repo}#{number}:{comment.id}:{comment.updated_at}after the @mention check and before the expensive path (user resolution, conversation/codebase creation, clone/sync, comment-history fetch). Drops log asgithub.duplicate_delivery_droppedat info.delivery:{deliveryId}when the payload lacks comment identity, and fails open when neither is available — never drops a webhook for want of a key. The dedup check sits after signature verification so unauthenticated junk can't poison the cache.X-GitHub-Deliverywas already extracted in the webhook route but only used in error logs; it's now threaded tohandleWebhookas an optional third parameter (per the issue's scope note). Gitea/gitlab adapters untouched.WebhookEvent.commenttype gains optionalid/updated_at(both present on real GitHub deliveries).Tests
delivery-dedup.test.ts(new, wired into core's test chain): first-seen/repeat, key independence, edit-forms-new-key, TTL expiry, prune-on-insert, max-size eviction, post-expiry refresh — 8 cases.adapter.test.ts, newwebhook delivery dedupblock: same-GUID repeat dropped, dual-subscription duplicate (same comment, different GUIDs) dropped — the Duplicate webhook delivery runs a workflow twice (no ingest idempotency) #1951 incident shape, edited comment re-processed, distinct comments independent, GUID fallback, fail-open with no key — 6 cases asserting on conversation-creation call counts.idempoten|dedup|deliveryId|x-github-delivery), and no test pinned the duplicate-processing behavior.Validation
bun run lint --max-warnings 0,format:checkpass repo-wide. Full adapters suite green (70-test github adapter lane + all chained lanes). Core chain green through the utils lanes including the new one (the long-tail orchestrator/credentials lanes are untouched by this diff and left to CI).tsc --noEmitclean on core/adapters/server apart from pre-existingpackages/providerscopilot-client drift unrelated to this change.Closes #1951
Summary by CodeRabbit