Deflake e2e run 28392564611#3711
Conversation
Bugbot couldn't run - usage limit reachedBugbot is counted against Cursor usage for this user or team, and this run hit a usage or spend limit. A user or team admin can review and increase usage limits in the Cursor dashboard. (requestId: serverGenReqId_17c534bd-be72-4352-84ce-7fa4a49a73b3) |
|
@BugBot run |
Bugbot couldn't run - usage limit reachedBugbot is counted against Cursor usage for this user or team, and this run hit a usage or spend limit. A user or team admin can review and increase usage limits in the Cursor dashboard. (requestId: serverGenReqId_d35fcdda-f937-47fc-aadc-2a46f7b663b0) |
There was a problem hiding this comment.
Code Review
This pull request updates E2E tests and documentation to handle legacy local-agent code search and destructive Supabase migrations. Specifically, it disables the code explorer in the local-agent code search spec, adds proposal approval steps in the Supabase migration tests, updates the corresponding E2E testing rules documentation, and sorts file paths in a codebase test to ensure consistency. The reviewer recommends replacing a ternary operator with an explicit if/else block in package_manager.spec.ts to improve readability when setting the maximum attempts for CI environments.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
|
||
| function warmSocketFirewallCache() { | ||
| const maxAttempts = 5; | ||
| const maxAttempts = process.env.CI ? 8 : 5; |
There was a problem hiding this comment.
According to the project's general rules, we should prioritize explicit if/else blocks over ternary operators for clarity when handling different execution paths (such as CI vs. local builds), even if it results in minor code duplication. Readability is favored over conciseness in these scenarios.
let maxAttempts = 5;
if (process.env.CI) {
maxAttempts = 8;
}References
- Prioritize explicit if/else blocks over ternary operators for clarity when handling different execution paths (e.g., test vs. production builds), even if it results in minor code duplication. Readability is favored over conciseness in these scenarios.
There was a problem hiding this comment.
Pull request overview
This PR addresses flakiness across E2E and unit tests by making assertions deterministic and ensuring tests explicitly perform required UI/actions before checking filesystem outputs.
Changes:
- Make
extractCodebaseunit test order-insensitive by sorting extracted paths before asserting. - Stabilize E2E behavior by explicitly approving destructive Supabase SQL proposals before asserting migration file creation.
- Reduce CI-only flakiness by increasing Socket Firewall warmup retry attempts on CI and pinning local-agent
code_searchE2E behavior viaenableCodeExplorer: falseplus persisted-settings polling; update the ARIA snapshot and document both patterns.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/tests/codebase.test.ts | Sort extracted file paths before equality assertion to avoid filesystem iteration order flakes. |
| rules/e2e-testing.md | Document patterns for stabilizing code_search E2E and handling destructive Supabase SQL proposals. |
| e2e-tests/supabase_migrations.spec.ts | Call po.approveProposal() after destructive SQL prompts before asserting migration files. |
| e2e-tests/snapshots/local_agent_code_search.spec.ts_local-agent---code-search-1.aria.yml | Update ARIA baseline to reflect stable code_search UI rendering when code explorer is disabled. |
| e2e-tests/package_manager.spec.ts | Increase Socket Firewall warmup retry attempts on CI to mitigate transient download failures. |
| e2e-tests/local_agent_code_search.spec.ts | Disable code explorer via persisted settings and poll settings before snapshot to prevent UI variance. |
π‘ Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
π Dyadbot Code Review SummaryVerdict: β
YES - Ready to merge This is a well-scoped test-deflaking PR that addresses four independent flakiness sources. All fixes follow established patterns in the test suite. Change-by-change assessment:
β No issues found by persona-based review. Generated by Dyadbot persona-based code review |
π Playwright Test Resultsβ Some tests failed
Summary: 345 passed, 1 failed, 7 skipped Failed Testsπ macOS
π Re-run Failing Tests (macOS)Copy and paste to re-run all failing spec files locally: npm run e2e \
e2e-tests/local_agent_auto.spec.tsπ View full report |
Summary
Failing run: https://github.com/dyad-sh/dyad/actions/runs/28392564611
Note
Low Risk
Test-only and documentation changes; no production behavior modified.
Overview
E2E reliability fixes for Supabase migrations, local-agent
code_search, Socket Firewall warmup, and a flaky unit assertion.Supabase migration specs now call
po.approveProposal()after destructive SQL prompts (DROP TABLE) because auto-approve does not apply those changesβmigration file assertions were racing ahead of approval.The local-agent
code_searchspec forcesenableCodeExplorer: falseand polls persisted settings before the fixture prompt, so the chat snapshot consistently shows the Code Search tool UI instead of varying with code-explorer readiness. The ARIA baseline was updated to match.Socket Firewall warmup in
package_manager.spec.tsuses 8 retries on CI (vs 5 locally) for transientsfwnpx download failures.extractCodebasetest compares file paths with.sort()so filesystem iteration order does not fail the assertion.rules/e2e-testing.mddocuments the code explorer and destructive SQL approval patterns.Reviewed by Cursor Bugbot for commit aa87587. Bugbot is set up for automated code reviews on this repo. Configure here.