-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
benchmark: memory (codspeed) #7623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
eab7570
benchmark: memory (codspeed)
Sheraff 3d31b79
fix codspeed github action
Sheraff 4c6d3af
unified codspeed job
Sheraff 025d4b4
im dumb
Sheraff 537776c
cleaner run command in github action
Sheraff 97d1801
better response stream draining
Sheraff b01f4de
cleanup streaming-peak scenario: we only care about server memory, no…
Sheraff fa3cb8a
use platformatic/flame for local memory bench
Sheraff 9bd39c0
direct pprof calls for cleaner output
Sheraff 3e63fd2
review
Sheraff 073a036
increase iteration count for low benches
Sheraff d8a4e76
flame run splits by benches, like codspeed
Sheraff d1ef03d
ci: apply automated fixes
autofix-ci[bot] ca04026
Merge branch 'main' into bench-codspeed-memory
Sheraff 2ff1a0d
QA
Sheraff 92c2688
build outside of codspeed instrumentation
Sheraff fab18e6
ci: apply automated fixes
autofix-ci[bot] 6a4c1f6
nitpick
Sheraff a30c37e
fix workspace deps
Sheraff ed51981
Merge branch 'main' into bench-codspeed-memory
Sheraff cbfde6d
solid & vue
Sheraff ab7fb18
ci: apply automated fixes
autofix-ci[bot] 8d25db9
simplify dual architecture
Sheraff b15de72
cleanup
Sheraff c922c2a
ci: apply automated fixes
autofix-ci[bot] e4fbeb1
fix vue benchmark
Sheraff File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,186 @@ | ||
| # Memory Benchmarks | ||
|
|
||
| Dedicated memory benchmarks for TanStack Router / Start, measured with the | ||
| CodSpeed **memory instrument** (`mode: memory` in | ||
| `.github/workflows/client-nav-benchmarks.yml`). Two separate benchmarks: | ||
|
|
||
| - `server/` (`@benchmarks/memory-server`) — React/Solid/Vue Start apps, requests against | ||
| the built server handler (`handler.fetch`), Node environment. | ||
| - `client/` (`@benchmarks/memory-client`) — router-only React/Solid/Vue apps in jsdom. | ||
|
|
||
| These deliberately do **not** reuse the CPU scenarios in `benchmarks/ssr` and | ||
| `benchmarks/client-nav`: memory benches need their own iteration counts, | ||
| payload sizes, and route shapes, and tuning those must never shift the CPU | ||
| baselines. Each scenario keeps a framework level (`react/`, `solid/`, `vue/`) | ||
| so framework ports can be added without renames. | ||
|
|
||
| ## Layout | ||
|
|
||
| ```text | ||
| benchmarks/memory/<server|client>/ | ||
| package.json Nx targets: build:<framework>, test:perf:<framework>, test:flame:<framework>, test:types | ||
| bench-utils.ts memoryBenchOptions, seeded LCG (+ sequential request loop on the server side) | ||
| vitest.<framework>.config.ts aggregates scenarios/*/<framework>/vite.config.ts | ||
| scenarios/<scenario>/<framework>/ | ||
| one isolated app per scenario + setup.ts + memory.bench.ts + memory.flame.ts | ||
| ``` | ||
|
|
||
| One app per scenario; apps and bench names are stable once landed (CodSpeed | ||
| continuity). Never grow an existing scenario for a new case — add a scenario. | ||
| `setup.ts` imports the built app and exports the concrete workload; | ||
| `memory.bench.ts` registers `bench(...)` directly, and `memory.flame.ts` runs the | ||
| same workload through the Flame profiler. | ||
|
|
||
| ## How the memory instrument executes a bench | ||
|
|
||
| - The bench function is warmed up, then **measured exactly once**, starting | ||
| after a forced GC. Under plain `vitest bench` the suites only smoke-test: | ||
| timing output is meaningless; real numbers come from CodSpeed. | ||
| - Under CodSpeed the bench fn runs several warmup invocations plus the | ||
| measured one **on the same mount**, so bench fns must be idempotent and | ||
| module-level counters/LCGs are used where ids must never repeat across | ||
| invocations. | ||
| - Plain `vitest bench` never runs suite hooks (`beforeAll`/`afterAll`) and | ||
| only honors tinybench's `setup`/`teardown` options; the CodSpeed runner | ||
| does the exact opposite. Client benches therefore register **both** — in | ||
| any given mode exactly one pair runs. | ||
| - The process runs with V8 determinism flags (predictable GC schedule, | ||
| `--no-opt`). Never call `global.gc()` manually. Because of `--no-opt`, | ||
| allocation counts overstate production; numbers are for regression | ||
| tracking, not absolute claims. | ||
| - Keep each bench under **~1.5M allocations** (instrument overhead grows past | ||
| 2M); this is the main constraint when tuning iteration counts. | ||
|
|
||
| ## Bench shapes and signals | ||
|
|
||
| - **Churn (leak detector):** N sequential iterations at steady state. If one | ||
| iteration leaks L bytes, peak grows by ~N·L; healthy builds show a flat | ||
| timeline floor independent of N. Tuning check: doubling N must leave peak | ||
| roughly unchanged. | ||
| - **Peak (footprint):** one (or very few) large operations; peak memory | ||
| scaling with the workload is the signal. | ||
|
|
||
| ## Scenarios | ||
|
|
||
| ### Server | ||
|
|
||
| | Scenario | Shape | Guards against | | ||
| | ----------------------- | ----- | --------------------------------------------------------- | | ||
| | `request-churn` | churn | cross-request retention in document SSR (unique URLs) | | ||
| | `server-fn-churn` | churn | retention in the server-function RPC path | | ||
| | `error-paths` | churn | redirect/notFound/error/unmatched paths pinning contexts | | ||
| | `aborted-requests` | churn | dangling streams/listeners after mid-stream client aborts | | ||
| | `peak-large-page` | peak | per-request peak scaling with page size | | ||
| | `streaming-peak` | peak | streaming buffering O(document) instead of O(chunk) | | ||
| | `serialization-payload` | peak | double-buffering / string-copy blowups in dehydration | | ||
|
|
||
| ### Client | ||
|
|
||
| | Scenario | Shape | Guards against | | ||
| | ------------------------- | ----- | -------------------------------------------------------- | | ||
| | `navigation-churn` | churn | per-navigation retention at steady state | | ||
| | `unique-location-churn` | churn | unbounded href/search-keyed caches (never-repeated URLs) | | ||
| | `preload-churn` | churn | preload-cache eviction not releasing memory | | ||
| | `loader-data-retention` | churn | departed routes' loader data staying pinned (gcTime 0) | | ||
| | `mount-unmount` | churn | router instances not collectable after dispose | | ||
| | `interrupted-navigations` | churn | superseded navigations retaining closures/contexts | | ||
|
|
||
| ## Conventions | ||
|
|
||
| - Strictly sequential work: at most one request/navigation in flight; each | ||
| server response is fully consumed before the next request. Pairing a single | ||
| navigation with its render signal via `Promise.all([navigate, rendered])` | ||
| is fine — never overlap distinct work items. | ||
| - Randomness only via the seeded LCG in `bench-utils.ts`; no `Math.random`, | ||
| `Date.now`, or timers — with one documented exception: `streaming-peak`'s | ||
| deferred sections use small `setTimeout` delays so deferred stream chunks are | ||
| observable across framework renderers. | ||
| - Sanity assertions run once at module load and throw on wrong | ||
| status/markers, so a bench can never silently measure the wrong thing. | ||
| - Server requests follow `benchmarks/ssr` conventions: document GETs send | ||
| `accept: text/html`, server-fn requests send `sec-fetch-site: same-origin` | ||
| with bodies precomputed at module level. | ||
| - Client apps export `mountTestApp` from `app.tsx`; benches import the built | ||
| `dist/app.js`; navigations use `replace: true`; unmount does full teardown | ||
| (framework root, `__TSR_ROUTER__`, `history.destroy()`); large loader payloads | ||
| are never rendered into the DOM. | ||
| - `NODE_ENV=production` everywhere (the Nx targets set it). | ||
|
|
||
| ## Run | ||
|
|
||
| Smoke-test the CodSpeed/Vitest benchmark entrypoints and typecheck the | ||
| scenarios: | ||
|
|
||
| ```bash | ||
| pnpm nx run @benchmarks/memory-server:test:perf:react --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-server:test:perf:solid --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-server:test:perf:vue --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-client:test:perf:react --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-client:test:perf:solid --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-client:test:perf:vue --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-server:test:types --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-client:test:types --outputStyle=stream --skipRemoteCache | ||
| ``` | ||
|
|
||
| Local attribution profiling, without CodSpeed CLI/login/sudo/upload, uses | ||
| `@datadog/pprof` heap sampling and `@platformatic/flame` only to render the | ||
| captured pprof files as HTML/Markdown. These targets rebuild the scenarios with | ||
| `--sourcemap true` so the generated profile reports can point back to source; | ||
| the normal CodSpeed benchmark builds are unchanged. Local aggregate scripts run | ||
| with `--parallel=1`, and scenario `test:flame` targets opt out of Nx parallelism | ||
| so profiling workloads do not overlap and bias each other. The Vitest aggregate | ||
| configs also set `fileParallelism: false` so benchmark files run sequentially | ||
| inside `test:perf:react`. | ||
|
|
||
| ```bash | ||
| pnpm benchmark:memory:server:flame | ||
| pnpm benchmark:memory:client:flame | ||
| pnpm benchmark:memory:server:flame:solid | ||
| pnpm benchmark:memory:client:flame:solid | ||
| pnpm benchmark:memory:server:flame:vue | ||
| pnpm benchmark:memory:client:flame:vue | ||
| ``` | ||
|
|
||
| To profile one scenario, run its `test:flame` target directly: | ||
|
|
||
| ```bash | ||
| pnpm nx run @benchmarks/memory-server-request-churn-react:test:flame --outputStyle=stream --skipRemoteCache | ||
| pnpm nx run @benchmarks/memory-client-navigation-churn-react:test:flame --outputStyle=stream --skipRemoteCache | ||
| ``` | ||
|
|
||
| Flame writes reports under the scenario's ignored `.profiles/<timestamp>/` | ||
| directory, including `heap-profile-*.html` and `heap-profile-*.md`. The | ||
| `memory.flame.ts` entrypoints run the same workload shape as `memory.bench.ts` | ||
| but manually start profiling after sanity/setup work and stop it after the | ||
| measured workload. Treat these profiles as diagnostic heap-sampling attribution; | ||
| they are not CodSpeed memory metrics such as peak memory, allocated bytes, or | ||
| allocation counts. The heap sampler is stopped before profile conversion and | ||
| Flame report generation, so Flame/pprof report-generation work should not appear | ||
| as part of the captured workload. Flame runs do not force GC before profiling; | ||
| doing so would perturb the workload and still would not make heap sampling | ||
| equivalent to CodSpeed memory metrics. | ||
|
|
||
| Clean local Flame profile output with: | ||
|
|
||
| ```bash | ||
| pnpm --filter @benchmarks/memory-server clean:profiles | ||
| pnpm --filter @benchmarks/memory-client clean:profiles | ||
| ``` | ||
|
|
||
| Client memory benches are useful for regression tracking of router/React/jsdom | ||
| integration behavior, especially retained route/cache data. They are not pure | ||
| browser-memory measurements, and local Flame attribution can include jsdom, | ||
| React DOM, and profiler shutdown frames. | ||
|
|
||
| Real memory measurement, locally (requires the CodSpeed CLI, `codspeed setup` | ||
| once to install the memory executor, and sudo; **uploads results to the | ||
| CodSpeed dashboard** — local runs do not affect PR baselines): | ||
|
|
||
| ```bash | ||
| WITH_INSTRUMENTATION=1 codspeed run --mode memory -- pnpm nx run @benchmarks/memory-server:test:perf:react | ||
| WITH_INSTRUMENTATION=1 codspeed run --mode memory -- pnpm nx run @benchmarks/memory-server:test:perf:solid | ||
| WITH_INSTRUMENTATION=1 codspeed run --mode memory -- pnpm nx run @benchmarks/memory-server:test:perf:vue | ||
| WITH_INSTRUMENTATION=1 codspeed run --mode memory -- pnpm nx run @benchmarks/memory-client:test:perf:react | ||
| WITH_INSTRUMENTATION=1 codspeed run --mode memory -- pnpm nx run @benchmarks/memory-client:test:perf:solid | ||
| WITH_INSTRUMENTATION=1 codspeed run --mode memory -- pnpm nx run @benchmarks/memory-client:test:perf:vue | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| export const memoryBenchOptions = { | ||
| iterations: 1, | ||
| warmupIterations: 1, | ||
| time: 0, | ||
| warmupTime: 0, | ||
| throws: true, | ||
| } | ||
|
|
||
| export function createDeterministicRandom(seed: number) { | ||
| let state = seed >>> 0 | ||
|
|
||
| return () => { | ||
| state = (state * 1664525 + 1013904223) >>> 0 | ||
| return state / 0x100000000 | ||
| } | ||
| } | ||
|
|
||
| export function randomSegment(random: () => number) { | ||
| return Math.floor(random() * 1_000_000_000).toString(36) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| export interface ClientMemoryWorkload { | ||
| name: string | ||
| before?: () => Promise<void> | void | ||
| run: () => Promise<void> | void | ||
| sanity: () => Promise<void> | void | ||
| after?: () => Promise<void> | void | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| import { profileFlameWorkload } from '../flame-control.ts' | ||
| import { window } from './jsdom.ts' | ||
| import type { ClientMemoryWorkload } from './benchmark.ts' | ||
|
|
||
| export async function runClientFlameBenchmark(workload: ClientMemoryWorkload) { | ||
| try { | ||
| await workload.sanity() | ||
| await workload.before?.() | ||
| await profileFlameWorkload(workload.run, workload.name) | ||
| } finally { | ||
| await workload.after?.() | ||
| window.close() | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| import { JSDOM } from 'jsdom' | ||
|
|
||
| const dom = new JSDOM('<!doctype html><html><body></body></html>', { | ||
| url: 'http://localhost/', | ||
| }) | ||
|
|
||
| const { window } = dom | ||
|
|
||
| function setGlobal(name: string, value: unknown) { | ||
| Object.defineProperty(globalThis, name, { | ||
| value, | ||
| configurable: true, | ||
| writable: true, | ||
| }) | ||
| } | ||
|
|
||
| setGlobal('window', window) | ||
| setGlobal('document', window.document) | ||
| setGlobal('self', window) | ||
| setGlobal('navigator', window.navigator) | ||
| setGlobal('location', window.location) | ||
| setGlobal('history', window.history) | ||
| setGlobal('HTMLElement', window.HTMLElement) | ||
| setGlobal('Element', window.Element) | ||
| setGlobal('SVGElement', window.SVGElement) | ||
| setGlobal('DocumentFragment', window.DocumentFragment) | ||
| setGlobal('Node', window.Node) | ||
| setGlobal('MouseEvent', window.MouseEvent) | ||
| setGlobal('MutationObserver', window.MutationObserver) | ||
| setGlobal('sessionStorage', window.sessionStorage) | ||
| setGlobal('localStorage', window.localStorage) | ||
| setGlobal('getComputedStyle', window.getComputedStyle.bind(window)) | ||
|
|
||
| setGlobal( | ||
| 'requestAnimationFrame', | ||
| window.requestAnimationFrame?.bind(window) ?? | ||
| ((callback: (time: number) => void) => | ||
| setTimeout(() => callback(performance.now()), 16)), | ||
| ) | ||
|
|
||
| setGlobal( | ||
| 'cancelAnimationFrame', | ||
| window.cancelAnimationFrame?.bind(window) ?? | ||
| ((handle: number) => clearTimeout(handle)), | ||
| ) | ||
|
|
||
| const scrollTo = () => {} | ||
| window.scrollTo = scrollTo | ||
| setGlobal('scrollTo', scrollTo) | ||
|
|
||
| export { window } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| export type Framework = 'react' | 'solid' | 'vue' | ||
|
|
||
| export type MountedApp = { | ||
| router: unknown | ||
| unmount: () => void | ||
| } | ||
|
|
||
| export type MountTestApp = (container: HTMLDivElement) => MountedApp | ||
|
|
||
| const frameworkNames = { | ||
| react: 'React', | ||
| solid: 'Solid', | ||
| vue: 'Vue', | ||
| } satisfies Record<Framework, string> | ||
|
|
||
| export function noop() {} | ||
|
|
||
| export function warnClientMemoryDevMode(framework: Framework) { | ||
| if (process.env.NODE_ENV !== 'production') { | ||
| console.warn( | ||
| `memory client benchmark is running without NODE_ENV=production; ${frameworkNames[framework]} dev overhead will dominate results.`, | ||
| ) | ||
| } | ||
| } | ||
|
|
||
| export function createBenchContainer() { | ||
| const container = document.createElement('div') | ||
| document.body.append(container) | ||
|
|
||
| return container | ||
| } | ||
|
|
||
| export function removeBenchContainer(container: HTMLDivElement | undefined) { | ||
| container?.remove() | ||
| } | ||
|
|
||
| export function nextAnimationFrame() { | ||
| return new Promise<void>((resolve) => { | ||
| requestAnimationFrame(() => resolve()) | ||
| }) | ||
| } | ||
|
|
||
| export async function drainMicrotasks() { | ||
| await Promise.resolve() | ||
| await Promise.resolve() | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.