Description
Following the introduction of runtime scaling and flamegraph benchmarks in #117, I suggest we can expand the diagnostic suite to handle two critical areas: Memory Usage and Regression Prevention.
Proposed Enhancements
1. Memory Profiling Integration
-
Goal: Track peak memory consumption alongside runtime.
-
Implementation: Integrate a tool like
memory-profiler or tracemalloc into the existing benchmark CLI.
-
Value: Helps identify if high-resolution grids ($10^7$+ nodes) require a transition from
networkx to more memory-efficient data structures (e.g., adjacency matrices or sparse tensors).
2. Automated Performance Regression
-
Goal: Prevent performance "drift" where new features accidentally slow down graph construction.
-
Implementation: * Add a
--check flag to the scaling script.
- Define a "performance budget" for specific node counts (e.g., $10^5$ nodes must be processed in under 10s).
- Integrate this check into the CI pipeline (GitHub Actions) to flag PRs that significantly exceed the runtime budget.
3. Machine-Readable Metrics (optional)
- Goal: Enable long-term tracking of performance trends.
- Implementation: Add an
--output-json or --csv flag to the benchmark scripts so results can be stored and compared across different versions of the codebase.
Context & Links
Task List
Description
Following the introduction of runtime scaling and flamegraph benchmarks in #117, I suggest we can expand the diagnostic suite to handle two critical areas: Memory Usage and Regression Prevention.
Proposed Enhancements
1. Memory Profiling Integration
memory-profilerortracemallocinto the existing benchmark CLI.networkxto more memory-efficient data structures (e.g., adjacency matrices or sparse tensors).2. Automated Performance Regression
--checkflag to the scaling script.3. Machine-Readable Metrics (optional)
--output-jsonor--csvflag to the benchmark scripts so results can be stored and compared across different versions of the codebase.Context & Links
Task List
graph_creation_scaling.main.