Configuration Reference

This document describes all configuration options for benchmarkoor. The config.example.yaml also has a lot of information.

Overview
Environment Variables
- Config-local variables (global.env)
- Environment Variable Overrides
Configuration Merging
Global Settings
Runner Settings
Resource Limits
Post-Test RPC Calls
Builder
API Server
Examples

Overview

Benchmarkoor uses YAML configuration files to define benchmark settings, client configurations, and test sources. Configuration is loaded from one or more files specified via the --config flag.

benchmarkoor run --config config.yaml

Environment Variables

Environment variables can be used anywhere in the configuration using shell-style syntax:

Syntax	Description
`${VAR}`	Substitute the value of `VAR`
`$VAR`	Substitute the value of `VAR`
`${VAR:-default}`	Use `default` if `VAR` is unset or empty

Example:

global:
  log_level: ${LOG_LEVEL:-info}
runner:
  benchmark:
    results_dir: ${RESULTS_DIR:-./results}

Config-local variables (`global.env`)

global.env declares variables inside the config itself, available to the same ${VAR} / ${VAR:-default} substitution everywhere in the file. This keeps a config self-contained — no need to export a value before running — while preserving the single-point-of-edit indirection:

global:
  env:
    STATE_DIR: /tmp/benchmarkoor/state-actor/simple-amsterdam-compute
builder:
  state_actor:
    targets:
      - client: geth
        output_dir: ${STATE_DIR}/geth   # → /tmp/benchmarkoor/state-actor/simple-amsterdam-compute/geth

Resolution order for any ${VAR} is shell environment → global.env → inline :-default. A real environment variable of the same name therefore still wins, so global.env acts as a per-config default that CI or an ad-hoc VAR=… benchmarkoor … invocation can override. A global.env value may itself reference the shell environment (e.g. ${BASE:-/tmp}/state-actor); values do not reference one another.

Environment Variable Overrides

Configuration values can also be overridden via environment variables with the BENCHMARKOOR_ prefix. The variable name is derived from the config path using underscores:

Config Path	Environment Variable
`global.log_level`	`BENCHMARKOOR_GLOBAL_LOG_LEVEL`
`builder.run_timeout`	`BENCHMARKOOR_BUILDER_RUN_TIMEOUT`
`runner.run_timeout`	`BENCHMARKOOR_RUNNER_RUN_TIMEOUT`
`runner.benchmark.results_dir`	`BENCHMARKOOR_RUNNER_BENCHMARK_RESULTS_DIR`
`runner.client.config.jwt`	`BENCHMARKOOR_RUNNER_CLIENT_CONFIG_JWT`

Configuration Merging

Multiple configuration files can be merged by specifying --config multiple times:

benchmarkoor run --config base.yaml --config overrides.yaml

Later files override values from earlier files. This is useful for:

Separating base configuration from environment-specific overrides
Keeping secrets in a separate file
Testing different configurations without modifying the base file

Global Settings

The global section contains application-wide settings.

global:
  log_level: info
  env:
    STATE_DIR: /tmp/benchmarkoor/state-actor/my-config
  directories:
    cachedir: ~/.cache/benchmarkoor

Options

Option	Type	Default	Description
`log_level`	string	`info`	Logging level: `debug`, `info`, `warn`, `error`
`env`	map[string]string	–	Config-local variables for `${VAR}` substitution; a per-config default that a shell env var of the same name still overrides. See Config-local variables.
`directories.cachedir`	string	`~/.cache/benchmarkoor`	On-disk cache shared by both commands: executor git/archive clones (`run`) and the EEST repo clone (`build`).

Runner Settings

The runner section contains all run-specific settings including benchmark configuration, client settings, and instance definitions.

runner:
  container_runtime: docker
  client_logs_to_stdout: true
  container_network: benchmarkoor
  cleanup_on_start: false
  run_timeout: 4h
  directories:
    tmp_datadir: /tmp/benchmarkoor
  drop_caches_path: /proc/sys/vm/drop_caches
  cpu_sysfs_path: /sys/devices/system/cpu

Options

Option	Type	Default	Description
`container_runtime`	string	`docker`	Container runtime to use: `docker` or `podman`. See Container Runtime
`client_logs_to_stdout`	bool	`false`	Stream client container logs to stdout
`container_network`	string	`benchmarkoor`	Container network name
`cleanup_on_start`	bool	`false`	Remove leftover containers/networks on startup
`run_timeout`	string	-	Global timeout for the entire run covering all instances, setup, and teardown. Uses Go duration format (e.g., `4h`, `30m`). See Runner Run Timeout
`directories.tmp_datadir`	string	system temp	Directory for temporary datadir copies. (The shared cache dir is `global.directories.cachedir`.)
`drop_caches_path`	string	`/proc/sys/vm/drop_caches`	Path to Linux drop_caches file (for containerized environments)
`cpu_sysfs_path`	string	`/sys/devices/system/cpu`	Base path for CPU sysfs files (for containerized environments where `/sys` is read-only and the host path is bind-mounted elsewhere, e.g., `/host_sys_cpu`)
`metadata.labels`	map[string]string	-	Arbitrary key-value labels attached to the run (see Metadata Labels)
`github_token`	string	-	GitHub token for downloading Actions artifacts via REST API. Not needed if `gh` CLI is installed and authenticated. Requires `actions:read` scope. Can also be set via `BENCHMARKOOR_RUNNER_GITHUB_TOKEN` env var
`live_reporting`	object	-	Stream periodic run-status reports to a benchmarkoor API instance so the UI can display in-progress runs. See Live Reporting

Live Reporting

When live_reporting is enabled, every run posts a snapshot of its current state (status, test counts, metadata labels) to the configured benchmarkoor API at a jittered interval. The API stores these in a separate live_runs table and the UI merges them into the runs view as ephemeral rows. Once the on-disk indexer picks up the same run from storage, the live entry is removed automatically.

runner:
  live_reporting:
    enabled: true
    endpoint: https://benchmarkoor.example.com
    token: my-shared-secret
    discovery_path: my-host/benchmarks  # must match a discovery path on the API side
    interval: 1m
    jitter_fraction: 0.2
    timeout: 10s
    logs_enabled: true     # default true; set false to disable the log streamer
    logs_interval: 200ms   # file-tail push cadence while streaming

Option	Type	Default	Description
`enabled`	bool	`false`	Enable live reporting
`endpoint`	string	-	Base URL of the benchmarkoor API (no trailing path), e.g. `https://api.example.com`
`token`	string	-	Shared bearer token; must match `api.ingest.token` on the API side
`discovery_path`	string	-	Discovery path the runner's results will be published under. Used as part of the unique key on the API
`interval`	string	`1m`	Base reporting interval (Go duration). Each tick adds random jitter
`jitter_fraction`	float	`0.2`	Random jitter as a fraction of `interval`. `0` uses the default; negative disables jitter entirely
`timeout`	string	`10s`	Per-request HTTP timeout
`logs_enabled`	bool	`true`	Enable live log streaming. When enabled, the runner opens a WebSocket to the API for the lifetime of the run. Log bytes only flow while at least one UI client has the log panel open — zero traffic otherwise
`logs_interval`	string	`200ms`	How often the runner reads new bytes from `benchmarkoor.log` and pushes them over the WebSocket while streaming is active. Lower values feel smoother; the overhead is negligible since empty ticks don't send a message

Reports are best-effort: HTTP failures are logged at WARN and dropped. The next tick will retry with the latest snapshot. On Stop(), the runner sends one final synchronous report so the terminal status reaches the API.

Container Runtime

Benchmarkoor supports both Docker and Podman as container runtimes. The runtime is selected via the container_runtime field.

Value	Description
`docker`	Use Docker (default)
`podman`	Use Podman. Required for `container-checkpoint-restore` rollback strategy. Connects via `/run/podman/podman.sock`

When using Podman, ensure the Podman socket is active:

sudo systemctl start podman.socket

Metadata Labels

The runner.client.config.metadata.labels field attaches arbitrary key-value pairs to benchmark runs. Labels are included in each run's output config.json and can be used for filtering and organization (e.g., in the UI or CI pipelines).

Labels can be set at the client level (defaults for all instances) and overridden per instance. Instance-level labels are merged with client-level labels, with instance values taking precedence on conflict.

runner:
  client:
    config:
      metadata:
        labels:
          env: production
          team: platform
  instances:
    - id: geth-latest
      client: geth
      metadata:
        labels:
          env: staging      # overrides client-level "env"
          variant: snap-sync  # additional instance-specific label

In this example, geth-latest runs will have labels env=staging, team=platform, and variant=snap-sync.

Labels can also be set (or overridden) at the client level via the CLI flag --metadata.label:

benchmarkoor run --config config.yaml \
  --metadata.label=env=production \
  --metadata.label=team=platform

When the same key is set in both the config file and the CLI, the CLI value wins.

Runner Run Timeout

The runner.run_timeout option sets a global timeout for the entire benchmark run. Unlike the per-instance runner.client.config.run_timeout which only applies to individual instance execution, this timeout caps everything — all instances, setup, and teardown — starting from when the run begins.

runner:
  run_timeout: 4h

When the timeout is reached, the run context is cancelled and no further instances will be started. Per-instance S3 uploads use an independent context and will still complete. Results collected before the timeout are preserved on disk.

Benchmark Settings

The runner.benchmark section configures test execution and results output.

runner:
  benchmark:
    results_dir: ./results
    results_owner: "1000:1000"
    system_resource_collection_enabled: true
    generate_results_index: true
    generate_suite_stats: true
    tests:
      filter: "erc20"
      source:
        git:
          repo: https://github.com/example/benchmarks.git
          version: main

Options

Option	Type	Default	Description
`results_dir`	string	`./results`	Directory for benchmark results
`results_owner`	string	-	Set ownership (user:group) for results files. Useful when running as root
`skip_test_run`	bool	`false`	Skip test execution; only run post-run operations (index/stats generation)
`system_resource_collection_enabled`	bool	`true`	Enable CPU/memory/disk metrics collection via cgroups/Docker Stats API
`generate_results_index`	bool	`false`	Generate `index.json` aggregating all run metadata
`generate_results_index_method`	string	`local`	Method for index generation: `local` (filesystem) or `s3` (read runs from S3, upload index back). Requires `results_upload.s3` when set to `s3`
`generate_suite_stats`	bool	`false`	Generate `stats.json` per suite for UI heatmaps
`generate_suite_stats_method`	string	`local`	Method for suite stats generation: `local` (filesystem) or `s3` (read runs from S3, upload stats back). Requires `results_upload.s3` when set to `s3`
`tests.filter`	string	-	Run only tests whose name (or file path) matches this pattern. Plain values match by substring; values prefixed with `regex:` match the trailing expression as a Go regular expression. See Test Filter
`tests.metadata.labels`	map[string]string	-	Arbitrary key-value labels for the test suite (see Suite Metadata Labels)
`tests.source`	object	-	Test source configuration (see below)

Suite Metadata Labels

The runner.benchmark.tests.metadata.labels field attaches arbitrary key-value pairs to a test suite. Labels are written to the suite's summary.json and displayed in the UI.

The special name label is used as the display name for the suite throughout the UI (breadcrumbs, tables, detail pages) instead of the suite hash.

runner:
  benchmark:
    tests:
      metadata:
        labels:
          name: "EIP-7934 BN128 Benchmarks"
          category: precompile
      source:
        # ...

Note: Labels do not affect the suite hash. The hash is computed from test file contents only, so changing labels does not create a new suite.

Test Sources

Tests can be loaded from a local directory, a git repository, an archive file, or EEST (Ethereum Execution Spec Tests) fixtures. Only one source type can be configured.

Local Source

tests:
  source:
    local:
      base_dir: ./benchmark-tests
      pre_run_steps:
        - "warmup/*.txt"
      steps:
        setup:
          - "tests/setup/*.txt"
        test:
          - "tests/test/*.txt"
        cleanup:
          - "tests/cleanup/*.txt"

Option	Type	Required	Description
`base_dir`	string	Yes	Path to the local test directory
`pre_run_steps`	[]string	No	Glob patterns for steps executed once before all tests
`steps.setup`	[]string	No	Glob patterns for setup phase files
`steps.test`	[]string	No	Glob patterns for test phase files
`steps.cleanup`	[]string	No	Glob patterns for cleanup phase files

Git Source

tests:
  source:
    git:
      repo: https://github.com/example/gas-benchmarks.git
      version: main
      pre_run_steps:
        - "funding/*.txt"
      steps:
        setup:
          - "tests/setup/*.txt"
        test:
          - "tests/test/*.txt"
        cleanup:
          - "tests/cleanup/*.txt"

Option	Type	Required	Description
`repo`	string	Yes	Git repository URL
`version`	string	Yes	Branch name, tag, or commit hash
`pre_run_steps`	[]string	No	Glob patterns for steps executed once before all tests
`steps.setup`	[]string	No	Glob patterns for setup phase files
`steps.test`	[]string	No	Glob patterns for test phase files
`steps.cleanup`	[]string	No	Glob patterns for cleanup phase files

Archive Source

Tests can be loaded from a ZIP or tar.gz archive file, either from a local path or a URL (including GitHub Actions artifacts).

tests:
  source:
    archive:
      file: https://github.com/NethermindEth/gas-benchmarks/actions/runs/23847558369/artifacts/6222084759
      pre_run_steps:
        - "perf-devnet-3/gas-bump.txt"
        - "perf-devnet-3/funding.txt"
      steps:
        setup:
          - "perf-devnet-3/setup/*.txt"
        test:
          - "perf-devnet-3/testing/*.txt"
    # Optional: External opcode metadata for the test suite.
    # A JSON file mapping test names to opcode counts.
    # Can be a local path or URL.
    opcode_source:
      file: opcodes_tracing.json

Option	Type	Required	Description
`file`	string	One of `file`/`parts`	Local path or URL to a ZIP or tar.gz archive. GitHub Actions artifact URLs are auto-converted to API endpoints
`parts`	[]string	One of `file`/`parts`	Ordered list of local paths or URLs to concatenate into the final archive. Useful when the archive is split because of per-asset size limits. Mutually exclusive with `file`
`pre_run_steps`	[]string	No	Glob patterns for steps executed once before all tests
`steps.setup`	[]string	No	Glob patterns for setup phase files
`steps.test`	[]string	No	Glob patterns for test phase files
`steps.cleanup`	[]string	No	Glob patterns for cleanup phase files

Multi-part archives: when an archive is too large for a single asset upload, parts accepts an ordered list of URLs or local paths. All parts are downloaded (with caching) and concatenated into a single file before extraction:

tests:
  source:
    archive:
      parts:
        - https://github.com/org/repo/releases/download/v1.0.0/tests.tar.gz.00.part
        - https://github.com/org/repo/releases/download/v1.0.0/tests.tar.gz.01.part
      steps:
        test:
          - "testing/*.txt"

Opcode Source

Optional external opcode metadata can be configured alongside the test source. Two modes are supported.

Direct JSON file — file is a local path or URL to the JSON file:

runner:
  benchmark:
    tests:
      opcode_source:
        file: opcodes_tracing.json  # Local path or URL to a JSON file

Archive mode — archive is a .zip / .tar.gz (or a GitHub Actions artifact URL) that contains the JSON file; file is the filename to look up inside the extracted archive:

runner:
  benchmark:
    tests:
      opcode_source:
        archive: https://github.com/NethermindEth/gas-benchmarks/actions/runs/24460911828/artifacts/6456466898
        file: opcodes_tracing.json  # Filename inside the archive

archive can also be a plain URL to a .zip / .tar.gz, or a local path to one. When archive is set, file is interpreted as a filename inside the extracted tree (matched by basename, so nested folders are walked automatically).

Option	Type	Required	Description
`file`	string	Yes	When `archive` is unset: local path or URL to the JSON file. When `archive` is set: filename to look up inside the extracted archive
`archive`	string	No	Optional local path or URL to a `.zip` / `.tar.gz` / GitHub Actions artifact containing the opcode JSON file. When set, `file` names the entry inside the archive

GitHub Actions artifacts: Browser URLs like https://github.com/{owner}/{repo}/actions/runs/{run_id}/artifacts/{artifact_id} are automatically converted to the GitHub API download endpoint. A GitHub token is required for artifact downloads (set via runner.github_token or BENCHMARKOOR_RUNNER_GITHUB_TOKEN).

Archive extraction: ZIP archives are extracted and any inner tarballs (common in GitHub Actions artifacts) are automatically extracted as well. Both direct-file and archive downloads are cache-validated on each run via HTTP ETag / Last-Modified — the archive (and its extraction) is refreshed automatically when the origin changes.

EEST Fixtures Source

EEST (Ethereum Execution Spec Tests) fixtures can be loaded from GitHub releases or GitHub Actions artifacts. This source type downloads fixtures from ethereum/execution-spec-tests and converts them to Engine API calls automatically.

From GitHub Releases

tests:
  source:
      eest_fixtures:
        github_repo: ethereum/execution-specs
        github_release: tests-benchmark@v0.0.9
      fixtures_subdir: fixtures/blockchain_tests_engine_x

Option	Type	Required	Default	Description
`github_repo`	string	Yes	-	GitHub repository (e.g., `ethereum/execution-specs`)
`github_release`	string	Yes*	-	Release tag (e.g., `test-benchmark@v0.0.9`)
`fixtures_subdir`	string	No	`fixtures/blockchain_tests_engine_x`	Subdirectory within the fixtures tarball to search
`fixtures_url`	string	No	Auto-generated	Override URL for fixtures tarball
`genesis_url`	string	No	Auto-generated	Override URL for genesis tarball

*Either github_release or fixtures_artifact_name is required.

From GitHub Actions Artifacts

As an alternative to releases, you can download fixtures directly from GitHub Actions workflow artifacts. This is useful for testing with fixtures from CI builds before they're released.

Requirements: Either the gh CLI must be installed and authenticated with GitHub, or runner.github_token must be set (a token with actions:read scope).

tests:
  source:
    eest_fixtures:
      github_repo: ethereum/execution-spec-tests
      fixtures_artifact_name: fixtures_benchmark_fast
      genesis_artifact_name: benchmark_genesis
      # Optional: specify a specific workflow run ID (uses latest if not specified)
      # fixtures_artifact_run_id: "12345678901"
      # genesis_artifact_run_id: "12345678901"

Option	Type	Required	Default	Description
`github_repo`	string	Yes	-	GitHub repository (e.g., `ethereum/execution-spec-tests`)
`fixtures_artifact_name`	string	Yes*	-	Name of the fixtures artifact to download
`genesis_artifact_name`	string	No	`benchmark_genesis`	Name of the genesis artifact to download
`fixtures_artifact_run_id`	string	No	Latest	Specific workflow run ID for fixtures artifact
`genesis_artifact_run_id`	string	No	Latest	Specific workflow run ID for genesis artifact
`fixtures_subdir`	string	No	`fixtures/blockchain_tests_engine_x`	Subdirectory within the fixtures to search

*Either github_release, fixtures_artifact_name, local_fixtures_dir/local_genesis_dir, or local_fixtures_tarball/local_genesis_tarball is required. Only one mode can be used at a time.

From Local Directories

For local development with already-extracted EEST fixtures (e.g., built locally from the execution-spec-tests repository), you can point directly at the directories. No downloading or caching is performed.

tests:
  source:
    eest_fixtures:
      local_fixtures_dir: /home/user/eest-output/fixtures
      local_genesis_dir: /home/user/eest-output/genesis
      # Optional: Override the subdirectory within fixtures to search.
      # fixtures_subdir: fixtures/blockchain_tests_engine_x  # default

Option	Type	Required	Default	Description
`local_fixtures_dir`	string	Yes*	-	Path to extracted fixtures directory
`local_genesis_dir`	string	Yes*	-	Path to extracted genesis directory
`fixtures_subdir`	string	No	`fixtures/blockchain_tests_engine_x`	Subdirectory within the fixtures directory to search

*Both local_fixtures_dir and local_genesis_dir must be set together. Both paths must exist and be directories.

github_repo is not required for local modes.

From Local Tarballs

If you have locally-built .tar.gz tarballs (e.g., fixtures_benchmark.tar.gz and benchmark_genesis.tar.gz), you can use them directly. The tarballs are extracted to a cache directory keyed by a hash of the tarball paths, so re-extraction is skipped on subsequent runs.

tests:
  source:
    eest_fixtures:
      local_fixtures_tarball: /home/user/eest-output/fixtures_benchmark.tar.gz
      local_genesis_tarball: /home/user/eest-output/benchmark_genesis.tar.gz
      # Optional: Override the subdirectory within fixtures to search.
      # fixtures_subdir: fixtures/blockchain_tests_engine_x  # default

Option	Type	Required	Default	Description
`local_fixtures_tarball`	string	Yes*	-	Path to fixtures `.tar.gz` file
`local_genesis_tarball`	string	Yes*	-	Path to genesis `.tar.gz` file
`fixtures_subdir`	string	No	`fixtures/blockchain_tests_engine_x`	Subdirectory within the extracted fixtures to search

*Both local_fixtures_tarball and local_genesis_tarball must be set together. Both paths must exist and be regular files.

github_repo is not required for local modes.

Key features:

Automatically downloads and caches fixtures from GitHub releases or artifacts
Supports local directories and local .tar.gz tarballs for offline/development use
Converts EEST fixture format to engine_newPayloadV{1-4} + engine_forkchoiceUpdatedV{1,3} calls
Only includes fixtures with fixture-format: blockchain_test_engine_x
Auto-resolves genesis files per client type from the release/artifact/local source

Genesis file resolution:

When using EEST fixtures, genesis files are automatically resolved based on client type. You don't need to configure runner.client.config.genesis unless you want to override the defaults.

Client	Genesis Path
geth, erigon, reth, nimbus	`go-ethereum/genesis.json`
nethermind	`nethermind/chainspec.json`
besu	`besu/genesis.json`

Example with filter:

runner:
  benchmark:
    tests:
      filter: "bn128"  # Only run tests matching "bn128"
      source:
        eest_fixtures:
          github_repo: ethereum/execution-specs
          github_release: tests-benchmark@v0.0.9

Test Filter

The runner.benchmark.tests.filter selects which tests run. Two modes:

Mode	Syntax	Behavior
Substring (default)	`filter: "bn128"`	Test/file path must contain the literal string. Regex metacharacters are matched literally (e.g. `filter: "test.name"` only matches paths that contain the seven-character string `test.name`)
Regex	`filter: "regex:<expr>"`	The trailing expression is compiled as a Go regular expression and tested with `MatchString`. Anchor with `^` / `$` if you need full-string matches; flags like `(?i)` for case-insensitive are supported

Examples:

# Substring — matches any test path containing "keccak"
filter: "keccak"

# Regex — matches "test_sstore_bloated…benchmark_300M" anywhere in the path
filter: "regex:test_sstore_bloated.*benchmark_300M"

# Regex with case-insensitive flag
filter: "regex:(?i)KECCAK|sha256"

# Regex anchored to end of path
filter: "regex:bn128_pairing\\.txt$"

The filter is applied to:

file paths returned from glob expansion (substring match against the absolute path),
EEST fixture test names,
opcode source entries.

A bad regex (e.g. unclosed character class) is rejected at config-load time with a runner.benchmark.tests.filter: invalid regex … error.

Results Upload

The runner.benchmark.results_upload section configures automatic uploading of results to remote storage after each instance run. Currently only S3-compatible storage is supported.

runner:
  benchmark:
    results_upload:
      s3:
        enabled: true
        endpoint_url: https://s3.amazonaws.com
        region: us-east-1
        bucket: my-benchmark-results
        access_key_id: ${AWS_ACCESS_KEY_ID}
        secret_access_key: ${AWS_SECRET_ACCESS_KEY}
        prefix: results
        # storage_class: STANDARD
        # acl: private
        force_path_style: false

Option	Type	Required	Default	Description
`enabled`	bool	Yes	`false`	Enable S3 upload
`bucket`	string	Yes	-	S3 bucket name
`endpoint_url`	string	No	AWS default	S3 endpoint URL — scheme and host only, no path (e.g., `https://<id>.r2.cloudflarestorage.com`)
`region`	string	No	`us-east-1`	AWS region
`access_key_id`	string	No	-	Static AWS access key ID
`secret_access_key`	string	No	-	Static AWS secret access key
`prefix`	string	No	`results`	Base key prefix. Runs are stored under `prefix/runs/`, suites under `prefix/suites/`
`storage_class`	string	No	Bucket default	S3 storage class (e.g., `STANDARD`, `STANDARD_IA`)
`acl`	string	No	-	Canned ACL (e.g., `private`, `public-read`)
`force_path_style`	bool	No	`false`	Use path-style addressing (required for MinIO and Cloudflare R2)
`parallel_uploads`	int	No	`50`	Number of concurrent file uploads

Important: The endpoint_url must be the base URL without any path component. Do not include the bucket name in the URL — the SDK handles that separately via the bucket field. For example, use https://<account_id>.r2.cloudflarestorage.com, not https://<account_id>.r2.cloudflarestorage.com/my-bucket.

When enabled, a preflight check runs before any benchmarks to verify S3 connectivity. Each instance's results directory is uploaded after the run completes (including on failure, for partial results).

Results can also be uploaded manually using the upload-results subcommand:

benchmarkoor upload-results --method=s3 --config config.yaml --result-dir=./results/runs/<run_dir>

The generate-index-file command also supports reading directly from S3. This is useful for regenerating index.json from remote data without having all results locally:

benchmarkoor generate-index-file --method=s3 --config config.yaml

When using --method=s3, the command reads config.json and result.json from each run directory in the bucket, builds the index in memory, and uploads index.json at prefix/index.json (e.g. prefix demo/results places index.json at demo/results/index.json).

The generate-suite-stats-file command also supports reading directly from S3:

benchmarkoor generate-suite-stats-file --method=s3 --config config.yaml

When using --method=s3, the command reads config.json and result.json from each run, groups them by suite hash, builds per-suite stats in memory, and uploads stats.json to prefix/suites/{hash}/stats.json.

Client Settings

The runner.client section configures Ethereum execution clients.

Supported Clients

Client	Type	Default Image
Geth	`geth`	`ethpandaops/geth:performance`
Nethermind	`nethermind`	`ethpandaops/nethermind:performance`
Besu	`besu`	`ethpandaops/besu:performance`
Erigon	`erigon`	`ethpandaops/erigon:performance`
Nimbus	`nimbus`	`statusim/nimbus-eth1:performance`
Reth	`reth`	`ethpandaops/reth:performance`

Client Defaults

The runner.client.config section sets defaults applied to all client instances.

runner:
  client:
    config:
      jwt: "5a64f13bfb41a147711492237995b437433bcbec80a7eb2daae11132098d7bae"
      drop_memory_caches: "disabled"
      rollback_strategy: "rpc-debug-setHead"  # or "none"
      resource_limits:
        cpuset_count: 4
        memory: "16g"
        swap_disabled: true
      genesis:
        geth: https://example.com/genesis/geth.json
        nethermind: https://example.com/genesis/nethermind.json

Option	Type	Default	Description
`jwt`	string	`5a64f1...`	JWT secret for Engine API authentication
`drop_memory_caches`	string	`disabled`	When to drop Linux memory caches (see below)
`rollback_strategy`	string	`rpc-debug-setHead`	Rollback strategy after each test (see below)
`checkpoint_restore_strategy_options`	object	-	Options for the checkpoint-restore rollback strategy (see Checkpoint Restore Strategy Options)
`wait_after_rpc_ready`	string	-	Duration to wait after RPC becomes ready (see below)
`run_timeout`	string	-	Maximum duration for test execution before the run is timed out (see below)
`retry_new_payloads_syncing_state`	object	-	Retry config for SYNCING responses (see below)
`retry_new_payloads_failed_state`	object	-	Retry config for any non-SYNCING `engine_newPayload*` failure (see below)
`resource_limits`	object	-	Container resource constraints (see Resource Limits)
`post_test_rpc_calls`	[]object	-	Arbitrary RPC calls to execute after each test step (see Post-Test RPC Calls)
`post_test_sleep_duration`	string	-	Sleep duration after each test, e.g. `200ms`, `1s` (see below)
`bootstrap_fcu`	bool/object	-	Send an `engine_forkchoiceUpdatedV3` after RPC is ready to confirm the client is fully synced (see Bootstrap FCU)
`opcode_extraction`	object	-	Extract per-test opcode counts via `debug_traceBlockByNumber` after each test step (see Opcode Extraction)
`genesis`	map	-	Genesis file URLs keyed by client type

Drop Memory Caches

This Linux-only feature (requires root) drops page cache, dentries, and inodes between benchmark phases for more consistent results.

Value	Description
`disabled`	Do not drop caches (default)
`tests`	Drop caches between tests
`steps`	Drop caches between all steps (setup, test, cleanup)

Rollback Strategy

Controls whether the client state is rolled back after each test. This is useful for stateful benchmarks where tests modify chain state and you want each test to start from the same block.

Value	Description
`none`	Do not rollback
`rpc-debug-setHead`	Capture block info before each test, then rollback via a client-specific debug RPC after the test completes (default)
`container-recreate`	Stop and remove the container after each test, then create and start a fresh one
`container-checkpoint-restore`	Use Podman's CRIU-based checkpoint/restore to snapshot container memory state and the data directory, then instantly restore both per-test. Requires `container_runtime: "podman"`. When `datadir.method: "zfs"` is configured, uses ZFS snapshots for rollback. Without a datadir, uses copy-based rollback (`cp -a` snapshot, `rsync --delete` restore). Other `datadir.methods` are not supported.

`rpc-debug-setHead`

When rpc-debug-setHead is enabled, the following happens for each test:

Before the test, eth_getBlockByNumber("latest", false) is called to capture the current block number and hash.
The test (including setup and cleanup steps) runs normally.
After the test, a client-specific rollback RPC call is made.
The rollback is verified by calling eth_getBlockByNumber("latest", false) again and comparing the block number.

If the rollback fails or the block number doesn't match, a warning is logged but the test is not marked as failed.

Client-specific RPC calls

Each client uses a different RPC method and parameter format for rollback:

Client	RPC Method	Parameter	Example payload
Geth	`debug_setHead`	Hex block number	`{"method":"debug_setHead","params":["0x5"]}`
Besu	`debug_setHead`	Hex block number	`{"method":"debug_setHead","params":["0x5"]}`
Reth	`debug_setHead`	Integer block number	`{"method":"debug_setHead","params":[5]}`
Nethermind	`debug_resetHead`	Block hash	`{"method":"debug_resetHead","params":["0xabc..."]}`
Erigon	N/A	N/A	Not supported
Nimbus	N/A	N/A	Not supported

For clients that don't support rollback (Erigon, Nimbus), a warning is logged and the rollback step is skipped.

`container-recreate`

When container-recreate is enabled, the runner manages the per-test loop:

The first test runs against the original container.
After each test, the container is stopped and removed.
A new container is created and started with the same configuration. The data volume/datadir persists.
The runner waits for the RPC endpoint to become ready and the configured wait period before running the next test.

This strategy works with all clients since it doesn't require any client-specific RPC support.

`container-checkpoint-restore`

When container-checkpoint-restore is enabled, the runner uses Podman's native CRIU-based checkpoint/restore to eliminate per-test container lifecycle overhead. This is significantly faster than container-recreate for large test suites because the client process resumes mid-execution without restart or RPC polling.

Two data-directory rollback modes are supported:

ZFS snapshots (when datadir.method: "zfs" is configured): instant copy-on-write rollback.
Copy-based (when no datadir is configured, e.g., EEST tests): cp -a snapshot, rsync --delete restore. The data directory is bind-mounted from a host temp directory.

Requirements:

container_runtime: "podman" must be set
CRIU must be installed on the host
Podman must be running as root (rootful mode)
If a datadir is configured, it must use method: "zfs"

Flow:

The container starts and the runner waits for the RPC endpoint to become ready.
After RPC is ready (and any configured wait period), the data directory is snapshotted (ZFS snapshot or file copy) and the container is checkpointed (memory state exported to a file). The container stops.
For each test:
- The data directory is rolled back to the snapshot (ZFS rollback or rsync restore).
- The container is restored from the checkpoint. The client process resumes at the exact point it was checkpointed — no startup, no RPC polling.
- The test executes.
- The restored container is stopped and removed.
After all tests, the snapshot and checkpoint export file are cleaned up.

With ZFS datadir:

runner:
  container_runtime: podman
  client:
    config:
      rollback_strategy: container-checkpoint-restore
    datadirs:
      geth:
        source_dir: /tank/data/geth
        method: zfs
  instances:
    - id: geth
      client: geth

Without datadir (e.g., EEST tests):

runner:
  container_runtime: podman
  client:
    config:
      rollback_strategy: container-checkpoint-restore
  instances:
    - id: geth
      client: geth

Checkpoint Restore Strategy Options

Options for the container-checkpoint-restore rollback strategy, nested under checkpoint_restore_strategy_options:

Sub-option	Type	Default	Description
`tmpfs_threshold`	string	-	Store checkpoint on tmpfs (RAM) when container memory is under this threshold. Uses the same format as `resource_limits.memory` (Docker go-units): e.g., `"8g"`, `"512m"`, `"1024k"`, or raw bytes. If not set, checkpoints are always stored on disk.
`tmpfs_max_size`	string	2× `tmpfs_threshold`	Maximum size of the tmpfs mount for checkpoint storage. Same format as `tmpfs_threshold` (e.g., `"16g"`, `"1024m"`). When not set, defaults to twice the `tmpfs_threshold` value.
`wait_after_tcp_drop_connections`	string	`10s`	How long to wait after dropping TCP connections before checkpointing, giving the process time to close file descriptors (Go duration string).
`restart_container`	bool	`false`	Whether to restart the container before taking a CRIU checkpoint. Restarting ensures a clean process state (cold caches, clean DB shutdown).

runner:
  client:
    config:
      rollback_strategy: container-checkpoint-restore
      checkpoint_restore_strategy_options:
        tmpfs_threshold: "8g"
        tmpfs_max_size: "16g"
        wait_after_tcp_drop_connections: "10s"
        restart_container: false

Wait After RPC Ready

Some clients (e.g., Erigon) have internal sync pipelines that continue running after their RPC endpoint becomes available. The wait_after_rpc_ready option adds a configurable delay after the RPC health check passes, giving the client time to complete internal initialization before test execution begins.

runner:
  client:
    config:
      wait_after_rpc_ready: 30s

The value is a Go duration string (e.g., 30s, 1m, 500ms). If not set, no additional wait is performed.

When to use:

When running benchmarks against clients with staged sync pipelines (Erigon)
When you observe SYNCING responses from Engine API calls despite the RPC being available
When starting from pre-populated data directories where clients may need time to validate state

Run Timeout

The run_timeout option sets a maximum duration for the test execution phase of a run. If the timeout is exceeded, the run is cancelled with a timed_out status. Partial results collected before the timeout are still written and published.

runner:
  client:
    config:
      run_timeout: 2h

The value is a Go duration string (e.g., 30m, 1h, 2h30m). If not set, no timeout is applied.

The timeout covers only the test execution phase — container setup, image pulling, and RPC readiness checks are not included.

Note: This is a per-instance timeout. For a global timeout that caps the entire run (all instances, setup, and teardown), use runner.run_timeout.

When to use:

When running large test suites that may hang or take unexpectedly long
When you want to enforce a maximum wall-clock time per instance
When running in CI/CD environments with time constraints

Post-Test Sleep Duration

The post_test_sleep_duration option adds a configurable pause after each test completes (after rollback and post-test RPC calls, but before the next test begins). This is useful for clients that need time to complete internal cleanup between tests.

runner:
  client:
    config:
      post_test_sleep_duration: 200ms

Uses Go duration format (e.g., 200ms, 1s, 5s). Default is 0 (disabled).

When to use:

When a client needs time for internal cleanup between tests
When you observe flaky results due to rapid successive test execution

Retry New Payloads Syncing State

When engine_newPayload returns a SYNCING status, it indicates the client hasn't fully processed the parent block yet. The retry_new_payloads_syncing_state option configures automatic retries with exponential backoff.

runner:
  client:
    config:
      retry_new_payloads_syncing_state:
        enabled: true
        max_retries: 10
        backoff: 1s

Option	Type	Required	Description
`enabled`	bool	Yes	Enable retry behavior
`max_retries`	int	Yes	Maximum number of retry attempts (must be ≥ 1)
`backoff`	string	Yes	Delay between retries (Go duration string)

When to use:

When benchmarking clients that return SYNCING during normal operation (Erigon)
When using pre-populated data directories where clients may need time to validate chain state
Combined with wait_after_rpc_ready for clients with complex initialization

Both this and retry_new_payloads_failed_state (below) apply to all engine_newPayload* calls — pre-run steps, setup steps, and test steps alike.

Retry New Payloads Failed State

Catch-all retry for engine_newPayload* calls that fail for any reason other than SYNCING — RPC/network errors, JSON-RPC errors (e.g. -32603 Server error), INVALID / INVALID_BLOCK_HASH payload statuses, or unparsable responses. Useful for transient client-side flakiness, where a single retry usually succeeds.

runner:
  client:
    config:
      retry_new_payloads_failed_state:
        enabled: true
        max_retries: 3
        backoff: 500ms

Option	Type	Required	Description
`enabled`	bool	Yes	Enable retry behavior
`max_retries`	int	Yes	Maximum number of retry attempts (must be ≥ 1)
`backoff`	string	Yes	Delay between retries (Go duration string)

When both retry_new_payloads_syncing_state and retry_new_payloads_failed_state are enabled, SYNCING errors take the SYNCING retry path and everything else takes the failed-state retry path.

When to use:

Recovering from transient JSON-RPC errors during long pre-run replays
Suppressing one-off failures when clients are momentarily under load (e.g. cache warm-up)

Bootstrap FCU

Some clients (e.g., Erigon) may still be performing internal initialization or syncing after their RPC endpoint becomes available. The bootstrap_fcu option sends an engine_forkchoiceUpdatedV3 call in a retry loop after RPC is ready, using the latest block hash from eth_getBlockByNumber("latest"). The client accepting the FCU with VALID status confirms it has finished syncing and is ready for test execution.

Besu accepts the bootstrap FCU on an isolated snapshot node only with --p2p-enabled=true: its synchronizer must run to register the post-merge head as in-sync, otherwise besu answers SYNCING to every FCU. Set extra_args: [--p2p-enabled=true] on the besu instance (--max-peers=0 + --discovery-enabled=false keep it isolated, with zero real peers).

Shorthand (uses defaults: max_retries: 30, backoff: 1s):

runner:
  client:
    config:
      bootstrap_fcu: true

Full configuration:

runner:
  client:
    config:
      bootstrap_fcu:
        enabled: true
        max_retries: 30
        backoff: 1s

Option	Type	Required	Default	Description
`enabled`	bool	Yes	-	Enable bootstrap FCU
`max_retries`	int	Yes	`30` (shorthand)	Maximum number of retry attempts (must be >= 1)
`backoff`	string	Yes	`1s` (shorthand)	Delay between retries (Go duration string)

The FCU call sets headBlockHash to the latest block, with safeBlockHash and finalizedBlockHash set to the zero hash and no payload attributes. The response must have VALID status. If the call fails, it is retried up to max_retries times with backoff between attempts. If all attempts fail, the run is aborted.

When using the container-recreate rollback strategy, the bootstrap FCU is sent after each container recreate. When using container-checkpoint-restore, the bootstrap FCU is sent once before the checkpoint is taken.

When to use:

When clients may still be performing internal initialization or syncing after RPC becomes available (e.g., Erigon's staged sync)
When starting from pre-populated data directories where the client needs time to validate state before processing Engine API requests
When you observe test failures due to the client returning errors or SYNCING responses on the first Engine API calls

Opcode Extraction

The opcode_extraction option captures per-test opcode counts as a side effect of running tests. After each test step, the runner walks the test's engine_newPayload* calls and runs debug_traceBlockByNumber against each block with a JS opcode-counting tracer. Per-tx counts are summed (and uppercased) into one map per newPayload, then appended to a per-test array. At the end of the run all the data lands in a single test-opcodes.json at the run results dir, in the same shape that runner.benchmark.tests.opcode_source expects.

runner:
  client:
    config:
      opcode_extraction:
        enabled: true
        timeout: 2m   # per-block trace timeout; default 2m

Option	Type	Required	Default	Description
`enabled`	bool	Yes	`false`	Enable the post-test extraction step
`timeout`	string	No	`2m`	Per-block `debug_traceBlockByNumber` timeout (Go duration). Long traces on fat blocks may need a higher value

opcode_extraction can be set globally under runner.client.config and/or per-instance under runner.instances[]. Instance-level config (when non-nil) fully replaces the global default. The output file shape is:

{
  "test-name.txt": [
    { "PUSH1": 23432, "DUP1": 11231, "SSTORE": 3321 }
  ]
}

(One entry per engine_newPayload* in the test step, summed across all txs in that block.)

Requirements:

The client must accept JS tracers via debug_traceBlockByNumber. Geth, Erigon, and Nethermind support them; coverage on Reth/Besu/Nimbus/ethrex varies — check your client docs.
The trace runs against the EL state right after the test step, before rollback, so the client must still have the block.

When to use:

When you want a ground-truth opcode profile of every benchmarked test (instead of relying on opcode_source JSON shipped from a separate pipeline)
When investigating client-vs-client divergence in EVM execution paths

Data Directories

The runner.client.datadirs section configures pre-populated data directories per client type. When configured, the init container is skipped and data is mounted directly.

runner:
  client:
    datadirs:
      geth:
        source_dir: ./data/snapshots/geth
        # container_dir defaults to /data (geth's data directory)
        method: copy
      reth:
        source_dir: ./data/snapshots/reth
        # container_dir defaults to /var/lib/reth (reth's data directory)
        method: overlayfs

Option	Type	Default	Description
`source_dir`	string	Required	Path to the source data directory
`container_dir`	string	Client default	Mount path inside the container. If not specified, uses the client's default data directory (e.g., `/var/lib/reth` for reth, `/data` for geth)
`method`	string	`copy`	Method for preparing the data directory

Data Directory Methods

Method	Description	Requirements
`copy`	Parallel Go copy with progress display	None (default, works everywhere)
`overlayfs`	Linux overlayfs for near-instant setup	Root access
`fuse-overlayfs`	FUSE-based overlayfs	`fuse-overlayfs` package; `user_allow_other` in `/etc/fuse.conf` if Docker runs as root. Warning: ~3x slower than native overlayfs
`zfs`	ZFS snapshots and clones for copy-on-write setup	Source directory on ZFS filesystem; root access or ZFS delegations configured
`direct`	Bind-mount `source_dir` directly into the container with no copy/snapshot. Changes persist after the run. Intended for inspection / resume workflows, not normal benchmarking	None
`schelk`	Use a schelk-managed scratch volume restored from a virgin baseline between iterations	`schelk` binary on PATH (or `BENCHMARKOOR_SCHELK_BIN`); schelk initialised via `schelk init-new` / `init-from`; root access

ZFS Setup

For ZFS method without root:

zfs allow -u <user> clone,create,destroy,mount,snapshot <dataset>

The dataset is auto-detected from the source directory mount point.

Schelk Setup

schelk keeps a pristine virgin block device and a scratch block device, using dm-era on a ramdisk to track which blocks changed during a run so the scratch can be surgically restored from virgin between iterations. It pairs well with rollback_strategy: container-recreate, which gives schelk a clean baseline at the start of every test.

Before configuring benchmarkoor, initialise schelk on the host (see schelk's SKILL.md):

sudo schelk init-from \
  --virgin /dev/<virgin>  --scratch /dev/<scratch> \
  --ramdisk /dev/ram0 --mount-point /schelk --fstype ext4

Then point source_dir at the path inside the schelk mount that holds your client's datadir:

runner:
  client:
    config:
      rollback_strategy: container-recreate
  instances:
    - id: reth-schelk
      client: reth
      datadir:
        method: schelk
        source_dir: /schelk/eth/reth   # subpath under the schelk mount

What benchmarkoor does at runtime:

Config validation verifies schelk is on PATH, reads /var/lib/schelk/state.json to learn the mount point, and runs schelk mount if the scratch isn't currently mounted. If state says mounted but /proc/mounts disagrees (a crash artefact), benchmarkoor surfaces a clear error pointing at schelk full-recover.
Per container lifecycle, Prepare runs schelk restore (recover + mount) so each iteration starts from the virgin baseline. Cleanup runs schelk recover to unmount and restore baseline.
Graceful shutdown: schelk commands run in their own process group, so a SIGTERM to benchmarkoor does not propagate to schelk. An in-flight schelk command is given up to 60 seconds to finish before being killed, so a recover mid-flight is not interrupted.

Building onto a schelk mount: when a builder.state_actor target's output_dir is under the schelk mount, benchmarkoor build mounts the scratch first (the same schelk mount preflight as above), materialises the datadir onto it, and — only when a build actually ran (fresh / --force / a --rebuild-on-diff change) — runs schelk promote to persist the new datadir as the virgin baseline. A skipped, unchanged target is not promoted. This is how the built datadir becomes the baseline the runner's per-iteration schelk restore resets to. No configuration is needed — it is detected from the output_dir path.

Notes:

rollback_strategy: container-checkpoint-restore is not compatible with method: schelk (it requires method: zfs).
All operational schelk commands require root.
source_dir must be the schelk mount point or a subdirectory of it.

Environment Variable	Description
`BENCHMARKOOR_SCHELK_BIN`	Override the schelk executable path. Useful when running under `sudo` with a sanitised PATH that does not include `~/.cargo/bin`. Accepts a bare name (resolved via PATH) or an absolute/relative path. Default: `schelk`
`SCHELK_STATE`	Override the schelk state-file path. Honoured by both schelk itself and benchmarkoor's preflight. Default: `/var/lib/schelk/state.json`

Default Container Directories

When container_dir is not specified, the client's default data directory is used:

Client	Default Data Directory
geth	`/data`
nethermind	`/data`
besu	`/data`
erigon	`/data`
nimbus	`/data`
reth	`/var/lib/reth`

Client Instances

The runner.instances array defines which client configurations to benchmark.

runner:
  instances:
    - id: geth-latest
      client: geth
      image: ethpandaops/geth:performance
      pull_policy: always
      entrypoint: []
      command: []
      extra_args:
        - --verbosity=5
      restart: never
      environment:
        GOMEMLIMIT: "14GiB"
      genesis: https://example.com/custom-genesis.json
      datadir:
        source_dir: ./snapshots/geth
        # container_dir defaults to client's data directory
        method: overlayfs
      drop_memory_caches: "steps"
      resource_limits:
        cpuset_count: 2
        memory: "8g"

Option	Type	Required	Default	Description
`id`	string	Yes	-	Unique identifier for this instance
`client`	string	Yes	-	Client type (see Supported Clients)
`image`	string	No	Per-client default	Docker image to use
`pull_policy`	string	No	`always`	Image pull policy: `always`, `never`, `missing`
`entrypoint`	[]string	No	Client default	Override container entrypoint
`command`	[]string	No	Client default	Override container command
`extra_args`	[]string	No	-	Additional arguments appended to command
`restart`	string	No	-	Container restart policy
`environment`	map	No	-	Additional environment variables
`genesis`	string	No	From `runner.client.config.genesis`	Override genesis file URL
`genesis_fork_override`	map	No	-	Activate forks at given timestamps by patching a geth-format genesis at boot. See Genesis Fork & EIP Overrides
`genesis_eip_override`	object	No	-	Activate EIPs at a timestamp by patching a parity/nethermind chainspec at boot. See Genesis Fork & EIP Overrides
`datadir`	object	No	From `runner.client.datadirs`	Instance-specific data directory config
`drop_memory_caches`	string	No	From `runner.client.config`	Instance-specific cache drop setting
`rollback_strategy`	string	No	From `runner.client.config`	Instance-specific rollback strategy
`checkpoint_restore_strategy_options`	object	No	From `runner.client.config`	Instance-specific checkpoint-restore strategy options (replaces global)
`wait_after_rpc_ready`	string	No	From `runner.client.config`	Instance-specific RPC ready wait duration
`run_timeout`	string	No	From `runner.client.config`	Instance-specific run timeout duration
`retry_new_payloads_syncing_state`	object	No	From `runner.client.config`	Instance-specific retry config for SYNCING responses
`retry_new_payloads_failed_state`	object	No	From `runner.client.config`	Instance-specific retry config for non-SYNCING failures
`resource_limits`	object	No	From `runner.client.config`	Instance-specific resource limits
`post_test_rpc_calls`	[]object	No	From `runner.client.config`	Instance-specific post-test RPC calls (replaces global)
`post_test_sleep_duration`	string	No	From `runner.client.config`	Instance-specific post-test sleep duration
`bootstrap_fcu`	bool/object	No	From `runner.client.config`	Instance-specific bootstrap FCU setting
`opcode_extraction`	object	No	From `runner.client.config`	Instance-specific opcode extraction setting (replaces global)

Genesis Fork & EIP Overrides

These options let an instance activate a fork that is not scheduled in the genesis it boots from — for example, running Amsterdam payloads against an Osaka snapshot. benchmarkoor patches the genesis file in-memory at boot, before mounting it; the source genesis on disk is never modified, and untouched fields (including large integers) round-trip verbatim, so the genesis block hash is unchanged.

Use these only for clients that read their fork schedule from the genesis file. geth and erigon do not — they read the fork schedule from the datadir, so a patched genesis is ignored. For those, use the client's own fork-override flag instead (e.g. --override.amsterdam=<timestamp> in extra_args).

genesis_fork_override — for geth-format genesis files (besu, reth, ethrex). A map of fork name to activation timestamp. For each entry it sets config.<fork>Time, and if the genesis has a blobSchedule that lacks the fork, it inherits the schedule of the latest preceding fork (so the new fork carries a blob schedule, as geth-family clients require).

runner:
  instances:
    - id: besu
      client: besu
      genesis: /path/to/osaka-chainspec.json   # used as-is
      genesis_fork_override:
        amsterdam: 1   # sets config.amsterdamTime=1, inherits blobSchedule.amsterdam

genesis_eip_override — for parity/nethermind-format chainspecs, which schedule forks per-EIP rather than by fork name. It sets params.eip<N>TransitionTimestamp for each listed EIP to the given (hex-encoded) timestamp. The EIP list is devnet-specific, so it lives in config.

runner:
  instances:
    - id: nethermind
      client: nethermind
      genesis: /path/to/osaka-parity-chainspec.json   # used as-is
      genesis_eip_override:
        timestamp: 1
        eips: [7708, 7778, 7843, 7928, 7954, 7976, 7981, 8024, 8037]

Option	Type	Description
`genesis_fork_override`	map[string]uint	Fork name → activation timestamp (unix seconds). geth-format genesis only.
`genesis_eip_override.timestamp`	uint	Activation timestamp (unix seconds) applied to every listed EIP.
`genesis_eip_override.eips`	[]uint	EIP numbers to activate, e.g. `[7928, 8037]`. parity/nethermind chainspec only.

Applying an override to the wrong genesis format is an error (a geth-format override needs a top-level config object; an EIP override needs a top-level params object).

Resource Limits

Resource limits can be configured globally (runner.client.config.resource_limits) or per-instance (runner.instances[].resource_limits). Instance-level settings override global defaults.

resource_limits:
  cpuset_count: 4
  # OR
  cpuset: [0, 1, 2, 3]
  memory: "16g"
  swap_disabled: true
  blkio_config:
    device_read_bps:
      - path: /dev/sdb
        rate: '12mb'
    device_write_bps:
      - path: /dev/sdb
        rate: '1024k'
    device_read_iops:
      - path: /dev/sdb
        rate: '120'
    device_write_iops:
      - path: /dev/sdb
        rate: '30'

Option	Type	Description
`cpuset_count`	int	Number of random CPUs to pin to (new selection each run)
`cpuset`	[]int	Specific CPU IDs to pin to
`cpu_freq`	string	Fixed CPU frequency. Supports: `"2000MHz"`, `"2.4GHz"`, `"MAX"` (use system maximum)
`cpu_turboboost`	bool	Enable (`true`) or disable (`false`) turbo boost. Omit to leave unchanged
`cpu_freq_governor`	string	CPU frequency governor. Common values: `performance`, `powersave`, `schedutil`. Defaults to `performance` when `cpu_freq` is set
`memory`	string	Memory limit with unit: `b`, `k`, `m`, `g` (e.g., `"16g"`, `"4096m"`)
`swap_disabled`	bool	Disable swap (sets memory-swap equal to memory, swappiness to 0)
`blkio_config`	object	Block I/O throttling configuration (see below)

Note: cpuset_count and cpuset are mutually exclusive. Use one or the other.

Block I/O Configuration

The blkio_config option allows throttling container disk I/O:

Option	Type	Description
`device_read_bps`	[]object	Device read bandwidth limits
`device_read_iops`	[]object	Device read IOPS limits
`device_write_bps`	[]object	Device write bandwidth limits
`device_write_iops`	[]object	Device write IOPS limits

Each device entry has:

Field	Type	Description
`path`	string	Device path (e.g., `/dev/sdb`)
`rate`	string	Rate limit. For `_bps`: string with unit (`b`, `k`, `m`, `g`). For `_iops`: integer string

CPU Frequency Management

CPU frequency settings allow you to lock CPUs to a specific frequency, control turbo boost, and set the CPU frequency governor. This is useful for achieving more consistent benchmark results by eliminating CPU frequency variations.

Requirements:

Linux only
Root access (requires write access to /sys/devices/system/cpu/*/cpufreq/)
cpufreq subsystem must be available
When running in Docker, bind-mount /sys/devices/system/cpu into the container and set runner.cpu_sysfs_path to the mount point (e.g., /host_sys_cpu)

resource_limits:
  cpuset_count: 4
  cpu_freq: "2000MHz"
  cpu_turboboost: false
  cpu_freq_governor: performance

Notes:

CPU frequency settings are applied to the CPUs specified by cpuset or cpuset_count. If neither is specified, settings are applied to all online CPUs.
Original CPU frequency settings are automatically restored when the benchmark completes or is interrupted.
If the process is killed, the benchmarkoor cleanup command will restore CPU frequency settings from saved state files.

Turbo Boost:

Intel systems: Controls /sys/devices/system/cpu/intel_pstate/no_turbo
AMD systems: Controls /sys/devices/system/cpu/cpufreq/boost

Available Governors:

Common governors (availability depends on kernel configuration):

Governor	Description
`performance`	Always run at max frequency (best for benchmarks)
`powersave`	Always run at min frequency
`schedutil`	Scale frequency based on CPU utilization (default on modern kernels)
`ondemand`	Scale frequency based on load
`conservative`	Like ondemand but more gradual changes

Example: Consistent Benchmark Configuration

For the most consistent benchmark results, lock the CPU frequency and disable turbo boost:

runner:
  client:
    config:
      resource_limits:
        cpuset_count: 4
        cpu_freq: "2000MHz"
        cpu_turboboost: false
        cpu_freq_governor: performance
        memory: "16g"
        swap_disabled: true

Post-Test RPC Calls

Post-test RPC calls allow you to execute arbitrary JSON-RPC calls after each test step completes. These calls are not timed and do not affect test results. They are useful for collecting debug traces, state snapshots, or other diagnostic data from the client after each test.

Calls are made to the client's regular RPC endpoint (no JWT authentication). If a call fails, a warning is logged and the remaining calls continue.

runner:
  client:
    config:
      post_test_rpc_calls:
        - method: debug_traceBlockByNumber
          params: ["{{.BlockNumberHex}}", {"tracer": "callTracer"}]
          dump:
            enabled: true
            filename: debug_traceBlockByNumber
        - method: debug_traceBlockByHash
          params: ["{{.BlockHash}}"]
          timeout: 2m  # Override default 30s timeout for slow methods
          dump:
            enabled: true
            filename: debug_traceBlockByHash

Call Options

Option	Type	Required	Description
`method`	string	Yes	JSON-RPC method name
`params`	[]any	No	Method parameters (supports template variables)
`timeout`	string	No	Per-call timeout as a Go duration string (e.g., `30s`, `2m`). Default: `30s`
`dump`	object	No	Response dump configuration
`dump.enabled`	bool	No	Enable writing the response to a file
`dump.filename`	string	When dump enabled	Base filename for the dump (`.json` extension is added automatically)

Template Variables

Go text/template syntax is supported in all string values within params. Templates are applied recursively to strings inside arrays and objects.

Variable	Description	Example
`{{.BlockHash}}`	Hash of the latest block	`"0xabc..."`
`{{.BlockNumber}}`	Block number as decimal string	`"1234"`
`{{.BlockNumberHex}}`	Block number as hex with `0x` prefix	`"0x4d2"`

Non-string values (booleans, numbers) pass through unchanged.

Dump Output

When dump.enabled is true, the raw JSON-RPC response is written to:

{resultsDir}/{testName}/post_test_rpc_calls/{dump.filename}.json

The response is pretty-printed if it is valid JSON. File ownership respects the results_owner configuration.

Execution Flow

Post-test RPC calls run after the test step and before the cleanup step:

1. Setup step (if present)
2. Test step (timed, results written)
3. Post-test RPC calls              ← runs here
4. Cleanup step (if present)
5. Rollback (if configured)

Instance-Level Override

Instance-level post_test_rpc_calls completely replace global defaults (not merged):

runner:
  client:
    config:
      post_test_rpc_calls:
        - method: debug_traceBlockByNumber
          params: ["{{.BlockNumberHex}}"]
          dump:
            enabled: true
            filename: trace_by_number
  instances:
    - id: geth-latest
      client: geth
      # This replaces the global calls entirely:
      post_test_rpc_calls:
        - method: debug_traceBlockByHash
          params: ["{{.BlockHash}}"]
          dump:
            enabled: true
            filename: trace_by_hash

Builder

The builder section configures tools that pre-populate benchmark inputs on disk. There are two builders:

state_actor (https://github.com/ethereum/state-actor) writes per-client genesis state directly in each EL's native on-disk format — geth Pebble, reth MDBX, besu/nethermind RocksDB — bypassing the client's normal genesis-replay path.
eest_payloads generates stateful EEST benchmark fixtures by running fill-stateful against a filler client booted on a pre-populated snapshot (typically one produced by state_actor). The fixtures are replayed by benchmarkoor run.

Builds are decoupled from benchmarkoor run: invoke benchmarkoor build to materialise the artifacts, then run benchmarks against them via the regular datadir.method: copy|zfs|schelk|… providers and test-source config. A missing datadir at run time is an error — it is never auto-built. When both builders are configured, they run in declaration order (state_actor before eest_payloads) so a fixture build can consume a datadir produced earlier in the same benchmarkoor build invocation.

Option	Type	Default	Description
`run_timeout`	string	–	Global timeout capping the entire `benchmarkoor build` (all builders and targets), as a Go duration (e.g. `2h`, `90m`). Empty means no timeout. Overridable via `BENCHMARKOOR_BUILDER_RUN_TIMEOUT`. The analogue of `runner.run_timeout` for builds.

`builder.state_actor` options

builder:
  state_actor:
    # Per-client images. State-actor needs cgo to write reth/besu/nethermind
    # datadirs, so each client has its own image. Every active target's
    # client must have an entry here.
    images:
      geth: ghcr.io/ethereum/state-actor:latest
      reth: ghcr.io/ethereum/state-actor-reth:latest
      besu: ghcr.io/ethereum/state-actor-besu:latest
      nethermind: ghcr.io/ethereum/state-actor-nethermind:latest
    pull_policy: always                           # always | if-not-present | never (default: always)
    container_runtime: docker                     # docker | podman (default: inherits runner.container_runtime, then docker)
    # spec source — top-level, shared across every target.
    # Pick at most one of:
    #   spec:         # structured YAML (or a `|` block scalar); written to a temp file before invoking state-actor
    #     entities: [ ... ]
    #   spec_file: /etc/benchmarkoor/state-spec.yaml   # absolute host path
    config:                                       # shared per-target defaults; targets override when set
      seed: 1
      fork: prague
      chain_id: 1337
    targets:
      - name: geth-5g                             # optional, defaults to `client`; used by --target filter
        client: geth
        output_dir: /srv/state/geth-5g
        target_size: 5GB

Option	Type	Default	Description
`images`	map[string]string	–	Per-client docker images for state-actor. Every active target's client must have an entry; state-actor needs a different cgo build per client (reth → MDBX, besu → RocksDB JNI, nethermind → .NET RocksDB).
`pull_policy`	string	`always`	One of `always`, `if-not-present`, `never`.
`container_runtime`	string	runner's runtime, then `docker`	Container runtime for the build container.
`spec`	YAML mapping or string	–	Inline state spec body (see state-actor SPEC.md). Write it as structured YAML (a mapping — your editor highlights it) or as a `\|` block scalar; both materialise to the same temp spec file at build time. Mutually exclusive with `spec_file`.
`spec_file`	string	–	Absolute host path to a state spec YAML. Bind-mounted read-only into the build container. Mutually exclusive with `spec`.
`config`	object	–	Shared defaults for the per-target build parameters. See below.
`targets`	[]object	–	Required when invoking `benchmarkoor build`. See below.

The top-level spec/spec_file applies to every target. A target without its own target_size runs with just the spec; a target with both flags runs with both (state-actor uses the spec and treats target_size as a headroom budget for any further auto-fill). A target with neither target_size nor any spec source is a validation error.

`builder.state_actor.config` options

Every field is also available per-target; a non-nil/non-empty value on a target overrides the corresponding default from config. Use this block to avoid repeating the same seed, fork, chain_id, etc. across every target.

Option	Type	Default	Description
`target_size`	string	–	Default size budget for every target (e.g. `5GB`). Targets without their own `target_size` inherit this value. Complements `spec`/`spec_file` — state-actor uses both at once, treating `target_size` as a headroom budget on top of the spec.
`seed`	int64	–	RNG seed for auto-fill. `0` = wall-clock (non-reproducible).
`fork`	string	–	Hard fork at genesis, e.g. `prague`, `osaka`.
`chain_id`	int64	–	Genesis chain ID.
`gas_limit`	uint64	–	Genesis gas limit.
`timestamp`	uint64	–	Unix seconds at genesis.
`extra_data`	string	–	Hex `extraData` for the genesis block.
`archive`	bool	–	Archive-mode metadata. Effective value must be limited to geth/reth, regardless of where it was set.
`binary_trie`	bool	–	EIP-7864 binary trie. Effective value must be limited to geth.
`group_depth`	int (1..8)	–	Binary-trie serialisation unit. Requires effective `binary_trie=true`.

Applicability validation runs on the effective target (config defaults + per-target overrides), so archive: true set globally is rejected if any active target is besu/nethermind. To opt a target out of a global archive: true, set archive: false on that target.

`builder.state_actor.targets[]` options

The fields below mirror builder.state_actor.config; any field set here overrides the corresponding default from config. Identifier fields (name, client, output_dir, target_size) are target-only.

Option	Type	Default	Applies to	Description
`name`	string	`client`	all	Human-readable name. Used by `--target` to filter. Must be unique across targets; defaults to the `client` field when omitted.
`client`	string	–	all	One of `geth`, `reth`, `besu`, `nethermind`, `ethrex`. State-actor does not support `erigon` or `nimbus`.
`output_dir`	string	–	all	Absolute host path. If the directory already contains entries, that target is skipped (no error) — pass `--force` (CLI) or set `force: true` here to wipe and rebuild. For geth, state-actor writes into `<output_dir>/geth/chaindata`.
`target_size`	string	from `config`	all	Advisory size budget for auto-generated state, e.g. `5GB`, `500MB` (base-1024). Required for the target when no spec is configured; when a spec is configured (top-level or default), `target_size` is optional and acts as a headroom budget that state-actor fills past the spec's projected cost.
`force`	bool	`false`	all	Per-target override of the CLI `--force` flag: wipes `output_dir` before building so state-actor sees a clean directory. Useful when most targets should skip-if-built but specific ones should always rebuild.
`seed`	int64	from `config`, then `1` (state-actor)	all	RNG seed for auto-fill. `0` = wall-clock (non-reproducible).
`fork`	string	from `config`, then latest supported by state-actor	all	Hard fork at genesis, e.g. `prague`, `osaka`. Run `state-actor --list-forks` for the current list.
`chain_id`	int64	from `config`, then `1337` (state-actor)	all	Genesis chain ID.
`gas_limit`	uint64	from `config`, then `30000000` (state-actor)	all	Genesis gas limit.
`timestamp`	uint64	from `config`, then `0` (state-actor)	all	Unix seconds at genesis.
`extra_data`	string	from `config`, then `""`	all	Hex `extraData` for the genesis block.
`archive`	bool	from `config`, then `false`	geth, reth	Archive-mode metadata. Set `false` to opt out of a global `archive: true`. Rejected (after resolution) for besu/nethermind.
`binary_trie`	bool	from `config`, then `false`	geth	EIP-7864 binary trie. Set `false` to opt out of a global default. Rejected (after resolution) for non-geth.
`group_depth`	int	from `config`, then `8` (state-actor)	geth + binary_trie	Binary-trie serialisation unit. Range 1..8. Requires effective `binary_trie=true`.

State-actor itself only writes the genesis block; subsequent blocks come from running a client against the produced datadir. See state-actor RUNBOOK.md for the per-client boot recipes (e.g. geth needs --db.engine=pebble; reth needs --debug.skip-genesis-validation; besu needs --data-storage-format=BONSAI; ethrex needs --skip-genesis-validation and ≥ v16.0.0).

Running

# Build every target declared under builder.state_actor.targets / builder.eest_payloads.targets
benchmarkoor build --config build.yaml

# Build only specific targets by name, across all builders
benchmarkoor build --config build.yaml --target geth-5g --target reth-spec

# Limit a single builder's targets (the other builder is unrestricted)
benchmarkoor build --config build.yaml --limit-state-actor-target nethermind
benchmarkoor build --config build.yaml --limit-eest-payload-target payload-generator-nethermind

# Build just one client end-to-end: its snapshot, then its fill
benchmarkoor build --config build.yaml \
  --limit-state-actor-target nethermind \
  --limit-eest-payload-target payload-generator-nethermind

# Overwrite existing output_dir contents
benchmarkoor build --config build.yaml --force

Flag	Description
`--target`	Filter by target `name` across all builders (comma-separated or repeated).
`--limit-state-actor-target`	Filter only `builder.state_actor` targets; `eest_payloads` is left unrestricted.
`--limit-eest-payload-target`	Filter only `builder.eest_payloads` targets; `state_actor` is left unrestricted.
`--skip-state-actor-build`	Skip the `builder.state_actor` builder entirely (only `eest_payloads` runs).
`--skip-eest-payload-build`	Skip the `builder.eest_payloads` builder entirely (only `state_actor` runs).
`--force`	Wipe each selected target's `output_dir` before building (bypasses the skip-if-populated behaviour).
`--rebuild-on-diff`	Rebuild a populated `output_dir` when its config changed since the last build, instead of skipping (see below).

A target is built when it passes the global --target filter and the per-builder limit for the builder that owns it; an unset filter imposes no restriction. Any filter value that names no existing target is a hard error (typos surface immediately — the per-builder limits are checked against only that builder's target names). --skip-*-build removes a whole builder; a skipped builder's --limit-*-target is then ignored. Skipping every configured builder is an error.

Rebuild on config change (`--rebuild-on-diff`)

By default a populated output_dir is skipped regardless of whether the config that produced it changed — you must --force to pick up a new fork, seed, spec, filter, etc. After every successful build benchmarkoor now records a .benchmarkoor-build.json sidecar in the output_dir holding a fingerprint of the output-affecting config. With --rebuild-on-diff, a populated target is rebuilt only when that fingerprint differs from the current config (the changed keys are logged); an unchanged config still skips, and a directory with no sidecar (built before this existed) is skipped until the next --force records a baseline.

The fingerprint covers the inputs that actually change the output:

state_actor: client, image, target_size, seed, fork, chain_id, gas_limit, timestamp, extra_data, archive, binary_trie, group_depth, and the spec content.
eest_payloads: filler client + image, fork, tests, filter, marker, gas/opcode values, max gas, rpc seed key, filler extra args, datadir method, the content of the genesis / address-stubs / fill Dockerfile, the fork/EIP genesis overrides, and the execution-specs checkout resolved to a commit SHA (via git ls-remote, so a moving eest_ref that advanced is detected). It also folds in the source snapshot's fingerprint, so rebuilding a state_actor datadir cascades into rebuilding the fixtures generated from it.

The command exits non-zero if any target fails; successful targets are still left in place on partial failure. A final summary lists each target with OK (built), SKIP (output_dir already populated), or ERR (failed).

Examples

Minimal — one geth datadir sized at 5 GB:

builder:
  state_actor:
    images:
      geth: ghcr.io/ethereum/state-actor:latest
    targets:
      - client: geth
        output_dir: /srv/state/geth-5g
        target_size: 5GB

Full — three clients sharing a top-level spec file and global defaults; geth opts out via target_size, reth opts out of the global archive setting:

builder:
  state_actor:
    images:
      geth: ghcr.io/ethereum/state-actor:latest
      reth: ghcr.io/ethereum/state-actor-reth:latest
      besu: ghcr.io/ethereum/state-actor-besu:latest
    pull_policy: if-not-present
    spec_file: /etc/benchmarkoor/state-spec.yaml
    config:
      # Applies to every target that doesn't override it.
      seed: 42
      fork: prague
      chain_id: 1337
      archive: true       # geth + reth inherit this; besu sets archive: false below
    targets:
      - name: geth-archive
        client: geth
        output_dir: /srv/state/geth-archive
        # target_size + top-level spec: spec drives the build, target_size sets the headroom budget
        target_size: 5GB
        binary_trie: true
        group_depth: 4
      - client: reth
        output_dir: /srv/state/reth-spec
      - client: besu
        output_dir: /srv/state/besu-spec
        archive: false    # overrides config.archive=true (besu doesn't support archive)

Inline spec — write the YAML directly in the config (structured, so editors highlight it; a | block scalar works too):

builder:
  state_actor:
    images:
      geth: ghcr.io/ethereum/state-actor:latest
    spec:
      entities:
        - kind: eoa
          name: bloated-eoa
          approximate_size_bytes: 2_000_000_000
      # … rest of the state spec
    targets:
      - client: geth
        output_dir: /srv/state/geth-spec

`builder.eest_payloads` options

eest_payloads generates stateful EEST benchmark fixtures: it boots a filler EL client on a writable copy of a pre-populated snapshot datadir, runs fill-stateful against the live client (recording engine-API payloads anchored to the snapshot's head block), and writes the fixtures to each target's output_dir. fill-stateful itself does not manage datadirs — benchmarkoor boots the filler and snapshots it.

Filler client: geth (ethpandaops/geth:master) is the production-ready filler. nethermind (nethermindeth/nethermind:master) also works — it implements testing_buildBlockV1 with correct EIP-7928 block-access-lists, and fill-stateful's per-test rewind falls back to debug_resetHead for it (nethermind has no debug_setHead). besu works too with an image carrying the merged TestingBuildBlockV1 coinbase fix (e.g. ethpandaops/besu:bal-devnet-7); benchmarkoor auto-pins its session priority fee.

Fill image: by default benchmarkoor builds the fill image (the uv/python toolchain that runs fill-stateful) from a Dockerfile embedded in the binary — nothing to publish or pass. To pull a pre-built image instead, set fill_image; to build from a custom Dockerfile, set fill_dockerfile. The embedded Dockerfile lives at pkg/builder/Dockerfile.eest-filler; to build it by hand:
docker build -f pkg/builder/Dockerfile.eest-filler -t ghcr.io/your-org/eest-fill-stateful:latest .

builder:
  eest_payloads:
    # Fill image defaults to a Dockerfile embedded in the binary. Optionally:
    # fill_image: ghcr.io/your-org/eest-fill-stateful:latest   # pull a pre-built image instead
    # fill_dockerfile: pkg/builder/Dockerfile.eest-filler      # or build from a custom Dockerfile
    pull_policy: always                  # always | if-not-present | never (default: always)
    container_runtime: docker            # docker | podman (default: inherits runner.container_runtime, then docker)
    # jwt: <hex>                         # Engine API secret, shared with the filler (default: benchmarkoor's DefaultJWT)
    # fill_command: [uv, run, fill-stateful]   # argv prefix inside fill_image (this is the default)
    # eest_repo: https://github.com/ethereum/execution-specs.git   # cloned + mounted at /eest (default)
    # eest_ref: forks/amsterdam          # branch, tag, or commit to check out (default: forks/amsterdam)
    config:                              # shared per-target defaults; targets override when set
      filler_image: ethpandaops/geth:master
      fork: Osaka
      gas_benchmark_values: [10, 30]     # millions of gas to parametrise against
      # fixed_opcode_count: [0.5, 1, 2]  # thousands of opcodes; mutually exclusive with gas_benchmark_values
      datadir_method: copy               # copy | overlayfs | fuse-overlayfs | zfs | direct | schelk
    targets:
      - name: compute-geth
        filler_client: geth
        source_dir: /srv/state/geth-archive     # PRISTINE snapshot (never mutated; a writable copy is filled)
        # geth boots from the datadir; to fill a fork that activates after the
        # snapshot, pass --override.<fork> here (besu/nethermind use `genesis` +
        # genesis_fork_override / genesis_eip_override instead):
        # filler_extra_args: [--override.amsterdam=1]
        output_dir: /srv/fixtures/compute
        tests:
          - tests/benchmark/compute              # pytest paths inside the fill image
        filter: bn128                            # optional pytest -k expression

Option	Type	Default	Description
`fill_image`	string	–	Pre-built container image carrying the uv/python toolchain that runs `fill-stateful`. Optional: when neither this nor `fill_dockerfile` is set, benchmarkoor builds the fill image from a Dockerfile embedded in the binary.
`fill_dockerfile`	string	–	Path to a custom Dockerfile that benchmarkoor builds with the container runtime at build time, instead of pulling a pre-built image or using the embedded default. Tagged `fill_image` when set, else `benchmarkoor-eest-fill:local`. Requires the runtime's `build` CLI (docker/podman) on the host.
`pull_policy`	string	`always`	One of `always`, `if-not-present`, `never`. Applies to both the fill image and the filler image (ignored for a locally built fill image).
`container_runtime`	string	runner's runtime, then `docker`	Container runtime for the filler + fill containers.
`jwt`	string	benchmarkoor's `DefaultJWT`	Engine API JWT secret; shared between the filler client and `fill-stateful`.
`fill_command`	[]string	`[uv, run, fill-stateful]`	argv prefix invoked inside `fill_image` before the `fill-stateful` flags. Override if your image exposes the command differently.
`eest_repo`	string	`https://github.com/ethereum/execution-specs.git`	execution-specs repo cloned for filling.
`eest_ref`	string	`forks/amsterdam`	Branch, tag, or commit of `eest_repo`. benchmarkoor always clones the repo at this ref into an on-disk cache at build time and mounts the checkout into the fill container at `/eest` (the `fill_image` carries only the uv/python toolchain, not the repo), so the EEST version is config-driven and changeable without rebuilding the image. The clone is cached and re-fetched only when the ref changes; `uv` builds the venv into the mounted checkout on first use (cached across runs).
`config`	object	–	Shared defaults for the per-target parameters. See below.
`targets`	[]object	–	Required when invoking `benchmarkoor build`. See below.

`builder.eest_payloads.config` options

Every field below is also available per-target; a non-nil/non-empty value on a target overrides the default. Use this block to avoid repeating shared knobs (fork, tests, filter, address_stubs, …) across targets that build the same suite. (Only the identity/locator fields — name, filler_client, source_dir, output_dir, genesis, genesis_fork_override, genesis_eip_override — are target-only and never hoisted.)

Option	Type	Default	Description
`filler_image`	string	–	Docker image for the filler client (e.g. `ethpandaops/geth:master`).
`fork`	string	–	Fork to fill against, e.g. `Osaka` (passed to `fill-stateful --fork`).
`tests`	string[]	–	pytest paths inside the fill image, e.g. `tests/benchmark/compute`. Required after resolution — set here or per-target.
`filter`	string	–	pytest `-k` expression (substring/node-id selection).
`marker`	string	–	pytest `-m` marker expression, orthogonal to `filter`'s `-k`, e.g. `repricing` / `not repricing`.
`address_stubs`	map	–	Inline `--address-stubs` map: stub name → arbitrary string fields (e.g. `addr`, `pkey`). Materialised to a temp JSON file at build time. Mutually exclusive with `address_stubs_file`.
`address_stubs_file`	string	–	Absolute host path to a `--address-stubs` JSON map. Mutually exclusive with `address_stubs`.
`gas_benchmark_values`	int[]	–	Gas budgets in millions, e.g. `[10, 30]`; joined into `--gas-benchmark-values`. Mutually exclusive with `fixed_opcode_count`.
`fixed_opcode_count`	float[]	–	Opcode counts in thousands, e.g. `[0.5, 1, 2]`; joined into `--fixed-opcode-count`. An empty list (`[]`) passes the flag bare, using the fill image's `.fixed_opcode_counts.json` default. Mutually exclusive with `gas_benchmark_values`.
`datadir_method`	string	`copy`	How the filler's writable copy of `source_dir` is prepared: `copy`, `overlayfs`, `fuse-overlayfs`, `zfs`, `direct`, `schelk`. Use `zfs`/`overlayfs` to avoid a full copy of a large snapshot.
`max_gas_per_test`	uint64	–	Overrides the fork's transaction gas-limit cap (`--max-gas-per-test`).
`rpc_seed_key`	string	–	Pin the seed EOA for reproducible fills (`--rpc-seed-key`); otherwise one is generated and funded via CL withdrawal.
`filler_extra_args`	[]string	–	Extra argv appended to the filler client command.

Address-stubs hoisting: address_stubs / address_stubs_file hoist as a unit — a target that sets either form inherits neither from config, so their mutual exclusion is preserved. An inline address_stubs example:
address_stubs:
  bloated_eoa_10GB:
    addr: "0x87a6314da5ac8832f6e7a176c8fb133b19f5be04"
    pkey: "0x4da32d29f6dcffa26e09dc4e102033f2d105de1444fb893493ae703289275e0e"

`builder.eest_payloads.targets[]` options

Identity/locator fields are target-only; the rest mirror config and are resolved with per-target precedence.

Option	Type	Default	Description
`name`	string	`filler_client`	Used by `--target` to filter. Must be unique across targets.
`filler_client`	string	–	Client booted as the filler: `geth`, `nethermind`, or `besu` (all implement `testing_buildBlockV1`).
`source_dir`	string	–	Absolute host path to the pristine snapshot datadir (e.g. a `state_actor` `output_dir`). Never mutated — a writable copy is filled. Existence is checked at build time.
`genesis`	string	–	Absolute host path to the genesis/chainspec the filler boots with (besu/nethermind read their fork schedule from it; passed via the client's genesis flag). Must match the chain config used to produce `source_dir`. geth/erigon boot from the datadir instead and need no `genesis`.
`genesis_fork_override`	map	–	Patch the geth-format `genesis` at filler boot to activate forks at given timestamps (`{amsterdam: 1}` → `config.amsterdamTime`, inheriting the blob schedule). For besu/reth/ethrex fillers. Same mechanism as the runner. Requires `genesis`.
`genesis_eip_override`	object	–	Patch a parity/nethermind `genesis` at filler boot, setting `params.eip<N>TransitionTimestamp` for each listed EIP. Fields: `timestamp` (uint), `eips` ([]uint). For the nethermind filler. Requires `genesis`; mutually exclusive with `genesis_fork_override`.
`output_dir`	string	–	Absolute host path for the generated fixtures. Skipped if already populated unless `--force` / `force: true`. Written under `<output_dir>/blockchain_tests_stateful_engine/`.
`force`	bool	`false`	Per-target override of `--force`: wipe `output_dir` before filling.
`filler_image`, `fork`, `tests`, `filter`, `marker`, `address_stubs`, `address_stubs_file`, `gas_benchmark_values`, `fixed_opcode_count`, `datadir_method`, `max_gas_per_test`, `rpc_seed_key`, `filler_extra_args`	—	from `config`	Mirror `config` with per-target precedence — see the `config` table above. `tests`, `fork`, and `filler_image` are required after resolution (set on the target or in `config`).

Replaying generated fixtures

Point benchmarkoor run at the pristine snapshot (never the copy the filler mutated) and at the fixture output:

runner:
  client:
    datadirs:
      geth:
        source_dir: /srv/state/geth-archive       # the pristine snapshot
        method: zfs                                # or copy/overlayfs/…
  benchmark:
    tests:
      source:
        eest_fixtures:
          local_fixtures_dir: /srv/fixtures/compute
          fixtures_subdir: blockchain_tests_stateful_engine

Stateful replay needs the new fixture format support — see benchmarkoor #182.

As a sanity check, each fixture's recorded benchmarkGasUsed should match benchmarkoor's measured gas_used_total for that test.

API Server

See API Server documentation for the full reference on the api config section, including server settings, authentication, database, storage, endpoints, and UI integration.

Examples

Running stateless tests across all clients:

global:
  log_level: info

runner:
  client_logs_to_stdout: true
  cleanup_on_start: false

  benchmark:
    results_dir: ./results
    generate_results_index: true
    generate_suite_stats: true
    tests:
      filter: "bn128"
      source:
        git:
          repo: https://github.com/NethermindEth/gas-benchmarks.git
          version: main
          pre_run_steps: []
          steps:
            setup:
              - eest_tests/setup/*/*
            test:
              - eest_tests/testing/*/*
            cleanup: []

  client:
    config:
      resource_limits:
        cpuset_count: 4
        memory: "16g"
        swap_disabled: true
      genesis:
        besu: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/besu/zkevmgenesis.json
        erigon: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/geth/zkevmgenesis.json
        ethrex: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/geth/zkevmgenesis.json
        geth: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/geth/zkevmgenesis.json
        nethermind: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/nethermind/zkevmgenesis.json
        nimbus: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/geth/zkevmgenesis.json
        reth: https://github.com/nethermindeth/gas-benchmarks/raw/refs/heads/main/scripts/genesisfiles/geth/zkevmgenesis.json

  instances:
    - id: nethermind
      client: nethermind
    - id: geth
      client: geth
    - id: reth
      client: reth
    - id: erigon
      client: erigon
    - id: besu
      client: besu

Running EEST fixtures across multiple clients:

global:
  log_level: info

runner:
  client_logs_to_stdout: true
  cleanup_on_start: true

  benchmark:
    results_dir: ./results
    generate_results_index: true
    generate_suite_stats: true
    tests:
      filter: "bn128"  # Optional: filter tests by name
      source:
        eest_fixtures:
          github_repo: ethereum/execution-specs
          github_release: tests-benchmark@v0.0.9

  client:
    config:
      resource_limits:
        cpuset_count: 4
        memory: "16g"
        swap_disabled: true
      # Genesis files are auto-resolved from the EEST release.
      # No need to configure genesis URLs unless you want to override.

  instances:
    - id: geth
      client: geth
    - id: nethermind
      client: nethermind
    - id: reth
      client: reth
    - id: besu
      client: besu
    - id: erigon
      client: erigon

Running EEST fixtures from a local directory (no GitHub required):

global:
  log_level: info

runner:
  client_logs_to_stdout: true
  cleanup_on_start: true

  benchmark:
    results_dir: ./results
    generate_results_index: true
    generate_suite_stats: true
    tests:
      source:
        eest_fixtures:
          local_fixtures_dir: /home/user/execution-spec-tests/output/fixtures
          local_genesis_dir: /home/user/execution-spec-tests/output/genesis

  client:
    config:
      resource_limits:
        cpuset_count: 4
        memory: "16g"
        swap_disabled: true

  instances:
    - id: geth
      client: geth
    - id: reth
      client: reth

Running stateful tests on a geth container with an existing data directory:

global:
  log_level: info

runner:
  client_logs_to_stdout: true
  cleanup_on_start: false

  benchmark:
    results_dir: ./results
    results_owner: "${UID}:${GID}"
    generate_results_index: true
    generate_suite_stats: true
    tests:
      source:
        git:
          repo: https://github.com/skylenet/gas-benchmarks.git
          version: order-stateful-tests-subdirs
          pre_run_steps:
            - stateful_tests/gas-bump.txt
            - stateful_tests/funding.txt
          steps:
            setup:
              - stateful_tests/setup/*/*
            test:
              - stateful_tests/testing/*/*
            cleanup:
              - stateful_tests/cleanup/*/*

  client:
    config:
      drop_memory_caches: "steps"
    datadirs:
      geth:
        source_dir: ${HOME}/data/clients/perf-devnet-2/23861500/geth
        method: overlayfs

  instances:
    - id: geth
      client: geth
      image: ethpandaops/geth:master
      extra_args:
        - --miner.gaslimit=1000000000
        - --txpool.globalqueue=10000
        - --txpool.globalslots=10000
        - --networkid=12159
        - --override.osaka=1864841831
        - --override.bpo1=1864841831
        - --override.bpo2=1864841831

For API server examples, see the API Server documentation.

Uh oh!

FilesExpand file tree

configuration.md

Latest commit

History

configuration.md

File metadata and controls

Configuration Reference

Table of Contents

Overview

Environment Variables

Config-local variables (global.env)

Environment Variable Overrides

Configuration Merging

Global Settings

Options

Runner Settings

Options

Live Reporting

Container Runtime

Metadata Labels

Runner Run Timeout

Benchmark Settings

Options

Suite Metadata Labels

Test Sources

Local Source

Git Source

Archive Source

Opcode Source

EEST Fixtures Source

From GitHub Releases

From GitHub Actions Artifacts

From Local Directories

From Local Tarballs

Test Filter

Results Upload

Client Settings

Supported Clients

Client Defaults

Drop Memory Caches

Rollback Strategy

rpc-debug-setHead

Client-specific RPC calls

container-recreate

container-checkpoint-restore

Checkpoint Restore Strategy Options

Wait After RPC Ready

Run Timeout

Post-Test Sleep Duration

Retry New Payloads Syncing State

Retry New Payloads Failed State

Bootstrap FCU

Opcode Extraction

Data Directories

Data Directory Methods

ZFS Setup

Schelk Setup

Default Container Directories

Client Instances

Genesis Fork & EIP Overrides

Resource Limits

Block I/O Configuration

CPU Frequency Management

Post-Test RPC Calls

Call Options

Template Variables

Dump Output

Execution Flow

Instance-Level Override

Builder

builder.state_actor options

builder.state_actor.config options

builder.state_actor.targets[] options

Running

Rebuild on config change (--rebuild-on-diff)

Examples

builder.eest_payloads options

builder.eest_payloads.config options

builder.eest_payloads.targets[] options

Config-local variables (`global.env`)

`rpc-debug-setHead`

`container-recreate`

`container-checkpoint-restore`

`builder.state_actor` options

`builder.state_actor.config` options

`builder.state_actor.targets[]` options

Rebuild on config change (`--rebuild-on-diff`)

`builder.eest_payloads` options

`builder.eest_payloads.config` options

`builder.eest_payloads.targets[]` options