Skip to content

test: per-variant BlockOp reject coverage + predictor mutation gate (PRED-04)#194

Merged
AbdelStark merged 1 commit into
AbdelStark:mainfrom
adrienlacombe:test/181-blockop-reject-coverage
Jun 10, 2026
Merged

test: per-variant BlockOp reject coverage + predictor mutation gate (PRED-04)#194
AbdelStark merged 1 commit into
AbdelStark:mainfrom
adrienlacombe:test/181-blockop-reject-coverage

Conversation

@adrienlacombe

Copy link
Copy Markdown
Contributor

Closes #181. Part of #177.

What this does

Three coverage gaps from the issue, each now closed:

1. Every BlockOp variant has a failing-on-tamper test. Inventory found 11 of 12 variants already covered in pwm-verifier/tests/block.rs; only Requant had no tampered-output reject test (it only had the InvalidShift boundary test). Added accept_and_reject_tampered_requant_output, which also pins the hand-computed accept outputs ([10, -7] at shift 1 → [5, -4], exercising the ties-to-even rounding case) and documents where each of the 12 variants' tamper tests live.

2. Predictor accept tests assert exact outputs. small_dims_one_block_verifies and small_dims_six_blocks_verify_within_float_tolerance previously only checked is_ok plus a float tolerance. They now pin the exact 24-element integer output vector (SMALL_DIMS_OUTPUT), locking the quantized semantics — kernels, table generation, scheme derivation, deterministic synthetic weights — end-to-end. Notable finding documented in the test: depth 1 and depth 6 pin the same vector, because the AdaLN-gated block contributions at the synthetic weights shift the float reference by ~3e-3, below one f_ln quantization step, so blocks 2–6 don't move the final per-position LayerNorm off the depth-1 grid points (the float references do differ, and both stay within tolerance — verified by the existing float-faithfulness gate that still runs).

3. Mutation campaign covers the predictor path. Added the range_check.modp_predictor mutant to SoundnessCampaign: the mod-p accumulator-aliasing forgery (out[0] += p) on a BatchedLinear accumulator, the named-buffer analog of range_check.modp_accumulator. The kill condition asserts the typed AccumulatorRange { op_id } — not just any rejection — so it pins the block-path Freivalds range guard specifically. Campaign floor raised from 5 to 7 mutants.

Acceptance (from #181)

  • Every BlockOp variant has a failing-on-tamper test (12/12, asserting the exact BlockOpMismatch / FreivaldsCheckFailed with the right op_id).
  • Predictor accept tests assert exact outputs (not just is_ok).
  • Mutation campaign includes a predictor-path soundness mutant.

Full workspace suite green (200 tests), real-dims predictor release test green, cargo fmt + clippy --workspace --all-targets clean. Independent of #193 (branched from main; the only shared file, block.rs, is touched in non-overlapping hunks).

🤖 Generated with Claude Code

Closes AbdelStark#181.

- block.rs: add the missing Requant tampered-output reject test — with it,
  every one of the 12 BlockOp variants rejects a bumped claimed output with
  the exact typed error and op_id (Linear/BatchedLinear via Freivalds, the
  exact-recompute ops via BlockOpMismatch). The accept half pins the
  hand-computed requant outputs (incl. the ties-to-even case).
- lewm_predictor.rs: the small-dims predictor accept tests now pin the exact
  24-element integer output vector instead of just is_ok + float tolerance,
  locking the quantized semantics (kernels, tables, scheme derivation,
  synthetic weights) end-to-end. Depth 1 and depth 6 pin the same vector:
  the AdaLN-gated block contributions shift the float reference by less
  than one f_ln quantization step (documented in the test).
- soundness_campaign.rs: add the range_check.modp_predictor mutant — the
  mod-p accumulator-aliasing forgery (out[0] += p) on a BatchedLinear
  accumulator, killed only by the block-path Freivalds range guard
  (asserts the typed AccumulatorRange, not just any rejection) — so the
  predictor/block path is merge-gated like the flat demo path. Campaign
  floor raised to 7 mutants.

Full workspace suite green (200 tests); real-dims predictor release test
unaffected and green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@AbdelStark AbdelStark merged commit b9ecf01 into AbdelStark:main Jun 10, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PRED-04 test: per-variant BlockOp reject tests + predictor mutation coverage

2 participants