test: per-variant BlockOp reject coverage + predictor mutation gate (PRED-04) by adrienlacombe · Pull Request #194 · AbdelStark/ProvableWorldModel

adrienlacombe · 2026-06-10T11:56:05Z

Closes #181. Part of #177.

What this does

Three coverage gaps from the issue, each now closed:

1. Every BlockOp variant has a failing-on-tamper test. Inventory found 11 of 12 variants already covered in pwm-verifier/tests/block.rs; only Requant had no tampered-output reject test (it only had the InvalidShift boundary test). Added accept_and_reject_tampered_requant_output, which also pins the hand-computed accept outputs ([10, -7] at shift 1 → [5, -4], exercising the ties-to-even rounding case) and documents where each of the 12 variants' tamper tests live.

2. Predictor accept tests assert exact outputs. small_dims_one_block_verifies and small_dims_six_blocks_verify_within_float_tolerance previously only checked is_ok plus a float tolerance. They now pin the exact 24-element integer output vector (SMALL_DIMS_OUTPUT), locking the quantized semantics — kernels, table generation, scheme derivation, deterministic synthetic weights — end-to-end. Notable finding documented in the test: depth 1 and depth 6 pin the same vector, because the AdaLN-gated block contributions at the synthetic weights shift the float reference by ~3e-3, below one f_ln quantization step, so blocks 2–6 don't move the final per-position LayerNorm off the depth-1 grid points (the float references do differ, and both stay within tolerance — verified by the existing float-faithfulness gate that still runs).

3. Mutation campaign covers the predictor path. Added the range_check.modp_predictor mutant to SoundnessCampaign: the mod-p accumulator-aliasing forgery (out[0] += p) on a BatchedLinear accumulator, the named-buffer analog of range_check.modp_accumulator. The kill condition asserts the typed AccumulatorRange { op_id } — not just any rejection — so it pins the block-path Freivalds range guard specifically. Campaign floor raised from 5 to 7 mutants.

Acceptance (from #181)

Every BlockOp variant has a failing-on-tamper test (12/12, asserting the exact BlockOpMismatch / FreivaldsCheckFailed with the right op_id).
Predictor accept tests assert exact outputs (not just is_ok).
Mutation campaign includes a predictor-path soundness mutant.

Full workspace suite green (200 tests), real-dims predictor release test green, cargo fmt + clippy --workspace --all-targets clean. Independent of #193 (branched from main; the only shared file, block.rs, is touched in non-overlapping hunks).

🤖 Generated with Claude Code

Closes AbdelStark#181. - block.rs: add the missing Requant tampered-output reject test — with it, every one of the 12 BlockOp variants rejects a bumped claimed output with the exact typed error and op_id (Linear/BatchedLinear via Freivalds, the exact-recompute ops via BlockOpMismatch). The accept half pins the hand-computed requant outputs (incl. the ties-to-even case). - lewm_predictor.rs: the small-dims predictor accept tests now pin the exact 24-element integer output vector instead of just is_ok + float tolerance, locking the quantized semantics (kernels, tables, scheme derivation, synthetic weights) end-to-end. Depth 1 and depth 6 pin the same vector: the AdaLN-gated block contributions shift the float reference by less than one f_ln quantization step (documented in the test). - soundness_campaign.rs: add the range_check.modp_predictor mutant — the mod-p accumulator-aliasing forgery (out[0] += p) on a BatchedLinear accumulator, killed only by the block-path Freivalds range guard (asserts the typed AccumulatorRange, not just any rejection) — so the predictor/block path is merge-gated like the flat demo path. Campaign floor raised to 7 mutants. Full workspace suite green (200 tests); real-dims predictor release test unaffected and green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

AbdelStark merged commit b9ecf01 into AbdelStark:main Jun 10, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: per-variant BlockOp reject coverage + predictor mutation gate (PRED-04)#194

test: per-variant BlockOp reject coverage + predictor mutation gate (PRED-04)#194
AbdelStark merged 1 commit into
AbdelStark:mainfrom
adrienlacombe:test/181-blockop-reject-coverage

adrienlacombe commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adrienlacombe commented Jun 10, 2026

What this does

Acceptance (from #181)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants