Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -143,11 +143,11 @@ jobs:
lake exe cache get
lake build

# Package the publishable crates (pwm-testkit is publish=false). The whole
# Package the publishable crates (pwm-testkit and pwm-tui are publish=false). The whole
# workspace is packaged together so inter-crate deps resolve among the set.
- name: Package crates
run: |
cargo package --workspace --exclude pwm-testkit --no-verify --locked
cargo package --workspace --exclude pwm-testkit --exclude pwm-tui --no-verify --locked
ls -1 target/package/*.crate

- name: Extract release notes
Expand Down
76 changes: 76 additions & 0 deletions .github/workflows/tui-smoke.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# SPDX-License-Identifier: Apache-2.0
#
# Headless smoke for the pwm-tui real-checkpoint demo (no Docker). Separate from
# the merge gate because it downloads external Hugging Face assets and a torch
# environment. Runs on demand and weekly on main.
name: TUI Smoke

on:
workflow_dispatch:
schedule:
- cron: "41 5 * * 1"

permissions:
contents: read

jobs:
tui-headless:
name: pwm-tui headless real demo
runs-on: ubuntu-latest
timeout-minutes: 60
env:
NO_COLOR: "1"
PWM_CACHE_DIR: ${{ github.workspace }}/.pwm-cache
steps:
- uses: actions/checkout@v6

- name: Install uv
uses: astral-sh/setup-uv@v8

- name: Build pwm-tui
run: cargo build -p pwm-tui --release --locked

- name: Cold run (downloads pinned assets, exports, proves)
run: |
set -euo pipefail
./target/release/pwm-tui --headless --json | tee tui-cold.log
tail -n 1 tui-cold.log > tui-summary.json
python3 - <<'PY'
import json
s = json.load(open("tui-summary.json"))
assert s["ok"] is True, s
r = s["report"]
assert r["accepted"] is True
assert r["weights_root_source"] == "export-bundle"
assert float(r["float_error"]) <= float(r["float_tolerance"])
assert r["tamper"]["rejected_with"], "tamper must be rejected"
print("cold run ok:", r["weights_root"][:16], r["z_out_head"])
PY

- name: Warm run (offline, bundle cache hit, pure Rust)
run: |
set -euo pipefail
./target/release/pwm-tui --headless --offline --json | tee tui-warm.log
grep -q "skipped: bundle cache hit" tui-warm.log
tail -n 1 tui-warm.log | python3 -c "import json,sys; s=json.load(sys.stdin); assert s['ok'] is True"

- name: Cache management commands
run: |
set -euo pipefail
./target/release/pwm-tui cache path
./target/release/pwm-tui cache ls
./target/release/pwm-tui cache clear --yes
./target/release/pwm-tui --headless --offline > offline-cold.log 2>&1 && exit 1 || true
grep -q "offline" offline-cold.log

- name: Upload logs
if: always()
uses: actions/upload-artifact@v7
with:
name: tui-smoke-${{ github.run_id }}
path: |
tui-cold.log
tui-warm.log
offline-cold.log
if-no-files-found: warn
retention-days: 14
18 changes: 16 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ non-reproducible attention. There is no proving circuit and no arithmetization.

## Layout

Five small crates, one trust anchor. The verifier depends on neither the exporter
Six small crates, one trust anchor. The verifier depends on neither the exporter
nor any Python or float runtime.

- `pwm-core`: fields (M31 value, Fp61 audit), fixed point, tensors, Merkle
Expand All @@ -55,7 +55,12 @@ nor any Python or float runtime.
- `pwm-verifier`: CPU, `no_std`, float-free. Freivalds-check the linears,
recompute the rest, check rollout, cost, argmin.
- `pwm-testkit`: golden vectors, accept and reject suites, the mutation harness,
the `pwm` demo CLI (`crates/pwm-testkit/src/bin/pwm.rs`).
the `pwm` demo CLI (`crates/pwm-testkit/src/bin/pwm.rs`), and the shared
`report` module: the single load -> prove -> verify -> tamper computation that
every demo front end renders.
- `pwm-tui`: the ratatui demo front end. Proves the real checkpoint locally with
no Docker, with sha256-pinned checkpoint/dataset/bundle caching and a
`--headless --json` mode for CI. Demo only; not published.

## Run the demo

Expand All @@ -68,13 +73,22 @@ docker compose --profile compact up --build prover verifier # tiny two-part
Without Docker:

```bash
cargo run -p pwm-tui --release # REAL checkpoint, TUI, cached
cargo run -p pwm-testkit --bin pwm --release -- prove-predictor # synthetic
cargo run -p pwm-testkit --bin pwm --release -- prove-predictor <bundle> # real weights
cargo test --workspace # accept + reject suites
bash ci/check-local.sh quick # fast local quality loop
bash ci/check-local.sh full # local merge-gate mirror
```

`pwm-tui` mirrors `./demo/run-real.sh` natively: it downloads and pin-verifies
the checkpoint plus a real episode into a cache (`--cache-dir`, then
`PWM_CACHE_DIR`, then the platform default), runs the exporter via `uv`
(network-free, NDJSON progress through `LEWM_JSON_EVENTS`), and proves the real
weights in-process. A warm bundle cache skips Python entirely. `--offline`,
`--refresh`, `--re-export`, and `cache path|ls|clear` manage the cache; the
Docker paths are unchanged.

The `pwm` CLI prints a five-stage pipeline (LOAD, INFER, COMMIT, VERIFY, TAMPER); it
honors `NO_COLOR` and `--json`.

Expand Down
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,27 @@ entry.

## [Unreleased]

### Added

- **`pwm-tui`: a ratatui demo that proves the REAL pretrained checkpoint end to
end locally, with no Docker** (#200): downloads and sha256-pin-verifies the
`quentinll/lewm-pusht` checkpoint and a real `lerobot/pusht` episode into a
configurable cache (`--cache-dir` > `PWM_CACHE_DIR` > platform default), runs
the PyTorch exporter via `uv` network-free, then proves, verifies, and
tamper-checks the real weights in-process with live per-stage progress, the
op histogram, the commitments, and the verdicts. A warm bundle cache skips
Python entirely; `--offline`, `--refresh`, `--re-export`, `--headless --json`,
and `cache path|ls|clear` manage the cache. Exit code 0 means the honest proof
was accepted and the forged one rejected.
- The Python exporter emits NDJSON progress events when `LEWM_JSON_EVENTS` is
set (pretty log moves to stderr; the Docker path is unchanged) and accepts
`LEWM_PUSHT_PARQUET` / `LEWM_PUSHT_MP4` local file overrides so a caching
front end can make the export stage network-free.
- The `pwm prove-predictor` demo computation moved into the shared
`pwm_testkit::report` module (`build_predictor_report`, `PredictorReport`,
with phase callbacks); the CLI and the TUI render the same single computation,
so their numbers cannot drift.

## [0.1.0] - 2026-06-11

### Security
Expand Down
Loading
Loading