feat: AGT integration example + aarch64 refuse-to-run + community files

tomjwxf · tomjwxf · commit 44f924dcb984 · 2026-04-17T07:47:45.000-04:00
Two substantive changes for design-partner readiness: 1. examples/agt-integration/ — Python drop-in shim (SbRuntimeSkill) matching openshell_agentmesh.skill.GovernanceSkill field-for-field. Directly answers microsoft/agent-governance-toolkit#748. Includes a Cedar policy, smoke test, and DESIGN.md-style open-question section inviting upstream feedback on the exact provider contract. 2. aarch64 honesty — apply_linux() now refuses-to-run with a clear error on non-x86_64 Linux rather than silently falling back to a permissive filter. Silent weakening of "strict" mode is strictly worse than a hard stop. Tracked at issue #1; lands in v0.1.1. Community files: - SECURITY.md — private-disclosure email, scope, defence-in-depth assumptions. - CONTRIBUTING.md — design-partner programme, ranked PR opportunities, dev workflow. - CHANGELOG.md — Keep-a-Changelog format; v0.1.0-alpha.1 entry. - README — honest per-platform matrix; AGT integration pointer. GitHub topics set for discoverability. Open issues #1–#4 enumerate concrete bounded tasks (aarch64 table, --dry-run flag, macOS backend, multi-issuer chains) — each a self-contained opportunity for a contributor / design partner. Tests: still 15 + 1 doctest green. cargo clippy --all-features clean (CI baseline). cargo fmt clean.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,67 @@
+# Changelog
+
+All notable changes to `sb-runtime` are documented here. Format follows
+[Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project follows
+[Semantic Versioning](https://semver.org/spec/v2.0.0.html) once it hits v1.0.
+
+## [Unreleased]
+
+### Added
+- Nothing yet; PRs welcome.
+
+## [0.1.0-alpha.1] — 2026-04-17
+
+First public preview. Design-partner release.
+
+### Added
+
+- **`sb-cli`** — the `sb` binary with three subcommands: `exec`, `verify`,
+  `keys generate`.
+- **`sb-sandbox`** crate — OS-native sandbox primitives.
+  - Linux x86_64 backend: Landlock ABI V2 (filesystem read / write / exec) +
+    seccomp-BPF (strict allowlist of ~70 syscalls by default; permissive
+    deny-list mode also available).
+  - Linux aarch64: refuses-to-run with a clear error rather than silently
+    degrading. See issue #1.
+  - macOS + Windows: stubs. `--allow-unsandboxed` lets the other layers
+    (Cedar + receipts) fire without OS isolation on those platforms.
+- **`sb-policy`** crate — Cedar-backed policy evaluator.
+- **`sb-receipt`** crate — Ed25519-signed, JCS-canonical, hash-chained
+  receipts. Zero-I/O, pure-crypto. Compatible with
+  [`@veritasacta/verify`](https://www.npmjs.com/package/@veritasacta/verify)
+  and the
+  [IETF draft-farley-acta-signed-receipts](https://datatracker.ietf.org/doc/draft-farley-acta-signed-receipts/).
+- **`examples/basic/`** — minimal "allow-list of commands" Cedar policy with
+  smoke-test instructions.
+- **`examples/agt-integration/`** — Python shim (`SbRuntimeSkill`) that
+  drops into Microsoft's Agent Governance Toolkit in place of
+  `openshell_agentmesh.skill.GovernanceSkill`. Same public interface; swap
+  via config. Addresses [AGT issue #748](https://github.com/microsoft/agent-governance-toolkit/issues/748).
+- **CI** — `cargo fmt`, `cargo clippy`, `cargo test`, and an end-to-end Linux
+  smoke run on push / PR. Cross-compile to x86_64-linux / x86_64-macos /
+  aarch64-macos on tagged releases.
+- **Community files** — `CONTRIBUTING.md`, `SECURITY.md`, `DESIGN.md` with
+  roadmap + open questions.
+
+### Known limitations
+
+See `DESIGN.md#known-limitations-v01` for the full list. Headlines:
+
+- Linux x86_64 only in this release.
+- Syscall allowlist is hand-curated; some programs will hit missing
+  syscalls (particularly `statx`, `ioctl`, newer `getrandom` variants).
+- Network policy is coarse: loopback-or-nothing via seccomp. Landlock
+  network rules (kernel ≥ 6.7) land in v0.2.
+- Receipt chains are single-issuer. Multi-issuer chains are issue #4.
+- `sb exec` runs the command it evaluates; a pure-evaluation `--dry-run`
+  mode is issue #2.
+
+### Contributors
+
+@tomjwxf — core scaffold, Linux backend, Cedar integration, receipt format,
+AGT shim.
+
+You? We're actively looking for design partners — see CONTRIBUTING.md.
+
+[Unreleased]: https://github.com/ScopeBlind/sb-runtime/compare/v0.1.0-alpha.1...HEAD
+[0.1.0-alpha.1]: https://github.com/ScopeBlind/sb-runtime/releases/tag/v0.1.0-alpha.1
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,79 @@
+# Contributing to sb-runtime
+
+Thank you — `sb-runtime` is a small project and every review, test, and PR
+genuinely moves it forward. Three things to read before you start.
+
+## The design-partner programme
+
+If you're building in adjacent territory (agent governance, policy-as-code,
+OS sandboxing, cryptographic supply chain, transparency logs) we'd love your
+input before v0.1 stabilises. Design partners get:
+
+- Direct read on the v0.2 roadmap before it lands in `DESIGN.md`.
+- Early reviewer access on API surface — the kind of feedback that's still
+  cheap to act on.
+- Named credit in release notes (if you want it; anonymity is fine too).
+
+Open an issue titled "design-partner interest" and tell us a sentence about
+what you're building. Or email **tommy@scopeblind.com**.
+
+Current design partners (or conversations in flight): TBD — you could be #1.
+
+## PRs we'd especially welcome
+
+Ranked by leverage:
+
+1. **[aarch64 syscall table](https://github.com/ScopeBlind/sb-runtime/issues/1)** — unblocks Linux ARM.
+2. **[--dry-run flag for sb exec](https://github.com/ScopeBlind/sb-runtime/issues/2)** — simplifies the AGT shim.
+3. **[macOS sandbox backend](https://github.com/ScopeBlind/sb-runtime/issues/3)** — sandbox_init + SBPL.
+4. **[Multi-issuer receipt chains](https://github.com/ScopeBlind/sb-runtime/issues/4)** — v0.2 receipt model.
+
+See `DESIGN.md` for the v0.2/v0.3 roadmap.
+
+## Development workflow
+
+```bash
+# Build & run the full test matrix
+cargo test --workspace
+
+# Lint (matches CI)
+cargo clippy --workspace --all-targets --all-features
+
+# Format (matches CI)
+cargo fmt --all --check
+
+# End-to-end smoke test (Linux-like OS)
+cargo build -p sb-cli
+./target/debug/sb exec \
+  --policy examples/basic/policy.cedar \
+  --receipts /tmp/sb-smoke \
+  --allow-unsandboxed \
+  -- /bin/echo hello
+./target/debug/sb verify /tmp/sb-smoke
+```
+
+## PR conventions
+
+- **One concern per PR.** Sandbox changes, policy changes, and receipt
+  changes should not share a commit.
+- **New public API requires a test.** We don't expose anything we can't point at.
+- **Breaking changes need a CHANGELOG entry.** Minor doc tweaks don't.
+- **`cargo fmt` + `cargo clippy` clean** before pushing. CI will block otherwise.
+
+## Code review
+
+One maintainer approval is enough during v0.1-alpha. Once we hit v0.1.0 stable
+we'll move to a stricter review model.
+
+## Licence
+
+By submitting a PR you agree your contribution is licensed MIT under the
+project LICENCE. If you work at an employer that claims rights to your code,
+please sort that out before PRing.
+
+## Be kind
+
+The project is young and evolving. Good-faith questions, arch disagreements,
+and "have you considered…" are all welcome. Rudeness isn't.
+
+Thanks. Looking forward to building this with you.
diff --git a/README.md b/README.md
@@ -15,7 +15,22 @@
 
 `sb-runtime` answers a question several AGT / OpenShell users have been asking: *can we get the "walls + brain + receipts" pattern without Docker/OCI/k3s/gateway infrastructure?* This is a single 8 MB binary that runs on dev laptops, CI, and edge.
 
-**Status: v0.1.0-alpha.1 — design-partner preview.** Linux sandbox backend works (Landlock + seccomp). macOS / Windows backends are stubs; use `--allow-unsandboxed` on those platforms to run with Cedar + receipts only. We're actively looking for design-partner input on the AGT provider interface and the Cedar schema for agent actions — open an issue or reply to [microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748).
+**Status: v0.1.0-alpha.1 — design-partner preview.** Honest platform matrix:
+
+| Platform             | Sandbox                    | Cedar + receipts |
+|----------------------|----------------------------|------------------|
+| Linux x86_64         | Landlock + seccomp         | ✓                |
+| Linux aarch64        | Refuses (see [issue #1][1])| ✓ (w/ `--allow-unsandboxed`) |
+| macOS                | Stub (`--allow-unsandboxed`) — [issue #3][3] | ✓ |
+| Windows              | Stub (`--allow-unsandboxed`) | ✓              |
+
+We're actively looking for design-partner input on the AGT provider interface,
+the Cedar schema for agent actions, and the macOS/Windows backend priorities —
+see `CONTRIBUTING.md` or reply to
+[microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748).
+
+[1]: https://github.com/ScopeBlind/sb-runtime/issues/1
+[3]: https://github.com/ScopeBlind/sb-runtime/issues/3
 
 ## Quick start
 
@@ -60,6 +75,13 @@ Each sub-crate is usable independently. `sb-receipt` is deliberately minimal (ze
 - **…use firejail / bubblewrap?** Those are filesystem sandboxes. They don't evaluate Cedar policy before the exec, and they don't emit signed receipts. Combine them with `sb-runtime` if you want — `sb` does Cedar + receipts + Landlock+seccomp, they do extra fs isolation layers.
 - **…just use Cedar?** Cedar decides. It doesn't enforce or audit. `sb-runtime` is the enforcement layer.
 
+## Integrating with Microsoft's Agent Governance Toolkit
+
+See [`examples/agt-integration/`](examples/agt-integration/) for a Python
+drop-in shim (`SbRuntimeSkill`) that replaces
+`openshell_agentmesh.skill.GovernanceSkill` field-for-field. Swap via config,
+no agent code changes required.
+
 ## Licensing
 
 MIT. No runtime dependencies on ScopeBlind services; no telemetry. The optional managed tier (hosted receipt archival, team dashboards, compliance exports) is available at [scopeblind.com/pricing](https://scopeblind.com/pricing) but the sandbox runs local-only forever with the free binary.
diff --git a/SECURITY.md b/SECURITY.md
@@ -0,0 +1,51 @@
+# Security Policy
+
+## Supported versions
+
+Only the latest tagged release on `main` receives security fixes during the
+v0.1-alpha cycle. Once v1.0 ships, we'll support the latest two minor lines.
+
+## Reporting a vulnerability
+
+Please report security issues privately rather than via a public GitHub issue:
+
+- Email: **security@scopeblind.com**
+- PGP (optional): fingerprint published at https://scopeblind.com/.well-known/security.txt
+
+We aim to acknowledge within **24 hours** and ship a fix + coordinated disclosure
+within **14 days** for high-severity issues, longer for issues requiring upstream
+Cedar / Landlock / seccomp changes.
+
+## Scope
+
+In scope:
+
+- Anything in `crates/sb-*` — the Rust code.
+- Anything in `examples/` — if an example would leak a key, mis-apply a policy,
+  or otherwise teach a wrong pattern.
+- Any documented CLI flag behaviour.
+
+Out of scope:
+
+- Bugs in the Cedar policy engine itself — please report to
+  https://github.com/cedar-policy/cedar.
+- Bugs in Landlock or the Linux kernel — report to
+  https://landlock.io or the kernel mailing list.
+- Denial-of-service attacks against the hosted receipt archival service —
+  report to security@scopeblind.com (separate from the open-source repo).
+
+## Defence-in-depth assumptions we rely on
+
+A sandbox built from Landlock + seccomp is *best-effort*, not a complete jail.
+We assume:
+
+- The kernel is patched against public CVEs.
+- The binary is not setuid. Callers drop privileges before invoking `sb`.
+- A determined attacker with a kernel 0-day can escape. For higher-assurance
+  workloads, layer `sb` inside a VM, a container, or a hardware sandbox —
+  `sb` is *complementary* to those, not a replacement.
+
+## Credit
+
+Researchers who privately report valid issues are credited in release notes
+unless they request anonymity.
diff --git a/crates/sb-sandbox/src/linux.rs b/crates/sb-sandbox/src/linux.rs
@@ -26,10 +26,32 @@ use landlock::{
 };
 
 /// Apply the profile to the current thread+process on Linux.
+///
+/// v0.1 supports x86_64 only. Other Linux architectures (notably aarch64,
+/// which is increasingly common on cloud instances) refuse-to-run with a
+/// clear error rather than silently falling back to a permissive filter —
+/// silently weakening enforcement for users who asked for strict mode is
+/// strictly worse than a hard stop. The aarch64 syscall table lands in
+/// v0.1.1; tracked at https://github.com/ScopeBlind/sb-runtime/issues/1
 pub(crate) fn apply_linux(profile: &Profile) -> Result<(), SandboxError> {
-    apply_landlock(profile)?;
-    apply_seccomp(profile)?;
-    Ok(())
+    #[cfg(not(target_arch = "x86_64"))]
+    {
+        let _ = profile;
+        return Err(SandboxError::Unsupported(
+            "sb-runtime v0.1 Linux backend supports x86_64 only. \
+             aarch64 syscall table is tracked at \
+             https://github.com/ScopeBlind/sb-runtime/issues/1 \
+             (lands in v0.1.1). Use --allow-unsandboxed to run with \
+             Cedar + receipts only on unsupported architectures.",
+        ));
+    }
+
+    #[cfg(target_arch = "x86_64")]
+    {
+        apply_landlock(profile)?;
+        apply_seccomp(profile)?;
+        Ok(())
+    }
 }
 
 fn apply_landlock(profile: &Profile) -> Result<(), SandboxError> {
diff --git a/examples/agt-integration/README.md b/examples/agt-integration/README.md
@@ -0,0 +1,109 @@
+# AGT integration — `SbRuntimeSkill`
+
+Drop-in replacement for `openshell_agentmesh.skill.GovernanceSkill` that delegates policy
+evaluation + enforcement + receipt emission to the `sb` binary. Same public interface; swap
+via configuration. Zero agent-code changes.
+
+## Why
+
+[microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748)
+asks for a lightweight sandbox alternative to OpenShell — something without Docker, k3s, or
+a gateway, suitable for CI, edge, and dev machines. `sb-runtime` answers that (single ~8 MB
+Rust binary; Cedar policy + Ed25519 receipts + Landlock + seccomp on Linux).
+`SbRuntimeSkill` is the Python shim that lets AGT consume it.
+
+## Drop-in swap
+
+Before (OpenShell):
+
+```python
+from openshell_agentmesh.skill import GovernanceSkill
+skill = GovernanceSkill(policy_dir=Path("policies/"))
+decision = skill.check_policy("shell", {"command": "rm -rf /tmp"})
+```
+
+After (`sb-runtime`):
+
+```python
+from sb_runtime_skill import SbRuntimeSkill
+skill = SbRuntimeSkill(
+    policy_path=Path("policies/dev-safe.cedar"),
+    receipts_dir=Path(".receipts"),
+)
+decision = skill.check_policy(
+    "exec",
+    {"command": "/usr/bin/rm", "args": ["-rf", "/tmp"]},
+)
+```
+
+`PolicyDecision` is field-for-field compatible (`allowed`, `action`, `reason`, `policy_name`,
+`trust_score`) plus one sb-specific extension: `receipt_path` pointing at the signed receipt
+the `sb` binary just wrote. Your existing trust-score loops, audit exports, and dashboards
+keep working.
+
+## What you get that OpenShell doesn't
+
+- **Tamper-evident audit trail.** Every decision — allow *and* deny — is an Ed25519-signed
+  receipt, hash-chained by `prev_hash`. A regulator can verify the chain offline with
+  `npx @veritasacta/verify` or `sb verify .receipts/` and needs to trust nothing except the
+  issuer public key.
+- **No infrastructure.** No Docker daemon, no k3s control plane, no network proxy. Single
+  binary. Fits in CI, edge, and dev-laptop workflows OpenShell finds awkward.
+- **Offline verification property.** Receipts keep verifying after the `sb` binary and the
+  Skill and ScopeBlind are all gone.
+
+## What you give up vs. OpenShell
+
+- No network proxy interception layer (LoopbackOnly seccomp rule only, for now).
+- No multi-tenant k3s isolation.
+- Linux sandbox backend is x86_64-only in v0.1 (aarch64 refuses-to-run rather than
+  silently falling back — see [issue #1](https://github.com/ScopeBlind/sb-runtime/issues/1)).
+- macOS and Windows run in `--allow-unsandboxed` mode (Cedar + receipts only) until
+  v0.2 lands `sandbox_init` and AppContainer backends.
+
+**Use OpenShell when you need full network-proxy enforcement or multi-tenant k3s.
+Use `sb-runtime` when those are overkill.**
+
+## Open design questions (v0.1.0-alpha.1)
+
+These are what I most want to hear from AGT maintainers + @lukehinds on:
+
+1. **Separation of check and enforce.** `check_policy` in OpenShell's `GovernanceSkill` is
+   pure evaluation — it does not run the action. `sb exec` currently evaluates *and* runs.
+   We're planning a `sb exec --dry-run` in v0.1.1 so the Skill can check without running;
+   the current shim uses a receipts-scan workaround. Is "check then separately enforce" the
+   right mental model for this interface, or should `check_policy` also enforce?
+2. **Async or sync?** The current shim is sync (matching OpenShell's `check_policy`). If
+   AGT is moving async in general, say the word and we'll ship async equivalents.
+3. **Receipt storage.** Should `SbRuntimeSkill` expose receipts via the existing audit log
+   methods, or is the `receipt_path` on `PolicyDecision` plus `verify_chain()` enough for
+   downstream consumers?
+4. **Trust score feedback loop.** The Skill currently reads trust score but doesn't write
+   it back from receipt outcomes. Should receipt decisions auto-decay trust the way
+   OpenShell's demo does?
+
+## Run the smoke test
+
+```bash
+# Build + install the sb binary from source (one-time)
+cargo install --path ../../crates/sb-cli   # from sb-runtime root
+
+# Run the Python demo
+python sb_runtime_skill.py --smoke
+```
+
+Expected output:
+
+```
+allow case: True  allowed by policy smoke
+deny case:  False denied by policy smoke:  deny policies: policy0
+chain ok:   True
+```
+
+## Status
+
+Design-partner preview. Aligned to AGT `openshell-skill` interface as of 2026-04-17.
+If upstream changes, this shim follows; file an issue on sb-runtime if you spot drift.
+
+Feedback especially welcome from the AGT + Sigstore + Red Hat alumni orbit — receipt/log
+design here tries to be compatible with the patterns you've already set.
diff --git a/examples/agt-integration/policies/dev-safe.cedar b/examples/agt-integration/policies/dev-safe.cedar
diff --git a/examples/agt-integration/sb_runtime_skill.py b/examples/agt-integration/sb_runtime_skill.py