|
| 1 | +# AGT integration — `SbRuntimeSkill` |
| 2 | + |
| 3 | +Drop-in replacement for `openshell_agentmesh.skill.GovernanceSkill` that delegates policy |
| 4 | +evaluation + enforcement + receipt emission to the `sb` binary. Same public interface; swap |
| 5 | +via configuration. Zero agent-code changes. |
| 6 | + |
| 7 | +## Why |
| 8 | + |
| 9 | +[microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748) |
| 10 | +asks for a lightweight sandbox alternative to OpenShell — something without Docker, k3s, or |
| 11 | +a gateway, suitable for CI, edge, and dev machines. `sb-runtime` answers that (single ~8 MB |
| 12 | +Rust binary; Cedar policy + Ed25519 receipts + Landlock + seccomp on Linux). |
| 13 | +`SbRuntimeSkill` is the Python shim that lets AGT consume it. |
| 14 | + |
| 15 | +## Drop-in swap |
| 16 | + |
| 17 | +Before (OpenShell): |
| 18 | + |
| 19 | +```python |
| 20 | +from openshell_agentmesh.skill import GovernanceSkill |
| 21 | +skill = GovernanceSkill(policy_dir=Path("policies/")) |
| 22 | +decision = skill.check_policy("shell", {"command": "rm -rf /tmp"}) |
| 23 | +``` |
| 24 | + |
| 25 | +After (`sb-runtime`): |
| 26 | + |
| 27 | +```python |
| 28 | +from sb_runtime_skill import SbRuntimeSkill |
| 29 | +skill = SbRuntimeSkill( |
| 30 | + policy_path=Path("policies/dev-safe.cedar"), |
| 31 | + receipts_dir=Path(".receipts"), |
| 32 | +) |
| 33 | +decision = skill.check_policy( |
| 34 | + "exec", |
| 35 | + {"command": "/usr/bin/rm", "args": ["-rf", "/tmp"]}, |
| 36 | +) |
| 37 | +``` |
| 38 | + |
| 39 | +`PolicyDecision` is field-for-field compatible (`allowed`, `action`, `reason`, `policy_name`, |
| 40 | +`trust_score`) plus one sb-specific extension: `receipt_path` pointing at the signed receipt |
| 41 | +the `sb` binary just wrote. Your existing trust-score loops, audit exports, and dashboards |
| 42 | +keep working. |
| 43 | + |
| 44 | +## What you get that OpenShell doesn't |
| 45 | + |
| 46 | +- **Tamper-evident audit trail.** Every decision — allow *and* deny — is an Ed25519-signed |
| 47 | + receipt, hash-chained by `prev_hash`. A regulator can verify the chain offline with |
| 48 | + `npx @veritasacta/verify` or `sb verify .receipts/` and needs to trust nothing except the |
| 49 | + issuer public key. |
| 50 | +- **No infrastructure.** No Docker daemon, no k3s control plane, no network proxy. Single |
| 51 | + binary. Fits in CI, edge, and dev-laptop workflows OpenShell finds awkward. |
| 52 | +- **Offline verification property.** Receipts keep verifying after the `sb` binary and the |
| 53 | + Skill and ScopeBlind are all gone. |
| 54 | + |
| 55 | +## What you give up vs. OpenShell |
| 56 | + |
| 57 | +- No network proxy interception layer (LoopbackOnly seccomp rule only, for now). |
| 58 | +- No multi-tenant k3s isolation. |
| 59 | +- Linux sandbox backend is x86_64-only in v0.1 (aarch64 refuses-to-run rather than |
| 60 | + silently falling back — see [issue #1](https://github.com/ScopeBlind/sb-runtime/issues/1)). |
| 61 | +- macOS and Windows run in `--allow-unsandboxed` mode (Cedar + receipts only) until |
| 62 | + v0.2 lands `sandbox_init` and AppContainer backends. |
| 63 | + |
| 64 | +**Use OpenShell when you need full network-proxy enforcement or multi-tenant k3s. |
| 65 | +Use `sb-runtime` when those are overkill.** |
| 66 | + |
| 67 | +## Open design questions (v0.1.0-alpha.1) |
| 68 | + |
| 69 | +These are what I most want to hear from AGT maintainers + @lukehinds on: |
| 70 | + |
| 71 | +1. **Separation of check and enforce.** `check_policy` in OpenShell's `GovernanceSkill` is |
| 72 | + pure evaluation — it does not run the action. `sb exec` currently evaluates *and* runs. |
| 73 | + We're planning a `sb exec --dry-run` in v0.1.1 so the Skill can check without running; |
| 74 | + the current shim uses a receipts-scan workaround. Is "check then separately enforce" the |
| 75 | + right mental model for this interface, or should `check_policy` also enforce? |
| 76 | +2. **Async or sync?** The current shim is sync (matching OpenShell's `check_policy`). If |
| 77 | + AGT is moving async in general, say the word and we'll ship async equivalents. |
| 78 | +3. **Receipt storage.** Should `SbRuntimeSkill` expose receipts via the existing audit log |
| 79 | + methods, or is the `receipt_path` on `PolicyDecision` plus `verify_chain()` enough for |
| 80 | + downstream consumers? |
| 81 | +4. **Trust score feedback loop.** The Skill currently reads trust score but doesn't write |
| 82 | + it back from receipt outcomes. Should receipt decisions auto-decay trust the way |
| 83 | + OpenShell's demo does? |
| 84 | + |
| 85 | +## Run the smoke test |
| 86 | + |
| 87 | +```bash |
| 88 | +# Build + install the sb binary from source (one-time) |
| 89 | +cargo install --path ../../crates/sb-cli # from sb-runtime root |
| 90 | + |
| 91 | +# Run the Python demo |
| 92 | +python sb_runtime_skill.py --smoke |
| 93 | +``` |
| 94 | + |
| 95 | +Expected output: |
| 96 | + |
| 97 | +``` |
| 98 | +allow case: True allowed by policy smoke |
| 99 | +deny case: False denied by policy smoke: deny policies: policy0 |
| 100 | +chain ok: True |
| 101 | +``` |
| 102 | + |
| 103 | +## Status |
| 104 | + |
| 105 | +Design-partner preview. Aligned to AGT `openshell-skill` interface as of 2026-04-17. |
| 106 | +If upstream changes, this shim follows; file an issue on sb-runtime if you spot drift. |
| 107 | + |
| 108 | +Feedback especially welcome from the AGT + Sigstore + Red Hat alumni orbit — receipt/log |
| 109 | +design here tries to be compatible with the patterns you've already set. |
0 commit comments