Skip to content

Commit 44f924d

Browse files
author
tomjwxf
committed
feat: AGT integration example + aarch64 refuse-to-run + community files
Two substantive changes for design-partner readiness: 1. examples/agt-integration/ — Python drop-in shim (SbRuntimeSkill) matching openshell_agentmesh.skill.GovernanceSkill field-for-field. Directly answers microsoft/agent-governance-toolkit#748. Includes a Cedar policy, smoke test, and DESIGN.md-style open-question section inviting upstream feedback on the exact provider contract. 2. aarch64 honesty — apply_linux() now refuses-to-run with a clear error on non-x86_64 Linux rather than silently falling back to a permissive filter. Silent weakening of "strict" mode is strictly worse than a hard stop. Tracked at issue #1; lands in v0.1.1. Community files: - SECURITY.md — private-disclosure email, scope, defence-in-depth assumptions. - CONTRIBUTING.md — design-partner programme, ranked PR opportunities, dev workflow. - CHANGELOG.md — Keep-a-Changelog format; v0.1.0-alpha.1 entry. - README — honest per-platform matrix; AGT integration pointer. GitHub topics set for discoverability. Open issues #1#4 enumerate concrete bounded tasks (aarch64 table, --dry-run flag, macOS backend, multi-issuer chains) — each a self-contained opportunity for a contributor / design partner. Tests: still 15 + 1 doctest green. cargo clippy --all-features clean (CI baseline). cargo fmt clean.
1 parent f64bd06 commit 44f924d

8 files changed

Lines changed: 760 additions & 4 deletions

File tree

CHANGELOG.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Changelog
2+
3+
All notable changes to `sb-runtime` are documented here. Format follows
4+
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project follows
5+
[Semantic Versioning](https://semver.org/spec/v2.0.0.html) once it hits v1.0.
6+
7+
## [Unreleased]
8+
9+
### Added
10+
- Nothing yet; PRs welcome.
11+
12+
## [0.1.0-alpha.1] — 2026-04-17
13+
14+
First public preview. Design-partner release.
15+
16+
### Added
17+
18+
- **`sb-cli`** — the `sb` binary with three subcommands: `exec`, `verify`,
19+
`keys generate`.
20+
- **`sb-sandbox`** crate — OS-native sandbox primitives.
21+
- Linux x86_64 backend: Landlock ABI V2 (filesystem read / write / exec) +
22+
seccomp-BPF (strict allowlist of ~70 syscalls by default; permissive
23+
deny-list mode also available).
24+
- Linux aarch64: refuses-to-run with a clear error rather than silently
25+
degrading. See issue #1.
26+
- macOS + Windows: stubs. `--allow-unsandboxed` lets the other layers
27+
(Cedar + receipts) fire without OS isolation on those platforms.
28+
- **`sb-policy`** crate — Cedar-backed policy evaluator.
29+
- **`sb-receipt`** crate — Ed25519-signed, JCS-canonical, hash-chained
30+
receipts. Zero-I/O, pure-crypto. Compatible with
31+
[`@veritasacta/verify`](https://www.npmjs.com/package/@veritasacta/verify)
32+
and the
33+
[IETF draft-farley-acta-signed-receipts](https://datatracker.ietf.org/doc/draft-farley-acta-signed-receipts/).
34+
- **`examples/basic/`** — minimal "allow-list of commands" Cedar policy with
35+
smoke-test instructions.
36+
- **`examples/agt-integration/`** — Python shim (`SbRuntimeSkill`) that
37+
drops into Microsoft's Agent Governance Toolkit in place of
38+
`openshell_agentmesh.skill.GovernanceSkill`. Same public interface; swap
39+
via config. Addresses [AGT issue #748](https://github.com/microsoft/agent-governance-toolkit/issues/748).
40+
- **CI**`cargo fmt`, `cargo clippy`, `cargo test`, and an end-to-end Linux
41+
smoke run on push / PR. Cross-compile to x86_64-linux / x86_64-macos /
42+
aarch64-macos on tagged releases.
43+
- **Community files**`CONTRIBUTING.md`, `SECURITY.md`, `DESIGN.md` with
44+
roadmap + open questions.
45+
46+
### Known limitations
47+
48+
See `DESIGN.md#known-limitations-v01` for the full list. Headlines:
49+
50+
- Linux x86_64 only in this release.
51+
- Syscall allowlist is hand-curated; some programs will hit missing
52+
syscalls (particularly `statx`, `ioctl`, newer `getrandom` variants).
53+
- Network policy is coarse: loopback-or-nothing via seccomp. Landlock
54+
network rules (kernel ≥ 6.7) land in v0.2.
55+
- Receipt chains are single-issuer. Multi-issuer chains are issue #4.
56+
- `sb exec` runs the command it evaluates; a pure-evaluation `--dry-run`
57+
mode is issue #2.
58+
59+
### Contributors
60+
61+
@tomjwxf — core scaffold, Linux backend, Cedar integration, receipt format,
62+
AGT shim.
63+
64+
You? We're actively looking for design partners — see CONTRIBUTING.md.
65+
66+
[Unreleased]: https://github.com/ScopeBlind/sb-runtime/compare/v0.1.0-alpha.1...HEAD
67+
[0.1.0-alpha.1]: https://github.com/ScopeBlind/sb-runtime/releases/tag/v0.1.0-alpha.1

CONTRIBUTING.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Contributing to sb-runtime
2+
3+
Thank you — `sb-runtime` is a small project and every review, test, and PR
4+
genuinely moves it forward. Three things to read before you start.
5+
6+
## The design-partner programme
7+
8+
If you're building in adjacent territory (agent governance, policy-as-code,
9+
OS sandboxing, cryptographic supply chain, transparency logs) we'd love your
10+
input before v0.1 stabilises. Design partners get:
11+
12+
- Direct read on the v0.2 roadmap before it lands in `DESIGN.md`.
13+
- Early reviewer access on API surface — the kind of feedback that's still
14+
cheap to act on.
15+
- Named credit in release notes (if you want it; anonymity is fine too).
16+
17+
Open an issue titled "design-partner interest" and tell us a sentence about
18+
what you're building. Or email **tommy@scopeblind.com**.
19+
20+
Current design partners (or conversations in flight): TBD — you could be #1.
21+
22+
## PRs we'd especially welcome
23+
24+
Ranked by leverage:
25+
26+
1. **[aarch64 syscall table](https://github.com/ScopeBlind/sb-runtime/issues/1)** — unblocks Linux ARM.
27+
2. **[--dry-run flag for sb exec](https://github.com/ScopeBlind/sb-runtime/issues/2)** — simplifies the AGT shim.
28+
3. **[macOS sandbox backend](https://github.com/ScopeBlind/sb-runtime/issues/3)** — sandbox_init + SBPL.
29+
4. **[Multi-issuer receipt chains](https://github.com/ScopeBlind/sb-runtime/issues/4)** — v0.2 receipt model.
30+
31+
See `DESIGN.md` for the v0.2/v0.3 roadmap.
32+
33+
## Development workflow
34+
35+
```bash
36+
# Build & run the full test matrix
37+
cargo test --workspace
38+
39+
# Lint (matches CI)
40+
cargo clippy --workspace --all-targets --all-features
41+
42+
# Format (matches CI)
43+
cargo fmt --all --check
44+
45+
# End-to-end smoke test (Linux-like OS)
46+
cargo build -p sb-cli
47+
./target/debug/sb exec \
48+
--policy examples/basic/policy.cedar \
49+
--receipts /tmp/sb-smoke \
50+
--allow-unsandboxed \
51+
-- /bin/echo hello
52+
./target/debug/sb verify /tmp/sb-smoke
53+
```
54+
55+
## PR conventions
56+
57+
- **One concern per PR.** Sandbox changes, policy changes, and receipt
58+
changes should not share a commit.
59+
- **New public API requires a test.** We don't expose anything we can't point at.
60+
- **Breaking changes need a CHANGELOG entry.** Minor doc tweaks don't.
61+
- **`cargo fmt` + `cargo clippy` clean** before pushing. CI will block otherwise.
62+
63+
## Code review
64+
65+
One maintainer approval is enough during v0.1-alpha. Once we hit v0.1.0 stable
66+
we'll move to a stricter review model.
67+
68+
## Licence
69+
70+
By submitting a PR you agree your contribution is licensed MIT under the
71+
project LICENCE. If you work at an employer that claims rights to your code,
72+
please sort that out before PRing.
73+
74+
## Be kind
75+
76+
The project is young and evolving. Good-faith questions, arch disagreements,
77+
and "have you considered…" are all welcome. Rudeness isn't.
78+
79+
Thanks. Looking forward to building this with you.

README.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,22 @@
1515

1616
`sb-runtime` answers a question several AGT / OpenShell users have been asking: *can we get the "walls + brain + receipts" pattern without Docker/OCI/k3s/gateway infrastructure?* This is a single 8 MB binary that runs on dev laptops, CI, and edge.
1717

18-
**Status: v0.1.0-alpha.1 — design-partner preview.** Linux sandbox backend works (Landlock + seccomp). macOS / Windows backends are stubs; use `--allow-unsandboxed` on those platforms to run with Cedar + receipts only. We're actively looking for design-partner input on the AGT provider interface and the Cedar schema for agent actions — open an issue or reply to [microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748).
18+
**Status: v0.1.0-alpha.1 — design-partner preview.** Honest platform matrix:
19+
20+
| Platform | Sandbox | Cedar + receipts |
21+
|----------------------|----------------------------|------------------|
22+
| Linux x86_64 | Landlock + seccomp ||
23+
| Linux aarch64 | Refuses (see [issue #1][1])| ✓ (w/ `--allow-unsandboxed`) |
24+
| macOS | Stub (`--allow-unsandboxed`) — [issue #3][3] ||
25+
| Windows | Stub (`--allow-unsandboxed`) ||
26+
27+
We're actively looking for design-partner input on the AGT provider interface,
28+
the Cedar schema for agent actions, and the macOS/Windows backend priorities —
29+
see `CONTRIBUTING.md` or reply to
30+
[microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748).
31+
32+
[1]: https://github.com/ScopeBlind/sb-runtime/issues/1
33+
[3]: https://github.com/ScopeBlind/sb-runtime/issues/3
1934

2035
## Quick start
2136

@@ -60,6 +75,13 @@ Each sub-crate is usable independently. `sb-receipt` is deliberately minimal (ze
6075
- **…use firejail / bubblewrap?** Those are filesystem sandboxes. They don't evaluate Cedar policy before the exec, and they don't emit signed receipts. Combine them with `sb-runtime` if you want — `sb` does Cedar + receipts + Landlock+seccomp, they do extra fs isolation layers.
6176
- **…just use Cedar?** Cedar decides. It doesn't enforce or audit. `sb-runtime` is the enforcement layer.
6277

78+
## Integrating with Microsoft's Agent Governance Toolkit
79+
80+
See [`examples/agt-integration/`](examples/agt-integration/) for a Python
81+
drop-in shim (`SbRuntimeSkill`) that replaces
82+
`openshell_agentmesh.skill.GovernanceSkill` field-for-field. Swap via config,
83+
no agent code changes required.
84+
6385
## Licensing
6486

6587
MIT. No runtime dependencies on ScopeBlind services; no telemetry. The optional managed tier (hosted receipt archival, team dashboards, compliance exports) is available at [scopeblind.com/pricing](https://scopeblind.com/pricing) but the sandbox runs local-only forever with the free binary.

SECURITY.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# Security Policy
2+
3+
## Supported versions
4+
5+
Only the latest tagged release on `main` receives security fixes during the
6+
v0.1-alpha cycle. Once v1.0 ships, we'll support the latest two minor lines.
7+
8+
## Reporting a vulnerability
9+
10+
Please report security issues privately rather than via a public GitHub issue:
11+
12+
- Email: **security@scopeblind.com**
13+
- PGP (optional): fingerprint published at https://scopeblind.com/.well-known/security.txt
14+
15+
We aim to acknowledge within **24 hours** and ship a fix + coordinated disclosure
16+
within **14 days** for high-severity issues, longer for issues requiring upstream
17+
Cedar / Landlock / seccomp changes.
18+
19+
## Scope
20+
21+
In scope:
22+
23+
- Anything in `crates/sb-*` — the Rust code.
24+
- Anything in `examples/` — if an example would leak a key, mis-apply a policy,
25+
or otherwise teach a wrong pattern.
26+
- Any documented CLI flag behaviour.
27+
28+
Out of scope:
29+
30+
- Bugs in the Cedar policy engine itself — please report to
31+
https://github.com/cedar-policy/cedar.
32+
- Bugs in Landlock or the Linux kernel — report to
33+
https://landlock.io or the kernel mailing list.
34+
- Denial-of-service attacks against the hosted receipt archival service —
35+
report to security@scopeblind.com (separate from the open-source repo).
36+
37+
## Defence-in-depth assumptions we rely on
38+
39+
A sandbox built from Landlock + seccomp is *best-effort*, not a complete jail.
40+
We assume:
41+
42+
- The kernel is patched against public CVEs.
43+
- The binary is not setuid. Callers drop privileges before invoking `sb`.
44+
- A determined attacker with a kernel 0-day can escape. For higher-assurance
45+
workloads, layer `sb` inside a VM, a container, or a hardware sandbox —
46+
`sb` is *complementary* to those, not a replacement.
47+
48+
## Credit
49+
50+
Researchers who privately report valid issues are credited in release notes
51+
unless they request anonymity.

crates/sb-sandbox/src/linux.rs

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,32 @@ use landlock::{
2626
};
2727

2828
/// Apply the profile to the current thread+process on Linux.
29+
///
30+
/// v0.1 supports x86_64 only. Other Linux architectures (notably aarch64,
31+
/// which is increasingly common on cloud instances) refuse-to-run with a
32+
/// clear error rather than silently falling back to a permissive filter —
33+
/// silently weakening enforcement for users who asked for strict mode is
34+
/// strictly worse than a hard stop. The aarch64 syscall table lands in
35+
/// v0.1.1; tracked at https://github.com/ScopeBlind/sb-runtime/issues/1
2936
pub(crate) fn apply_linux(profile: &Profile) -> Result<(), SandboxError> {
30-
apply_landlock(profile)?;
31-
apply_seccomp(profile)?;
32-
Ok(())
37+
#[cfg(not(target_arch = "x86_64"))]
38+
{
39+
let _ = profile;
40+
return Err(SandboxError::Unsupported(
41+
"sb-runtime v0.1 Linux backend supports x86_64 only. \
42+
aarch64 syscall table is tracked at \
43+
https://github.com/ScopeBlind/sb-runtime/issues/1 \
44+
(lands in v0.1.1). Use --allow-unsandboxed to run with \
45+
Cedar + receipts only on unsupported architectures.",
46+
));
47+
}
48+
49+
#[cfg(target_arch = "x86_64")]
50+
{
51+
apply_landlock(profile)?;
52+
apply_seccomp(profile)?;
53+
Ok(())
54+
}
3355
}
3456

3557
fn apply_landlock(profile: &Profile) -> Result<(), SandboxError> {

examples/agt-integration/README.md

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# AGT integration — `SbRuntimeSkill`
2+
3+
Drop-in replacement for `openshell_agentmesh.skill.GovernanceSkill` that delegates policy
4+
evaluation + enforcement + receipt emission to the `sb` binary. Same public interface; swap
5+
via configuration. Zero agent-code changes.
6+
7+
## Why
8+
9+
[microsoft/agent-governance-toolkit#748](https://github.com/microsoft/agent-governance-toolkit/issues/748)
10+
asks for a lightweight sandbox alternative to OpenShell — something without Docker, k3s, or
11+
a gateway, suitable for CI, edge, and dev machines. `sb-runtime` answers that (single ~8 MB
12+
Rust binary; Cedar policy + Ed25519 receipts + Landlock + seccomp on Linux).
13+
`SbRuntimeSkill` is the Python shim that lets AGT consume it.
14+
15+
## Drop-in swap
16+
17+
Before (OpenShell):
18+
19+
```python
20+
from openshell_agentmesh.skill import GovernanceSkill
21+
skill = GovernanceSkill(policy_dir=Path("policies/"))
22+
decision = skill.check_policy("shell", {"command": "rm -rf /tmp"})
23+
```
24+
25+
After (`sb-runtime`):
26+
27+
```python
28+
from sb_runtime_skill import SbRuntimeSkill
29+
skill = SbRuntimeSkill(
30+
policy_path=Path("policies/dev-safe.cedar"),
31+
receipts_dir=Path(".receipts"),
32+
)
33+
decision = skill.check_policy(
34+
"exec",
35+
{"command": "/usr/bin/rm", "args": ["-rf", "/tmp"]},
36+
)
37+
```
38+
39+
`PolicyDecision` is field-for-field compatible (`allowed`, `action`, `reason`, `policy_name`,
40+
`trust_score`) plus one sb-specific extension: `receipt_path` pointing at the signed receipt
41+
the `sb` binary just wrote. Your existing trust-score loops, audit exports, and dashboards
42+
keep working.
43+
44+
## What you get that OpenShell doesn't
45+
46+
- **Tamper-evident audit trail.** Every decision — allow *and* deny — is an Ed25519-signed
47+
receipt, hash-chained by `prev_hash`. A regulator can verify the chain offline with
48+
`npx @veritasacta/verify` or `sb verify .receipts/` and needs to trust nothing except the
49+
issuer public key.
50+
- **No infrastructure.** No Docker daemon, no k3s control plane, no network proxy. Single
51+
binary. Fits in CI, edge, and dev-laptop workflows OpenShell finds awkward.
52+
- **Offline verification property.** Receipts keep verifying after the `sb` binary and the
53+
Skill and ScopeBlind are all gone.
54+
55+
## What you give up vs. OpenShell
56+
57+
- No network proxy interception layer (LoopbackOnly seccomp rule only, for now).
58+
- No multi-tenant k3s isolation.
59+
- Linux sandbox backend is x86_64-only in v0.1 (aarch64 refuses-to-run rather than
60+
silently falling back — see [issue #1](https://github.com/ScopeBlind/sb-runtime/issues/1)).
61+
- macOS and Windows run in `--allow-unsandboxed` mode (Cedar + receipts only) until
62+
v0.2 lands `sandbox_init` and AppContainer backends.
63+
64+
**Use OpenShell when you need full network-proxy enforcement or multi-tenant k3s.
65+
Use `sb-runtime` when those are overkill.**
66+
67+
## Open design questions (v0.1.0-alpha.1)
68+
69+
These are what I most want to hear from AGT maintainers + @lukehinds on:
70+
71+
1. **Separation of check and enforce.** `check_policy` in OpenShell's `GovernanceSkill` is
72+
pure evaluation — it does not run the action. `sb exec` currently evaluates *and* runs.
73+
We're planning a `sb exec --dry-run` in v0.1.1 so the Skill can check without running;
74+
the current shim uses a receipts-scan workaround. Is "check then separately enforce" the
75+
right mental model for this interface, or should `check_policy` also enforce?
76+
2. **Async or sync?** The current shim is sync (matching OpenShell's `check_policy`). If
77+
AGT is moving async in general, say the word and we'll ship async equivalents.
78+
3. **Receipt storage.** Should `SbRuntimeSkill` expose receipts via the existing audit log
79+
methods, or is the `receipt_path` on `PolicyDecision` plus `verify_chain()` enough for
80+
downstream consumers?
81+
4. **Trust score feedback loop.** The Skill currently reads trust score but doesn't write
82+
it back from receipt outcomes. Should receipt decisions auto-decay trust the way
83+
OpenShell's demo does?
84+
85+
## Run the smoke test
86+
87+
```bash
88+
# Build + install the sb binary from source (one-time)
89+
cargo install --path ../../crates/sb-cli # from sb-runtime root
90+
91+
# Run the Python demo
92+
python sb_runtime_skill.py --smoke
93+
```
94+
95+
Expected output:
96+
97+
```
98+
allow case: True allowed by policy smoke
99+
deny case: False denied by policy smoke: deny policies: policy0
100+
chain ok: True
101+
```
102+
103+
## Status
104+
105+
Design-partner preview. Aligned to AGT `openshell-skill` interface as of 2026-04-17.
106+
If upstream changes, this shim follows; file an issue on sb-runtime if you spot drift.
107+
108+
Feedback especially welcome from the AGT + Sigstore + Red Hat alumni orbit — receipt/log
109+
design here tries to be compatible with the patterns you've already set.

0 commit comments

Comments
 (0)