Architecture

Deep dive into the M87 Governed Swarm system design.

Governing Laws

These are the system's inviolable constraints. Each law maps directly to enforcement code and regression tests.

#	Law	Enforcement	Test File
1	Agents cannot execute tools	Runner is the only component that calls subprocess	`proof-test.sh` (no tool calls outside runner)
2	No approval → no job	`govern_proposal()` gates all job minting	`test_governance_invariants.py`
3	Unknown state → DENY	All governance paths default to rejection	`test_governance_invariants.py::TestFailSafeInvariants`
4	DEH mismatch → reject	Runner recomputes envelope hash independently	`services/runner/tests/test_runner.py::test_deh_*`
5	Manifest drift → refuse	Runner compares job.manifest_hash to loaded hash	`services/runner/tests/test_runner.py::test_manifest_*`
6	Budget exhaustion → halt	Preemptive `try_*` gates in AutonomyBudgetTracker	`test_reversibility_gate_invariants.py::TestRunnerBudget*`
7	No artifacts → no completion	Runner requires verifiable completion_artifacts	`services/runner/tests/test_runner.py::test_artifact_*`
8	IRREVERSIBLE → human approval	Reversibility gate blocks without explicit approval	`test_reversibility_gate_invariants.py::TestReversibilityGate*`
9	READ_SECRETS → always DENY	Hardcoded rejection in govern_proposal()	`test_governance_invariants.py::test_read_secrets_*`
10	Toxic topology → escalate	SessionRiskTracker detects effect sequences	`test_governance_redteam_invariants.py::TestSessionRiskTracker`

Audit trail: Every law violation emits an event to m87:events with the specific law code and rejection reason.

No exceptions: These laws cannot be relaxed by configuration, environment variables, or runtime flags (except documented emergency kill-switches that log loudly).

Design Philosophy

M87 is built on one principle: autonomy requires governance.

Traditional agent systems give agents freedom to act. M87 inverts this: agents can only propose, and a separate governance layer decides what actually happens.

This creates:

Auditability: Every action has a traceable decision
Control: Humans can intervene at any point
Safety: Agents cannot escalate their own permissions

Core Flow

┌──────────┐     ┌──────────┐     ┌────────────┐     ┌─────────┐     ┌───────────┐
│  Intent  │ ──▶ │ Proposal │ ──▶ │ Governance │ ──▶ │   Job   │ ──▶ │ Execution │
└──────────┘     └──────────┘     └────────────┘     └─────────┘     └───────────┘
     │                │                  │                │                │
     │                │                  │                │                │
  Created by      Created by         Decides:         Minted only      Performed by
  user/system     adapters          ALLOW/DENY/       if approved       Runner
                                   REQUIRE_HUMAN

1. Intent

An intent is a request for something to happen. It can come from:

A user via the API
An external system
A scheduled trigger

{
  "intent_id": "i-abc123",
  "from": "user",
  "mode": "fix",
  "goal": "Fix the authentication bug in login.py"
}

2. Proposal

Adapters watch for intents and create proposals. A proposal specifies:

What effects are needed (READ_REPO, WRITE_PATCH, etc.)
A truth account (observations and claims supporting the proposal)
A risk score

{
  "proposal_id": "p-def456",
  "intent_id": "i-abc123",
  "agent": "Casey",
  "summary": "Fix null check in login.py:45",
  "effects": ["READ_REPO", "WRITE_PATCH", "RUN_TESTS"],
  "truth_account": {
    "observations": ["login.py:45 has uncaught exception"],
    "claims": [{"claim": "Fix is low risk", "confidence": 0.8}]
  },
  "risk_score": 0.3
}

3. Governance Decision

The governance engine evaluates the proposal against policy rules:

READ_SECRETS → Always DENY
Agent scope violation → DENY (agent proposing outside their effects)
Risk threshold exceeded → REQUIRE_HUMAN
DEPLOY → REQUIRE_HUMAN
Otherwise → ALLOW

4. Job

If the decision is ALLOW (or REQUIRE_HUMAN after approval), a JobSpec is minted:

{
  "job_id": "j-ghi789",
  "proposal_id": "p-def456",
  "tool": "pytest",
  "args": ["tests/"],
  "timeout_seconds": 120
}

Jobs go to the m87:jobs Redis stream.

5. Execution

The Runner:

Consumes from m87:jobs stream only
Validates tool against allowlist
Executes with timeout
Reports result back to API

Service Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                              External                                │
│                                                                     │
│   Users ──────▶ Dashboard (UI) ──────▶ Governance API               │
│                 :3000                   :8000                       │
└─────────────────────────────────────────────────────────────────────┘
                                            │
                                            │
┌─────────────────────────────────────────────────────────────────────┐
│                              Internal                                │
│                                                                     │
│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐            │
│   │   Casey     │    │   Jordan    │    │   Riley     │            │
│   │  Adapter    │    │  Adapter    │    │  Adapter    │            │
│   │             │    │             │    │             │            │
│   │ (code)      │    │ (delivery)  │    │ (analysis)  │            │
│   └──────┬──────┘    └──────┬──────┘    └──────┬──────┘            │
│          │                  │                  │                    │
│          └──────────────────┼──────────────────┘                    │
│                             │                                       │
│                             ▼                                       │
│                    ┌─────────────────┐                             │
│                    │  Governance API │                             │
│                    │    (FastAPI)    │                             │
│                    └────────┬────────┘                             │
│                             │                                       │
│          ┌──────────────────┼──────────────────┐                    │
│          │                  │                  │                    │
│          ▼                  ▼                  ▼                    │
│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐            │
│   │   Runner    │    │  Notifier   │    │    Redis    │            │
│   │             │    │             │    │             │            │
│   │ (executes   │    │ (observes   │    │ m87:events  │            │
│   │  jobs)      │    │  events)    │    │ m87:jobs    │            │
│   └─────────────┘    └─────────────┘    └─────────────┘            │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Service Responsibilities

Service	Responsibility	Can Propose	Can Execute	Can Approve
API	Governance decisions	No	No	No (routes to humans)
Runner	Job execution	No	Yes	No
Notifier	Event observation	No	No	No
Casey	Code proposals	Yes	No	No
Jordan	Delivery proposals	Yes	No	No
Riley	Analysis proposals	Yes	No	No
Dashboard	Human interface	No	No	Yes (via API)

Redis Streams

M87 uses Redis Streams for event sourcing and job queuing.

m87:events

The audit log. Every significant event is recorded here:

intent.created      - New intent received
proposal.allowed    - Proposal approved automatically
proposal.denied     - Proposal rejected
proposal.needs_approval - Proposal awaiting human approval
proposal.approved   - Human approved a proposal
job.created         - Job minted
job.completed       - Job finished successfully
job.failed          - Job failed

m87:jobs

The work queue. Only approved jobs appear here:

{
  "job_id": "j-xxx",
  "proposal_id": "p-xxx",
  "tool": "pytest",
  "args": ["tests/"],
  "timeout_seconds": 120
}

Consumer Groups

runner-group: Runner consumes jobs exactly once
notifier-group: Notifier observes events for alerting

Agent Effect Scopes

Each agent has a defined scope of effects they can propose:

Casey:  {READ_REPO, WRITE_PATCH, RUN_TESTS}      max_risk: 0.6
Jordan: {SEND_NOTIFICATION, BUILD_ARTIFACT,      max_risk: 0.5
         CREATE_PR, READ_REPO}
Riley:  {READ_REPO, BUILD_ARTIFACT,              max_risk: 0.4
         SEND_NOTIFICATION}
Human:  {all effects}                            max_risk: 1.0

If an agent proposes an effect outside their scope, the proposal is denied.

If an agent proposes with a risk score above their threshold, the proposal requires human approval.

Security Model

Authentication

All mutating endpoints require the X-M87-Key header:

curl -X POST http://localhost:8000/v1/govern/proposal \
  -H "X-M87-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '...'

Protected Endpoints

Endpoint	Requires Auth
POST /v1/govern/proposal	Yes
POST /v1/approve/{id}	Yes
POST /v1/deny/{id}	Yes
POST /v1/runner/result	Yes
GET /v1/events	No
GET /v1/agents	No

Network Isolation

Redis has no public port (internal only)
Adapters can only reach the API, not the runner
Runner executes in isolation with a fixed tool allowlist

Tool Allowlist

The runner only executes these tools:

TOOL_ALLOWLIST = {"echo", "pytest", "git", "build"}

Any other tool is rejected.

Failure Modes

M87 is designed to fail closed and locally:

Failure	Behavior
Unknown effect	DENY
Agent scope violation	DENY
Unknown tool	Job rejected
Timeout exceeded	Job killed, marked failed
Redis unavailable	API returns 503
Missing API key	401 Unauthorized

Data Flow Example

User creates intent via API
Intent emitted to m87:events
Casey adapter sees intent.created
Casey builds proposal with effects [READ_REPO, WRITE_PATCH]
Casey submits proposal to /v1/govern/proposal
Governance checks:
- Casey can propose READ_REPO, WRITE_PATCH ✓
- Risk 0.3 < Casey's max 0.6 ✓
- No READ_SECRETS ✓
- Not DEPLOY ✓
Decision: ALLOW
Job minted to m87:jobs
Runner consumes job
Runner validates tool in allowlist
Runner executes with timeout
Runner reports result to /v1/runner/result
Result emitted to m87:events
Notifier sees job.completed, can alert

Claude Code Integration

The .claude/ directory teaches Claude Code to see this as a governed system:

.claude/
├── rules/           # Path-scoped governance rules
│   ├── governance.md
│   ├── adapters.md
│   ├── runner.md
│   ├── contracts.md
│   └── infra.md
├── settings.json    # Hooks and permissions
└── models/          # Model routing
    ├── explore.yaml
    └── implement.yaml

This ensures Claude Code:

Respects governance boundaries when editing
Understands what each service can/cannot do
Doesn't accidentally break invariants

Runner Governance Stack

The Runner enforces defense-in-depth governance for all JobSpecs pulled from m87:jobs.

┌─────────────────────────────────────────────────────────────┐
│                    RUNNER GOVERNANCE STACK                   │
├─────────────────────────────────────────────────────────────┤
│  (1) Capability Declaration                                  │
│      └─ DeploymentEnvelope + DEH verification                │
│                                                              │
│  (2) Rate & Blast-Radius Control                             │
│      └─ AutonomyBudget + preemptive try_* gates              │
│      └─ Write scope gating (scope_rank)                      │
│                                                              │
│  (3) Egress Hard-Stop                                        │
│      └─ governed_request() — single choke point              │
└─────────────────────────────────────────────────────────────┘

Job Lifecycle (Governed)

API receives a Proposal and mints a JobSpec only after governance decisions.
API computes and pins:
- manifest_hash
- deployment_envelope
- envelope_hash (DEH)
Runner consumes the JobSpec and enforces:
- Manifest drift refusal (manifest_hash must match runner manifest)
- DEH verification (recompute and compare)
- Autonomy Budget gates (preemptive)
- Artifact-backed completion enforcement
Runner reports bounded, sanitized results including governance evidence.

Machine-Verifiable Evidence

Runner results include:

deh_evidence:
- envelope_hash_verified (bool)
- deh_claimed
- deh_recomputed
autonomy_budget + autonomy_usage
completion_artifacts (verifiable hashes)

Trust Boundary

All enforcement happens in the Runner—the only component authorized to execute tools—so policy can't be bypassed by upstream orchestration.

API Governance Stack (Phase 3-6)

The API enforces additional governance before jobs are minted.

┌─────────────────────────────────────────────────────────────┐
│                    API GOVERNANCE STACK                      │
├─────────────────────────────────────────────────────────────┤
│  Phase 3: Session Risk Tracking                              │
│      └─ SessionRiskTracker (Redis-backed)                   │
│      └─ Toxic topology detection (salami-slicing)           │
│      └─ Fail-closed on sensor blindness                     │
│                                                              │
│  Phase 5: Code Artifact Inspection                           │
│      └─ Tripwire scan (subprocess-based, async-safe)        │
│      └─ Detects: socket, requests, subprocess, eval, etc.   │
│                                                              │
│  Phase 6: Human Override Protection                          │
│      └─ Challenge-response for REQUIRE_HUMAN                │
│      └─ Proposal hash binding (prevents replay)             │
└─────────────────────────────────────────────────────────────┘

Toxic Topologies Detected

Topology	Effects	Decision
`repo_read_then_network`	READ_REPO → NETWORK_CALL	REQUIRE_HUMAN
`secrets_then_network`	READ_SECRETS → NETWORK_CALL	DENY
`write_then_deploy`	WRITE_PATCH → DEPLOY	REQUIRE_HUMAN

No Bypass Guarantee

Both /v1 and /v2 governance endpoints delegate to the same Phase 3-6 enforcement:

/v1/govern/proposal → evaluate_governance_proposal() → Phase 3-6
/v2/govern/proposal → evaluate_governance_proposal() → Phase 3-6

No execution path can enqueue jobs without passing Phase 3-6 governance.

Kill-switch: M87_DISABLE_PHASE36_GOVERNANCE=1 (emergency only, logs loudly)

Extending the System

Adding a New Agent

Define effect scope in apps/api/app/main.py AGENT_PROFILES
Create adapter in services/{name}-adapter/
Add to docker-compose.yml
Document in .claude/rules/adapters.md

Adding a New Effect

Add to TypeScript contracts packages/contracts/src/effects.ts
Add to API effect validation
Update agent scopes if needed
Add to runner tool allowlist if executable

Adding a New Policy Rule

Edit govern_proposal() in apps/api/app/main.py
Add rule before the final ALLOW fallback
Document in .claude/rules/governance.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Governing Laws

Design Philosophy

Core Flow

1. Intent

2. Proposal

3. Governance Decision

4. Job

5. Execution

Service Architecture

Service Responsibilities

Redis Streams

m87:events

m87:jobs

Consumer Groups

Agent Effect Scopes

Security Model

Authentication

Protected Endpoints

Network Isolation

Tool Allowlist

Failure Modes

Data Flow Example

Claude Code Integration

Runner Governance Stack

Job Lifecycle (Governed)

Machine-Verifiable Evidence

Trust Boundary

API Governance Stack (Phase 3-6)

Toxic Topologies Detected

No Bypass Guarantee

Extending the System

Adding a New Agent

Adding a New Effect

Adding a New Policy Rule

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

Governing Laws

Design Philosophy

Core Flow

1. Intent

2. Proposal

3. Governance Decision

4. Job

5. Execution

Service Architecture

Service Responsibilities

Redis Streams

m87:events

m87:jobs

Consumer Groups

Agent Effect Scopes

Security Model

Authentication

Protected Endpoints

Network Isolation

Tool Allowlist

Failure Modes

Data Flow Example

Claude Code Integration

Runner Governance Stack

Job Lifecycle (Governed)

Machine-Verifiable Evidence

Trust Boundary

API Governance Stack (Phase 3-6)

Toxic Topologies Detected

No Bypass Guarantee

Extending the System

Adding a New Agent

Adding a New Effect

Adding a New Policy Rule