oh-my-fable

Fable 5's way of working a long task — plan first, self-correct every step, never lose the thread — as a model-agnostic agent harness.

_{The fable is Fable 5's way of thinking; the oh-my- is because, like oh-my-zsh, you just want the good defaults. The mindset is the model's — the engine is any provider.}

npm i oh-my-fable

The demos are magical. Then you point an agent at a real multi-hour task and it loops on the same step, loses the plan somewhere in a 40-message chat history, and — when your process restarts — forgets everything and starts over.

oh-my-fable encodes the way a strong reasoning model works a long task — the mindset, not the model — into a harness: plan first, self-correct every step, keep the thread, and finish. It's built around two mechanisms and one rule:

The whole run lives in a single RunContext — the only source of truth, and always serializable. It's checkpointed after every step.

From that one rule you get the thing nobody else gives you: a crash is a pause.

_{The name is about the thinking, not a model lock-in — the mindset is Fable 5's, the
engine is whatever Provider you hand it (Anthropic, OpenAI-compatible, local, …).}

── run run_mqf… ──
  📋 planned 3 steps: outline → draft → edit
  ▶  outline
     → outlined
     💾 checkpoint saved
  ▶  draft
  💥 the process just died (power outage, OOM, deploy, whatever)

── resuming from the last checkpoint ──
  ▶  draft                ← picks up exactly where it died
     💾 checkpoint saved
  ▶  edit
  ✅ done

  steps: outline [done], draft [done], edit [done]

const result = await run(goal, { provider, store });   // crashes at step 2
// ...process restarts...
await resume(result.runId, { provider, store });        // finishes from step 2

That's examples/scripted-run.mjs — run it with npm run example, no API key needed.

The three things it does that most frameworks don't

1. It survives crashes (resumable by construction)

State doesn't live in memory or in a chat transcript — it lives in RunContext, saved to disk after every step. Kill the process at step 47 of 60 and resume() continues from step 47, plan and progress intact. Swap the FileStore for SQLite/Redis by implementing one interface.

2. It plans first, then self-corrects (plan ≠ history)

The plan is structured data that lives outside the conversation, so the model never loses track of "where am I" in a wall of text. After every step a reflector checks the result against the goal and routes:

verdict	meaning	what happens
`on_track`	normal progress	next step
`needs_replan`	the result changed the plan's assumptions	replan
`blocked`	same obstacle keeps recurring	replan around it / escalate
`goal_met`	success criteria satisfied	stop (even with steps left — no busywork)

And replanning accumulates: finished steps are preserved verbatim; only the remaining work is regenerated. Long tasks move forward instead of restarting.

3. It's deterministically testable (genuinely rare for an agent framework)

Because every model call is stateless, you can script the model and assert the loop's behavior — no network, no flakiness:

import { run, ScriptedProvider, reply, MemoryStore } from "oh-my-fable";

const provider = new ScriptedProvider([
  reply.plan([{ id: "s1", intent: "do the thing" }]),
  reply.text("did it"),
  reply.reflection("goal_met"),
]);

const { status } = await run("do the thing", { provider, store: new MemoryStore() });
expect(status).toBe("done"); // fully deterministic

The whole harness is tested this way — crash-recovery, replan-accumulation, budget halts, the tool loop — all without a single API call.

Quick start

import { run, AnthropicProvider } from "oh-my-fable";

const result = await run(
  {
    description: "Research the top 3 Rust web frameworks and write a comparison table",
    successCriteria: ["a markdown table comparing 3 frameworks exists"],
    constraints: ["only use information you can verify"],
  },
  { provider: new AnthropicProvider() }, // reads ANTHROPIC_API_KEY
);

console.log(result.status); // "done" | "halted" | "failed"
console.log(result.ctx.plan.steps);

npm i oh-my-fable        # zero runtime dependencies

Node ≥ 18. Ships with AnthropicProvider and OpenAICompatProvider (works with OpenAI, Ollama, LM Studio, OpenRouter, Groq… — ollama("llama3.1") for a local model with no key), both over fetch, no SDK. Or bring any model by implementing the Provider interface (three methods).

AnthropicProvider works with the current flagship models (claude-opus-4-8, claude-fable-5) out of the box — it drops the temperature parameter they reject — and prompt-caches the system+tools prefix by default, so a long durable run pays ~10× less on the context it replays every step. Opt into { thinking: "adaptive", effort: "high" } for harder planning. The claude provider can return real --output-format json cost/usage and run Claude's own tools ({ tools: true, permissionMode: "acceptEdits" }).

Or use it from the terminal

Don't want to write code? It ships a CLI (zero extra deps):

npx oh-my-fable demo                       # watch crash → resume, no API key

# ⭐ already pay for Claude Code? drive it as a DURABLE, TOOL-USING agent — your
#    login, no separate API key, $0 per token. Claude edits files & runs commands:
npx oh-my-fable run "refactor utils.ts and run the tests" --provider claude --cli-tools

# pure-reasoning over the same login (no tools):
npx oh-my-fable run "outline a talk on durable agents" --provider claude

# or a LOCAL model (Ollama / LM Studio), also no key:
npx oh-my-fable run "outline a talk on durable agents" --provider ollama --model llama3.1

# or any hosted model:
export ANTHROPIC_API_KEY=sk-...
npx oh-my-fable run "summarize README.md into SUMMARY.md" --tools fs

npx oh-my-fable list                       # your saved runs
npx oh-my-fable show  run_abc123           # the run's plan, steps & budget as a timeline
npx oh-my-fable resume run_abc123          # continue one from its checkpoint

You don't need an Anthropic API key. Pick how it talks to a model:

`--provider`	uses	key?	tools?
`claude`	your Claude Code login	none	`--cli-tools` → Claude runs Read/Write/Edit/Bash itself
`codex`	your Codex CLI login	none	`--cli-tools` → workspace-write
`ollama`	a local Ollama model	none	`--tools fs` (harness-run)
`--base-url <url>`	LM Studio / OpenRouter / Groq / any OpenAI-compatible	per that server	`--tools fs`
`openai`	OpenAI	`OPENAI_API_KEY`	`--tools fs`
(default)	Anthropic	`ANTHROPIC_API_KEY`	`--tools fs`

Two ways to give an agent hands:

--cli-tools (claude/codex) — the CLI runs its own tools (file edits, shell) on your subscription. oh-my-fable stays the durable planner/reflector around it: it plans, checkpoints every step, and reflects — Claude does the work. Tune with --permission-mode acceptEdits|dontAsk|plan and --allow "Read,Edit,Bash(npm test)".
--tools fs (API providers) — the harness gives the agent a sandboxed read_file/write_file/list_dir, confined to the working directory.

You watch the plan form and each step get reflected on, live. Every run is checkpointed, so resume <runId> always works — and show <runId> prints the whole run (plan, steps, budget) from its serialized RunContext.

Tools

import { run, defineTool, AnthropicProvider } from "oh-my-fable";

const search = defineTool(
  "web_search",
  "Search the web and return results.",
  { type: "object", properties: { query: { type: "string" } }, required: ["query"] },
  async ({ query }) => ({ ok: true, output: await fetchResults(query) }),
);

await run(goal, { provider: new AnthropicProvider(), tools: [search] });

A tool that throws becomes an Observation, not a crash — the reflector decides what to do about it.

Watch it work

await run(goal, {
  provider,
  onEvent: (e) => console.log(e.type, e),
  // plan_created · step_start · step_done · reflection · replan · compaction · checkpoint · done · halted
});

It can't run away

Three hard ceilings, checked at the top of every loop turn, plus two recovery caps — exceed any and it halts cleanly, preserving all work:

await run(goal, {
  provider,
  maxSteps: 50,            // total step budget
  maxTokens: 2_000_000,    // cumulative token budget
  maxWallClockMs: 1_800_000,
  maxStepAttempts: 3,      // a single step retried this many times → blocked
  maxReplans: 12,          // replan storm → halted
});

How it's built

A planner ↔ executor ↔ reflector loop over a serializable RunContext:

plan → [ budget? → next step → compact? → execute → reflect → checkpoint → route ] → done

planner — goal → ordered steps; replan accumulates instead of resetting.
executor — runs one step, including a provider-agnostic tool mini-loop.
reflector — heuristics first (cheap, certain), then the model, with JSON self-repair and a conservative fallback (a wrong early exit is worse than one more loop).
contextManager — folds old turns into digests so long runs stay inside the window; the plan is never compacted.
store / budget — checkpoint after every step; guard against runaways.

Every piece is an interface you can replace without touching the core. The full architecture writeup is in ARCHITECTURE.md.

Roadmap

A web dashboard that tails a run's events and lets you resume from any checkpoint (show <runId> is the CLI version of this today).
More providers in-repo (OpenAI-compatible, local) — though it's a 3-method interface.
Parallel step execution for independent branches of the plan DAG.
Human-in-the-loop: pause for approval as a first-class step status.

💖 Sponsor

Free, MIT, zero-dependency, built in spare time. If it saved your agent from starting over:

⭐ Star the repo — it's how the next person building an agent finds it.
🍋 Sponsor via Lemon Squeezy — one-time or recurring.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
examples		examples
src		src
tests		tests
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

oh-my-fable

Fable 5's way of working a long task — plan first, self-correct every step, never lose the thread — as a model-agnostic agent harness.

The three things it does that most frameworks don't

1. It survives crashes (resumable by construction)

2. It plans first, then self-corrects (plan ≠ history)

3. It's deterministically testable (genuinely rare for an agent framework)

Quick start

Or use it from the terminal

Tools

Watch it work

It can't run away

How it's built

Roadmap

💖 Sponsor

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

oh-my-fable

Fable 5's way of working a long task — plan first, self-correct every step, never lose the thread — as a model-agnostic agent harness.

The three things it does that most frameworks don't

1. It survives crashes (resumable by construction)

2. It plans first, then self-corrects (plan ≠ history)

3. It's deterministically testable (genuinely rare for an agent framework)

Quick start

Or use it from the terminal

Tools

Watch it work

It can't run away

How it's built

Roadmap

💖 Sponsor

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages