The context engineering layer for AI agents. Selects only the tools and skills relevant to each turn, recovering accuracy lost to tool overload and cutting what you pay per call. No vector DB, no embeddings.
- Cost: Every tool schema sent to the model is tokens you pay for. Fewer tools in context means lower spend on every call.
- Accuracy: Models get worse as tool lists grow. Some drop from 77% to 8% accuracy just from having too many options.
- Ratel fixes both: by indexing your full catalog and injecting only the tools that match the current task, keeping the rest out of context entirely.
Across local, open-source, and frontier model setups, Ratel cuts token usage and recovers accuracy lost to tool overload — without embeddings or a vector DB. Full results: benchmark.ratel.sh
Building an agent in TypeScript or Python? Add the SDK:
pnpm add @ratel-ai/sdkpip install ratel-aiTypeScript example
import { ToolCatalog, searchCapabilitiesTool, invokeToolTool } from "@ratel-ai/sdk";
const catalog = new ToolCatalog();
catalog.register({
id: "read_file",
name: "read_file",
description: "Read a file from local disk.",
inputSchema: { properties: { path: { type: "string" } } },
execute: async ({ path }) => ({ contents: await fs.readFile(path, "utf8") }),
});
const search = searchCapabilitiesTool(catalog);
const invoke = invokeToolTool(catalog);Python example
from ratel_ai import ToolCatalog, ExecutableTool, search_capabilities_tool, invoke_tool_tool
catalog = ToolCatalog()
catalog.register(ExecutableTool(
id="read_file",
name="read_file",
description="Read a file from local disk.",
input_schema={"properties": {"path": {"type": "string"}}},
execute=lambda args: {"contents": open(args["path"]).read()},
))
search = search_capabilities_tool(catalog)
invoke = invoke_tool_tool(catalog)Examples: Vercel AI SDK · Pydantic AI
Using Claude Code, Cursor, or ChatGPT with MCP servers? Drop Ratel in front of your existing setup with no code changes:
npx -y @ratel-ai/mcp-server mcp importFull docs: ratel-ai/ratel-mcp
When your agent needs to act, it calls search_capabilities. Ratel searches its internal index and returns only the most relevant tools. The model sees a short, focused list and picks correctly far more often.
The index uses BM25, the same algorithm behind most search engines, applied to each tool's name and description. It is fast, deterministic, and adds no latency to your agent loop.
Ratel scales from an in-process library to a hosted service — one kernel, one protocol, all the way up. Four products share it:
| Repo | What it is | |
|---|---|---|
| Kernel + platform | ratel-ai/ratel (this one) | The ratel-ai-core engine plus TS/Python SDKs, CLI, and the coming server. Embed it today; run it standalone once the server lands. |
| ratel-local | ratel-ai/ratel-mcp | The local distribution — Ratel in front of your MCP setup, today shipped as ratel-mcp / @ratel-ai/mcp-server. |
| ratel-cloud | coming | Managed, hosted Ratel. Same protocol; SDKs reach it over the wire via RATEL_URL. |
| ratel-bench | ratel-ai/ratel-bench | The benchmark harness behind benchmark.ratel.sh. |
The server and hosted cloud are decided direction (ADR-0014), not yet public.
src/
├── core/ # ratel-ai-core — Rust BM25 engine
├── sdk/ts/ # @ratel-ai/sdk — TypeScript SDK (NAPI-bound)
└── sdk/python/ # ratel-ai — Python SDK (PyO3-bound)
examples/ # End-to-end SDK examples
docs/ # ADRs
Prerequisites: Rust stable, Node 24+, pnpm 10.28+. Python SDK: Python 3.9+ and uv.
cargo build --workspace && cargo test --workspace # Rust
pnpm install && pnpm -r build && pnpm -r test # TypeScript
# Python: see src/sdk/python/README.md- CONTRIBUTING.md
- AGENTS.md — for coding agents working in this repo
The ratel-ai-core kernel is licensed under Apache-2.0 — an explicit patent grant for the engine others embed. Everything else (SDKs, CLI, examples) is MIT. See ADR-0017 for the rationale.
