diff --git a/.antigravity-plugin/README.md b/.antigravity-plugin/README.md new file mode 100644 index 000000000..3110cbbc5 --- /dev/null +++ b/.antigravity-plugin/README.md @@ -0,0 +1,77 @@ +# MemPalace — Antigravity plugin + +In-repo packaging for the MemPalace integration with Google's [Antigravity IDE](https://antigravity.google/). + +This directory is the source of truth for what gets installed at +`~/.gemini/config/plugins/mempalace/` when the user runs the installer. + +## Layout + +``` +.antigravity-plugin/ +├── plugin.json # marker manifest (verified minimal schema) +├── mcp_config.json # auto-registers the mempalace-mcp stdio server +├── hooks.json.tmpl # template — installer renders to hooks.json +├── skills/ +│ ├── mempalace/ +│ │ └── SKILL.md # ops skill: setup, mine, status, CLI delegation +│ └── mempalace-recall/ +│ └── SKILL.md # recall-only skill: search-before-answer protocol +├── rules/ +│ └── mempalace-recall.md # optional recall rule (complements the skill) +└── README.md # this file +``` + +The hook scripts themselves live at `hooks/antigravity/`. The installer +copies them into `/hooks/` and renders `hooks.json.tmpl` +into a `hooks.json` whose `command` paths point at the absolute install +location. + +## Three recall layers + +MemPalace can store everything, but it only helps if the agent actually +*reads* the palace before answering. Three layers wire that in, from +eager to on-demand: + +1. **Wake hook** (`hooks/antigravity/mempal_wake_hook_antigravity.sh`, + `PreInvocation` event, gated to `invocationNum == 1`). On the first + model call of a conversation it runs `mempalace wake-up` and injects + the **actual palace content verbatim** via Antigravity's + `injectSteps[].ephemeralMessage` output. This is Antigravity's native + equivalent of Cursor's `sessionStart` `additional_context`, except it + delivers the memory itself rather than a directive to go fetch it. +2. **Recall skill** (`skills/mempalace-recall/SKILL.md`). The + search-before-answer protocol the agent follows when a turn is + recall-relevant — tool selection, unhappy paths, anti-patterns. It + covers recall only; the `mempalace` skill covers setup / mine / + status. +3. **Optional recall rule** (`rules/mempalace-recall.md`). A lightweight + markdown rule that nudges the agent to search before answering when + Antigravity's matcher decides the turn is recall-relevant. It is + deliberately recall-scoped (not an always-on global rule) so it never + adds latency to greenfield work, honouring MemPalace's "memory should + feel instant" budget. + +All three point to the single canonical protocol in +[`integrations/shared/recall-protocol.md`](../integrations/shared/recall-protocol.md) +so the skill and rule never drift. + +## Install + +```bash +bash hooks/antigravity/install.sh +``` + +The installer is idempotent and the uninstaller matches by basename, so +re-runs and partial installs are safe. + +See [website/guide/antigravity.md](../website/guide/antigravity.md) for +the full user-facing guide and [hooks/antigravity/README.md](../hooks/antigravity/README.md) +for the hooks-specific documentation. + +## Verified surfaces + +Every file in this directory maps to a surface verified against +[Google's Antigravity docs](https://antigravity.google/docs/). See +[hooks/antigravity/INVESTIGATION.md](../hooks/antigravity/INVESTIGATION.md) +for the full audit, including the surfaces deliberately not shipped. diff --git a/.antigravity-plugin/hooks.json.tmpl b/.antigravity-plugin/hooks.json.tmpl new file mode 100644 index 000000000..7aab4140f --- /dev/null +++ b/.antigravity-plugin/hooks.json.tmpl @@ -0,0 +1,20 @@ +{ + "mempalace-save": { + "Stop": [ + { + "type": "command", + "command": "__PLUGIN_DIR__/hooks/mempal_save_hook_antigravity.sh", + "timeout": 30 + } + ] + }, + "mempalace-wake": { + "PreInvocation": [ + { + "type": "command", + "command": "__PLUGIN_DIR__/hooks/mempal_wake_hook_antigravity.sh", + "timeout": 5 + } + ] + } +} diff --git a/.antigravity-plugin/mcp_config.json b/.antigravity-plugin/mcp_config.json new file mode 100644 index 000000000..ca633f5f5 --- /dev/null +++ b/.antigravity-plugin/mcp_config.json @@ -0,0 +1,7 @@ +{ + "mcpServers": { + "mempalace": { + "command": "mempalace-mcp" + } + } +} diff --git a/.antigravity-plugin/plugin.json b/.antigravity-plugin/plugin.json new file mode 100644 index 000000000..14768cbc1 --- /dev/null +++ b/.antigravity-plugin/plugin.json @@ -0,0 +1,3 @@ +{ + "name": "mempalace" +} diff --git a/.antigravity-plugin/rules/mempalace-recall.md b/.antigravity-plugin/rules/mempalace-recall.md new file mode 100644 index 000000000..f5edf3b64 --- /dev/null +++ b/.antigravity-plugin/rules/mempalace-recall.md @@ -0,0 +1,18 @@ +# MemPalace recall + +Before answering anything that may already be in the user's memory +palace — past work, prior decisions, a person, a project, or "what did +we do / decide / discuss last time?" — search the palace first: + +1. Call `mempalace_search` with a short keyword query. Use + `mempalace_kg_query` for relational or time-bound facts. +2. Quote the drawer's **verbatim** text. Never summarize or paraphrase + stored content. +3. If results are empty, say so — do not invent an answer. If the MCP + server is unavailable, surface the error; do not fall back to guessing. + +Skip recall for pure greenfield work with no memory relevance (renaming +a variable, fixing a typo). Recall is question-driven, not reflexive. + +Full protocol: . Deeper guidance: +the `mempalace-recall` skill. diff --git a/.antigravity-plugin/skills/mempalace-recall/SKILL.md b/.antigravity-plugin/skills/mempalace-recall/SKILL.md new file mode 100644 index 000000000..b4ba5fba2 --- /dev/null +++ b/.antigravity-plugin/skills/mempalace-recall/SKILL.md @@ -0,0 +1,119 @@ +--- +name: mempalace-recall +description: "Recall protocol for MemPalace — search the palace before answering about past work, people, projects, or prior decisions. Apply when the user asks what was decided, what happened before, who someone is, what was discussed last time, or anything that may already be filed in their memory palace; or when mempalace-recall is invoked. Complements the mempalace setup skill and requires the mempalace-mcp server." +--- + +# MemPalace Recall + +Search-before-answer protocol for MemPalace. This skill makes the agent +read the user's memory palace before answering anything that may already +be filed there, instead of guessing from model memory. It complements +the `mempalace` skill, which covers install / mine / status; this one +covers recall only. + +## Step 0 — Verify MemPalace is available + +Before relying on recall, confirm MemPalace is installed and reachable: + +- Official release page: +- Check installed: `mempalace --version` +- Do not assume a version — the MCP tool set is the source of truth for + what this installed build supports. + +If the `mempalace_*` MCP tools are not available, tell the user the +server is not connected and point them at the `mempalace` skill to set +it up. Do not silently fall back to answering from model memory. + +## Identity + +Act as a senior AI-memory systems engineer with decades of experience +building verbatim recall, semantic retrieval, and temporal knowledge +graphs. Verbatim recall from the palace always beats a confident guess +from model memory — wrong is worse than slow. + +## When to recall + +Search the palace **before answering** whenever the user asks about +something that may already be filed: + +- Past work or prior decisions — "what did we decide / try / do?" +- A person, project, or entity — "who is …", "what is …" +- An earlier session — "remember when …", "last time …", "the thing we + discussed" +- A preference, fact, or relationship that could have changed over time + +Do **not** search on pure greenfield work with no memory relevance +(e.g. "rename this variable", "fix this typo"). Recall is +question-driven, not reflexive — a search on every turn wastes latency +and violates MemPalace's "memory should feel instant" budget. + +## Protocol + +1. On wake-up, the MemPalace PreInvocation hook injects verbatim palace + content via `injectSteps[].ephemeralMessage` on the first model call + of a conversation. If memory was injected, start from it before + searching further. +2. **Before responding** about people, projects, past events, or prior + decisions: call `mempalace_search` first. For relational or temporal + facts ("who reported to whom in March", "what was true then"), call + `mempalace_kg_query` instead or as well. +3. **If unsure** about a fact (name, age, relationship, preference): say + "let me check the palace" and query. Wrong is worse than slow. +4. **Return verbatim.** Quote the drawer's exact stored words. Never + summarize, paraphrase, or lossy-compress what the palace returns — + that is the whole point of the system. +5. **After a substantive session**, record continuity with + `mempalace_diary_write` (background hooks may already do this — do not + double-file). +6. **When a fact changes**, call `mempalace_kg_invalidate` on the old + fact, then `mempalace_kg_add` for the new one. + +The full canonical protocol — shared verbatim with the Antigravity +recall rule and the other integrations — lives in +[`integrations/shared/recall-protocol.md`](https://github.com/MemPalace/mempalace/blob/main/integrations/shared/recall-protocol.md). + +## Tool selection + +| You need | Tool | +|---|---| +| Find any memory by meaning | `mempalace_search` (start here) | +| Relational / time-bound facts about an entity | `mempalace_kg_query` | +| The chronological story of an entity | `mempalace_kg_timeline` | +| Recent session continuity | `mempalace_diary_read` | +| Which wings / rooms exist (scope unknown) | `mempalace_list_wings`, `mempalace_list_rooms` | +| Record this session | `mempalace_diary_write` | + +`mempalace_search` takes a short natural-language `query` (keywords or a +question — not a system prompt or pasted conversation) plus optional +`wing` / `room` filters and `limit` (default 5). + +## Unhappy paths + +- **Empty results.** Say the palace has nothing on this; do not invent an + answer to fill the gap. Offer to widen the search (drop the wing + filter) or to file the new information. +- **MCP unavailable / tool error.** Surface the error plainly and suggest + the user verify the server (`mempalace status`, or re-run the + installer `hooks/antigravity/install.sh`). Do not silently fall back + to guessing from model memory. +- **Stale or conflicting facts.** Prefer the knowledge graph's + time-valid answer; if a fact has changed, invalidate the old one and + add the new one rather than overwriting context silently. + +## Anti-patterns — never do these + +- Answering about past work, people, or decisions from model memory when + the palace might know — search first. +- Paraphrasing or summarizing stored content instead of quoting it + verbatim. +- Searching reflexively on every turn, including pure greenfield coding + with no memory relevance. +- Pasting the full conversation or a system prompt into the `query` + argument — keep queries short and keyword-driven. + +## Official References + +- MemPalace: +- MemPalace releases: +- Antigravity documentation: +- Agent Skills specification: diff --git a/.antigravity-plugin/skills/mempalace/SKILL.md b/.antigravity-plugin/skills/mempalace/SKILL.md new file mode 100644 index 000000000..5f94ab6d9 --- /dev/null +++ b/.antigravity-plugin/skills/mempalace/SKILL.md @@ -0,0 +1,92 @@ +--- +name: mempalace +description: MemPalace — mine projects and conversations into a searchable memory palace. Use when the user asks about MemPalace, memory palace, mining memories, searching memories, palace setup, wings, rooms, or drawers; or when they want to recall past work that may already be filed in their palace. +--- + +# MemPalace + +A searchable memory palace for AI — mine projects and conversations, then search them semantically. Verbatim storage, local-first, zero external API by default. + +## Prerequisites + +Ensure `mempalace` is installed: + +```bash +mempalace --version +``` + +If not installed (uv recommended): + +```bash +uv tool install mempalace # or: pip install mempalace +``` + +## Dynamic, version-correct instructions + +MemPalace exposes operation-specific instructions through the CLI so this skill stays accurate as MemPalace evolves. To get instructions for any operation: + +```bash +mempalace instructions +``` + +Always prefer the CLI output over what is written here when the two disagree — the CLI is the single source of truth for the installed version. + +## Common operations + +These are the five operations users ask for most often. Each one wraps a single MemPalace CLI subcommand. The `mempalace instructions ` form returns the full, version-correct guidance. + +### `help` — discover what MemPalace can do + +```bash +mempalace instructions help +``` + +Use when the user is new, unsure what's possible, or asks "what can you do". + +### `init` — first-run setup of the palace + +```bash +mempalace instructions init +``` + +Use when the user has just installed MemPalace, no palace exists yet, or the user explicitly asks to set up / configure / re-initialize their palace. + +### `mine` — ingest a project or conversation directory + +```bash +mempalace instructions mine +``` + +Use when the user wants to fold a project's files into their palace, or to ingest exported conversation transcripts into the palace as searchable memory. + +### `search` — find verbatim memories by semantic query + +```bash +mempalace instructions search +``` + +Use when the user wants to recall something from the past, find a previous decision, or rediscover code/notes/conversations they already wrote. + +### `status` — what's in the palace right now + +```bash +mempalace instructions status +``` + +Use when the user asks "what's in my palace", "how big is my palace", or wants a summary of wings, rooms, and drawer counts. + +## MCP tools (preferred over CLI) + +Inside Antigravity, the MemPalace MCP server registers a rich set of tools. Use these instead of shelling out to the CLI for live operations (search, diary writes, drawer adds, knowledge graph queries, palace status). The MCP tools always reflect the current palace state without spawning a subprocess. + +The MCP server is auto-registered when this plugin is installed at `~/.gemini/config/plugins/mempalace/`. If the server does not appear in Antigravity's MCP store, run `mempalace-mcp --version` to verify the binary is on PATH, then restart Antigravity. + +## Design principles (verbatim from the project) + +- **Verbatim always** — never summarize, paraphrase, or lossy-compress user data. +- **Local-first, zero external API by default** — extraction, embedding, and LLM-assisted refinement happen on the user's machine. +- **Privacy by architecture** — the system never calls out to external services for core operations. +- **Performance budgets** — hooks under 500ms; startup injection under 100ms. +- **Background everything** — filing, indexing, and timestamps happen via hooks in the background; zero tokens spent on bookkeeping in the chat window. + +If a request would violate any of these principles, refuse and explain — even if it would be technically convenient. diff --git a/.claude-plugin/README.md b/.claude-plugin/README.md index b6708bb2b..e9e6468e9 100644 --- a/.claude-plugin/README.md +++ b/.claude-plugin/README.md @@ -1,6 +1,6 @@ # MemPalace Claude Code Plugin -A Claude Code plugin that gives your AI a persistent memory system. Mine projects and conversations into a searchable palace backed by ChromaDB, with 19 MCP tools, auto-save hooks, and 5 guided skills. +A Claude Code plugin that gives your AI a persistent memory system. Mine projects and conversations into a searchable palace backed by ChromaDB, with 33 MCP tools, auto-save hooks, and 5 guided skills. ## Prerequisites @@ -50,7 +50,7 @@ Set the `MEMPAL_DIR` environment variable to a directory path to automatically r ## MCP Server -The plugin automatically configures a local MCP server with 19 tools for storing, searching, and managing memories. No manual MCP setup is required -- `/mempalace:init` handles everything. +The plugin automatically configures a local MCP server with 33 tools for storing, searching, and managing memories. No manual MCP setup is required -- `/mempalace:init` handles everything. ## Full Documentation diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 8f4b0ba71..52226cb36 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -8,8 +8,8 @@ { "name": "mempalace", "source": "./.claude-plugin", - "description": "AI memory system — mine projects and conversations into a searchable palace. 19 MCP tools, auto-save hooks, guided setup.", - "version": "3.4.0", + "description": "AI memory system — mine projects and conversations into a searchable palace. 33 MCP tools, auto-save hooks, guided setup.", + "version": "3.4.1", "author": { "name": "milla-jovovich" } diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 639ede169..aa0cba686 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "mempalace", - "version": "3.4.0", - "description": "Give your AI a memory — mine projects and conversations into a searchable palace. 19 MCP tools, auto-save hooks, and guided setup.", + "version": "3.4.1", + "description": "Give your AI a memory — mine projects and conversations into a searchable palace. 33 MCP tools, auto-save hooks, and guided setup.", "author": { "name": "milla-jovovich" }, diff --git a/.claude-plugin/skills/mempalace-recall/SKILL.md b/.claude-plugin/skills/mempalace-recall/SKILL.md new file mode 100644 index 000000000..749994f89 --- /dev/null +++ b/.claude-plugin/skills/mempalace-recall/SKILL.md @@ -0,0 +1,60 @@ +--- +name: mempalace-recall +description: Recall protocol for MemPalace — search the palace before answering about past work, prior decisions, people, or projects. Use when the user asks what was decided, what happened before, who someone is, what was discussed last time, or anything that may already be filed in their memory palace. +allowed-tools: Bash +--- + +# MemPalace Recall + +Search-before-answer protocol for MemPalace. Read the user's memory +palace before answering anything that may already be filed there, +instead of guessing from model memory. This complements the `mempalace` +skill (install / mine / status); this one covers recall only. + +## Step 0 — Verify MemPalace is available + +```bash +mempalace --version +``` + +If the `mempalace_*` MCP tools are not available, tell the user the +server is not connected and point them at the `mempalace` skill or +`/init`. Do not silently fall back to answering from model memory. + +## When to recall + +Search the palace **before answering** whenever the user asks about +something that may be filed: + +- Past work or prior decisions — "what did we decide / try / do?" +- A person, project, or entity — "who is …", "what is …" +- An earlier session — "remember when …", "last time …" +- A preference, fact, or relationship that could have changed over time + +Skip recall for pure greenfield work with no memory relevance (renaming +a variable, fixing a typo). Recall is question-driven, not reflexive. + +## Protocol + +1. Before responding about people / projects / past events / prior + decisions: call `mempalace_search` first. Use `mempalace_kg_query` + for relational or time-bound facts. +2. If unsure about a fact: say "let me check the palace" and query. +3. Return the drawer's **verbatim** text — never summarize or paraphrase + stored content. +4. After a substantive session, record continuity with + `mempalace_diary_write` (skip if a background hook already saved). +5. When a fact changes: `mempalace_kg_invalidate` the old fact, then + `mempalace_kg_add` the new one. + +## Unhappy paths + +- **Empty results** — say the palace has nothing on this; do not invent + an answer. Offer to widen the search or file the new information. +- **MCP error / server down** — surface the error, suggest `mempalace + status` or re-running `/init`; never fall back to guessing. +- **Conflicting facts** — trust the knowledge graph's time-valid answer; + invalidate-then-add rather than overwriting silently. + +The canonical protocol, shared across all MemPalace integrations, lives +in `integrations/shared/recall-protocol.md`. diff --git a/.codex-plugin/README.md b/.codex-plugin/README.md index 2af714c36..2d2478bb3 100644 --- a/.codex-plugin/README.md +++ b/.codex-plugin/README.md @@ -1,6 +1,6 @@ # MemPalace - Codex CLI Plugin -Give your AI a persistent memory -- mine projects and conversations into a searchable palace backed by ChromaDB, with 19 MCP tools, auto-save hooks, and guided skills. +Give your AI a persistent memory -- mine projects and conversations into a searchable palace backed by ChromaDB, with 33 MCP tools, auto-save hooks, and guided skills. ## Prerequisites diff --git a/.codex-plugin/plugin.json b/.codex-plugin/plugin.json index 97d84ab13..462f401b3 100644 --- a/.codex-plugin/plugin.json +++ b/.codex-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "mempalace", - "version": "3.4.0", - "description": "Give your AI a memory — mine projects and conversations into a searchable palace. 19 MCP tools, auto-save hooks, and guided setup.", + "version": "3.4.1", + "description": "Give your AI a memory — mine projects and conversations into a searchable palace. 33 MCP tools, auto-save hooks, and guided setup.", "author": { "name": "milla-jovovich" }, @@ -27,7 +27,7 @@ "interface": { "displayName": "MemPalace", "shortDescription": "AI memory system for Codex", - "longDescription": "Give your AI a persistent memory — mine projects and conversations into a searchable palace backed by ChromaDB, with 19 MCP tools, auto-save hooks, and guided skills.", + "longDescription": "Give your AI a persistent memory — mine projects and conversations into a searchable palace backed by ChromaDB, with 33 MCP tools, auto-save hooks, and guided skills.", "developerName": "milla-jovovich", "category": "Coding", "capabilities": [ diff --git a/.cursor-plugin/README.md b/.cursor-plugin/README.md new file mode 100644 index 000000000..6ba9ba48e --- /dev/null +++ b/.cursor-plugin/README.md @@ -0,0 +1,144 @@ +# MemPalace Cursor Plugin + +A Cursor IDE plugin that gives your agent a persistent memory system. Auto-registers the `mempalace-mcp` server (33 MCP tools), ships 5 slash commands, two model-invocable skills (setup/mining/search and a recall protocol), and an optional recall rule. + +> Hooks (auto-save + session-start memory recall) are shipped separately under `hooks/cursor/` so the plugin is safe to install in any Cursor workspace without touching the agent loop. See [Hooks](#hooks-optional) below. + +## Prerequisites + +- Python 3.9+ +- Cursor 1.7+ (plugin manifest schema requires it) + +## Installation + +### Local clone (recommended while not in the marketplace yet) + +Symlink (or copy) this repository into Cursor's local plugins folder: + +```bash +ln -s /path/to/mempalace ~/.cursor/plugins/local/mempalace +``` + +Then in Cursor: Cmd-Shift-P → **Developer: Reload Window**. + +### Marketplace + +Once published, install via the Cursor marketplace panel and select `mempalace`. Required-plugin distribution from a team marketplace is also supported. + +## Post-Install Setup + +After installing the plugin, run the `init` command in a Cursor chat: + +``` +/mempalace-init +``` + +(Or just say "use the mempalace skill" — Cursor will model-invoke the bundled skill.) + +This installs the `mempalace` package via `uv tool` or `pip`, initializes a palace under `~/.mempalace/`, and verifies the MCP server is reachable. + +## Available Slash Commands + +| Command | Description | +|---------------------|-----------------------------------------------------------------------------------| +| `/mempalace-help` | Show available tools, skills, CLI commands, hooks, and architecture | +| `/mempalace-init` | Set up MemPalace — install, configure, onboard | +| `/mempalace-search` | Search your memories across the palace using semantic search | +| `/mempalace-mine` | Mine projects and conversations into the palace | +| `/mempalace-status` | Show palace overview — wings, rooms, drawer counts | + +> Cursor commands are global, not plugin-namespaced — that's why each slug is prefixed with `mempalace-` rather than appearing as `/help`, `/init`, etc. This keeps them collision-free with built-in or other-plugin commands. + +## Skills + +Two model-invocable skills ship at the plugin root under `skills/`: + +| Skill | What it does | +|-------|--------------| +| `mempalace` | Setup, mining, status, and the dynamic `mempalace instructions` CLI. | +| `mempalace-recall` | Search-before-answer protocol — makes the agent read the palace before answering about past work, people, projects, or prior decisions instead of guessing. | + +Cursor surfaces these automatically when a request matches their description, or you can attach them explicitly. + +## Recall rule (optional) + +The plugin also ships a Cursor rule at the plugin root under `rules/mempalace-recall.mdc`: + +```yaml +description: When the user asks about past work, prior decisions, people, ... call mempalace_search before answering ... +alwaysApply: false +``` + +It is `alwaysApply: false` on purpose — Cursor loads it only when its matcher judges the turn recall-relevant, so it never fires on unrelated coding work and never adds MCP latency to greenfield tasks. The rule, the `mempalace-recall` skill, and the `sessionStart` hook all reference the same canonical protocol in [`integrations/shared/recall-protocol.md`](../integrations/shared/recall-protocol.md). + +Want recall forced into **every** conversation regardless of context? Copy the aggressive `alwaysApply: true` variant from [`examples/cursor/rules/`](../examples/cursor/rules/README.md) into `~/.cursor/rules/`. That is a deliberate, heavier opt-in, not a default. + +## MCP Server + +This plugin ships `mcp.json` at the plugin root, so Cursor auto-loads the `mempalace-mcp` server on plugin install: + +```json +{ + "mcpServers": { + "mempalace": { + "command": "mempalace-mcp" + } + } +} +``` + +All 33 MemPalace MCP tools (`mempalace_search`, `mempalace_add_drawer`, `mempalace_diary_write`, `mempalace_check_duplicate`, `mempalace_diary_read`, …) become available to the agent immediately. No manual `~/.cursor/mcp.json` edit required. + +If the server doesn't appear, confirm `mempalace-mcp` is on the user `$PATH`: + +```bash +command -v mempalace-mcp +``` + +If it isn't, run `/init` (or `mempalace install` from a terminal) — `mempalace-mcp` is installed alongside the `mempalace` package. + +## Hooks (optional) + +Cursor's hooks system is configured separately from plugins (in `~/.cursor/hooks.json` or `.cursor/hooks.json`), so this plugin does **not** wire hooks itself. The MemPalace repository ships three Cursor-native hooks under [`hooks/cursor/`](../hooks/cursor/) that you install with one command. + +User scope — writes `~/.cursor/hooks.json`, applies to every Cursor workspace (recommended): + +```bash +hooks/cursor/install.sh --scope user --variant full +``` + +Project scope — writes `.cursor/hooks.json` under the current project only: + +```bash +hooks/cursor/install.sh --scope project --variant full +``` + +What you get: + +| Hook event | What it does | +|----------------|-------------------------------------------------------------------------------------------------------| +| `sessionStart` | Injects an `additional_context` recap of relevant memories scoped to the workspace wing | +| `stop` | Counts agent turns; every N turns, emits a `followup_message` instructing a memory checkpoint | +| `preCompact` | Synchronously mines the transcript before compaction, drops a marker so the next `stop` saves a diary | + +Full details: [`website/guide/cursor-hooks.md`](../website/guide/cursor-hooks.md) and [`hooks/cursor/README.md`](../hooks/cursor/README.md). + +## Uninstall + +Remove the local plugin symlink: + +```bash +rm ~/.cursor/plugins/local/mempalace +``` + +Then in Cursor: Cmd-Shift-P → **Developer: Reload Window**. + +If you also installed the hooks, remove them (leaves any unrelated hooks in `hooks.json` untouched): + +```bash +hooks/cursor/install.sh --scope user --uninstall +``` + +## Full Documentation + +See the main [README](../README.md) for complete documentation, architecture details, and advanced usage. diff --git a/.cursor-plugin/marketplace.json b/.cursor-plugin/marketplace.json new file mode 100644 index 000000000..bd3ed05e2 --- /dev/null +++ b/.cursor-plugin/marketplace.json @@ -0,0 +1,17 @@ +{ + "name": "mempalace", + "owner": { + "name": "milla-jovovich", + "url": "https://github.com/MemPalace" + }, + "plugins": [ + { + "name": "mempalace", + "source": ".", + "description": "AI memory system — mine projects and conversations into a searchable palace. 33 MCP tools, slash commands, and a guided skill for Cursor.", + "author": { + "name": "milla-jovovich" + } + } + ] +} diff --git a/.cursor-plugin/mcp.json b/.cursor-plugin/mcp.json new file mode 100644 index 000000000..ca633f5f5 --- /dev/null +++ b/.cursor-plugin/mcp.json @@ -0,0 +1,7 @@ +{ + "mcpServers": { + "mempalace": { + "command": "mempalace-mcp" + } + } +} diff --git a/.cursor-plugin/plugin.json b/.cursor-plugin/plugin.json new file mode 100644 index 000000000..b3be76b3d --- /dev/null +++ b/.cursor-plugin/plugin.json @@ -0,0 +1,19 @@ +{ + "name": "mempalace", + "description": "Give your AI a memory — mine projects and conversations into a searchable palace. 33 MCP tools, slash commands, and a guided skill for Cursor.", + "author": { + "name": "milla-jovovich" + }, + "homepage": "https://github.com/MemPalace/mempalace", + "repository": "https://github.com/MemPalace/mempalace", + "license": "MIT", + "keywords": [ + "memory", + "ai", + "rag", + "mcp", + "chromadb", + "palace", + "search" + ] +} diff --git a/CHANGELOG.md b/CHANGELOG.md index 64d66f2c9..4f10ef293 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,28 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), --- +## [Unreleased] + +--- + +## [3.4.1] — 2026-06-14 + +### Features + +- **Cursor IDE plugin (`.cursor-plugin/`).** Drops into `~/.cursor/plugins/local/mempalace` (or installs from the Cursor marketplace once published) and auto-registers the `mempalace-mcp` server, five slash commands (`/mempalace-help`, `/mempalace-init`, `/mempalace-mine`, `/mempalace-search`, `/mempalace-status`), and the model-invocable [`mempalace` skill](skills/mempalace/SKILL.md) — no manual `~/.cursor/mcp.json` edit required. The plugin manifest deliberately omits a hardcoded `version` field — `mempalace/version.py` is the single source of truth, so there is nothing to drift on the next release (a contract test enforces the field stays absent). The canonical plugin components (`commands/`, `skills/`, `mcp.json`) are real files at the plugin root; no symlinks are committed (committed symlinks materialise as broken text files on Windows clones with `core.symlinks=false`). Mirrors the surface of [`.claude-plugin/`](.claude-plugin/) and [`.codex-plugin/`](.codex-plugin/) without duplicating their hook scripts: the Cursor hook scripts under [`hooks/cursor/`](hooks/cursor/) (shipped in the same release) remain the canonical install path for `stop` / `preCompact` / `sessionStart`, wired separately by [`hooks/cursor/install.sh`](hooks/cursor/install.sh). Contract tests in [`tests/test_cursor_plugin_manifest.py`](tests/test_cursor_plugin_manifest.py) cover manifest JSON validity, kebab-case naming, `..`-free relative paths, on-disk path resolution, marketplace alignment, MCP config shape (`mcpServers` wrapper required by Cursor, unlike Claude's flat `.mcp.json`), the version-field-absent guard, the no-symlink guard, and every skill/command frontmatter — all pure file inspection so they run on any CI platform without Cursor itself. + +- **Cursor IDE hook support (`stop` / `preCompact` / `sessionStart`).** Three new bash hooks live under [`hooks/cursor/`](hooks/cursor/) and share a `lib/common.sh` helpers module. The save hook counts `stop` invocations per Cursor `conversation_id` and emits a `followup_message` every `MEMPAL_SAVE_INTERVAL` (default 15) so the agent files the session into MemPalace and writes a diary entry. Unlike the silent-by-default Claude Code hook, the Cursor followup fires **on by default**: Cursor's transcript format is undocumented and `normalize.py` has no Cursor parser yet, so the background `mempalace mine --mode convos` is best-effort only and the `followup_message` is the load-bearing verbatim-capture path. Users who want the Claude-style "zero tokens in the chat window" behaviour can suppress it with `MEMPAL_CURSOR_SILENT=1` (or `MEMPAL_VERBOSE=false`); the default flips to silent once a Cursor transcript parser lands. The precompact hook synchronously mines the transcript before Cursor's compaction summarises it and drops a marker so the next `stop` forces a save nudge (Cursor's `preCompact` is observational-only — it cannot block or emit a `followup_message`, unlike Claude Code's `PreCompact`); the synchronous mine is bounded by Cursor's per-hook timeout, and because `mempalace mine` is incremental/append-only a killed mine resumes cleanly on the next run rather than corrupting the palace. The wake hook is Cursor-only: `sessionStart` returns `additional_context` telling the agent to recall scoped to the wing inferred from the workspace root. Honours the same `MEMPALACE_HOOKS_AUTO_SAVE=false` kill switch as the Claude Code hooks, plus a new `MEMPAL_DISABLE_HOOK=1` alias and a `MEMPAL_STATE_DIR` env override. Per-conversation state files are garbage-collected by a daily-throttled, Cursor-namespaced TTL sweep (`MEMPAL_STATE_TTL_DAYS`, default 30) so `cursor_*.count` / `cursor_*.pending` cannot grow unbounded — shared logs and other editors' state are never touched. Includes an opt-in installer at [`hooks/cursor/install.sh`](hooks/cursor/install.sh) with `--scope user|project`, `--variant full|minimal`, `--dry-run`, and `--uninstall` (idempotent, preserves unrelated hooks via `python3`-based JSON merge — no `jq` dependency). Example wirings live at [`examples/cursor/hooks.json`](examples/cursor/hooks.json) and [`examples/cursor/hooks.minimal.json`](examples/cursor/hooks.minimal.json); they are intentionally not placed at the repo root because Cursor auto-loads project hooks from any trusted workspace and we do not arm hooks on contributor checkout. Per-event stdin/stdout schema documented at [`hooks/cursor/STDIN_SHAPE.md`](hooks/cursor/STDIN_SHAPE.md). Walkthrough at [`website/guide/cursor-hooks.md`](website/guide/cursor-hooks.md). Coverage added in [`tests/test_cursor_hooks_shell.py`](tests/test_cursor_hooks_shell.py) and [`tests/test_cursor_hooks_install.py`](tests/test_cursor_hooks_install.py). + +- **First-class Antigravity IDE support.** New `.antigravity-plugin/` package + idempotent installer at `hooks/antigravity/install.sh` that registers MemPalace as a Google Antigravity plugin (MCP server, skill, two lifecycle hooks) at `~/.gemini/config/plugins/mempalace/`. The Stop hook background-mines the active conversation transcript every Nth fire (default 15, configurable via `MEMPAL_SAVE_INTERVAL`); the PreInvocation hook injects verbatim memory on the first model call only via Antigravity's `injectSteps[].ephemeralMessage` output, gated by `invocationNum == 1`. Both hooks are bash 3.2.57 compatible (macOS default), use the same `~/.mempalace/hook_state/` directory as the Claude Code / Codex / Cursor hooks (`antigravity_*`-namespaced state files), and respect every existing kill switch (`MEMPAL_DISABLE_HOOK`, `MEMPALACE_HOOKS_AUTO_SAVE`, `~/.mempalace/config.json` `hooks.auto_save`). Installer is `cmp`-gated (re-run produces a byte-identical install), uninstall is basename-guarded (refuses to wipe a directory whose basename isn't `mempalace`), and `--dry-run` is side-effect free. Full audit of which Antigravity surfaces we ship and which we deliberately don't is in [`hooks/antigravity/INVESTIGATION.md`](hooks/antigravity/INVESTIGATION.md). User-facing guide: [`website/guide/antigravity.md`](website/guide/antigravity.md). Standalone examples in [`examples/antigravity/`](examples/antigravity/). + - **Zero-config interpreter resolution.** `mempal_resolve_python` now derives the Python interpreter from the `mempalace-mcp` / `mempalace` console-script shebang on `$PATH` before falling back to `python3`. The common `uv tool install mempalace` / `pipx install` layout installs the console scripts into an isolated environment whose interpreter is **not** system `python3`, so the previous `command -v python3` resolution landed on a Python that couldn't import `mempalace`, the `-m mempalace` probe failed, and mining silently never fired. Resolution is pure shebang parsing + `stat` (no Python subprocess at source time, preserving the hook performance budget). `MEMPAL_PYTHON` remains the explicit override. Documented under *How the hooks find your `mempalace` install* in the guide. + +### Bug Fixes + +- **`embeddinggemma` no longer OOM-kills bulk re-embeds.** `EmbeddinggemmaONNX.__call__` ran a single `session.run` over its entire input, so a repair-scale batch (5000 docs) allocated attention buffers far beyond available RAM and the kernel killed the process silently: `mempalace repair --yes` on an `embedding_model: embeddinggemma` palace died right after `Building temporary collection:` with no traceback and no crash report (#1770). Embedding now runs in sub-batches of 32 docs (constructor-tunable `batch_size`), matching the internal batching of ChromaDB's bundled MiniLM embedder. Per-document vectors are unchanged: the model's pooled output is attention-masked, so sub-batch padding does not affect values. The embedder is also hardened for shared use: the process-wide EF cache and the lazy model load are thread-safe (concurrent first calls build exactly one ONNX session), and `__call__` handles a bare string, `None`, and empty input without triggering the model download. +- **Backup retention to prevent unbounded disk usage.** `mempalace migrate` (full-palace `.pre-migrate.` copies) and `mempalace repair max-seq-id` (`chroma.sqlite3.max-seq-id-backup-` copies) each wrote a fresh, full-size, timestamped backup every run and never deleted the old ones. On a machine that mines or repairs on a schedule, those copies could silently accumulate until they filled the disk — one palace was found with hundreds of GB of stale backups beside a few hundred MB of live data, hidden from a normal `du` of the home directory. A new `max_backups` setting (default `10`, env `MEMPALACE_MAX_BACKUPS`, or `config.json`) now prunes the oldest backups after each new one is written. Set it to `0` to keep every backup. Pruning is keyed by filesystem mtime, scoped strictly to each backup's own naming pattern (live data is never touched), and best-effort so a deletion failure can never abort a migration or repair that already succeeded. + +--- + ## [3.3.6] — 2026-05-24 ### Features @@ -494,7 +516,16 @@ Initial public release. --- -[Unreleased]: https://github.com/MemPalace/mempalace/compare/v3.2.0...HEAD +[Unreleased]: https://github.com/MemPalace/mempalace/compare/v3.4.1...HEAD +[3.4.1]: https://github.com/MemPalace/mempalace/compare/v3.4.0...v3.4.1 +[3.4.0]: https://github.com/MemPalace/mempalace/compare/v3.3.6...v3.4.0 +[3.3.6]: https://github.com/MemPalace/mempalace/compare/v3.3.5...v3.3.6 +[3.3.5]: https://github.com/MemPalace/mempalace/compare/v3.3.4...v3.3.5 +[3.3.4]: https://github.com/MemPalace/mempalace/compare/v3.3.3...v3.3.4 +[3.3.3]: https://github.com/MemPalace/mempalace/compare/v3.3.2...v3.3.3 +[3.3.2]: https://github.com/MemPalace/mempalace/compare/v3.3.1...v3.3.2 +[3.3.1]: https://github.com/MemPalace/mempalace/compare/v3.3.0...v3.3.1 +[3.3.0]: https://github.com/MemPalace/mempalace/compare/v3.2.0...v3.3.0 [3.2.0]: https://github.com/MemPalace/mempalace/compare/v3.1.0...v3.2.0 [3.1.0]: https://github.com/MemPalace/mempalace/compare/v3.0.0...v3.1.0 [3.0.0]: https://github.com/MemPalace/mempalace/releases/tag/v3.0.0 diff --git a/README.md b/README.md index 441190ab9..6f74c5b7f 100644 --- a/README.md +++ b/README.md @@ -156,7 +156,8 @@ mempalace search "why did we switch to GraphQL" mempalace wake-up ``` -For Claude Code, Gemini CLI, MCP-compatible tools, and local models, see +For Claude Code, Gemini CLI, [Antigravity](https://mempalaceofficial.com/guide/antigravity.html), +MCP-compatible tools, and local models, see [mempalaceofficial.com/guide/getting-started](https://mempalaceofficial.com/guide/getting-started.html). --- @@ -224,7 +225,7 @@ Usage and tool reference: ## MCP server -29 MCP tools cover palace reads/writes, knowledge-graph operations, +33 MCP tools cover palace reads/writes, knowledge-graph operations, cross-wing navigation, drawer management, and agent diaries. Installation and the full tool list: [mempalaceofficial.com/reference/mcp-tools](https://mempalaceofficial.com/reference/mcp-tools.html). @@ -238,8 +239,14 @@ system prompt: ## Auto-save hooks -Two Claude Code hooks save periodically and before context compression: -[mempalaceofficial.com/guide/hooks](https://mempalaceofficial.com/guide/hooks.html). +Auto-save hooks for **Claude Code, Codex CLI, and Cursor IDE** save +periodically and before context compression: + +- Claude Code + Codex → + [mempalaceofficial.com/guide/hooks](https://mempalaceofficial.com/guide/hooks.html) +- Cursor IDE (adds session-start recall and a transcript snapshot before + compaction) → + [mempalaceofficial.com/guide/cursor-hooks](https://mempalaceofficial.com/guide/cursor-hooks.html) If you are installing under time pressure, start with the [Claude Code retention setup checklist](https://mempalaceofficial.com/guide/claude-code-retention.html): @@ -278,7 +285,7 @@ PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md). MIT — see [LICENSE](LICENSE). -[version-shield]: https://img.shields.io/badge/version-3.4.0-4dc9f6?style=flat-square&labelColor=0a0e14 +[version-shield]: https://img.shields.io/badge/version-3.4.1-4dc9f6?style=flat-square&labelColor=0a0e14 [release-link]: https://github.com/MemPalace/mempalace/releases [python-shield]: https://img.shields.io/badge/python-3.9+-7dd8f8?style=flat-square&labelColor=0a0e14&logo=python&logoColor=7dd8f8 [python-link]: https://www.python.org/ diff --git a/commands/mempalace-help.md b/commands/mempalace-help.md new file mode 100644 index 000000000..51b5a6118 --- /dev/null +++ b/commands/mempalace-help.md @@ -0,0 +1,7 @@ +--- +description: Show comprehensive MemPalace help — available skills, MCP tools, CLI commands, hooks, and architecture. +--- + +Invoke the `mempalace` skill from this plugin and run the `help` instructions, then follow them. + +Concretely: run `mempalace instructions help` in a terminal, then carry out the steps it prints. diff --git a/commands/mempalace-init.md b/commands/mempalace-init.md new file mode 100644 index 000000000..eaf02946b --- /dev/null +++ b/commands/mempalace-init.md @@ -0,0 +1,12 @@ +--- +description: Set up MemPalace — install the package, initialize a palace, register the MCP server with Cursor, and verify everything works. +--- + +Invoke the `mempalace` skill from this plugin and run the `init` instructions, then follow them. + +Concretely: run `mempalace instructions init` in a terminal, then carry out the steps it prints. + +Cursor-specific extras after init: + +1. The `mempalace-mcp` server is already auto-registered by this plugin — no manual `mcp.json` edit needed. +2. For automatic background saves and session-start memory recall, also run `hooks/cursor/install.sh --scope user` from a cloned MemPalace repo. See `website/guide/cursor-hooks.md` for the walkthrough. diff --git a/commands/mempalace-mine.md b/commands/mempalace-mine.md new file mode 100644 index 000000000..a15a76ff8 --- /dev/null +++ b/commands/mempalace-mine.md @@ -0,0 +1,7 @@ +--- +description: Mine projects and conversations into the MemPalace. Supports project files, conversation exports, and auto-classification. +--- + +Invoke the `mempalace` skill from this plugin and run the `mine` instructions, then follow them. + +Concretely: run `mempalace instructions mine` in a terminal, then carry out the steps it prints. diff --git a/commands/mempalace-search.md b/commands/mempalace-search.md new file mode 100644 index 000000000..20b2462d8 --- /dev/null +++ b/commands/mempalace-search.md @@ -0,0 +1,7 @@ +--- +description: Search your memories across the MemPalace using semantic search with wing/room filtering. +--- + +Invoke the `mempalace` skill from this plugin and run the `search` instructions, then follow them. + +Concretely: run `mempalace instructions search` in a terminal, then carry out the steps it prints. The MCP tool `mempalace_search` is also available directly from this Cursor session. diff --git a/commands/mempalace-status.md b/commands/mempalace-status.md new file mode 100644 index 000000000..6ab82a856 --- /dev/null +++ b/commands/mempalace-status.md @@ -0,0 +1,7 @@ +--- +description: Show the current state of your memory palace — wings, rooms, drawer counts, and suggestions. +--- + +Invoke the `mempalace` skill from this plugin and run the `status` instructions, then follow them. + +Concretely: run `mempalace instructions status` in a terminal, then carry out the steps it prints. diff --git a/examples/antigravity/README.md b/examples/antigravity/README.md new file mode 100644 index 000000000..33575a262 --- /dev/null +++ b/examples/antigravity/README.md @@ -0,0 +1,74 @@ +# MemPalace — Antigravity examples + +Two standalone configs for users who don't want to use the +`hooks/antigravity/install.sh` installer. + +## Files + +| File | Purpose | +|-------------------|-----------------------------------------------------------------------------------| +| `hooks.json` | Standalone `hooks.json` registering the Stop and PreInvocation hooks. | +| `mcp_config.json` | Standalone MCP entry registering the `mempalace-mcp` stdio server. | + +## Wire up `hooks.json` + +The example uses placeholder absolute paths (`/ABSOLUTE/PATH/TO/mempalace/...`). +You must rewrite both `command` fields to the actual absolute paths to +the hook scripts in your cloned repo, or to whichever location holds +them. Antigravity will not resolve relative paths reliably. + +Then drop the file at one of: + +- `~/.gemini/config/hooks.json` (global, applies to every workspace) +- `/.agents/hooks.json` (workspace-scoped) + +Restart Antigravity to pick the file up. + +If you'd rather have the paths absolutized automatically, run the +installer: + +```bash +bash hooks/antigravity/install.sh +``` + +That writes a fully rendered `hooks.json` to +`~/.gemini/config/plugins/mempalace/hooks.json`. + +## Wire up `mcp_config.json` + +The example registers the `mempalace-mcp` stdio server. Two options: + +### Option A — merge into the user-level Antigravity MCP config + +Antigravity's user-level MCP config lives at +`~/.gemini/antigravity/mcp_config.json`. Merge the `mcpServers.mempalace` +entry from this example into that file, then restart Antigravity. + +### Option B — drop into a plugin directory + +If you've created a custom plugin folder (per the [Antigravity plugins docs](https://antigravity.google/docs/plugins)), +copy this `mcp_config.json` directly into the plugin root: + +``` +/mcp_config.json +``` + +Antigravity merges plugin-level MCP entries with the user-level config +on launch. + +## Verify + +After wiring up either or both: + +```bash +mempalace-mcp --version # confirm binary is on PATH +ls ~/.mempalace/ # confirm palace exists (run `mempalace init` if not) +``` + +Restart Antigravity. The `mempalace` MCP server should appear in the +MCP store; the Stop and PreInvocation hooks fire automatically. + +See [`hooks/antigravity/STDIN_SHAPE.md`](../../hooks/antigravity/STDIN_SHAPE.md) +for the exact wire format Antigravity uses, and +[`website/guide/antigravity.md`](../../website/guide/antigravity.md) +for the full user-facing guide. diff --git a/examples/antigravity/hooks.json b/examples/antigravity/hooks.json new file mode 100644 index 000000000..4eaec3d59 --- /dev/null +++ b/examples/antigravity/hooks.json @@ -0,0 +1,21 @@ +{ + "mempalace-save": { + "Stop": [ + { + "type": "command", + "command": "/ABSOLUTE/PATH/TO/mempalace/hooks/antigravity/mempal_save_hook_antigravity.sh", + "timeout": 30 + } + ] + }, + + "mempalace-wake": { + "PreInvocation": [ + { + "type": "command", + "command": "/ABSOLUTE/PATH/TO/mempalace/hooks/antigravity/mempal_wake_hook_antigravity.sh", + "timeout": 5 + } + ] + } +} diff --git a/examples/antigravity/mcp_config.json b/examples/antigravity/mcp_config.json new file mode 100644 index 000000000..ca633f5f5 --- /dev/null +++ b/examples/antigravity/mcp_config.json @@ -0,0 +1,7 @@ +{ + "mcpServers": { + "mempalace": { + "command": "mempalace-mcp" + } + } +} diff --git a/examples/cursor/README.md b/examples/cursor/README.md new file mode 100644 index 000000000..7a94270a5 --- /dev/null +++ b/examples/cursor/README.md @@ -0,0 +1,110 @@ +# Cursor IDE Hooks — Example `hooks.json` Files + +Sample configurations for wiring the MemPalace Cursor hooks into the +Cursor IDE. These are **examples only** — they are intentionally not +placed at the repo root (`/.cursor/hooks.json`) because Cursor +auto-loads project-level hooks from any trusted workspace, and the +repo is regularly opened by contributors. We do not auto-arm hooks on +contributor checkout. + +## Variants + +### `hooks.json` — full (recommended) + +Three hooks wired: + +- **`sessionStart`** — calls `mempal_wake_hook_cursor.sh`, which returns + `additional_context` telling the agent to recall scoped to the wing + inferred from the workspace root. Cursor-only — Claude Code has no + equivalent. +- **`stop`** — calls `mempal_save_hook_cursor.sh`. Counts stop + invocations per conversation and emits a `followup_message` every + `MEMPAL_SAVE_INTERVAL` (default 15) telling the agent to file the + session into the palace and write a diary entry. `loop_limit: 1` is + defense-in-depth on top of our own loop-count check. +- **`preCompact`** — calls `mempal_precompact_hook_cursor.sh`. Runs + `mempalace mine` synchronously on the transcript before compaction + summarises it, then drops a marker so the next `stop` forces a save + followup. + +### `hooks.minimal.json` — `stop` only + +Lightest install. Wires just the save hook. Use this if you don't +want the sessionStart recall context or the preCompact transcript +snapshot. + +## How to use + +The `$HOME` placeholder is **not** expanded by Cursor — you must +substitute the absolute path before saving the file. Pick one: + +### Option A — let `install.sh` do it + +Project scope — writes `/.cursor/hooks.json`: + +```bash +hooks/cursor/install.sh --scope project --target /path/to/your/repo +``` + +User scope — writes `~/.cursor/hooks.json`, applies to every Cursor workspace: + +```bash +hooks/cursor/install.sh --scope user +``` + +The installer copies the hook scripts to `~/.mempalace/hooks/cursor/`, +substitutes the absolute paths, and merges the entries into your +existing `hooks.json` without clobbering unrelated hooks. See +`install.sh --help` for `--dry-run`, `--uninstall`, and `--variant`. + +### Option B — copy + edit manually + +1. Copy the chosen example to the target location: + - User scope: `~/.cursor/hooks.json` + - Project scope: `/.cursor/hooks.json` +2. Replace every `$HOME` with the absolute path to your home + directory (e.g., `/Users/you` or `/home/you`). +3. Make sure each hook script is executable + (`chmod +x ~/.mempalace/hooks/cursor/mempal_*_hook_cursor.sh`). +4. Restart Cursor, or wait for it to auto-reload the file. + +## Why aren't these files at the repo root? + +Cursor automatically loads `.cursor/hooks.json` from any trusted +workspace. Placing a real `hooks.json` at the repo root would arm +MemPalace's hooks on every contributor's machine the moment they open +the repo in Cursor — which would modify their conversation behaviour +without consent and write to `~/.mempalace/hook_state/` without +asking. Editor configuration is sacred; opt-in only. + +If you actually want MemPalace's hooks armed when working on the +MemPalace repo itself, run: + +```bash +hooks/cursor/install.sh --scope project --target . +``` + +That will write `./.cursor/hooks.json` for the repo workspace +specifically — but it is your decision, not ours, and the file is +listed in `.gitignore` paths Cursor users typically already exclude. + +## Related: the Cursor plugin + +The hooks here are **only one half** of MemPalace's Cursor integration. The other half is the [`.cursor-plugin/`](../../.cursor-plugin/) folder at the repo root, which packages MemPalace's MCP server, five slash commands, and the model-invocable `mempalace` skill as a regular Cursor plugin you can drop into `~/.cursor/plugins/local/mempalace`. + +The two install paths are orthogonal — install whichever you want, in any order: + +| You want | Install | +|---------------------------------------------------------------------------|----------------------------------------------------------| +| MCP tools (`mempalace_search`, `mempalace_add_drawer`, …) + slash commands | The plugin — see [`.cursor-plugin/README.md`](../../.cursor-plugin/README.md) | +| Auto-save every N turns + sessionStart memory recall | The hooks here — see Option A above | +| Both | Install the plugin AND run `hooks/cursor/install.sh` | + +Hooks are deliberately **not** bundled into the plugin because Cursor's hooks system is configured per-user/per-project (in `~/.cursor/hooks.json` or `.cursor/hooks.json`), not per-plugin — so the installer here owns that file with idempotent merge semantics, while the plugin owns the MCP+commands+skill side. + +## See also + +- [`hooks/cursor/README.md`](../../hooks/cursor/README.md) — full reference for hooks +- [`hooks/cursor/STDIN_SHAPE.md`](../../hooks/cursor/STDIN_SHAPE.md) — per-event schema with citations +- [`website/guide/cursor-hooks.md`](../../website/guide/cursor-hooks.md) — rendered docs +- [`.cursor-plugin/README.md`](../../.cursor-plugin/README.md) — Cursor plugin (MCP + commands + skill) diff --git a/examples/cursor/hooks.json b/examples/cursor/hooks.json new file mode 100644 index 000000000..93c142088 --- /dev/null +++ b/examples/cursor/hooks.json @@ -0,0 +1,21 @@ +{ + "version": 1, + "hooks": { + "sessionStart": [ + { + "command": "$HOME/.mempalace/hooks/cursor/mempal_wake_hook_cursor.sh" + } + ], + "stop": [ + { + "command": "$HOME/.mempalace/hooks/cursor/mempal_save_hook_cursor.sh", + "loop_limit": 1 + } + ], + "preCompact": [ + { + "command": "$HOME/.mempalace/hooks/cursor/mempal_precompact_hook_cursor.sh" + } + ] + } +} diff --git a/examples/cursor/hooks.minimal.json b/examples/cursor/hooks.minimal.json new file mode 100644 index 000000000..4dfd597b7 --- /dev/null +++ b/examples/cursor/hooks.minimal.json @@ -0,0 +1,11 @@ +{ + "version": 1, + "hooks": { + "stop": [ + { + "command": "$HOME/.mempalace/hooks/cursor/mempal_save_hook_cursor.sh", + "loop_limit": 1 + } + ] + } +} diff --git a/examples/cursor/rules/README.md b/examples/cursor/rules/README.md new file mode 100644 index 000000000..31a11bae1 --- /dev/null +++ b/examples/cursor/rules/README.md @@ -0,0 +1,58 @@ +# Cursor Rules — MemPalace recall + +Optional [Cursor rules](https://cursor.com/docs/rules) that make the +agent search MemPalace before answering questions about past work, +people, projects, or prior decisions. + +These are for users who install MemPalace **without** the Cursor plugin +(or who want recall behaviour in a specific project). If you installed +the [Cursor plugin](../../../.cursor-plugin/README.md), it already ships +the `alwaysApply: false` rule at the plugin root — you do not need to +copy anything. + +## Which file to use + +| File | `alwaysApply` | Fires when | Use when | +|------|---------------|------------|----------| +| [`mempalace-recall.mdc`](mempalace-recall.mdc) | `false` | Cursor's matcher decides the turn is recall-relevant (from the rule `description`) | **Recommended.** Recall without paying for the rule on unrelated work. | +| [`mempalace-recall-always.mdc`](mempalace-recall-always.mdc) | `true` | Every conversation in scope, every turn | You want recall guaranteed in context and accept the cost. | + +The always-on variant is heavier: it sits in context on every turn and +makes the agent more eager to call `mempalace_search`, which adds MCP +latency and works against MemPalace's "memory should feel instant" +budget. Prefer the `false` variant unless you specifically want recall +forced into every conversation. Pick **one** of the two — do not install +both. + +## Install + +User scope (every workspace) — copy into `~/.cursor/rules/`: + +```bash +mkdir -p ~/.cursor/rules +cp examples/cursor/rules/mempalace-recall.mdc ~/.cursor/rules/ +``` + +Project scope (this repo only) — copy into `.cursor/rules/`: + +```bash +mkdir -p .cursor/rules +cp examples/cursor/rules/mempalace-recall.mdc .cursor/rules/ +``` + +For the aggressive variant, copy `mempalace-recall-always.mdc` instead +(only one of the two). Then reload Cursor: +Cmd-Shift-P → **Developer: Reload Window**. + +## How recall is delivered + +Recall ships in three orthogonal layers — install any combination: + +| Layer | What it does | Where | +|-------|--------------|-------| +| `sessionStart` hook | Injects wing-scoped recall context once per new chat | [`hooks/cursor/`](../../../hooks/cursor/) | +| `mempalace-recall` skill | Full search-before-answer protocol, model-invoked or attached | [`skills/mempalace-recall/`](../../../skills/mempalace-recall/) | +| Recall rule (these files) | Nudges search-before-answer on recall-relevant turns | here, or the plugin root `rules/` | + +All three reference the same canonical protocol in +[`integrations/shared/recall-protocol.md`](../../../integrations/shared/recall-protocol.md). diff --git a/examples/cursor/rules/mempalace-recall-always.mdc b/examples/cursor/rules/mempalace-recall-always.mdc new file mode 100644 index 000000000..097eea627 --- /dev/null +++ b/examples/cursor/rules/mempalace-recall-always.mdc @@ -0,0 +1,25 @@ +--- +description: Always-on MemPalace recall — search the palace before answering about past work, people, projects, or prior decisions. +alwaysApply: true +--- + +# MemPalace recall (always on) + +This is the aggressive variant of the recall rule: `alwaysApply: true` +loads it into every conversation in scope, on every turn, regardless of +whether Cursor's matcher thinks recall is relevant. + +Before answering anything that may already be in the user's memory +palace — past work, prior decisions, a person, a project, or "what did +we do / decide / discuss last time?" — search the palace first: + +1. Call `mempalace_search` with a short keyword query. Use + `mempalace_kg_query` for relational or time-bound facts. +2. Quote the drawer's **verbatim** text. Never summarize or paraphrase + stored content. +3. If results are empty, say so — do not invent an answer. If the MCP + server is unavailable, surface the error; do not fall back to guessing. + +Even with this rule always loaded, only actually call the tools when the +question touches memory. Do not search on pure greenfield work +(renaming a variable, fixing a typo). diff --git a/examples/cursor/rules/mempalace-recall.mdc b/examples/cursor/rules/mempalace-recall.mdc new file mode 100644 index 000000000..9c95c8c13 --- /dev/null +++ b/examples/cursor/rules/mempalace-recall.mdc @@ -0,0 +1,20 @@ +--- +description: When the user asks about past work, prior decisions, people, projects, or events that may be filed in MemPalace, call mempalace_search (or mempalace_kg_query for relational or time-bound facts) before answering from model memory. Return stored content verbatim; never guess when the palace might know. +alwaysApply: false +--- + +# MemPalace recall + +Before answering anything that may already be in the user's memory +palace — past work, prior decisions, a person, a project, or "what did +we do / decide / discuss last time?" — search the palace first: + +1. Call `mempalace_search` with a short keyword query. Use + `mempalace_kg_query` for relational or time-bound facts. +2. Quote the drawer's **verbatim** text. Never summarize or paraphrase + stored content. +3. If results are empty, say so — do not invent an answer. If the MCP + server is unavailable, surface the error; do not fall back to guessing. + +Skip recall for pure greenfield work with no memory relevance (renaming +a variable, fixing a typo). Recall is question-driven, not reflexive. diff --git a/hooks/README.md b/hooks/README.md index 7722d2aeb..05a895e3e 100644 --- a/hooks/README.md +++ b/hooks/README.md @@ -2,6 +2,13 @@ These hook scripts make MemPalace save automatically. No manual "save" commands needed. +This file covers the **Claude Code** and **Codex CLI** hooks that live +flat under `hooks/`. For the **Cursor IDE** hooks, see +[`hooks/cursor/README.md`](cursor/README.md) or the rendered docs at +[`website/guide/cursor-hooks.md`](../website/guide/cursor-hooks.md). The +two are additive and share the same `~/.mempalace/hook_state/` +directory. + If you are trying to protect existing Claude Code transcripts immediately, use the short checklist first: [`website/guide/claude-code-retention.md`](../website/guide/claude-code-retention.md). It covers hook wiring, JSONL backup, and one-time backfill. @@ -46,6 +53,24 @@ Make them executable: chmod +x hooks/mempal_save_hook.sh hooks/mempal_precompact_hook.sh ``` +## Install — Antigravity (Google) + +The Antigravity integration lives in its own subdirectory because the +wire format (camelCase JSON, `injectSteps[]` output) and event names +(`Stop`, `PreInvocation`) are Antigravity-specific. Use the dedicated +installer: + +```bash +bash hooks/antigravity/install.sh +``` + +This installs to `~/.gemini/config/plugins/mempalace/`, registers the +MCP server, ships the `mempalace` skill, and wires the Stop + +PreInvocation hooks. See [`hooks/antigravity/README.md`](antigravity/README.md) +for the full guide and [`hooks/antigravity/INVESTIGATION.md`](antigravity/INVESTIGATION.md) +for the source-of-truth audit of which Antigravity surfaces the +integration uses. + ## Install — Codex CLI (OpenAI) Add to `.codex/hooks.json`: diff --git a/hooks/antigravity/INVESTIGATION.md b/hooks/antigravity/INVESTIGATION.md new file mode 100644 index 000000000..61e20c82c --- /dev/null +++ b/hooks/antigravity/INVESTIGATION.md @@ -0,0 +1,332 @@ +# Antigravity IDE — Integration Surface Investigation + +**Investigated**: 2026-05-27 +**Author**: undeadindustries +**Scope**: What MemPalace can integrate with in Google's Antigravity IDE, +what we shipped, and what we deliberately did not ship. + +This document is the source of truth for design decisions in the +`feat/antigravity-support` branch. It exists so a future maintainer can +re-derive every choice without re-reading the docs cold. + +--- + +## 1. Surfaces verified against Google's official Antigravity docs + +All quotes are pulled verbatim from Google's official Antigravity +documentation on 2026-05-27. URLs are the authoritative source; the +mirrored excerpts here are for reviewer convenience. + +### 1.1. MCP — `https://antigravity.google/docs/mcp` + +> The configuration file is located at `~/.gemini/antigravity/mcp_config.json`. +> +> The configuration file has a single `mcpServers` object where you +> define each server you want to connect to. + +Schema (verified): + +```json +{ + "mcpServers": { + "": { + "command": "...", // stdio + "args": [...], + "env": {...}, + "cwd": "...", + "serverUrl": "...", // remote + "headers": {...}, + "authProviderType": "google_credentials", + "oauth": {"clientId": "...", "clientSecret": "..."}, + "disabled": false, + "disabledTools": [...] + } + } +} +``` + +Per-plugin form: `mcp_config.json` at the plugin root, same shape. +Antigravity merges plugin entries with the user's +`~/.gemini/antigravity/mcp_config.json` rather than clobbering. + +**Cross-checked locally**: the user's existing +`~/.gemini/antigravity/mcp_config.json` already contains a working +`"mempalace": {"command": "/Users/robs/.local/bin/mempalace-mcp"}` +entry, proving the shape matches and the binary is on PATH. + +### 1.2. Plugins — `https://antigravity.google/docs/plugins` + +> A plugin is a directory containing a `plugin.json` file and optional +> subdirectories for different customization types: +> +> ``` +> plugins// +> ├── plugin.json # Required marker file +> ├── mcp_config.json # Optional MCP server definitions +> ├── hooks.json # Optional hooks definition +> ├── skills/ # Optional skills +> │ └── / +> │ └── SKILL.md +> └── rules/ # Optional rules +> └── .md +> ``` + +Manifest schema (verified): + +```json +{ + "name": "my-custom-plugin" +} +``` + +> The name field is optional and defaults to the directory name if omitted. + +Install locations (verified): + +> - Workspace Level: Place your plugin folder inside a +> `.agents/plugins/` or `_agents/plugins/` directory at the root of +> your opened workspace. +> - Global Level: Place your plugin folder inside +> `~/.gemini/config/plugins/` in your user home directory. + +We ship the global location as the canonical install path. + +### 1.3. Skills — `https://antigravity.google/docs/skills` + +> A skill is a folder containing a `SKILL.md` file with instructions +> that the agent can follow when working on specific tasks. + +Frontmatter (verified): + +| Field | Required | Description | +|---------------|----------|---------------------------------------------------------------------------------------------------| +| `name` | No | A unique identifier (lowercase, hyphens). Defaults to the folder name. | +| `description` | Yes | A clear description of what the skill does and when to use it. | + +Standalone discovery paths (verified): + +| Location | Scope | +|-------------------------------------|------------------------| +| `/.agents/skills/` | Workspace-specific | +| `~/.gemini/antigravity/skills/` | Global, all workspaces | + +In-plugin discovery: `/skills//SKILL.md`. + +We ship the in-plugin form so a single install registers MCP, skill, and +hooks together. + +### 1.4. Hooks — `https://antigravity.google/docs/hooks?app=antigravity` + +> Hooks allow you to run custom scripts or shell commands at specific +> points during Antigravity's execution loop. + +`hooks.json` lives at one of: +- `~/.gemini/config/hooks.json` (global) +- `/.agents/hooks.json` (workspace) +- `/hooks.json` (per-plugin) — what we ship + +Top-level schema (verified): + +```json +{ + "": { + "enabled": true, + "PreToolUse": [{ "matcher": "...", "hooks": [{...}] }], + "PostToolUse": [{ "matcher": "...", "hooks": [{...}] }], + "PreInvocation": [{ ... }], + "PostInvocation": [{ ... }], + "Stop": [{ ... }] + } +} +``` + +For `PreInvocation` / `PostInvocation` / `Stop`, items are flat handler +objects; the `matcher` wrapper is only used for `PreToolUse` / +`PostToolUse`. + +Handler object (verified): + +| Field | Required | Description | +|-----------|----------|---------------------------------------------------| +| `type` | No | Currently only `"command"` is supported. Default. | +| `command` | Yes | The shell command to execute. | +| `timeout` | No | Timeout in seconds. Defaults to 30. | + +#### STDIN/STDOUT contract (verified) + +> Hooks receive input via stdin as JSON and should return output via +> stdout as JSON. Field names use camelCase. + +Common stdin fields (every event): + +| Field | Type | Description | +|-------------------------|-----------------|-------------------------------------------------------| +| `conversationId` | string | The unique UUID of the active agent conversation. | +| `workspacePaths` | array | Absolute directory paths of the user's workspaces. | +| `transcriptPath` | string | Absolute path to the persistent `transcript.jsonl`. | +| `artifactDirectoryPath` | string | Path to conversation artifacts and screenshots. | + +`Stop` event additional fields: + +| Field | Type | Description | +|---------------------|---------|----------------------------------------------------------------------------| +| `executionNum` | integer | Sequence number of the execution attempt. | +| `terminationReason` | string | `"model_stop"`, `"max_steps_exceeded"`, `"error"`, etc. | +| `error` | string | Optional error message. | +| `fullyIdle` | boolean | **Required.** True iff all background commands and async tasks are done. | + +`Stop` event stdout (verified): + +| Field | Type | Description | +|------------|--------|-------------------------------------------------------------------------------------------------------| +| `decision` | string | **Required.** Set to `"continue"` to FORCE the agent to keep running. Anything else allows the stop. | +| `reason` | string | Optional. If `decision == "continue"`, injected as a system message. | + +**CRITICAL**: emitting `{"decision": "continue"}` from a save hook would +turn it into an infinite agent-loop trigger. The MemPalace save hook +MUST emit `{}` on every code path. There is an explicit refusal in +`mempal_save_hook_antigravity.sh` to ever print the literal word +`"continue"` from a decision field. + +`PreInvocation` event additional fields: + +| Field | Type | Description | +|-------------------|---------|----------------------------------------------------------| +| `invocationNum` | integer | Sequence number of the current model invocation. | +| `initialNumSteps` | integer | Number of steps currently in the trajectory. | + +`PreInvocation` event stdout (verified): + +| Field | Type | Description | +|---------------|----------------|-----------------------------------------------------------------------------------------------| +| `injectSteps` | array | Optional. Steps injected before the model is called. Each step has one of: | +| | | `{ "toolCall": {...} }` / `{ "userMessage": "..." }` / `{ "ephemeralMessage": "..." }` | + +We use `ephemeralMessage` for the wake-up injection: the message is +visible to the model on this turn but does not persist to the +transcript, so we do not pollute future model calls with the same +injection. + +`PreInvocation` fires before EVERY model invocation, not only at session +start. We gate to `invocationNum == 1` to mimic Cursor's `sessionStart` +semantics — exactly one wake injection per conversation. + +### 1.5. Permissions — `https://antigravity.google/docs/permissions` + +Permissions are user-side, configured via Allow / Deny / Ask lists. +Plugins do **not** declare permissions in `plugin.json`. The +third-party "antigravity-plugins" community skill at +`~/.gemini/skills/antigravity-plugins/SKILL.md` documents a +`"permissions": [...]` field in `plugin.json`; that field is fabricated +and does not appear in any real Google-shipped plugin (`firebase`, +`google-antigravity-sdk`, `chrome-devtools-plugin`, +`modern-web-guidance-plugin`) inspected at +`~/.gemini/config/plugins/`. + +We ship a minimal `plugin.json` of `{"name": "mempalace"}`. + +--- + +## 2. Surfaces shipped + +| Surface | What we ship | +|---------------------------|---------------------------------------------------------------------------| +| Plugin manifest | `.antigravity-plugin/plugin.json` — minimal, verified shape | +| MCP auto-registration | `.antigravity-plugin/mcp_config.json` — registers `mempalace-mcp` stdio | +| Skill | `.antigravity-plugin/skills/mempalace/SKILL.md` — real file, frontmatter | +| `Stop` hook | `hooks/antigravity/mempal_save_hook_antigravity.sh` — counter + auto-mine | +| `PreInvocation` hook | `hooks/antigravity/mempal_wake_hook_antigravity.sh` — wake injection | +| Installer | `hooks/antigravity/install.sh` — idempotent, basename-match uninstall | +| User-facing docs | `website/guide/antigravity.md` + sidebar wiring | +| Examples | `examples/antigravity/{hooks.json,mcp_config.json,README.md}` | +| Tests | 3 test files mirroring the cursor blueprint | + +--- + +## 3. Surfaces deliberately not shipped + +### 3.1. `PreCompact` equivalent — NOT SHIPPED + +Antigravity's external `hooks.json` does **not** expose a context-compaction event. The Python SDK has an in-process `@hooks.on_compaction` +decorator (see `~/.gemini/config/plugins/google-antigravity-sdk/examples/getting_started/hooks.md`), +but that fires inside a Python `LocalAgentConfig`-built agent, not the +IDE itself. There is no way to subscribe to compaction from a +plugin's `hooks.json`. + +UX consequence: long conversations can auto-compact without a save +checkpoint. The `Stop` hook still catches the conversation when the +user actually ends the turn, so the worst case is that some mid-turn +state is lost on auto-compaction. Verbatim transcript ingestion via +the `Stop` path covers the long-term recall use case. + +### 3.2. Slash-commands / `commands/` — NOT SHIPPED + +Antigravity has no `commands/` plugin component. The Cursor and Codex +integrations both ship five quick-reference commands +(`mempalace-help`, `-init`, `-mine`, `-search`, `-status`) that point +at `mempalace instructions `. Those have been folded into the +`SKILL.md` `## Common operations` section so the agent gets the same +quick-reference content via Antigravity's progressive-disclosure skill +loading. No new files, no rule-noise, same discoverability. + +### 3.3. `rules/` — NOT SHIPPED + +`rules/.md` files are evaluated as constraints on the agent's +behavior. Shipping rules from MemPalace risks colliding with the +user's existing project rules (e.g. `.agents/rules/*.md` files the +user has already authored). Users who want strict MemPalace-related +rules can drop them into their own `/.agents/rules/` +directory; we do not impose them. + +### 3.4. Workspace-level `.agents/plugins/mempalace/` install — NOT SHIPPED BY DEFAULT + +The installer writes to the global location at +`~/.gemini/config/plugins/mempalace/`. Workspace-scoped installs are +documented in `hooks/antigravity/README.md` for users who want to +limit MemPalace to one workspace; they can `cp -r .antigravity-plugin +/.agents/plugins/mempalace`. We do not install there +automatically because the canonical UX is global. + +### 3.5. `permissions` field in `plugin.json` — NOT SHIPPED + +Antigravity permissions are user-side (`Allow` / `Deny` / `Ask` +lists). Plugin manifests do not declare permissions. The +`"permissions": [...]` field documented in the third-party +"antigravity-plugins" community skill is fabricated; no +Google-shipped plugin uses it. + +### 3.6. `PreToolUse` / `PostToolUse` hooks — NOT SHIPPED + +These would let MemPalace observe every tool call (e.g. +auto-extract entities after each `write_to_file`). Out of scope +for v1; the hook surface is real and could be added in a future +PR if there is demand. Documenting the omission here so a future +contributor doesn't conclude the hooks weren't supported. + +--- + +## 4. Cross-checks against the user's running Antigravity + +The user (`robs@`) has Antigravity 2.0 installed. The following live +artifacts on this machine corroborate the published docs: + +| Path | Confirms | +|------------------------------------------------------------------------|----------------------------------------------------------------| +| `~/.gemini/antigravity/mcp_config.json` | MCP config path + standard `mcpServers` shape | +| `~/.gemini/antigravity/skills//SKILL.md` (multiple) | Global skill discovery path | +| `~/.gemini/config/plugins/firebase/plugin.json` | Real `plugin.json` shape (no `permissions` field) | +| `~/.gemini/config/plugins/chrome-devtools-plugin/skills/.../SKILL.md` | In-plugin skill discovery `/skills//SKILL.md` | +| `~/.gemini/config/plugins/google-antigravity-sdk/examples/.../hooks.md` | SDK-side compaction hook is in-process Python only | +| Existing `mempalace` entry in `mcp_config.json` | `mempalace-mcp` already running and discoverable | + +--- + +## 5. Reference URLs (all 2026-05-27) + +- `https://antigravity.google/docs/plugins` +- `https://antigravity.google/docs/hooks?app=antigravity` +- `https://antigravity.google/docs/skills` +- `https://antigravity.google/docs/mcp` +- `https://antigravity.google/docs/permissions` +- `https://antigravity.google/docs/subagents` +- `https://antigravity.google/blog/introducing-google-antigravity-sdk` diff --git a/hooks/antigravity/README.md b/hooks/antigravity/README.md new file mode 100644 index 000000000..2b8d9022c --- /dev/null +++ b/hooks/antigravity/README.md @@ -0,0 +1,172 @@ +# MemPalace — Antigravity hook scripts + +Lifecycle hooks for the [Antigravity IDE](https://antigravity.google/). + +This is the third sibling of the Claude Code and Codex integrations +(see `hooks/mempal_save_hook.sh` and `.codex-plugin/hooks/`). The +overall shape is the same — a Stop event triggers a background save, +a startup-time event injects memory into the agent — but the wire +format and STDOUT contract are Antigravity-specific (see +[STDIN_SHAPE.md](STDIN_SHAPE.md)). + +## Quick start + +From the repo root: + +```bash +bash hooks/antigravity/install.sh +``` + +This installs the plugin to `~/.gemini/config/plugins/mempalace/`. +Restart Antigravity and the MCP server, skill, and hooks all register +automatically. + +To dry-run first: + +```bash +bash hooks/antigravity/install.sh --dry-run +``` + +To uninstall: + +```bash +bash hooks/antigravity/install.sh --uninstall +``` + +## What gets installed + +``` +~/.gemini/config/plugins/mempalace/ +├── plugin.json # marker manifest +├── mcp_config.json # registers mempalace-mcp +├── hooks.json # rendered from hooks.json.tmpl +├── README.md +├── skills/ +│ └── mempalace/ +│ └── SKILL.md +└── hooks/ + ├── lib/ + │ └── common.sh + ├── mempal_save_hook_antigravity.sh # Stop event handler + └── mempal_wake_hook_antigravity.sh # PreInvocation handler +``` + +`hooks.json` carries absolute paths to the two hook scripts (resolved +from `__PLUGIN_DIR__` at install time). + +## What the hooks do + +### `mempal_save_hook_antigravity.sh` (Stop event) + +Fires every time the agent's execution loop terminates. Increments a +per-conversation counter; every `MEMPAL_SAVE_INTERVAL` fires (default +15), spawns `mempalace mine --mode convos --wing +` in the background. The hook itself returns `{}` to stdout +in under a few milliseconds — the actual mining runs detached and +does not block the user. + +Defers when: + +- `fullyIdle == false` (background tasks still running) +- `terminationReason == "error"` (transcript may be corrupt) +- A previous save for this conversation is still running +- Any kill switch is set + +### `mempal_wake_hook_antigravity.sh` (PreInvocation event, gated) + +Fires before every model invocation. Gated to `invocationNum == 1` +(first invocation of the conversation only) — beyond that we'd be +re-injecting on every turn. Calls `mempalace wake-up --wing ` +with a 500ms hard timeout and emits the verbatim output as an +`ephemeralMessage` so the agent sees relevant memory on its first +response without polluting the persistent transcript. + +Skips when: + +- `invocationNum != 1` +- Already woke this conversation (atomic `mkdir` loop guard) +- `mempalace wake-up` exits non-zero, times out, or produces empty output +- Any kill switch is set + +## Kill switches + +Any one of these disables both hooks (silent passthrough, exit 0): + +| Knob | Value | +|------------------------------------------|--------------------------------| +| `MEMPAL_DISABLE_HOOK` | `1`, `true`, `yes` | +| `MEMPALACE_HOOKS_AUTO_SAVE` | `false`, `0`, `no` | +| `~/.mempalace/config.json` | `{"hooks": {"auto_save": false}}` | +| Removing `~/.mempalace/` entirely | (palace nuke) | + +## Workspace-scoped install (advanced) + +If you want MemPalace to load only inside a specific workspace, +manually copy the rendered plugin into your workspace's `.agents/plugins/`: + +```bash +bash hooks/antigravity/install.sh --install-dir /tmp/render-stage +mkdir -p /.agents/plugins/ +cp -r /tmp/render-stage /.agents/plugins/mempalace +rm -rf /tmp/render-stage +``` + +The global install at `~/.gemini/config/plugins/mempalace/` is the +canonical UX and what we recommend. + +## Troubleshooting + +### Hooks aren't firing + +1. Confirm Antigravity sees the plugin: open the IDE, navigate to the + Customizations page; `mempalace` should appear in the global plugins + list. +2. Check `~/.mempalace/hook_state/antigravity_hook.log` — every fire + logs a line. No log lines = the hook is not being invoked. +3. Verify `mempalace-mcp` is on `$PATH`: `mempalace-mcp --version`. +4. Inspect the rendered `hooks.json` paths point at executable files: + `bash -n ~/.gemini/config/plugins/mempalace/hooks/*.sh`. + +### Save fires but no mining happens + +1. Look for the most recent `[event=stop]` lines in + `antigravity_hook.log` — `count` and `interval` should both be + visible. Mining only triggers when `count % interval == 0`. +2. Ensure a Python that can import `mempalace` is reachable. The hook + runs `"$MEMPAL_PYTHON_BIN" -m mempalace`, where `MEMPAL_PYTHON_BIN` + is resolved (in order) from `$MEMPAL_PYTHON`, the + `mempalace-mcp` / `mempalace` console-script shebang on `$PATH`, + then `python3`. A failed probe logs: + + ``` + ERROR: mempalace is not runnable via -m mempalace; install mempalace or set MEMPAL_PYTHON + ``` + + On a GUI-launched Antigravity the harness `PATH` may differ from + your shell `PATH`; if the shebang heuristic can't find the right + interpreter, export `MEMPAL_PYTHON=/abs/path/python` (e.g. + `"$(uv tool dir)/mempalace/bin/python"`) and restart. + +### Wake injection isn't appearing + +1. The wake hook only injects on `invocationNum == 1`. Subsequent + invocations are gated. +2. The atomic `mkdir` marker + `~/.mempalace/hook_state/antigravity_woke_` exists + after a successful injection. Remove it to re-inject (rare). +3. `mempalace wake-up --wing ` may be returning empty output + if the wing doesn't exist yet. Run `mempalace status` to verify + wing presence. + +## See also + +- [INVESTIGATION.md](INVESTIGATION.md) — every Antigravity surface we + investigated, with verbatim quotes and source URLs. +- [STDIN_SHAPE.md](STDIN_SHAPE.md) — the exact wire format + Antigravity uses, with worked examples. +- [../mempal_save_hook.sh](../mempal_save_hook.sh) — Claude Code + equivalent. +- [../../.codex-plugin/hooks/](../../.codex-plugin/hooks/) — Codex + equivalent. +- [../../website/guide/antigravity.md](../../website/guide/antigravity.md) + — full user-facing guide. diff --git a/hooks/antigravity/STDIN_SHAPE.md b/hooks/antigravity/STDIN_SHAPE.md new file mode 100644 index 000000000..f7e904eef --- /dev/null +++ b/hooks/antigravity/STDIN_SHAPE.md @@ -0,0 +1,206 @@ +# Antigravity hook STDIN / STDOUT contract + +This file documents the exact wire format the Antigravity IDE uses +when invoking the MemPalace hook scripts. All fields are verbatim +from Google's official Antigravity hooks documentation +(`https://antigravity.google/docs/hooks?app=antigravity`, accessed +2026-05-27). See [INVESTIGATION.md](INVESTIGATION.md) for the +provenance audit. + +## Wire format + +Hooks receive **JSON on stdin** and must emit **JSON on stdout**. +Field names are **camelCase**. + +Hook execution timeout defaults to 30 seconds. The MemPalace plugin +sets the Stop hook timeout to 30s and the PreInvocation hook timeout +to 5s in the rendered `hooks.json`. + +## Common stdin fields (every event) + +| Field | Type | Notes | +|-------------------------|-----------------|--------------------------------------------------------| +| `conversationId` | string | UUID of the active agent conversation. | +| `workspacePaths` | array | Absolute workspace dirs. **First element is canonical**. | +| `transcriptPath` | string | Absolute path to `transcript.jsonl`. | +| `artifactDirectoryPath` | string | Path to conversation artifacts and screenshots. | + +## Stop event + +### Stdin (additional fields) + +| Field | Type | Notes | +|---------------------|---------|--------------------------------------------------------------------------------------| +| `executionNum` | integer | Sequence number of the execution attempt for this conversation. | +| `terminationReason` | string | `"model_stop"`, `"max_steps_exceeded"`, `"error"`, etc. | +| `error` | string | Optional. Set when termination was caused by a system error. | +| `fullyIdle` | boolean | **Required.** True iff all background commands and async tasks have completed. | + +### Stdout + +| Field | Type | Notes | +|------------|--------|--------------------------------------------------------------------------------------------------------| +| `decision` | string | If `"continue"`, **forces** the agent to keep running. Anything else allows the stop. | +| `reason` | string | Optional. If `decision == "continue"`, injected as a system message into the conversation. | + +**MemPalace policy**: the save hook ALWAYS emits `{}` and exits 0. It +NEVER emits `{"decision": "continue"}` — that would force an infinite +agent loop. There is an explicit refusal in +`mempal_save_hook_antigravity.sh` to ever construct a stdout JSON +object containing the literal word `"continue"` in a decision field. + +### MemPalace gating + +The save hook short-circuits with `{}` (no save triggered) when ANY +of the following hold: + +1. `MEMPAL_DISABLE_HOOK=1` (or `true`/`yes`) is set. +2. `MEMPALACE_HOOKS_AUTO_SAVE=false` (or `0`/`no`) is set. +3. `~/.mempalace/config.json` has `hooks.auto_save: false`. +4. `~/.mempalace/` directory does not exist (user nuked the palace). +5. Stdin is malformed or empty (sentinel-guarded parse failure). +6. `fullyIdle == false` (background tasks still running; defer save). +7. `terminationReason == "error"` (transcript may be corrupt). +8. `transcriptPath` validation fails (not a `.json`/`.jsonl`, or `..` traversal). +9. The transcript file does not exist on disk. +10. The save counter has not yet hit `count % MEMPAL_SAVE_INTERVAL == 0`. +11. A pending save is still running for this conversation (less than 1 hour old). +12. The `mempalace` CLI is not on `$PATH`. + +When the modulo gate is hit and validation passes, the hook spawns +`mempalace mine --mode convos --wing ` in +the background and returns `{}` immediately. + +## PreInvocation event + +### Stdin (additional fields) + +| Field | Type | Notes | +|-------------------|---------|------------------------------------------------------------------| +| `invocationNum` | integer | Sequence number of the current model invocation (1-based). | +| `initialNumSteps` | integer | Number of steps currently in the trajectory. | + +### Stdout + +| Field | Type | Notes | +|---------------|----------------|--------------------------------------------------------------------------------------------------------| +| `injectSteps` | array | Optional. Steps to inject before the model is called. Each step has one of: `{"toolCall": {...}}`, `{"userMessage": "..."}`, `{"ephemeralMessage": "..."}` | + +The `ephemeralMessage` form is what the MemPalace wake hook emits — it +delivers the wake-up text to the model on this turn but does not +persist into the transcript, so subsequent invocations don't see a +duplicate. + +### MemPalace gating + +The wake hook short-circuits with `{}` (no injection) when ANY of: + +1. Any kill switch trips (same five conditions as the save hook). +2. `invocationNum != 1` — we only inject on the first model call of + each conversation, mimicking Cursor's `sessionStart` semantics. +3. The atomic `mkdir`-based loop guard is already taken (this + conversation already received a wake injection). +4. `mempalace wake-up --wing ` exits non-zero, times out + (500ms hard cap), or produces empty output. + +When the gates pass and `mempalace wake-up` returns text, the hook +emits: + +```json +{ + "injectSteps": [ + { "ephemeralMessage": "" } + ] +} +``` + +The wake hook NEVER emits a `decision` field — that field belongs to +the Stop event. There is a final guard against any stdout that +contains a `decision` key. + +## Worked example: Stop event + +### Input + +```json +{ + "executionNum": 1, + "terminationReason": "model_stop", + "error": "", + "fullyIdle": true, + "conversationId": "ec33ebf9-0cba-4100-8142-c61503f6c587", + "workspacePaths": ["/home/me/projects/mempalace"], + "transcriptPath": "/home/me/projects/mempalace/.gemini/jetski/transcript.jsonl", + "artifactDirectoryPath": "/home/me/projects/mempalace/.gemini/jetski/artifacts" +} +``` + +### Output (always) + +```json +{} +``` + +(Side effects: counter `~/.mempalace/hook_state/antigravity_save_count_` is +incremented; if the modulo gate fires, a background `mempalace mine` +subprocess is spawned with the transcript directory and the inferred +wing `wing_mempalace`.) + +## Worked example: PreInvocation, first invocation + +### Input + +```json +{ + "invocationNum": 1, + "initialNumSteps": 0, + "conversationId": "ec33ebf9-0cba-4100-8142-c61503f6c587", + "workspacePaths": ["/home/me/projects/mempalace"], + "transcriptPath": "/home/me/projects/mempalace/.gemini/jetski/transcript.jsonl", + "artifactDirectoryPath": "/home/me/projects/mempalace/.gemini/jetski/artifacts" +} +``` + +### Output (when the palace has memory for `wing_mempalace`) + +```json +{ + "injectSteps": [ + { + "ephemeralMessage": "" + } + ] +} +``` + +### Output (when invocationNum != 1, or any gate trips) + +```json +{} +``` + +## State files + +All hook state lives under `~/.mempalace/hook_state/` (overridable +via `$MEMPAL_STATE_DIR`) and is namespaced with the `antigravity_` +prefix to coexist with Claude Code, Cursor, and Codex hook state in +the same directory. + +| File | Purpose | +|-----------------------------------------------|-----------------------------------------------| +| `antigravity_hook.log` | All hook activity, ISO8601Z timestamps. | +| `antigravity_save_count_` | Per-conversation Stop counter. | +| `antigravity_pending_` | Marker file for in-flight save subprocess. | +| `antigravity_woke_` (dir) | Atomic mkdir marker for wake injection. | +| `antigravity_last_input.log` | 4 KB cap, mode 0600, set on parse failure. | +| `antigravity_last_python_err.log` | Python stderr from the JSON parser, mode 0600.| + +## Environment variables + +| Variable | Default | Purpose | +|--------------------------------|----------------------|-----------------------------------------------------------| +| `MEMPAL_PYTHON` | `$(command -v python3)` | Override the Python interpreter used by the hooks. | +| `MEMPAL_STATE_DIR` | `~/.mempalace/hook_state` | Override the hook state directory. | +| `MEMPAL_SAVE_INTERVAL` | `15` | Save every Nth Stop fire. Floored to >= 1 (no /0). | +| `MEMPAL_DISABLE_HOOK` | unset | Set to `1` / `true` / `yes` to disable both hooks. | +| `MEMPALACE_HOOKS_AUTO_SAVE` | unset | Set to `false` / `0` / `no` to disable both hooks. | diff --git a/hooks/antigravity/install.sh b/hooks/antigravity/install.sh new file mode 100755 index 000000000..87eeb6e21 --- /dev/null +++ b/hooks/antigravity/install.sh @@ -0,0 +1,357 @@ +#!/bin/bash +# MEMPALACE ANTIGRAVITY INSTALLER +# +# Idempotent installer for the Antigravity plugin. Copies +# .antigravity-plugin/* and hooks/antigravity/{lib,*.sh} into the +# install directory (default ~/.gemini/config/plugins/mempalace/), +# renders hooks.json.tmpl into hooks.json with absolute paths, and +# leaves the result in a state Antigravity will discover on next +# launch. +# +# === Usage === +# +# bash hooks/antigravity/install.sh # install with defaults +# bash hooks/antigravity/install.sh --dry-run # show what would happen +# bash hooks/antigravity/install.sh --uninstall # remove plugin +# bash hooks/antigravity/install.sh --install-dir

# custom install dir +# bash hooks/antigravity/install.sh --log-level debug # noisier output +# +# === Idempotency === +# +# Re-running the installer produces a byte-identical install dir. +# Files are only written when their content differs from what is +# already on disk (cmp gate). The user's ~/.gemini/config/plugins/ +# directory is never touched outside the mempalace/ subdirectory. +# +# === Uninstall safety === +# +# Uninstall removes the install dir entirely IFF it is the +# mempalace/ plugin directory. We match by basename of the install +# dir, never by substring search, so a user who has a sibling plugin +# at ~/.gemini/config/plugins/mempalace-foo/ is unaffected. +# +# === set -e === +# +# Installer can use `set -e` (constraint #2 only forbids it in the +# hook scripts themselves). On any error we exit non-zero so a CI run +# fails loudly. + +set -e +set -u + +# ── Repo root resolution ───────────────────────────────────────────── +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd -P)" +PLUGIN_SRC="$REPO_ROOT/.antigravity-plugin" +HOOKS_SRC="$REPO_ROOT/hooks/antigravity" + +# ── Defaults ───────────────────────────────────────────────────────── +INSTALL_DIR_DEFAULT="$HOME/.gemini/config/plugins/mempalace" +INSTALL_DIR="" +DRY_RUN=0 +UNINSTALL=0 +LOG_LEVEL="info" + +# ── Args ───────────────────────────────────────────────────────────── +print_usage() { + cat <<'USAGE' +Usage: install.sh [--install-dir DIR] [--dry-run] [--uninstall] [--log-level LEVEL] + +Options: + --install-dir DIR Plugin install directory. + Default: ~/.gemini/config/plugins/mempalace + --dry-run Show what would happen without writing anything. + --uninstall Remove the installed plugin. + --log-level LEVEL debug | info | warn | error. Default: info. + -h, --help Show this help. +USAGE +} + +while [ $# -gt 0 ]; do + case "$1" in + --install-dir) + INSTALL_DIR="${2:-}" + shift 2 + ;; + --install-dir=*) + INSTALL_DIR="${1#*=}" + shift + ;; + --dry-run) + DRY_RUN=1 + shift + ;; + --uninstall) + UNINSTALL=1 + shift + ;; + --log-level) + LOG_LEVEL="${2:-info}" + shift 2 + ;; + --log-level=*) + LOG_LEVEL="${1#*=}" + shift + ;; + -h|--help) + print_usage + exit 0 + ;; + *) + echo "ERROR: unknown argument: $1" >&2 + print_usage >&2 + exit 2 + ;; + esac +done + +if [ -z "$INSTALL_DIR" ]; then + INSTALL_DIR="$INSTALL_DIR_DEFAULT" +fi + +# ── Absolutize the install dir ─────────────────────────────────────── +# +# The cursor PR review caught that a relative --install-dir would get +# baked into hooks.json verbatim, leaving paths like +# `./plugins/.../mempal_save_hook_antigravity.sh` that Antigravity +# can't resolve at runtime. Absolutize before writing anything. +mempal_absolutize() { + local p="$1" + case "$p" in + /*) printf '%s' "$p" ;; + ~*) printf '%s' "${p/#\~/$HOME}" ;; + *) + # Resolve relative to the user's $PWD at invocation time. + # We never `cd` in the main shell of this installer, so + # $PWD is already the user's invocation directory — no + # subshell cd dance needed. + local base="${PWD}" + printf '%s/%s' "$base" "$p" + ;; + esac +} +INSTALL_DIR="$(mempal_absolutize "$INSTALL_DIR")" +# Squash any `//` or `./` or `name/..` artefacts using Python's +# os.path.normpath; falls back to the raw value if Python is missing +# (which would be very unusual on macOS / Linux). +if command -v python3 >/dev/null 2>&1; then + INSTALL_DIR="$(python3 -c 'import os,sys; print(os.path.normpath(sys.argv[1]))' "$INSTALL_DIR")" +fi + +# ── Logging ────────────────────────────────────────────────────────── +log() { + local lvl="$1"; shift + local msg="$*" + case "$lvl" in + debug) + [ "$LOG_LEVEL" = "debug" ] && echo "[install] DEBUG: $msg" + return 0 + ;; + info) + case "$LOG_LEVEL" in + debug|info) echo "[install] $msg" ;; + esac + ;; + warn) + case "$LOG_LEVEL" in + debug|info|warn) echo "[install] WARN: $msg" >&2 ;; + esac + ;; + error) + echo "[install] ERROR: $msg" >&2 + ;; + esac +} + +# ── Action helpers (dry-run aware) ────────────────────────────────── +run() { + if [ "$DRY_RUN" -eq 1 ]; then + echo "[install] DRY-RUN: $*" + return 0 + fi + "$@" +} + +# ── Render template ────────────────────────────────────────────────── +# +# Substitutes __PLUGIN_DIR__ in $src into $dst with $INSTALL_DIR. +# Emits the rendered file to a temp path first, then promotes it iff +# the content differs from what's already at $dst. The cmp gate is +# what makes the installer idempotent: a no-op re-run produces no +# disk writes (and the test suite asserts byte-equality). +render_template() { + local src="$1" + local dst="$2" + if [ ! -f "$src" ]; then + log error "template not found: $src" + return 1 + fi + local tmp + tmp="$(mktemp "${TMPDIR:-/tmp}/mempal_agy_render.XXXXXX")" + # Python over awk/sed — INSTALL_DIR may legitimately contain + # characters (spaces, colons) that would require careful escaping + # in a sed s/// replacement. Python read+replace handles all of + # them uniformly. + python3 -c " +import sys +src, dst, install_dir = sys.argv[1:4] +with open(src, 'r') as f: + body = f.read() +body = body.replace('__PLUGIN_DIR__', install_dir) +with open(dst, 'w') as f: + f.write(body) +" "$src" "$tmp" "$INSTALL_DIR" + if [ -f "$dst" ] && cmp -s "$tmp" "$dst"; then + rm -f "$tmp" + log debug "unchanged: $dst" + return 0 + fi + if [ "$DRY_RUN" -eq 1 ]; then + echo "[install] DRY-RUN: would render $src -> $dst" + rm -f "$tmp" + return 0 + fi + mv "$tmp" "$dst" + log info "wrote: $dst" +} + +# ── copy_file: cmp-gated copy that preserves mode ──────────────────── +copy_file() { + local src="$1" + local dst="$2" + if [ ! -f "$src" ]; then + log error "missing source file: $src" + return 1 + fi + if [ -f "$dst" ] && cmp -s "$src" "$dst"; then + log debug "unchanged: $dst" + return 0 + fi + if [ "$DRY_RUN" -eq 1 ]; then + echo "[install] DRY-RUN: would copy $src -> $dst" + return 0 + fi + mkdir -p "$(dirname "$dst")" + cp "$src" "$dst" + log info "wrote: $dst" +} + +# ── Uninstall path ─────────────────────────────────────────────────── +# +# We DO NOT remove the install dir by string-substring match against +# the path. We require the install dir's basename to be exactly +# "mempalace" — that way an unrelated sibling like +# ~/.gemini/config/plugins/mempalace-foo/ is left alone, and a +# malformed --install-dir like ~ or / cannot wipe the user's home. +do_uninstall() { + local base + base="$(basename "$INSTALL_DIR")" + if [ "$base" != "mempalace" ]; then + log error "refusing to uninstall: install dir basename is '$base', expected 'mempalace'" + log error "(safety guard: prevents accidental wipe of unrelated directories)" + return 1 + fi + if [ ! -d "$INSTALL_DIR" ]; then + log info "nothing to uninstall: $INSTALL_DIR does not exist" + return 0 + fi + # Verify the dir LOOKS like our plugin before removing — a + # plugin.json file with our marker is the proof. + if [ ! -f "$INSTALL_DIR/plugin.json" ]; then + log error "refusing to uninstall: $INSTALL_DIR has no plugin.json" + return 1 + fi + if ! grep -q '"name"[[:space:]]*:[[:space:]]*"mempalace"' "$INSTALL_DIR/plugin.json" 2>/dev/null; then + log error "refusing to uninstall: $INSTALL_DIR/plugin.json is not a mempalace plugin" + return 1 + fi + if [ "$DRY_RUN" -eq 1 ]; then + echo "[install] DRY-RUN: would rm -rf $INSTALL_DIR" + return 0 + fi + rm -rf "$INSTALL_DIR" + log info "uninstalled: $INSTALL_DIR" +} + +if [ "$UNINSTALL" -eq 1 ]; then + do_uninstall + exit 0 +fi + +# ── Pre-install sanity ─────────────────────────────────────────────── +if [ ! -d "$PLUGIN_SRC" ]; then + log error "missing source: $PLUGIN_SRC" + exit 1 +fi +if [ ! -d "$HOOKS_SRC" ]; then + log error "missing source: $HOOKS_SRC" + exit 1 +fi + +# Soft-check that mempalace-mcp is on PATH; warn but do not fail. +if ! command -v mempalace-mcp >/dev/null 2>&1; then + log warn "mempalace-mcp is not on PATH; the MCP server will fail to start until it is." + log warn " fix: 'uv tool install mempalace' or 'pip install mempalace'" +fi + +# Soft-check ~/.gemini exists; if missing, Antigravity isn't installed. +if [ ! -d "$HOME/.gemini" ]; then + log warn "$HOME/.gemini not found — Antigravity is probably not installed yet." + log warn " the install will still proceed; Antigravity will pick up the plugin on first launch." +fi + +log info "install dir: $INSTALL_DIR" + +# ── Install: directories ───────────────────────────────────────────── +run mkdir -p "$INSTALL_DIR" \ + "$INSTALL_DIR/skills/mempalace" \ + "$INSTALL_DIR/skills/mempalace-recall" \ + "$INSTALL_DIR/rules" \ + "$INSTALL_DIR/hooks" \ + "$INSTALL_DIR/hooks/lib" + +# ── Install: plugin metadata ───────────────────────────────────────── +copy_file "$PLUGIN_SRC/plugin.json" "$INSTALL_DIR/plugin.json" +copy_file "$PLUGIN_SRC/mcp_config.json" "$INSTALL_DIR/mcp_config.json" +copy_file "$PLUGIN_SRC/README.md" "$INSTALL_DIR/README.md" + +# ── Install: skills (real files, no symlinks at the discovery path) ── +copy_file "$PLUGIN_SRC/skills/mempalace/SKILL.md" \ + "$INSTALL_DIR/skills/mempalace/SKILL.md" +copy_file "$PLUGIN_SRC/skills/mempalace-recall/SKILL.md" \ + "$INSTALL_DIR/skills/mempalace-recall/SKILL.md" + +# ── Install: optional recall rule ──────────────────────────────────── +# +# Antigravity discovers markdown rules under the plugin's rules/ +# directory. This one is recall-only and intentionally lightweight — +# it complements the mempalace-recall skill. Shipping it as a plugin +# rule (not an always-on global rule) keeps it scoped to recall- +# relevant turns, honouring MemPalace's "memory should feel instant" +# budget. +copy_file "$PLUGIN_SRC/rules/mempalace-recall.md" \ + "$INSTALL_DIR/rules/mempalace-recall.md" + +# ── Install: hooks ─────────────────────────────────────────────────── +copy_file "$HOOKS_SRC/lib/common.sh" "$INSTALL_DIR/hooks/lib/common.sh" +copy_file "$HOOKS_SRC/mempal_save_hook_antigravity.sh" "$INSTALL_DIR/hooks/mempal_save_hook_antigravity.sh" +copy_file "$HOOKS_SRC/mempal_wake_hook_antigravity.sh" "$INSTALL_DIR/hooks/mempal_wake_hook_antigravity.sh" + +# Ensure hook scripts are executable on the install side. cp preserves +# mode but a fresh git clone from a tarball might not — chmod is +# defensive, idempotent, and bash 3.2 safe. +if [ "$DRY_RUN" -ne 1 ]; then + chmod 755 "$INSTALL_DIR/hooks/mempal_save_hook_antigravity.sh" 2>/dev/null || true + chmod 755 "$INSTALL_DIR/hooks/mempal_wake_hook_antigravity.sh" 2>/dev/null || true +fi + +# ── Install: render hooks.json from template ───────────────────────── +render_template "$PLUGIN_SRC/hooks.json.tmpl" "$INSTALL_DIR/hooks.json" + +# ── Done ───────────────────────────────────────────────────────────── +if [ "$DRY_RUN" -eq 1 ]; then + log info "DRY-RUN complete; no files written." +else + log info "install complete: $INSTALL_DIR" + log info "restart Antigravity to load the plugin." +fi diff --git a/hooks/antigravity/lib/common.sh b/hooks/antigravity/lib/common.sh new file mode 100644 index 000000000..f900024b8 --- /dev/null +++ b/hooks/antigravity/lib/common.sh @@ -0,0 +1,498 @@ +# shellcheck shell=bash +# MEMPALACE ANTIGRAVITY HOOK — shared helpers +# +# Sourced by the two Antigravity hook scripts: +# * mempal_save_hook_antigravity.sh (Stop event) +# * mempal_wake_hook_antigravity.sh (PreInvocation event, gated to invocationNum==1) +# +# Mirrors the conventions of the existing Claude Code hook scripts +# (hooks/mempal_save_hook.sh, hooks/mempal_precompact_hook.sh): +# +# * STATE_DIR layout under ~/.mempalace/hook_state/ +# * MEMPAL_PYTHON resolution order (override -> $PATH -> bare python3) +# * MEMPALACE_HOOKS_AUTO_SAVE=false kill switch (config.json fallback) +# * sentinel-guarded Python parser via `sed -n 'Np'` (bash 3.2 safe) +# * fail-open on internal errors: emit valid JSON and log, never crash +# the hook host +# +# Antigravity-specific contract differences from Claude / Cursor: +# +# * Antigravity stdin uses camelCase (transcriptPath, conversationId, +# workspacePaths, executionNum, terminationReason, fullyIdle, +# invocationNum, initialNumSteps), not the snake_case Claude Code +# format (session_id, transcript_path, stop_hook_active). +# * Antigravity stdout for Stop event MUST be {} on every success path +# because { "decision": "continue" } would force the agent into an +# infinite re-execution loop. The save hook explicitly refuses to +# ever emit the "continue" decision. +# * Antigravity stdout for PreInvocation can carry an "injectSteps" +# array of { "ephemeralMessage": "..." } objects to inject memory +# into the agent's first turn. +# +# This file is sourced, not executed, so it intentionally has no +# shebang. The `# shellcheck shell=bash` directive above tells +# shellcheck to treat it as bash when run standalone. + +# bash 3.2.57 (the macOS default) is the lower bound. Do not use +# `mapfile`, `readarray`, `declare -A`, or `${var^^}` — none of those +# exist in 3.2. Use `sed -n 'Np'` for line extraction and case-folding +# via `tr` instead. + +# ── State directory + log path ──────────────────────────────────────── +# +# Honour MEMPAL_STATE_DIR while keeping the default identical to the +# Claude Code hooks so a user running both keeps a single state directory +# (constraint #7 in the integration brief). +MEMPAL_STATE_DIR="${MEMPAL_STATE_DIR:-$HOME/.mempalace/hook_state}" +mkdir -p "$MEMPAL_STATE_DIR" 2>/dev/null +MEMPAL_AGY_LOG="$MEMPAL_STATE_DIR/antigravity_hook.log" + +# ── Python interpreter resolution ───────────────────────────────────── +# +# The hooks run mempalace as `"$MEMPAL_PYTHON_BIN" -m mempalace`, so the +# resolved interpreter MUST be one that has the mempalace package +# importable. The single most common install path — +# `uv tool install mempalace` (and `pipx install`) — puts the +# `mempalace` / `mempalace-mcp` *console scripts* on PATH inside an +# ISOLATED environment whose interpreter is NOT the system `python3`. +# So naively resolving `command -v python3` lands on a system Python +# that can't import mempalace, the `-m mempalace` probe fails, and +# mining silently never fires. (This bit a real user on PR #1633.) +# +# Resolution order — first hit wins: +# 1. $MEMPAL_PYTHON — explicit operator override +# 2. shebang of the mempalace-mcp / mempalace console script on PATH +# — pip/uv write these with an absolute-path shebang pointing at +# the exact interpreter that owns the package. This is the SAME +# console script mcp_config.json launches, so if the MCP server +# can start, the hooks resolve a working interpreter too — no +# MEMPAL_PYTHON needed for the common install paths. +# 3. $(command -v python3) — an activated dev venv / +# editable install where python3 itself owns the package +# 4. bare "python3" — last-resort fallback +# +# Steps 2-4 are pure string parsing + stat (no Python subprocess), so +# resolution stays cheap enough to run at source time on every hook +# fire, including gated-out / kill-switched ones. We deliberately do +# NOT run an `import mempalace` probe here: building that import pays +# the chromadb/onnx cold-start cost, which a recent perf fix +# (df295bd) moved OFF the hook foreground on purpose. The downstream +# `-m mempalace --version` probe (backgrounded in the save hook, +# subprocessed in the wake hook) is the safety net that catches a +# shebang interpreter whose package is genuinely broken. +mempal_resolve_python() { + # 1. Explicit override always wins. + local p="${MEMPAL_PYTHON:-}" + if [ -n "$p" ] && [ -x "$p" ]; then + printf '%s' "$p" + return 0 + fi + + # 2. Derive the interpreter from a mempalace console-script shebang. + local script_path shebang interp + for script_path in mempalace-mcp mempalace; do + script_path="$(command -v "$script_path" 2>/dev/null || true)" + [ -n "$script_path" ] || continue + [ -r "$script_path" ] || continue + shebang="$(sed -n '1p' "$script_path" 2>/dev/null)" + case "$shebang" in + '#!'*) + interp="${shebang#\#!}" # drop the leading '#!' + interp="${interp%$'\r'}" # strip a trailing CR (CRLF files) + interp="${interp# }" # drop one leading space + interp="${interp%% *}" # first whitespace-delimited token + # Guard against `#!/usr/bin/env python` wrappers: only + # accept a token whose basename looks like a Python + # interpreter and is executable. An `env`-style shebang + # yields `/usr/bin/env` here, which we skip. + case "${interp##*/}" in + python*) + if [ -x "$interp" ]; then + printf '%s' "$interp" + return 0 + fi + ;; + esac + ;; + esac + done + + # 3. First python3 on PATH. + p="$(command -v python3 2>/dev/null || true)" + if [ -n "$p" ]; then + printf '%s' "$p" + return 0 + fi + + # 4. Last-resort bare name. + printf '%s' "python3" +} +MEMPAL_PYTHON_BIN="$(mempal_resolve_python)" + +# ── Logging ─────────────────────────────────────────────────────────── +# +# ISO8601Z timestamps are greppable across timezones. +mempal_log() { + local event="${1:-?}" + local conv="${2:-unknown}" + local msg="${3:-}" + local ts + ts="$(date -u '+%Y-%m-%dT%H:%M:%SZ')" + printf '[%s] [event=%s] [conv=%s] %s\n' "$ts" "$event" "$conv" "$msg" \ + >> "$MEMPAL_AGY_LOG" 2>/dev/null +} + +# ── Kill switch ─────────────────────────────────────────────────────── +# +# Disabled if ANY of: +# * MEMPAL_DISABLE_HOOK is a truthy string +# * MEMPALACE_HOOKS_AUTO_SAVE is false/0/no +# * ~/.mempalace/config.json sets hooks.auto_save: false +# * ~/.mempalace/ directory does not exist (user nuked the palace) +# +# Returns 0 (kill switch tripped, hook should short-circuit) or non-zero +# (proceed normally). +mempal_kill_switch_tripped() { + # Palace nuke is the strongest signal: respect it before touching + # disk for state, logging, etc. + if [ ! -d "$HOME/.mempalace" ]; then + return 0 + fi + + case "${MEMPAL_DISABLE_HOOK:-}" in + 1|true|TRUE|yes|YES) return 0 ;; + esac + + case "${MEMPALACE_HOOKS_AUTO_SAVE:-}" in + false|FALSE|0|no|NO) return 0 ;; + esac + + local cfg="$HOME/.mempalace/config.json" + if [ -f "$cfg" ]; then + local auto + auto=$("$MEMPAL_PYTHON_BIN" -c " +import json, sys +try: + with open(sys.argv[1]) as f: + cfg = json.load(f) + print(str(cfg.get('hooks', {}).get('auto_save', True)).lower()) +except Exception: + print('true') +" "$cfg" 2>/dev/null) + if [ "$auto" = "false" ]; then + return 0 + fi + fi + + return 1 +} + +# ── camelCase JSON parser (Antigravity stdin) ──────────────────────── +# +# Reads JSON from stdin once and prints a sanitized, sentinel-bracketed +# block of fields the bash side can grab via `sed -n 'Np'`. Why a +# sentinel and per-line layout: bash 3.2 doesn't have `mapfile` or +# `readarray`, and `eval`-on-shell-var is the wrong shape (every value +# is user-controllable JSON). Sentinel + line offset is the same pattern +# the existing Claude Code hook (hooks/mempal_save_hook.sh) uses. +# +# Output layout (one field per line; line numbers are stable and the +# fields are documented in STDIN_SHAPE.md): +# +# line 1: __MEMPAL_PARSE_OK__ — sentinel (parse success marker) +# line 2: conversationId — sanitized to [A-Za-z0-9._-] +# line 3: transcriptPath — sanitized to a safe path charset +# line 4: workspacePath — workspacePaths[0], sanitized +# line 5: artifactDirectoryPath — sanitized +# line 6: executionNum — integer, default 0 +# line 7: terminationReason — sanitized to [a-z_] +# line 8: fullyIdle — "True" or "False" (string) +# line 9: invocationNum — integer, default 0 +# line 10: initialNumSteps — integer, default 0 +# +# The sanitizers are defense-in-depth: every field is also vetted by +# the Python json.load step, but we still strip shell-meaningful chars +# from any field a downstream bash variable might interpolate, so that +# a hostile / malformed harness payload cannot inject command tokens. +# +# Stderr from Python is captured to last_python_err.log at mode 0600 so +# operators can debug parse failures without re-firing the hook. The +# umask 077 on the inner subshell creates the file at 0600 atomically; +# the explicit chmod 600 below is a belt-and-suspenders guard if a +# future edit ever drops the umask. +mempal_parse_stdin() { + local input="$1" + ( + umask 077 + printf '%s' "$input" | "$MEMPAL_PYTHON_BIN" -c " +import sys, json, re + +# IMPORTANT: do NOT wrap json.load in a try/except. If the input is +# not valid JSON we want Python to exit non-zero BEFORE printing the +# __MEMPAL_PARSE_OK__ sentinel — the bash caller looks for the +# sentinel on line 1 to decide whether to engage its defense-in-depth +# 'failed to parse' branch. Catching the exception and falling back +# to data={} would let the sentinel print, masking parse failures +# from the bash side. The traceback lands in +# antigravity_last_python_err.log so operators can debug. +data = json.load(sys.stdin) + +def safe(s, allowed=r'[^a-zA-Z0-9_/.\-~]'): + return re.sub(allowed, '', str(s)) + +def safe_id(s): + return re.sub(r'[^a-zA-Z0-9._-]', '', str(s)) + +def safe_int(v, default=0): + try: + n = int(v) + return n if n >= 0 else default + except Exception: + return default + +def safe_lower_alpha_underscore(s): + return re.sub(r'[^a-z_]', '', str(s).lower()) + +conv_id = safe_id(data.get('conversationId', '')) +transcript = safe(data.get('transcriptPath', '')) +wp_arr = data.get('workspacePaths', []) +if isinstance(wp_arr, list) and wp_arr: + workspace = safe(wp_arr[0]) +else: + workspace = '' +artifact = safe(data.get('artifactDirectoryPath', '')) +execution_num = safe_int(data.get('executionNum', 0)) +termination_reason = safe_lower_alpha_underscore(data.get('terminationReason', '')) +fully_idle_raw = data.get('fullyIdle', None) +if fully_idle_raw is True or str(fully_idle_raw).lower() in ('true', '1', 'yes'): + fully_idle = 'True' +else: + fully_idle = 'False' +invocation_num = safe_int(data.get('invocationNum', 0)) +initial_num_steps = safe_int(data.get('initialNumSteps', 0)) + +print('__MEMPAL_PARSE_OK__') +print(conv_id) +print(transcript) +print(workspace) +print(artifact) +print(execution_num) +print(termination_reason) +print(fully_idle) +print(invocation_num) +print(initial_num_steps) +" 2>"$MEMPAL_STATE_DIR/antigravity_last_python_err.log" + ) + # Tidy up the err log: keep it iff non-empty (failure happened). + if [ -s "$MEMPAL_STATE_DIR/antigravity_last_python_err.log" ]; then + chmod 600 "$MEMPAL_STATE_DIR/antigravity_last_python_err.log" 2>/dev/null + else + rm -f "$MEMPAL_STATE_DIR/antigravity_last_python_err.log" 2>/dev/null + fi +} + +# ── Transcript path validator ───────────────────────────────────────── +# +# Mirrors mempalace.hooks_cli._validate_transcript_path: rejects empty, +# non-jsonl/json suffixes, and any `..` traversal segment. +mempal_is_valid_transcript_path() { + local path="$1" + [ -n "$path" ] || return 1 + case "$path" in + *.json|*.jsonl) ;; + *) return 1 ;; + esac + case "/$path/" in + */../*) return 1 ;; + esac + return 0 +} + +# ── Wing inference ──────────────────────────────────────────────────── +# +# Takes the first workspace path from workspacePaths[] (already +# extracted into $1) and derives a `wing_` name from its leaf +# directory. Hyphens become underscores; spaces become underscores. +# Empty input yields wing_sessions, matching mempalace.hooks_cli's +# fallback. +mempal_infer_wing() { + local workspace="$1" + if [ -z "$workspace" ]; then + printf 'wing_sessions' + return 0 + fi + # Strip trailing slashes + while [ "${workspace}" != "${workspace%/}" ]; do + workspace="${workspace%/}" + done + if [ -z "$workspace" ]; then + printf 'wing_sessions' + return 0 + fi + local leaf="${workspace##*/}" + if [ -z "$leaf" ]; then + printf 'wing_sessions' + return 0 + fi + # Lowercase + hyphens-to-underscores. tr is bash 3.2 safe; ${var^^} + # / ${var//-/_} on a fresh expansion are bash 4+ only. + local slug + slug=$(printf '%s' "$leaf" | tr 'A-Z' 'a-z' | tr ' -' '__') + printf 'wing_%s' "$slug" +} + +# ── Save-interval floor ─────────────────────────────────────────────── +# +# Reads MEMPAL_SAVE_INTERVAL from the environment, floors it to >= 1 +# so that `count % interval` cannot divide by zero. We hit the +# divide-by-zero shape on the Cursor PR review; this guards explicitly. +mempal_save_interval() { + local raw="${MEMPAL_SAVE_INTERVAL:-15}" + case "$raw" in + ''|*[!0-9]*) printf '15'; return 0 ;; + esac + # Strip leading zeros. bash arithmetic ($((...))) parses any token + # starting with `0` as octal, so MEMPAL_SAVE_INTERVAL=08 would + # crash $((COUNT % INTERVAL)) with "value too great for base". + # Loop while the value still starts with 0 AND has length > 1, so + # the literal string "0" is preserved (then floored to 15 below). + while [ "${raw}" != "${raw#0}" ] && [ "${#raw}" -gt 1 ]; do + raw="${raw#0}" + done + if [ "$raw" -lt 1 ] 2>/dev/null; then + printf '15' + return 0 + fi + printf '%s' "$raw" +} + +# ── Atomic counter write ────────────────────────────────────────────── +# +# Writes $value to $file via a same-directory temp file + `mv -f`. +# `mv` (rename) is atomic on a single filesystem, so a concurrent +# reader either sees the old contents or the new contents — never a +# half-written / truncated file. A plain `printf > file` truncates +# first and then writes, leaving a window where a concurrent Stop fire +# could read an empty / partial value. Concurrent fires for one +# conversation are unlikely (Antigravity serializes turns) but the +# previous "written atomically" comment was simply false; this makes +# it true. +# +# bash 3.2 safe. The temp lives in the same directory as the target so +# the rename stays on one filesystem (a cross-device mv would fall back +# to copy+unlink and lose atomicity). On any failure we degrade to a +# direct write rather than leaving the counter unwritten. +mempal_write_counter_atomic() { + local file="$1" + local value="$2" + local tmp + tmp="$(mktemp "${file}.XXXXXX" 2>/dev/null)" || { + printf '%s' "$value" > "$file" + return + } + printf '%s' "$value" > "$tmp" + mv -f "$tmp" "$file" 2>/dev/null || { + rm -f "$tmp" 2>/dev/null + printf '%s' "$value" > "$file" + } +} + +# ── State-file TTL ──────────────────────────────────────────────────── +# +# Per-conversation state artifacts (antigravity_save_count_, +# antigravity_pending_, antigravity_woke_/) accumulate one +# set per conversation and are never otherwise removed. Reads +# MEMPAL_STATE_TTL_DAYS (default 30), validated digits-only and +# leading-zero-stripped like mempal_save_interval so `find -mtime` +# never sees a bad token. A value of 0 means "sweep aggressively" +# (everything older than today); we floor empty/garbage to 30. +mempal_state_ttl_days() { + local raw="${MEMPAL_STATE_TTL_DAYS:-30}" + case "$raw" in + ''|*[!0-9]*) printf '30'; return 0 ;; + esac + while [ "${raw}" != "${raw#0}" ] && [ "${#raw}" -gt 1 ]; do + raw="${raw#0}" + done + printf '%s' "$raw" +} + +# ── Stale state GC ──────────────────────────────────────────────────── +# +# Opportunistic sweep of per-conversation state older than the TTL. +# Gated to run at most once per 24h via the antigravity_last_sweep +# marker so it costs nothing on the vast majority of fires (a single +# mtime comparison). When it does run, three `find` passes remove the +# stale counter files, pending markers, and woke marker directories. +# +# The name globs are specific (antigravity_save_count_*, _pending_*, +# _woke_*), so the shared log files (antigravity_hook.log, +# antigravity_last_input.log, antigravity_last_python_err.log) and the +# antigravity_last_sweep marker itself are never touched. BSD find +# (macOS default) and GNU find both accept `-maxdepth`, `-mtime +N`, +# and `-exec ... +`. +# +# Fail-open: every step is best-effort; a missing state dir, a find +# that errors, or a permission problem must never abort the caller. +mempal_gc_stale_state() { + [ -d "$MEMPAL_STATE_DIR" ] || return 0 + + local marker="$MEMPAL_STATE_DIR/antigravity_last_sweep" + if [ -f "$marker" ]; then + local mtime now + if mtime=$("$MEMPAL_PYTHON_BIN" -c 'import os, sys; print(int(os.path.getmtime(sys.argv[1])))' "$marker" 2>/dev/null) \ + && now=$(date '+%s' 2>/dev/null) \ + && [ -n "$mtime" ] \ + && [ "$((now - mtime))" -lt 86400 ]; then + return 0 + fi + fi + # Touch the marker first so a crash mid-sweep still throttles the + # next fire (better to skip a sweep than to hammer the disk). + : > "$marker" 2>/dev/null + + local ttl + ttl=$(mempal_state_ttl_days) + + find "$MEMPAL_STATE_DIR" -maxdepth 1 -type f \ + -name 'antigravity_save_count_*' -mtime +"$ttl" \ + -exec rm -f {} + 2>/dev/null + find "$MEMPAL_STATE_DIR" -maxdepth 1 -type f \ + -name 'antigravity_pending_*' -mtime +"$ttl" \ + -exec rm -f {} + 2>/dev/null + find "$MEMPAL_STATE_DIR" -maxdepth 1 -type d \ + -name 'antigravity_woke_*' -mtime +"$ttl" \ + -exec rm -rf {} + 2>/dev/null + + return 0 +} + +# ── Fail-open emitters ──────────────────────────────────────────────── +# +# Every code path in both hooks must terminate by calling exactly one +# of these emitters. Stdout is JSON. Exit status is always 0 — the hook +# never blocks the user's IDE on its own failure (constraint #2). +# +# CRITICAL: mempal_emit_stop_pass MUST NEVER emit +# {"decision":"continue"} — that would force the agent to keep running +# instead of letting the turn end. Antigravity treats any value other +# than "continue" (including `{}`) as "allow the stop". We enforce this +# by hard-coding the empty object output here. +mempal_emit_stop_pass() { + printf '{}\n' +} + +mempal_emit_wake_inject() { + local message="$1" + if [ -z "$message" ]; then + printf '{}\n' + return 0 + fi + # Encode the message as JSON via Python so embedded quotes / newlines + # / control chars don't corrupt the output. + "$MEMPAL_PYTHON_BIN" -c " +import json, sys +msg = sys.argv[1] +print(json.dumps({'injectSteps': [{'ephemeralMessage': msg}]})) +" "$message" 2>/dev/null || printf '{}\n' +} diff --git a/hooks/antigravity/mempal_save_hook_antigravity.sh b/hooks/antigravity/mempal_save_hook_antigravity.sh new file mode 100755 index 000000000..47b99483f --- /dev/null +++ b/hooks/antigravity/mempal_save_hook_antigravity.sh @@ -0,0 +1,245 @@ +#!/bin/bash +# MEMPALACE ANTIGRAVITY SAVE HOOK — Stop event handler +# +# Antigravity fires the Stop event each time the agent's execution loop +# terminates. We use it to background-mine the active conversation +# transcript every Nth save into the user's MemPalace, and to write a +# diary checkpoint via `mempalace mine --mode convos`. +# +# Mirrors the Claude Code (hooks/mempal_save_hook.sh) and Codex +# (.codex-plugin/hooks/mempal-hook.sh) integrations as closely as the +# Antigravity stdin/stdout contract allows. Differences: +# +# * Antigravity stdin uses camelCase: conversationId, transcriptPath, +# workspacePaths, executionNum, terminationReason, fullyIdle. +# * Antigravity stdout MUST be `{}` on every code path. Emitting +# `{"decision":"continue"}` would force the agent to keep running +# and create an infinite loop. We never call mempal_emit_stop_pass +# with anything other than the literal empty object. +# * Counter file is namespaced antigravity_save_count_ +# to coexist with Claude Code / Cursor / Codex state in the same +# ~/.mempalace/hook_state/ directory. +# +# === STDIN (verified, camelCase) === +# { +# "executionNum": 1, +# "terminationReason": "model_stop", +# "error": "", +# "fullyIdle": true, +# "conversationId": "", +# "workspacePaths": ["/abs/path/..."], +# "transcriptPath": "/abs/path/transcript.jsonl", +# "artifactDirectoryPath": "/abs/path/artifacts/" +# } +# +# === STDOUT (always) === +# {} +# +# `set -e` is intentionally NOT enabled — a broken hook must not block +# the user's conversation (constraint #2 in the integration brief). + +# ── Locate this script + source common helpers ─────────────────────── +MEMPAL_AGY_HOOK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)" +# shellcheck source=lib/common.sh +. "$MEMPAL_AGY_HOOK_DIR/lib/common.sh" + +# ── Read all of stdin once ─────────────────────────────────────────── +INPUT=$(cat) + +# ── Kill switch: short-circuit cleanly if disabled ─────────────────── +if mempal_kill_switch_tripped; then + mempal_emit_stop_pass + exit 0 +fi + +# ── Opportunistic GC of stale per-conversation state ───────────────── +# +# Self-throttled to at most once per 24h (see mempal_gc_stale_state), +# so this is a single mtime check on the overwhelming majority of +# fires. Runs after the kill switch so a disabled hook touches nothing. +mempal_gc_stale_state + +# ── Parse stdin (camelCase, sentinel-guarded) ──────────────────────── +_parsed=$(mempal_parse_stdin "$INPUT") +_marker=$(printf '%s\n' "$_parsed" | sed -n '1p') +CONVERSATION_ID=$(printf '%s\n' "$_parsed" | sed -n '2p') +TRANSCRIPT_PATH=$(printf '%s\n' "$_parsed" | sed -n '3p') +WORKSPACE_PATH=$(printf '%s\n' "$_parsed" | sed -n '4p') +# Line 5 (artifactDirectoryPath) is parsed but unused for save. Skip. +EXECUTION_NUM=$(printf '%s\n' "$_parsed" | sed -n '6p') +# Line 7 (terminationReason) is parsed but used only for logging. +TERMINATION_REASON=$(printf '%s\n' "$_parsed" | sed -n '7p') +FULLY_IDLE=$(printf '%s\n' "$_parsed" | sed -n '8p') + +# ── Defense-in-depth: surface raw input on parse failure ───────────── +# +# When the sentinel is missing, Python crashed before reaching its +# print() calls. Persist the offending payload (capped at 4 KB, mode +# 0600) so the next debugger doesn't lose a day to log lines that say +# "Session unknown". +if [ -n "$INPUT" ] && [ "$_marker" != "__MEMPAL_PARSE_OK__" ]; then + mempal_log "stop" "unknown" "input parse failed (sentinel missing); see antigravity_last_input.log + antigravity_last_python_err.log" + ( + umask 077 + printf '%s' "$INPUT" | head -c 4096 > "$MEMPAL_STATE_DIR/antigravity_last_input.log" + ) + chmod 600 "$MEMPAL_STATE_DIR/antigravity_last_input.log" 2>/dev/null + # Continue with empty fields; the validators below will reject. +fi + +CONVERSATION_ID="${CONVERSATION_ID:-unknown}" +TRANSCRIPT_PATH="${TRANSCRIPT_PATH:-}" +WORKSPACE_PATH="${WORKSPACE_PATH:-}" +EXECUTION_NUM="${EXECUTION_NUM:-0}" +TERMINATION_REASON="${TERMINATION_REASON:-}" +FULLY_IDLE="${FULLY_IDLE:-False}" + +# Expand ~ in the transcript path +TRANSCRIPT_PATH="${TRANSCRIPT_PATH/#\~/$HOME}" + +# ── Bail when fullyIdle is False ───────────────────────────────────── +# +# If background commands or async tasks are still running, the +# transcript is still in motion. Defer the save until the next Stop +# event when the agent is fully done — better to skip than to ingest a +# half-finished transcript and pollute the search index. +if [ "$FULLY_IDLE" != "True" ]; then + mempal_log "stop" "$CONVERSATION_ID" "deferring save: fullyIdle=False (executionNum=$EXECUTION_NUM, terminationReason=$TERMINATION_REASON)" + mempal_emit_stop_pass + exit 0 +fi + +# ── Skip when terminationReason is `error` ─────────────────────────── +# +# A model error termination usually means the transcript is corrupt or +# truncated. Don't ingest noise. +if [ "$TERMINATION_REASON" = "error" ]; then + mempal_log "stop" "$CONVERSATION_ID" "skipping save: terminationReason=error" + mempal_emit_stop_pass + exit 0 +fi + +# ── Increment counter (per conversation) ───────────────────────────── +# +# The counter is a single integer, written via mempal_write_counter_atomic +# (same-dir temp + `mv`, which is an atomic rename on one filesystem). +# Concurrent Stop fires for the same conversation are unlikely +# (Antigravity serializes turns), but the atomic write means a +# concurrent reader always sees a complete value rather than a +# half-written / truncated file. The integer-only validation on read +# is a second guard: any garbled value resets the count to 0. +COUNTER_FILE="$MEMPAL_STATE_DIR/antigravity_save_count_${CONVERSATION_ID}" +COUNT=0 +if [ -f "$COUNTER_FILE" ]; then + raw=$(cat "$COUNTER_FILE" 2>/dev/null) + case "$raw" in + ''|*[!0-9]*) COUNT=0 ;; + *) COUNT="$raw" ;; + esac +fi +COUNT=$((COUNT + 1)) +mempal_write_counter_atomic "$COUNTER_FILE" "$COUNT" + +INTERVAL=$(mempal_save_interval) +mempal_log "stop" "$CONVERSATION_ID" "count=$COUNT interval=$INTERVAL executionNum=$EXECUTION_NUM workspace=$WORKSPACE_PATH" + +# ── Modulo gate ────────────────────────────────────────────────────── +# +# `count % interval == 0` triggers a save. INTERVAL has been floored to +# >= 1 by mempal_save_interval, so the modulo cannot divide by zero +# even if the user explicitly set MEMPAL_SAVE_INTERVAL=0 or empty. +if [ $((COUNT % INTERVAL)) -ne 0 ]; then + mempal_emit_stop_pass + exit 0 +fi + +# ── Pending-marker guard ───────────────────────────────────────────── +# +# If a previous save is still running (the marker file exists), skip +# this fire. The mine subprocess removes the marker on exit, but a +# crashed mine could leave the marker forever — guard against that by +# treating markers older than 1 hour as stale and reclaiming them. +PENDING_FILE="$MEMPAL_STATE_DIR/antigravity_pending_${CONVERSATION_ID}" +if [ -f "$PENDING_FILE" ]; then + # mtime in epoch seconds (portable; BSD/macOS `date -r` takes epoch, not a path). + if mtime=$("$MEMPAL_PYTHON_BIN" -c 'import os, sys; print(int(os.path.getmtime(sys.argv[1])))' "$PENDING_FILE" 2>/dev/null) \ + && now=$(date '+%s') \ + && [ -n "$mtime" ] \ + && [ "$((now - mtime))" -lt 3600 ]; then + mempal_log "stop" "$CONVERSATION_ID" "pending save still in flight; skipping" + mempal_emit_stop_pass + exit 0 + fi + mempal_log "stop" "$CONVERSATION_ID" "stale pending marker reclaimed" + rm -f "$PENDING_FILE" 2>/dev/null +fi + +# ── Validate transcript path ───────────────────────────────────────── +if ! mempal_is_valid_transcript_path "$TRANSCRIPT_PATH"; then + mempal_log "stop" "$CONVERSATION_ID" "invalid transcriptPath rejected: $TRANSCRIPT_PATH" + mempal_emit_stop_pass + exit 0 +fi +if [ ! -f "$TRANSCRIPT_PATH" ]; then + mempal_log "stop" "$CONVERSATION_ID" "transcriptPath does not exist: $TRANSCRIPT_PATH" + mempal_emit_stop_pass + exit 0 +fi + +# ── Trigger save ───────────────────────────────────────────────────── +WING=$(mempal_infer_wing "$WORKSPACE_PATH") +TRANSCRIPT_DIR=$(dirname "$TRANSCRIPT_PATH") + +mempal_log "stop" "$CONVERSATION_ID" "TRIGGERING SAVE wing=$WING transcript_dir=$TRANSCRIPT_DIR" + +# Drop the pending marker BEFORE spawning so a near-simultaneous fire +# sees it. If the spawn fails, remove the marker so the next fire can +# retry. +: > "$PENDING_FILE" 2>/dev/null + +# Detach EVERYTHING heavy into a single background subshell: the +# runnability probe, the mine itself, and the pending-marker cleanup. +# The foreground returns immediately after spawning, so the hook's +# stdout (`{}`) reaches Antigravity within milliseconds. +# +# Why the probe must NOT run in the foreground: `mempalace --version` +# is NOT cheap. Building the `mine` argument parser imports +# `mempalace.miner` (-> palace -> backends -> chromadb/onnx) before +# argparse ever processes `--version`, so the probe pays the full +# cold-start import cost. Running it in the foreground would block the +# hook for that entire import and blow the <500ms save budget. Moving +# it inside the backgrounded subshell keeps the foreground instant. +# +# Folding the cleanup into this same subshell also removes the need for +# a separate process-liveness polling watcher: the `rm -f +# "$PENDING_FILE"` simply runs after the mine returns, in the same +# shell that owns the mine — no sibling-PID `wait` hazard, no polling +# loop. +# +# We invoke mempalace as `"$MEMPAL_PYTHON_BIN" -m mempalace` rather than +# the bare `mempalace` console script so a user with the package +# installed only inside a venv (and the venv's bin/ not on the hook's +# PATH, e.g. `uv tool install` in some distributions, or a manually +# managed virtualenv) still hits a working mine. MEMPAL_PYTHON honours +# user override; sees ``mempalace/__main__.py`` which dispatches to +# ``mempalace.cli:main`` — identical to the console script. +mempal_log "stop" "$CONVERSATION_ID" "spawning background mine wing=$WING transcript_dir=$TRANSCRIPT_DIR" +( + if "$MEMPAL_PYTHON_BIN" -m mempalace --version >/dev/null 2>&1; then + "$MEMPAL_PYTHON_BIN" -m mempalace mine "$TRANSCRIPT_DIR" \ + --mode convos \ + --wing "$WING" \ + >> "$MEMPAL_AGY_LOG" 2>&1 < /dev/null + mempal_log "stop" "$CONVERSATION_ID" "background mine finished wing=$WING" + else + mempal_log "stop" "$CONVERSATION_ID" "ERROR: mempalace is not runnable via $MEMPAL_PYTHON_BIN -m mempalace; install mempalace or set MEMPAL_PYTHON" + fi + rm -f "$PENDING_FILE" 2>/dev/null +) >/dev/null 2>&1 < /dev/null & + +# ── Always emit `{}` ───────────────────────────────────────────────── +# +# Never `{"decision":"continue"}`. That would force the agent into an +# infinite re-execution loop. mempal_emit_stop_pass hard-codes `{}`. +mempal_emit_stop_pass +exit 0 diff --git a/hooks/antigravity/mempal_wake_hook_antigravity.sh b/hooks/antigravity/mempal_wake_hook_antigravity.sh new file mode 100755 index 000000000..7dcc952da --- /dev/null +++ b/hooks/antigravity/mempal_wake_hook_antigravity.sh @@ -0,0 +1,166 @@ +#!/bin/bash +# MEMPALACE ANTIGRAVITY WAKE HOOK — PreInvocation event handler +# +# Antigravity fires the PreInvocation event before every model +# invocation, with `invocationNum` carrying the sequence number of the +# call. We use the first invocation (invocationNum == 1) as our +# session-start equivalent and inject a verbatim memory pointer into +# the agent's context via the `injectSteps[].ephemeralMessage` output +# field — the message lives for one turn and does not persist into the +# transcript, so it doesn't pollute future invocations of this same +# conversation. +# +# === STDIN (verified, camelCase) === +# { +# "invocationNum": 1, +# "initialNumSteps": 0, +# "conversationId": "", +# "workspacePaths": ["/abs/path/..."], +# "transcriptPath": "/abs/path/transcript.jsonl", +# "artifactDirectoryPath": "/abs/path/artifacts/" +# } +# +# === STDOUT === +# Either: +# {} — no injection +# Or: +# {"injectSteps":[{"ephemeralMessage":"..."}]} — verbatim memory pointer +# +# Verbatim guarantee: the ephemeralMessage carries the exact text +# emitted by `mempalace wake-up`, never paraphrased or summarized. +# +# Performance budget: the integration brief sets a 100ms ceiling for +# startup injection. We enforce a 500ms hard timeout on the +# `mempalace wake-up` subprocess (more generous than 100ms because +# cold ChromaDB connections can dominate, and missing the budget is +# strictly better than blocking the user) — if it doesn't return in +# time we emit `{}` and let the conversation start without injection. +# +# `set -e` is intentionally NOT enabled — fail-open is mandatory. + +# ── Locate this script + source common helpers ─────────────────────── +MEMPAL_AGY_HOOK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)" +# shellcheck source=lib/common.sh +. "$MEMPAL_AGY_HOOK_DIR/lib/common.sh" + +# ── Read all of stdin once ─────────────────────────────────────────── +INPUT=$(cat) + +# ── Kill switch ────────────────────────────────────────────────────── +if mempal_kill_switch_tripped; then + mempal_emit_stop_pass + exit 0 +fi + +# ── Parse stdin ────────────────────────────────────────────────────── +_parsed=$(mempal_parse_stdin "$INPUT") +_marker=$(printf '%s\n' "$_parsed" | sed -n '1p') +CONVERSATION_ID=$(printf '%s\n' "$_parsed" | sed -n '2p') +# Lines 3-5 (transcriptPath, workspacePath, artifactDirectoryPath) are +# parsed; we use workspacePath for wing inference. transcriptPath and +# artifactDirectoryPath are unused by the wake flow. +# Line 4: workspacePath +WORKSPACE_PATH=$(printf '%s\n' "$_parsed" | sed -n '4p') +INVOCATION_NUM=$(printf '%s\n' "$_parsed" | sed -n '9p') + +# Defense-in-depth on parse failure +if [ -n "$INPUT" ] && [ "$_marker" != "__MEMPAL_PARSE_OK__" ]; then + mempal_log "preInvocation" "unknown" "input parse failed (sentinel missing)" + mempal_emit_stop_pass + exit 0 +fi + +CONVERSATION_ID="${CONVERSATION_ID:-unknown}" +WORKSPACE_PATH="${WORKSPACE_PATH:-}" +INVOCATION_NUM="${INVOCATION_NUM:-0}" + +# ── Gate: only inject on the FIRST invocation ──────────────────────── +# +# PreInvocation fires before every model call. Without this gate we'd +# inject memory on every single turn — both expensive and visually +# noisy. invocationNum == 1 means "first model call of this +# conversation", which is the closest thing Antigravity exposes to +# Cursor's `sessionStart`. +if [ "$INVOCATION_NUM" != "1" ]; then + mempal_emit_stop_pass + exit 0 +fi + +# ── Loop guard ─────────────────────────────────────────────────────── +# +# Defense in depth: even within the first invocation, we only ever +# want to inject once per conversation. mkdir is atomic and works on +# bash 3.2 / macOS / Linux without flock or other GNU coreutils +# extensions. +WOKE_MARKER="$MEMPAL_STATE_DIR/antigravity_woke_${CONVERSATION_ID}" +if ! mkdir "$WOKE_MARKER" 2>/dev/null; then + mempal_log "preInvocation" "$CONVERSATION_ID" "already woke this conversation; skipping" + mempal_emit_stop_pass + exit 0 +fi + +# ── Run wake-up with a hard timeout ────────────────────────────────── +# +# `timeout` is GNU coreutils — present on most Linux installs but +# missing from stock macOS. Wrap the subprocess in a Python timeout +# (subprocess.run(timeout=...)) which is cross-platform. The Python +# script also constructs the final JSON envelope for stdout, so the +# bash side just passes the result through. +WING=$(mempal_infer_wing "$WORKSPACE_PATH") +mempal_log "preInvocation" "$CONVERSATION_ID" "WAKE injection wing=$WING invocationNum=$INVOCATION_NUM" + +OUTPUT=$("$MEMPAL_PYTHON_BIN" -c " +import json, subprocess, sys + +wing = sys.argv[1] +timeout_s = 0.5 # 500 ms + +# Invoke as ``[sys.executable, '-m', 'mempalace', ...]`` rather than +# the bare ``mempalace`` console script. sys.executable is the same +# Python that resolved MEMPAL_PYTHON in lib/common.sh, so this binds +# the wake-up call to the correct interpreter (and its installed +# mempalace package) even when the venv's bin/ isn't on PATH. +try: + completed = subprocess.run( + [sys.executable, '-m', 'mempalace', 'wake-up', '--wing', wing], + capture_output=True, + text=True, + timeout=timeout_s, + ) + if completed.returncode != 0: + print('{}') + sys.exit(0) + body = (completed.stdout or '').strip() + if not body: + print('{}') + sys.exit(0) + # Verbatim — pass the wake-up text exactly as emitted, wrapped in + # the Antigravity injectSteps envelope. json.dumps escapes embedded + # control chars and quotes correctly. + print(json.dumps({'injectSteps': [{'ephemeralMessage': body}]})) +except FileNotFoundError: + print('{}') +except subprocess.TimeoutExpired: + print('{}') +except Exception: + print('{}') +" "$WING" 2>/dev/null) + +if [ -z "$OUTPUT" ]; then + OUTPUT='{}' +fi + +# Sanity-check: never emit `decision` from a PreInvocation hook (that +# field belongs to the Stop event). The Python helper only ever +# constructs `{"injectSteps": [...]}` or `{}`, so this is belt-and- +# suspenders against a future edit ever leaking a Stop-shaped object. +case "$OUTPUT" in + *\"decision\"*) + mempal_log "preInvocation" "$CONVERSATION_ID" "ERROR: refused to emit decision field from wake hook" + mempal_emit_stop_pass + exit 0 + ;; +esac + +printf '%s\n' "$OUTPUT" +exit 0 diff --git a/hooks/cursor/README.md b/hooks/cursor/README.md new file mode 100644 index 000000000..fcf2a7d34 --- /dev/null +++ b/hooks/cursor/README.md @@ -0,0 +1,157 @@ +# MemPalace Cursor IDE Hooks + +Auto-save and session-recall hooks for the [Cursor](https://cursor.com) IDE, +matching the behaviour of the existing Claude Code + Codex hooks at the repo +root and adding two Cursor-only capabilities (`sessionStart` recall and a +preCompact transcript snapshot). + +For the rendered documentation see +[`website/guide/cursor-hooks.md`](../../website/guide/cursor-hooks.md) or +the published version at +[mempalaceofficial.com/guide/cursor-hooks](https://mempalaceofficial.com/guide/cursor-hooks.html). + +## What's here + +| File | Role | +|-------------------------------------|---------------------------------------------------------------------| +| `lib/common.sh` | Shared bash helpers (parse, log, counter, wing inference, kill switch). Sourced by all three hooks. | +| `mempal_save_hook_cursor.sh` | Cursor `stop` hook. Counts stop invocations per conversation, emits a `followup_message` every `SAVE_INTERVAL` (default 15) telling the agent to file the session into MemPalace. | +| `mempal_precompact_hook_cursor.sh` | Cursor `preCompact` hook. Runs `mempalace mine` synchronously on the transcript before compaction, then drops a `.pending` marker so the next stop forces a save nudge. | +| `mempal_wake_hook_cursor.sh` | Cursor `sessionStart` hook. Returns `additional_context` telling the agent to recall scoped to the wing inferred from the workspace root. Cursor-only — Claude Code has no equivalent. | +| `install.sh` | Optional installer. Copies the scripts to `~/.mempalace/hooks/cursor/` and merges entries into `~/.cursor/hooks.json` (or `.cursor/hooks.json` for project scope). Supports `--dry-run` and `--uninstall`. | +| `STDIN_SHAPE.md` | Reference. Per-event stdin / stdout schema with citations to the official Cursor docs. | + +## Quick install + +Preview first (writes nothing, prints the would-be JSON to stdout): + +```bash +hooks/cursor/install.sh --scope user --dry-run +``` + +Apply — writes `~/.cursor/hooks.json` and copies the scripts to `~/.mempalace/hooks/cursor/`: + +```bash +hooks/cursor/install.sh --scope user +``` + +Pass `--scope project --target ` to write `/.cursor/hooks.json` instead. +The installer never auto-runs — it is a documented opt-in step. We do not +modify your Cursor config on `pip install mempalace` because editor config +is sacred and should never be touched without explicit consent. + +## Manual install (no installer) + +The minimum wiring is `stop` only. Add to `~/.cursor/hooks.json`: + +```json +{ + "version": 1, + "hooks": { + "stop": [ + { + "command": "/absolute/path/to/hooks/cursor/mempal_save_hook_cursor.sh", + "loop_limit": 1 + } + ] + } +} +``` + +For the full triple (recommended), also wire `sessionStart` and `preCompact` +— see [`examples/cursor/hooks.json`](../../examples/cursor/hooks.json). + +After editing the file, Cursor watches `hooks.json` and reloads +automatically. If hooks still do not fire, restart Cursor and check the +Hooks panel in Settings. + +## Configuration + +All knobs are env vars; defaults match the Claude Code hooks where +possible so a single hook-state directory works for both editors. + +| Variable | Default | Purpose | +|--------------------------------|------------------------------------|---------| +| `MEMPAL_SAVE_INTERVAL` | `15` | Number of `stop` events between save followups. | +| `MEMPAL_CURSOR_SILENT` | (unset) | Set to `1`/`true`/`yes` to suppress the `followup_message`. The hook still runs its best-effort background mine and keeps its counters — it just stays silent. `MEMPAL_VERBOSE=false`/`0`/`no` does the same. See note below on why the followup is on by default. | +| `MEMPAL_DIR` | (unset) | Optional project directory to also mine on each save. Additive — never replaces the transcript mine. | +| `MEMPAL_PYTHON` | auto-detected | Path to a Python 3 interpreter. Fallback order: `$MEMPAL_PYTHON` → `command -v python3` → bare `python3`. Useful when Cursor is launched from a GUI on macOS and the inherited PATH lacks your installed `python3`. | +| `MEMPAL_STATE_DIR` | `$HOME/.mempalace/hook_state` | Where the hook keeps its per-conversation counter files, pending-save markers, and `cursor_hook.log`. | +| `MEMPAL_STATE_TTL_DAYS` | `30` | Age (days) after which stale `cursor_*.count` / `cursor_*.pending` state files are garbage-collected. A daily-throttled sweep runs from the hooks; only Cursor state is touched (shared logs and other editors' state are left alone). | +| `MEMPAL_DISABLE_HOOK` | (unset) | Set to `1`/`true`/`yes` to disable all three hooks. Emergency kill switch. | +| `MEMPALACE_HOOKS_AUTO_SAVE` | (unset) | Set to `false`/`0`/`no` to disable. Same semantics as the Claude Code hooks. Also honoured via `~/.mempalace/config.json` → `{"hooks": {"auto_save": false}}`. | + +## Debugging + +Everything appends to: + +```bash +cat ~/.mempalace/hook_state/cursor_hook.log +``` + +Example log lines (ISO 8601 + event + conversation id): + +``` +[2026-05-27T02:16:01Z] [event=sessionStart] [conv=abc123] workspace=/Users/me/proj wing=proj +[2026-05-27T02:21:33Z] [event=stop] [conv=abc123] counter 0 -> 1 (interval=15) +[2026-05-27T02:42:09Z] [event=stop] [conv=abc123] counter 14 -> 15 (interval=15) +[2026-05-27T02:42:09Z] [event=stop] [conv=abc123] TRIGGERING SAVE at counter=15 +[2026-05-27T02:42:11Z] [event=stop] [conv=abc123] loop_count>0; letting agent stop +``` + +When a hook can't parse its stdin (corrupt payload, future Cursor schema +change), the raw input (capped at 4096 bytes, mode 0600) lands at: + +``` +~/.mempalace/hook_state/cursor_last_input.log +~/.mempalace/hook_state/cursor_last_python_err.log +``` + +These are overwritten on each failure, never appended, so a repeating +misconfiguration cannot grow disk usage. + +## What differs from the Claude Code hooks + +| Aspect | Claude Code hooks (`hooks/mempal_*.sh`) | Cursor hooks (`hooks/cursor/*.sh`) | +|-------------------------|------------------------------------------------|-----------------------------------------------------| +| Counter key | `session_id` | `conversation_id` (Cursor's stable per-conv id) | +| Loop guard | `stop_hook_active` flag in stdin | `loop_count` field in stdin | +| Counting method | Parses JSONL transcript for user messages | Counts `stop` invocations (transcript schema undoc) | +| Capture path | Background `mine --mode convos` (normalize.py has a Claude parser) | Background mine is best-effort (no Cursor parser); the `followup_message` carries verbatim capture | +| Save default | Silent — diary nudge opt-IN behind `MEMPAL_VERBOSE=true` | Followup ON by default; opt-OUT via `MEMPAL_CURSOR_SILENT=1` / `MEMPAL_VERBOSE=false` | +| PreCompact behaviour | `decision: block` forces save before compaction | Pre-mine + pending-save marker (Cursor preCompact is observational-only) | +| sessionStart | n/a (Claude Code has no equivalent) | `additional_context` injects recall guidance | +| State dir | `$HOME/.mempalace/hook_state` (hardcoded) | Same default, plus `MEMPAL_STATE_DIR` env override | +| Kill switch | `MEMPALACE_HOOKS_AUTO_SAVE=false` | Same, plus `MEMPAL_DISABLE_HOOK=1` alias | +| Log file | `hook.log` | `cursor_hook.log` (kept separate to avoid cross-tool log churn) | + +See [`STDIN_SHAPE.md`](STDIN_SHAPE.md) for the per-event schema and +[`website/guide/cursor-hooks.md`](../../website/guide/cursor-hooks.md) for +the full walkthrough with diagrams. + +## Why the followup is on by default (Cursor-specific) + +Unlike the Claude Code hook — which is silent by default because its +background `mempalace mine --mode convos` captures the verbatim transcript +on its own — Cursor's transcript format is **undocumented** and +`mempalace/normalize.py` has **no Cursor parser**. The background mine on +the Cursor `stop`/`preCompact` hooks is therefore **best-effort only**: it +does not yet yield clean verbatim conversation drawers. + +That makes the `followup_message` the **load-bearing verbatim-capture +path** for Cursor — it drives the agent to file its own in-context +verbatim quotes via `mempalace_add_drawer` / `mempalace_diary_write`. +Silencing it by default would leave a default Cursor install capturing +nothing, which is why it is on by default here. Set `MEMPAL_CURSOR_SILENT=1` +(or `MEMPAL_VERBOSE=false`) if you prefer the Claude-style silent +behaviour and accept the reduced capture. Once `normalize.py` learns to +read Cursor transcripts, this default will flip to silent to match Claude. + +## Cost + +Zero extra LLM tokens spent by the hooks themselves. The hooks are local +bash scripts that run on your machine. The followup message the save hook +emits is a normal user turn — it counts the same as any other user message +and does not invoke any extra LLM call beyond the one the user would +otherwise make. Suppress it with `MEMPAL_CURSOR_SILENT=1` if you want zero +followups in the chat window. diff --git a/hooks/cursor/STDIN_SHAPE.md b/hooks/cursor/STDIN_SHAPE.md new file mode 100644 index 000000000..2fab0f7de --- /dev/null +++ b/hooks/cursor/STDIN_SHAPE.md @@ -0,0 +1,184 @@ +# Cursor Hook Stdin Shape — Reference + +This file documents the JSON payloads the Cursor IDE sends to the +MemPalace hook scripts in `hooks/cursor/`. It exists so a future +contributor does not have to re-discover the schema by writing a +probe hook. + +**Source:** [`cursor.com/docs/hooks.md`](https://cursor.com/docs/hooks.md), +fetched 2026-05-27. Cursor's hook system is documented as a stable +v1 schema (`{"version": 1, ...}` at the top of `hooks.json`). + +If you suspect Cursor has changed the payload shape since that fetch +date, re-verify against the upstream docs and update both this file +and `hooks/cursor/lib/common.sh::mempal_parse_stdin`. The hook +scripts deliberately ignore fields they do not consume, so adding +new fields is non-breaking. + +## Common fields (all events) + +Every hook receives these on stdin in addition to its event-specific +fields. Source: docs section "Common schema → Input (all hooks)". + +```json +{ + "conversation_id": "string", + "generation_id": "string", + "model": "string", + "hook_event_name": "string", + "cursor_version": "string", + "workspace_roots": [""], + "user_email": "string | null", + "transcript_path": "string | null" +} +``` + +**Field notes (verified):** + +- `conversation_id` is the stable per-conversation ID. The Cursor + `stop` event does **not** carry a `session_id` — only + `conversation_id`. MemPalace keys its counter files on this. Cursor + `sessionStart` does carry a `session_id`, and the docs note it is + "same as `conversation_id`". +- `generation_id` changes every user message. We do not use it. +- `transcript_path` may be `null` if the user has disabled + transcripts in Cursor settings. The hooks degrade gracefully when + the value is empty. +- `workspace_roots` is normally a single-entry array but multi-root + workspaces are supported; MemPalace uses index `[0]`. + +## Event-specific fields + +### `stop` (consumed by `mempal_save_hook_cursor.sh`) + +```json +{ + "status": "completed" | "aborted" | "error", + "loop_count": 0 +} +``` + +- `loop_count` indicates how many times this stop hook has already + triggered an automatic followup for this conversation (starts at + 0). When `loop_count > 0` we know our own previous `followup_message` + is currently being processed — the save hook returns `{}` so the + agent can finish. Equivalent to Claude Code's `stop_hook_active`. +- The per-script `loop_limit` (default 5 for Cursor hooks, configurable + via the `loop_limit` field on the hook entry in `hooks.json`) is + defense-in-depth on top of our own check. The example `hooks.json` + in `examples/cursor/` sets `loop_limit: 1`. + +**Allowed output fields** (only): + +```json +{ "followup_message": "" } +``` + +### `preCompact` (consumed by `mempal_precompact_hook_cursor.sh`) + +```json +{ + "trigger": "auto" | "manual", + "context_usage_percent": 85, + "context_tokens": 120000, + "context_window_size": 128000, + "message_count": 45, + "messages_to_compact": 30, + "is_first_compaction": true +} +``` + +**Critical constraint:** preCompact is documented as **observational +only**. It cannot block compaction and its allowed output fields are +limited to: + +```json +{ "user_message": "" } +``` + +There is **no** `followup_message` and **no** `decision: block` on +this event — unlike Claude Code's `PreCompact`. MemPalace works +around this by: + +1. Running `mempalace mine` synchronously inside the hook so the + verbatim transcript lands in the palace before compaction + summarises it. +2. Dropping a `cursor_.pending` marker that the next + `stop` invocation reads and uses to force a save followup + regardless of its counter. + +### `sessionStart` (consumed by `mempal_wake_hook_cursor.sh`) + +```json +{ + "session_id": "", + "is_background_agent": true, + "composer_mode": "agent" | "ask" | "edit" +} +``` + +`session_id` equals `conversation_id` on this event (docs are +explicit about this). + +**Allowed output fields:** + +```json +{ + "env": { "": "" }, + "additional_context": "" +} +``` + +`additional_context` is the field MemPalace uses. The schema also +accepts `continue` and `user_message` but the docs explicitly note +"current callers do not enforce them; session creation is not +blocked even when continue is false". We do not emit either. + +## Environment variables (all hooks) + +Cursor sets these env vars on every hook execution; the hook scripts +fall back to them when JSON parsing fails for any reason. + +| Variable | Description | +|---------------------------|---------------------------------------------------| +| `CURSOR_PROJECT_DIR` | Workspace root (= `workspace_roots[0]`) | +| `CURSOR_VERSION` | Cursor version string | +| `CURSOR_USER_EMAIL` | Authenticated user email (if logged in) | +| `CURSOR_TRANSCRIPT_PATH` | Conversation transcript path (if transcripts on) | +| `CURSOR_CODE_REMOTE` | `"true"` if running in a remote workspace | +| `CLAUDE_PROJECT_DIR` | Alias for `CURSOR_PROJECT_DIR` (Claude compat) | + +## Exit code semantics + +Cursor interprets command-hook exit codes as follows +(docs "Hook Types → Command-Based Hooks → Exit code behavior"): + +- `0` — success, use the JSON output. +- `2` — block the action (equivalent to `permission: "deny"`). +- Other — hook failed; action proceeds (fail-open by default). + +MemPalace hooks always exit `0` and emit either `{}` (no-op) or a +valid JSON response. We never use exit code `2`; nothing MemPalace +does should ever block an agent action. + +## Working directory contract + +- **User hooks** (`~/.cursor/hooks.json`) run from `~/.cursor/`. +- **Project hooks** (`.cursor/hooks.json`) run from the project root. + +The MemPalace hooks always resolve their sibling `lib/common.sh` via +`BASH_SOURCE[0]` so the working directory does not matter for the +script's own loading — only the `command` path in `hooks.json` needs +to point at the absolute location of the script. + +## Transcript file format (out of scope) + +The format of the file at `transcript_path` is **not documented by +Cursor** as of the fetch date above. MemPalace deliberately does not +parse it: the save hook counts `stop` invocations (each one +corresponds to one assistant turn) and hands the transcript to +`mempalace mine`, which has its own normaliser layer. + +If you need to consume the transcript directly, probe its shape with +a throw-away hook that does `cat > /tmp/cursor-transcript-sample.txt` +and inspect the output — there is no shortcut. diff --git a/hooks/cursor/install.sh b/hooks/cursor/install.sh new file mode 100755 index 000000000..cdd4f8526 --- /dev/null +++ b/hooks/cursor/install.sh @@ -0,0 +1,399 @@ +#!/bin/bash +# MEMPALACE CURSOR HOOK INSTALLER +# +# Optional helper. Copies the three Cursor hook scripts to a +# stable install location and merges entries into a Cursor +# `hooks.json` config file — without clobbering unrelated hooks +# already in that file. +# +# This is NEVER auto-invoked. Editor config is sacred; we do not +# modify a user's hooks.json without explicit consent. The user runs +# this script (or wires the hooks manually) as a documented opt-in. +# +# === USAGE === +# +# hooks/cursor/install.sh [options] +# +# Options: +# --scope user|project Target scope. Default: user. +# - user: merges into ~/.cursor/hooks.json +# - project: merges into /.cursor/hooks.json +# --target Project root for --scope project (default: $PWD). +# Ignored for --scope user. +# --install-dir Where to copy the hook scripts. +# Default: ~/.mempalace/hooks/cursor +# --variant full|minimal Which hook set to wire. +# - full: stop + preCompact + sessionStart +# - minimal: stop only +# Default: full. +# --dry-run Print the would-be JSON to stdout, do not write +# and do not copy scripts. +# --uninstall Remove MemPalace entries from the target +# hooks.json (preserves unrelated hooks). +# Does NOT delete the installed scripts. +# -h, --help Show this help and exit. +# +# === PORTABILITY === +# +# Pure bash 3.2 + POSIX tools + python3 (which the hook scripts +# themselves already require). No `jq` dependency. +# +# Python helpers are materialised to temp files rather than piped via +# `$(... <<'PYEOF' ... PYEOF)` to dodge the bash 3.2.57 parser bug +# that trips on parens nested inside a heredoc body that lives inside +# a `$(...)` command substitution. + +set -e + +usage() { + sed -n '2,38p' "${BASH_SOURCE[0]}" | sed 's/^# \{0,1\}//' +} + +# ── Defaults ────────────────────────────────────────────────────── +SCOPE="user" +TARGET="" +INSTALL_DIR="$HOME/.mempalace/hooks/cursor" +VARIANT="full" +DRY_RUN=0 +UNINSTALL=0 + +# ── Parse args ──────────────────────────────────────────────────── +while [ $# -gt 0 ]; do + case "$1" in + --scope) + shift + SCOPE="${1:-}" + ;; + --target) + shift + TARGET="${1:-}" + ;; + --install-dir) + shift + INSTALL_DIR="${1:-}" + ;; + --variant) + shift + VARIANT="${1:-}" + ;; + --dry-run) DRY_RUN=1 ;; + --uninstall) UNINSTALL=1 ;; + -h|--help) + usage + exit 0 + ;; + *) + printf 'install.sh: unknown argument: %s\n' "$1" >&2 + usage >&2 + exit 64 + ;; + esac + shift || true +done + +case "$SCOPE" in + user|project) ;; + *) + printf 'install.sh: --scope must be "user" or "project" (got "%s")\n' \ + "$SCOPE" >&2 + exit 64 + ;; +esac + +case "$VARIANT" in + full|minimal) ;; + *) + printf 'install.sh: --variant must be "full" or "minimal" (got "%s")\n' \ + "$VARIANT" >&2 + exit 64 + ;; +esac + +# ── Resolve paths ───────────────────────────────────────────────── +_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" 2>/dev/null && pwd)" +SOURCE_DIR="$_script_dir" + +# Resolve --install-dir to an absolute path before it gets baked into +# hooks.json. Cursor invokes hook commands from its own working +# directory (typically the project root), so a relative command path +# would silently fail to launch the hook. gh-PR review caught this. +case "$INSTALL_DIR" in + /*) ;; + *) INSTALL_DIR="$PWD/$INSTALL_DIR" ;; +esac + +# Resolve the Python interpreter the same way the hooks themselves do +# so a user with a non-default Python is consistent across install + +# runtime. +if [ -n "${MEMPAL_PYTHON:-}" ] && [ -x "$MEMPAL_PYTHON" ]; then + PYTHON_BIN="$MEMPAL_PYTHON" +else + PYTHON_BIN="$(command -v python3 2>/dev/null || true)" +fi +if [ -z "$PYTHON_BIN" ]; then + printf 'install.sh: python3 not found on PATH; cannot proceed.\n' >&2 + printf 'Set $MEMPAL_PYTHON to an interpreter path or install python3.\n' >&2 + exit 1 +fi + +# Determine the target hooks.json path. +case "$SCOPE" in + user) + if [ -n "$TARGET" ]; then + printf 'install.sh: --target is only meaningful with --scope project; ignoring.\n' >&2 + fi + TARGET_DIR="$HOME/.cursor" + ;; + project) + TARGET_DIR="${TARGET:-$PWD}/.cursor" + ;; +esac +TARGET_FILE="$TARGET_DIR/hooks.json" + +# Determine which commands the merge / uninstall logic should +# install or remove. Paths point at the install location, NOT the +# source repo — once the user runs install.sh they can move / delete +# the cloned repo without breaking the wiring. +SAVE_CMD="$INSTALL_DIR/mempal_save_hook_cursor.sh" +PRECOMPACT_CMD="$INSTALL_DIR/mempal_precompact_hook_cursor.sh" +WAKE_CMD="$INSTALL_DIR/mempal_wake_hook_cursor.sh" + +# ── Step 1: copy scripts (skipped on --dry-run / --uninstall) ───── +if [ "$UNINSTALL" -eq 0 ] && [ "$DRY_RUN" -eq 0 ]; then + mkdir -p "$INSTALL_DIR/lib" + cp "$SOURCE_DIR/lib/common.sh" "$INSTALL_DIR/lib/common.sh" + cp "$SOURCE_DIR/mempal_save_hook_cursor.sh" "$INSTALL_DIR/" + cp "$SOURCE_DIR/mempal_precompact_hook_cursor.sh" "$INSTALL_DIR/" + cp "$SOURCE_DIR/mempal_wake_hook_cursor.sh" "$INSTALL_DIR/" + chmod +x "$INSTALL_DIR/mempal_save_hook_cursor.sh" \ + "$INSTALL_DIR/mempal_precompact_hook_cursor.sh" \ + "$INSTALL_DIR/mempal_wake_hook_cursor.sh" + printf 'install.sh: copied scripts to %s\n' "$INSTALL_DIR" >&2 +fi + +# ── Short-circuit: uninstall with no existing file is a no-op ───── +# +# Without this, the merge step would happily write an empty +# {"version": 1, "hooks": {}} to a brand new file that the user +# never asked us to create — surprising behaviour that the test +# suite explicitly guards against. +if [ "$UNINSTALL" -eq 1 ] && [ ! -f "$TARGET_FILE" ]; then + printf 'install.sh: nothing to uninstall (%s does not exist)\n' \ + "$TARGET_FILE" >&2 + exit 0 +fi + +# ── Step 2: merge / unmerge hooks.json via python3 ──────────────── +# +# Materialise the merge logic to a temp .py file (see bash 3.2 +# rationale at the top of this script), then invoke it. The Python +# script is responsible for: +# * tolerating a missing or empty hooks.json (starts from {}) +# * preserving unrelated hook entries on install +# * preserving unrelated hook entries on uninstall +# * recognising MemPalace entries by basename in the `command` field +# * idempotent install (re-running does not duplicate entries) + +# mktemp portability: pass an explicit absolute template so we sidestep +# the BSD vs GNU difference in `-t` semantics (BSD treats it as a +# prefix; GNU treats it as a template). Honour TMPDIR if set, fall +# back to /tmp. gh-PR review caught the previous `-t` form as +# non-portable. +MERGE_PY="$(mktemp "${TMPDIR:-/tmp}/mempal-install-merge.XXXXXX")" +trap 'rm -f "$MERGE_PY"' EXIT + +cat > "$MERGE_PY" <<'PYEOF' +"""hooks.json merge helper for hooks/cursor/install.sh. + +Argv: + sys.argv[1]: path to hooks.json (may not exist) + sys.argv[2]: variant ("full" or "minimal") + sys.argv[3]: uninstall flag ("1" or "0") + sys.argv[4]: save_cmd absolute path + sys.argv[5]: precompact_cmd absolute path + sys.argv[6]: wake_cmd absolute path + +Output: prints the merged JSON to stdout. Exits 2 on a malformed +existing config (refuses to overwrite a broken file). +""" +import json +import os +import sys + +target_file = sys.argv[1] +variant = sys.argv[2] +uninstall = sys.argv[3] == "1" +save_cmd = sys.argv[4] +precompact_cmd = sys.argv[5] +wake_cmd = sys.argv[6] + +# Recognise our entries by basename. The three filenames below are +# the unique product-of-our-naming convention; any entry whose command +# ends in one of them is treated as a MemPalace entry on +# install (so we replace rather than duplicate it) and on uninstall +# (so we remove it without touching unrelated entries). Matching on +# basename rather than a full-path substring lets users pick any +# --install-dir without breaking uninstall. +MEMPAL_BASENAMES = ( + "mempal_save_hook_cursor.sh", + "mempal_precompact_hook_cursor.sh", + "mempal_wake_hook_cursor.sh", +) + +if os.path.exists(target_file): + with open(target_file, "r", encoding="utf-8") as fh: + try: + cfg = json.load(fh) + except Exception as exc: + sys.stderr.write( + "install.sh: existing %s is not valid JSON: %s\n" + "Refusing to overwrite. Fix the file and retry.\n" + % (target_file, exc) + ) + sys.exit(2) +else: + cfg = {} + +if not isinstance(cfg, dict): + sys.stderr.write( + "install.sh: %s top level must be a JSON object; got %s\n" + % (target_file, type(cfg).__name__) + ) + sys.exit(2) + +cfg.setdefault("version", 1) +cfg.setdefault("hooks", {}) +if not isinstance(cfg["hooks"], dict): + sys.stderr.write( + "install.sh: %s 'hooks' must be a JSON object\n" % target_file + ) + sys.exit(2) + + +def is_mempal_entry(entry): + if not isinstance(entry, dict): + return False + cmd = entry.get("command", "") + if not isinstance(cmd, str): + return False + # Match on basename so a customised --install-dir (e.g. /opt/..., + # ~/.local/share/..., or anything with or without a leading dot) + # still round-trips through uninstall. + base = os.path.basename(cmd) + return base in MEMPAL_BASENAMES + + +def filter_mempal(entries): + if not isinstance(entries, list): + return entries + return [e for e in entries if not is_mempal_entry(e)] + + +def upsert(event, entry): + existing = cfg["hooks"].get(event, []) + if not isinstance(existing, list): + sys.stderr.write( + "install.sh: %s hooks[%s] must be a list\n" % (target_file, event) + ) + sys.exit(2) + cleaned = [e for e in existing if not is_mempal_entry(e)] + cleaned.append(entry) + cfg["hooks"][event] = cleaned + + +if uninstall: + for event in list(cfg["hooks"].keys()): + cfg["hooks"][event] = filter_mempal(cfg["hooks"][event]) + if not cfg["hooks"][event]: + del cfg["hooks"][event] +else: + upsert("stop", {"command": save_cmd, "loop_limit": 1}) + if variant == "full": + upsert("preCompact", {"command": precompact_cmd}) + upsert("sessionStart", {"command": wake_cmd}) + +# Stable key order for the events MemPalace touches, then preserve +# any unrelated event names in their original order so future Cursor +# events we don't know about yet still round-trip. +known_order = [ + "sessionStart", + "stop", + "preCompact", + "sessionEnd", + "preToolUse", + "postToolUse", + "postToolUseFailure", + "subagentStart", + "subagentStop", + "beforeShellExecution", + "afterShellExecution", + "beforeMCPExecution", + "afterMCPExecution", + "beforeReadFile", + "afterFileEdit", + "beforeSubmitPrompt", + "afterAgentResponse", + "afterAgentThought", + "beforeTabFileRead", + "afterTabFileEdit", + "workspaceOpen", +] + +ordered_hooks = {} +for event in known_order: + if event in cfg["hooks"]: + ordered_hooks[event] = cfg["hooks"][event] +for event, entries in cfg["hooks"].items(): + if event not in ordered_hooks: + ordered_hooks[event] = entries +cfg["hooks"] = ordered_hooks + +print(json.dumps(cfg, indent=2, sort_keys=False)) +PYEOF + +NEW_JSON="$("$PYTHON_BIN" "$MERGE_PY" \ + "$TARGET_FILE" "$VARIANT" "$UNINSTALL" \ + "$SAVE_CMD" "$PRECOMPACT_CMD" "$WAKE_CMD")" + +# ── Step 3: emit, write, or remove ──────────────────────────────── +if [ "$DRY_RUN" -eq 1 ]; then + printf 'install.sh: --dry-run; would write to %s\n' "$TARGET_FILE" >&2 + printf '%s\n' "$NEW_JSON" + exit 0 +fi + +mkdir -p "$TARGET_DIR" + +# If --uninstall left an empty hooks object AND no other top-level +# keys beyond version, remove the file entirely so the user's +# `.cursor/` directory does not accumulate orphan configs. +if [ "$UNINSTALL" -eq 1 ]; then + # Inline the emptiness check via `python -c '...'` rather than a + # temp .py file. The body is short enough that a tmpfile is pure + # overhead, and removing the tmpfile eliminates a small leak + # window if the script is interrupted between mktemp and rm -f. + # gh-PR review suggested this simplification. + NON_EMPTY="$(printf '%s' "$NEW_JSON" | "$PYTHON_BIN" -c ' +import json, sys +cfg = json.load(sys.stdin) +hooks = cfg.get("hooks", {}) +extras = [k for k in cfg.keys() if k not in ("version", "hooks")] +print("1" if (hooks or extras) else "0") +')" + if [ "$NON_EMPTY" = "0" ] && [ -f "$TARGET_FILE" ]; then + rm -f "$TARGET_FILE" + printf 'install.sh: removed empty %s\n' "$TARGET_FILE" >&2 + exit 0 + fi +fi + +TMP_FILE="${TARGET_FILE}.tmp.$$" +printf '%s\n' "$NEW_JSON" > "$TMP_FILE" +mv "$TMP_FILE" "$TARGET_FILE" + +if [ "$UNINSTALL" -eq 1 ]; then + printf 'install.sh: removed MemPalace entries from %s\n' "$TARGET_FILE" >&2 +else + printf 'install.sh: wrote %s\n' "$TARGET_FILE" >&2 + printf 'install.sh: restart Cursor (or wait for it to reload hooks.json)\n' >&2 +fi diff --git a/hooks/cursor/lib/common.sh b/hooks/cursor/lib/common.sh new file mode 100644 index 000000000..f7d5dfad3 --- /dev/null +++ b/hooks/cursor/lib/common.sh @@ -0,0 +1,490 @@ +# shellcheck shell=bash +# MEMPALACE CURSOR HOOK — shared helpers +# +# Sourced by the three Cursor hooks (stop / preCompact / sessionStart). +# Mirrors the conventions of the existing Claude Code hook scripts +# (hooks/mempal_save_hook.sh, hooks/mempal_precompact_hook.sh) so a +# user who already debugs one knows how to debug the other: +# +# * STATE_DIR layout under ~/.mempalace/hook_state/ +# * MEMPAL_PYTHON resolution order (override → $PATH → bare python3) +# * MEMPALACE_HOOKS_AUTO_SAVE=false kill switch (config.json fallback) +# * sentinel-guarded Python parser via `sed -n 'Np'` (bash 3.2 safe) +# * fail-open on internal errors: emit `{}` and log, never crash the +# hook host +# +# Cursor-specific additions on top of that contract: +# +# * MEMPAL_DISABLE_HOOK=1 as an additional kill-switch alias +# * MEMPAL_STATE_DIR env override for the state directory +# * conversation_id (Cursor's stable per-conversation ID) replaces +# Claude Code's session_id in the counter file names — Cursor `stop` +# events do not carry a session_id, only conversation_id +# * loop_count is the loop-prevention signal in place of Claude Code's +# stop_hook_active flag (Cursor docs, "stop" event) +# +# This file is sourced, not executed, so it intentionally has no +# shebang. The `# shellcheck shell=bash` directive above tells +# shellcheck to treat it as bash when run standalone. + +# ── State directory + log path ──────────────────────────────────────── +# +# Honour MEMPAL_STATE_DIR (additive override introduced for Cursor) +# while keeping the default identical to the Claude Code hooks so a +# user running both keeps a single state directory. +MEMPAL_STATE_DIR="${MEMPAL_STATE_DIR:-$HOME/.mempalace/hook_state}" +mkdir -p "$MEMPAL_STATE_DIR" 2>/dev/null +MEMPAL_CURSOR_LOG="$MEMPAL_STATE_DIR/cursor_hook.log" + +# ── Python interpreter resolution ───────────────────────────────────── +# +# Same contract as the Claude Code hooks: +# 1. $MEMPAL_PYTHON — explicit user override (absolute path) +# 2. $(command -v python3) — first python3 on the hook's PATH +# 3. bare "python3" — last-resort fallback +mempal_resolve_python() { + local p="${MEMPAL_PYTHON:-}" + if [ -n "$p" ] && [ -x "$p" ]; then + printf '%s' "$p" + return 0 + fi + p="$(command -v python3 2>/dev/null || true)" + if [ -n "$p" ]; then + printf '%s' "$p" + return 0 + fi + printf '%s' "python3" +} +MEMPAL_PYTHON_BIN="$(mempal_resolve_python)" + +# ── Logging ─────────────────────────────────────────────────────────── +# +# Lines are `[ISO8601Z] [event=...] [conv=...] message`. ISO8601 keeps +# the format greppable across timezones (the Claude Code log uses +# %H:%M:%S which loses the date — we improve on that here without +# changing the existing log file). +mempal_log() { + local event="${1:-?}" + local conv="${2:-unknown}" + local msg="${3:-}" + local ts + ts="$(date -u '+%Y-%m-%dT%H:%M:%SZ')" + printf '[%s] [event=%s] [conv=%s] %s\n' "$ts" "$event" "$conv" "$msg" \ + >> "$MEMPAL_CURSOR_LOG" 2>/dev/null +} + +# ── Kill switch ─────────────────────────────────────────────────────── +# +# Disabled if ANY of: +# * MEMPAL_DISABLE_HOOK is a truthy string (Cursor-prompt addition) +# * MEMPALACE_HOOKS_AUTO_SAVE is false/0/no (Claude Code convention) +# * ~/.mempalace/config.json has hooks.auto_save == false +# +# Returns 0 (true in shell) when disabled, 1 when enabled. +mempal_is_disabled() { + case "${MEMPAL_DISABLE_HOOK:-}" in + 1|true|yes|on) return 0 ;; + esac + case "${MEMPALACE_HOOKS_AUTO_SAVE:-}" in + false|0|no|off) return 0 ;; + esac + local cfg="$HOME/.mempalace/config.json" + if [ -f "$cfg" ]; then + local result + # Use python -c '...' with the config path as argv[1] rather + # than a heredoc. A heredoc body that contains parens inside a + # $(...) command substitution trips the bash 3.2.57 parser + # bug (macOS /bin/bash default) — gh-PR review caught this. + # The -c form is also consistent with mempal_parse_stdin below. + result="$("$MEMPAL_PYTHON_BIN" -c ' +import json, sys +try: + with open(sys.argv[1]) as f: + cfg = json.load(f) + print(str(cfg.get("hooks", {}).get("auto_save", True)).lower()) +except Exception: + print("true") +' "$cfg" 2>/dev/null)" + if [ "$result" = "false" ]; then + return 0 + fi + fi + return 1 +} + +# ── Stdin parser ────────────────────────────────────────────────────── +# +# Reads Cursor's hook JSON from $1 and exports: +# MEMPAL_CONV_ID — conversation_id, falls back to "unknown" +# MEMPAL_LOOP_COUNT — integer (0 if absent / non-numeric) +# MEMPAL_TRANSCRIPT — transcript_path, may be empty +# MEMPAL_WORKSPACE — first workspace_roots entry, falls back to +# CURSOR_PROJECT_DIR env var, then $PWD +# MEMPAL_TRIGGER — preCompact trigger ("auto" | "manual"), empty otherwise +# MEMPAL_STATUS — stop status ("completed" | "aborted" | "error"), +# empty otherwise +# MEMPAL_PARSE_OK — "1" if parser ran cleanly, "0" otherwise +# +# Uses the same sentinel + `sed -n 'Np'` extraction as the Claude Code +# hooks for bash 3.2 compatibility (mapfile/readarray are unavailable +# on macOS /bin/bash 3.2.57; #1440 regression). Each line of output is +# pre-sanitised by the Python side to a shell-safe character set. +mempal_parse_stdin() { + local input="${1:-}" + local parsed + # We invoke Python via -c with a single-quoted multi-line string + # rather than ``python3 - <<'PYEOF'`` because the heredoc form + # would shadow Python's stdin with the heredoc body, leaving + # ``json.load(sys.stdin)`` to read nothing and silently fail. The + # parser body deliberately uses only double-quoted Python strings + # so the surrounding bash single-quote is safe verbatim, and uses + # only the shell-safe character set (alphanumeric, underscore, + # dash, slash, dot, tilde) matching the Claude Code hook's + # sanitiser so a hostile transcript_path cannot splice + # metacharacters into the parsed output. + parsed="$( + umask 077 + printf '%s' "$input" | "$MEMPAL_PYTHON_BIN" -c ' +import json, re, sys + +def safe_str(value): + return re.sub(r"[^a-zA-Z0-9_/.\-~]", "", str(value or "")) + +def safe_int(value): + try: + return str(int(value)) + except (TypeError, ValueError): + return "0" + +try: + data = json.load(sys.stdin) +except Exception: + sys.exit(1) +if not isinstance(data, dict): + sys.exit(1) + +conv = safe_str(data.get("conversation_id") or data.get("session_id")) +loop_count = safe_int(data.get("loop_count", 0)) +transcript = safe_str(data.get("transcript_path", "")) +trigger = safe_str(data.get("trigger", "")) +status = safe_str(data.get("status", "")) + +roots = data.get("workspace_roots") or [] +workspace = "" +if isinstance(roots, list) and roots: + workspace = safe_str(roots[0]) + +print("__MEMPAL_PARSE_OK__") +print(conv) +print(loop_count) +print(transcript) +print(workspace) +print(trigger) +print(status) +' 2>"$MEMPAL_STATE_DIR/cursor_last_python_err.log" + )" + + # Drop empty stderr capture on success; lock it to 0600 on failure + # (mirrors the privacy contract in the Claude Code hooks — the + # traceback can echo back transcript_path / home layout). + if [ -s "$MEMPAL_STATE_DIR/cursor_last_python_err.log" ]; then + chmod 600 "$MEMPAL_STATE_DIR/cursor_last_python_err.log" 2>/dev/null + else + rm -f "$MEMPAL_STATE_DIR/cursor_last_python_err.log" 2>/dev/null + fi + + local marker + marker="$(printf '%s\n' "$parsed" | sed -n '1p')" + if [ "$marker" = "__MEMPAL_PARSE_OK__" ]; then + MEMPAL_PARSE_OK="1" + MEMPAL_CONV_ID="$(printf '%s\n' "$parsed" | sed -n '2p')" + MEMPAL_LOOP_COUNT="$(printf '%s\n' "$parsed" | sed -n '3p')" + MEMPAL_TRANSCRIPT="$(printf '%s\n' "$parsed" | sed -n '4p')" + MEMPAL_WORKSPACE="$(printf '%s\n' "$parsed" | sed -n '5p')" + MEMPAL_TRIGGER="$(printf '%s\n' "$parsed" | sed -n '6p')" + MEMPAL_STATUS="$(printf '%s\n' "$parsed" | sed -n '7p')" + else + MEMPAL_PARSE_OK="0" + MEMPAL_CONV_ID="" + MEMPAL_LOOP_COUNT="0" + MEMPAL_TRANSCRIPT="" + MEMPAL_WORKSPACE="" + MEMPAL_TRIGGER="" + MEMPAL_STATUS="" + fi + + # Defaults and environment fallbacks. The Cursor docs guarantee + # CURSOR_TRANSCRIPT_PATH and CURSOR_PROJECT_DIR env vars are set + # for every hook execution; if JSON parsing failed for any reason + # (sentinel missing, malformed payload, missing interpreter) we + # still have a usable workspace. + MEMPAL_CONV_ID="${MEMPAL_CONV_ID:-unknown}" + case "$MEMPAL_LOOP_COUNT" in + ''|*[!0-9]*) MEMPAL_LOOP_COUNT="0" ;; + esac + if [ -z "$MEMPAL_TRANSCRIPT" ] && [ -n "${CURSOR_TRANSCRIPT_PATH:-}" ]; then + MEMPAL_TRANSCRIPT="${CURSOR_TRANSCRIPT_PATH}" + fi + if [ -z "$MEMPAL_WORKSPACE" ]; then + if [ -n "${CURSOR_PROJECT_DIR:-}" ]; then + MEMPAL_WORKSPACE="${CURSOR_PROJECT_DIR}" + elif [ -n "${CLAUDE_PROJECT_DIR:-}" ]; then + MEMPAL_WORKSPACE="${CLAUDE_PROJECT_DIR}" + else + MEMPAL_WORKSPACE="${PWD:-/}" + fi + fi + + # Expand a leading ~ in the transcript path so downstream + # ``[ -f "$path" ]`` checks resolve correctly. + case "$MEMPAL_TRANSCRIPT" in + '~/'*) MEMPAL_TRANSCRIPT="$HOME/${MEMPAL_TRANSCRIPT#~/}" ;; + esac +} + +# ── Defense-in-depth: dump unparseable stdin ────────────────────────── +# +# Same shape as the Claude Code hooks' last_input.log: bounded to 4096 +# bytes, overwritten (never appended) so a misconfiguration loop cannot +# grow disk usage, 0600 perms because the dump mirrors the raw hook +# payload (transcript_path reveals the user's home + project layout). +mempal_dump_bad_input() { + local input="${1:-}" + if [ -z "$input" ]; then + return 0 + fi + mempal_log "${2:-?}" "${MEMPAL_CONV_ID:-unknown}" \ + "WARN: input parse failed (sentinel missing); see $MEMPAL_STATE_DIR/cursor_last_input.log + cursor_last_python_err.log" + ( + umask 077 + printf '%s' "$input" | head -c 4096 > "$MEMPAL_STATE_DIR/cursor_last_input.log" + ) + chmod 600 "$MEMPAL_STATE_DIR/cursor_last_input.log" 2>/dev/null +} + +# ── Counter helpers ─────────────────────────────────────────────────── +# +# One counter file per conversation_id. Atomic write via temp file +# inside the same directory (rename is atomic on POSIX) so concurrent +# hook invocations cannot half-write the file. Read tolerates a +# corrupted or empty file by returning 0, never crashing. +_mempal_counter_path() { + local conv="${1:-unknown}" + # Sanitise the conv id one more time: it has already been through + # the Python sanitiser, but be defensive in case a caller passes a + # raw string. Strip any character outside [a-zA-Z0-9_.-]. + local safe_conv + safe_conv="$(printf '%s' "$conv" | tr -cd 'a-zA-Z0-9_.-')" + if [ -z "$safe_conv" ]; then + safe_conv="unknown" + fi + printf '%s/cursor_%s.count' "$MEMPAL_STATE_DIR" "$safe_conv" +} + +mempal_read_counter() { + local path="$1" + if [ ! -f "$path" ]; then + printf '0' + return 0 + fi + local raw + raw="$(cat "$path" 2>/dev/null)" + case "$raw" in + ''|*[!0-9]*) printf '0' ;; + *) printf '%s' "$raw" ;; + esac +} + +mempal_write_counter_atomic() { + local path="$1" + local value="$2" + case "$value" in + ''|*[!0-9]*) value="0" ;; + esac + local tmp="${path}.tmp.$$" + printf '%s' "$value" > "$tmp" 2>/dev/null || return 1 + mv "$tmp" "$path" 2>/dev/null || { + rm -f "$tmp" 2>/dev/null + return 1 + } +} + +# ── Pending-save marker ─────────────────────────────────────────────── +# +# Dropped by the preCompact hook (which cannot itself emit a +# followup_message — Cursor's preCompact is observational-only) and +# consumed by the next stop invocation so the LLM still gets a diary +# nudge after compaction. +_mempal_pending_path() { + local conv="${1:-unknown}" + local safe_conv + safe_conv="$(printf '%s' "$conv" | tr -cd 'a-zA-Z0-9_.-')" + if [ -z "$safe_conv" ]; then + safe_conv="unknown" + fi + printf '%s/cursor_%s.pending' "$MEMPAL_STATE_DIR" "$safe_conv" +} + +mempal_set_pending() { + local conv="${1:-unknown}" + local path + path="$(_mempal_pending_path "$conv")" + : > "$path" 2>/dev/null || return 1 + chmod 600 "$path" 2>/dev/null +} + +mempal_consume_pending() { + local conv="${1:-unknown}" + local path + path="$(_mempal_pending_path "$conv")" + if [ -f "$path" ]; then + rm -f "$path" 2>/dev/null + return 0 + fi + return 1 +} + +# ── State-file TTL ──────────────────────────────────────────────────── +# +# Per-conversation state artifacts (cursor_.count and +# cursor_.pending) accumulate one set per conversation and are +# never otherwise removed (igorls review, PR #1632 — unbounded state +# growth). Reads MEMPAL_STATE_TTL_DAYS (default 30), validated +# digits-only and leading-zero-stripped (mirrors the SAVE_INTERVAL +# sanitiser) so `find -mtime` never sees a bad or octal token. Empty or +# non-numeric floors to 30; a value of 0 means "sweep everything older +# than today". +mempal_state_ttl_days() { + local raw="${MEMPAL_STATE_TTL_DAYS:-30}" + case "$raw" in + ''|*[!0-9]*) printf '30'; return 0 ;; + esac + while [ "${raw}" != "${raw#0}" ] && [ "${#raw}" -gt 1 ]; do + raw="${raw#0}" + done + printf '%s' "$raw" +} + +# ── Stale state GC ──────────────────────────────────────────────────── +# +# Opportunistic sweep of per-conversation Cursor state older than the +# TTL. Throttled to at most once per 24h via the cursor_last_sweep +# marker, so it costs a single mtime comparison on the vast majority of +# fires. When it does run, two `find` passes remove the stale counter +# files and pending markers. +# +# The globs are Cursor-specific and suffix-anchored (cursor_*.count, +# cursor_*.pending), so the shared logs (cursor_hook.log, +# cursor_last_input.log, cursor_last_python_err.log), the +# cursor_last_sweep marker itself, and any antigravity_*/Claude state +# sharing the same directory are never touched. BSD find (macOS default) +# and GNU find both accept -maxdepth, -mtime +N, and -exec ... +. +# +# Fail-open: every step is best-effort; a missing state dir, a find that +# errors, or a permission problem must never abort the caller. +mempal_gc_stale_state() { + [ -d "$MEMPAL_STATE_DIR" ] || return 0 + + local marker="$MEMPAL_STATE_DIR/cursor_last_sweep" + if [ -f "$marker" ]; then + local mtime now + if mtime=$("$MEMPAL_PYTHON_BIN" -c 'import os, sys; print(int(os.path.getmtime(sys.argv[1])))' "$marker" 2>/dev/null) \ + && now=$(date '+%s' 2>/dev/null) \ + && [ -n "$mtime" ] \ + && [ "$((now - mtime))" -lt 86400 ]; then + return 0 + fi + fi + # Touch the marker first so a crash mid-sweep still throttles the + # next fire (better to skip a sweep than to hammer the disk). + : > "$marker" 2>/dev/null + + local ttl + ttl=$(mempal_state_ttl_days) + + find "$MEMPAL_STATE_DIR" -maxdepth 1 -type f \ + -name 'cursor_*.count' -mtime +"$ttl" \ + -exec rm -f {} + 2>/dev/null + find "$MEMPAL_STATE_DIR" -maxdepth 1 -type f \ + -name 'cursor_*.pending' -mtime +"$ttl" \ + -exec rm -f {} + 2>/dev/null + + return 0 +} + +# ── Workspace → wing inference ──────────────────────────────────────── +# +# basename(workspace_root), normalised to [a-z0-9_-]. Edge cases: +# / → "root" +# /path/ → trailing slash stripped, then basename +# "/foo bar/" → "foo_bar" (spaces collapsed to underscores) +# "" → "cursor_session" +# "C:\\proj" → "proj" (Windows-style path; basename via tr fallback) +# +# We intentionally keep this in pure bash + POSIX tools so the +# inference is identical across the hook scripts and the test suite +# can target it as a function via `bash -c 'source ...; ...'`. +mempal_infer_wing() { + local raw="${1:-}" + if [ -z "$raw" ]; then + printf 'cursor_session' + return 0 + fi + # Strip trailing slashes (but preserve the lone "/" case). + while [ "$raw" != "/" ] && [ "${raw%/}" != "$raw" ]; do + raw="${raw%/}" + done + if [ "$raw" = "/" ]; then + printf 'root' + return 0 + fi + local base="${raw##*/}" + # On Windows-style paths with backslashes, fall back to splitting + # on backslash too so we don't return the whole path verbatim. + case "$base" in + *\\*) base="${base##*\\}" ;; + esac + # Lowercase + replace anything outside [a-z0-9_-] with underscore. + # Collapse runs of underscores so "foo bar" doesn't become + # "foo__bar". + base="$(printf '%s' "$base" \ + | tr '[:upper:]' '[:lower:]' \ + | tr -c 'a-z0-9_-' '_' \ + | tr -s '_' \ + | sed 's/^_//; s/_$//')" + if [ -z "$base" ]; then + printf 'cursor_session' + return 0 + fi + printf '%s' "$base" +} + +# ── Transcript path validation ──────────────────────────────────────── +# +# Mirrors hooks/mempal_save_hook.sh::is_valid_transcript_path so the +# Cursor and Claude Code hooks reject the same shapes: +# * non-empty +# * .json or .jsonl suffix +# * no .. traversal segments +mempal_is_valid_transcript() { + local path="${1:-}" + [ -n "$path" ] || return 1 + case "$path" in + *.json|*.jsonl) ;; + *) return 1 ;; + esac + case "/$path/" in + */../*) return 1 ;; + esac + return 0 +} + +# ── JSON emit ───────────────────────────────────────────────────────── +# +# Final stdout write. Uses ``printf '%s'`` instead of ``echo`` because +# echo interprets ``-n``/``-e``/``-E`` as flags and varies in backslash +# handling between builtin and /bin/echo (xpg_echo shopt). Matches the +# Claude Code hook's documented rationale. +mempal_emit() { + printf '%s\n' "${1:-{\}}" +} diff --git a/hooks/cursor/mempal_precompact_hook_cursor.sh b/hooks/cursor/mempal_precompact_hook_cursor.sh new file mode 100755 index 000000000..dcb76e4be --- /dev/null +++ b/hooks/cursor/mempal_precompact_hook_cursor.sh @@ -0,0 +1,133 @@ +#!/bin/bash +# MEMPALACE CURSOR PRE-COMPACT HOOK — Snapshot transcript before compaction +# +# Cursor "preCompact" hook. Cursor's preCompact is documented as +# OBSERVATIONAL ONLY (cursor.com/docs/hooks.md fetched 2026-05-27): +# +# * It cannot block compaction. +# * Its only output field is `user_message` (no `followup_message`, +# no `decision: block`). +# +# So unlike the Claude Code PreCompact hook (which can block the AI +# and force a save before compaction proceeds), the Cursor preCompact +# hook can only do two useful things at this moment: +# +# 1. Run `mempalace mine` SYNCHRONOUSLY against the transcript file +# so whatever Cursor's transcript contains is ingested BEFORE +# Cursor summarises the conversation — zero LLM cost, no agent +# interaction needed. NOTE: this is BEST-EFFORT for Cursor. +# Cursor's transcript format is undocumented and normalize.py has +# no Cursor parser, so this does not yet produce clean verbatim +# drawers; it is a safety net, not the primary capture path. +# +# 2. Drop a `.pending` marker file keyed on conversation_id. The +# next `stop` hook reads that marker and forces a save followup +# regardless of its counter, so the AI still gets a "write a +# diary entry now" nudge on the very next turn. THIS followup is +# the load-bearing verbatim-capture path for Cursor (the agent +# files its own in-context verbatim quotes via the MCP tools). +# +# === INSTALL === +# +# Add to ~/.cursor/hooks.json (or .cursor/hooks.json for project +# scope) under "preCompact": +# +# { +# "version": 1, +# "hooks": { +# "preCompact": [ +# { "command": "/absolute/path/to/mempal_precompact_hook_cursor.sh" } +# ] +# } +# } +# +# No loop_limit is needed; preCompact is not a looping hook. + +_mempal_self="${BASH_SOURCE[0]:-$0}" +_mempal_dir="$(cd "$(dirname "$_mempal_self")" 2>/dev/null && pwd)" +# shellcheck source=lib/common.sh +. "$_mempal_dir/lib/common.sh" + +# Optional additional project directory to mine before compaction +# (parity with the Claude Code hook's MEMPAL_DIR knob). +MEMPAL_DIR="${MEMPAL_DIR:-}" + +if mempal_is_disabled; then + mempal_emit '{}' + exit 0 +fi + +# Opportunistic, daily-throttled GC of stale per-conversation state. +# Placed after the kill switch so a disabled hook touches nothing. +mempal_gc_stale_state + +INPUT="$(cat)" +mempal_parse_stdin "$INPUT" + +if [ "$MEMPAL_PARSE_OK" != "1" ]; then + mempal_dump_bad_input "$INPUT" "preCompact" + mempal_emit '{}' + exit 0 +fi + +mempal_log "preCompact" "$MEMPAL_CONV_ID" \ + "trigger=${MEMPAL_TRIGGER:-?} transcript=$MEMPAL_TRANSCRIPT" + +# ── Synchronous mine ────────────────────────────────────────────── +# +# This intentionally blocks the hook. Compaction is irreversible — +# once Cursor summarises the conversation we cannot get the verbatim +# text back — so we must finish ingesting before returning. Background +# mining would race the compaction and lose data. +# +# TIMEOUT TRADEOFF (igorls review, PR #1632): on a very large transcript +# this synchronous mine can exceed Cursor's per-hook timeout, in which +# case Cursor kills the process mid-mine. That is acceptable and safe +# here: `mempalace mine` is incremental and append-only (a crash mid- +# operation leaves the existing palace untouched — see CLAUDE.md +# "Incremental only"), so a killed mine simply resumes on the next mine +# invocation rather than corrupting the palace. We deliberately do NOT +# wrap this in a shorter timeout, because truncating the mine would +# trade a recoverable partial-ingest for guaranteed silent data loss +# right before the irreversible compaction. The pending-save marker +# below is the backstop: the next `stop` hook re-mines and nudges a +# verbatim save regardless of whether this mine completed. +if command -v mempalace >/dev/null 2>&1; then + if mempal_is_valid_transcript "$MEMPAL_TRANSCRIPT" \ + && [ -f "$MEMPAL_TRANSCRIPT" ]; then + mempalace mine "$(dirname "$MEMPAL_TRANSCRIPT")" --mode convos \ + >> "$MEMPAL_CURSOR_LOG" 2>&1 || \ + mempal_log "preCompact" "$MEMPAL_CONV_ID" \ + "WARN: mempalace mine convos returned non-zero" + elif [ -n "$MEMPAL_TRANSCRIPT" ]; then + mempal_log "preCompact" "$MEMPAL_CONV_ID" \ + "skipping invalid transcript path: $MEMPAL_TRANSCRIPT" + fi + if [ -n "$MEMPAL_DIR" ] && [ -d "$MEMPAL_DIR" ]; then + mempalace mine "$MEMPAL_DIR" --mode projects \ + >> "$MEMPAL_CURSOR_LOG" 2>&1 || \ + mempal_log "preCompact" "$MEMPAL_CONV_ID" \ + "WARN: mempalace mine projects returned non-zero" + fi +else + mempal_log "preCompact" "$MEMPAL_CONV_ID" \ + "mempalace CLI not on PATH; skipping synchronous mine" +fi + +# ── Drop the pending-save marker ────────────────────────────────── +mempal_set_pending "$MEMPAL_CONV_ID" || \ + mempal_log "preCompact" "$MEMPAL_CONV_ID" \ + "WARN: could not write pending-save marker" + +# Surface a short user-visible note that compaction is about to +# happen and we've already captured the verbatim text. user_message +# is the only output field Cursor's preCompact accepts. +"$MEMPAL_PYTHON_BIN" -c ' +import json +print(json.dumps({ + "user_message": ( + "MemPalace: transcript snapshotted before compaction. " + "A diary nudge is queued for the next agent turn." + ) +})) +' diff --git a/hooks/cursor/mempal_save_hook_cursor.sh b/hooks/cursor/mempal_save_hook_cursor.sh new file mode 100755 index 000000000..8a3290dd6 --- /dev/null +++ b/hooks/cursor/mempal_save_hook_cursor.sh @@ -0,0 +1,269 @@ +#!/bin/bash +# MEMPALACE CURSOR SAVE HOOK — Auto-save every N stop events +# +# Cursor "stop" hook. After every agent loop ends, this hook: +# 1. Counts stop invocations per conversation_id (each stop ≈ one +# assistant turn ≈ roughly one user message — see plan rationale). +# 2. Every SAVE_INTERVAL stops, returns a followup_message telling +# the agent to file the session into MemPalace and write a diary +# entry. Cursor auto-submits that as the next user message. +# 3. On the next stop, loop_count > 0 so we let the agent finish +# without re-firing — Cursor's loop_count is the equivalent of +# Claude Code's stop_hook_active flag. +# 4. If the preCompact hook has left a `.pending` marker, force a +# save followup regardless of the counter and clear the marker. +# +# === WHY THE FOLLOWUP FIRES BY DEFAULT (differs from the Claude hook) === +# +# The Claude Code hook (hooks/mempal_save_hook.sh) is SILENT by default: +# its background `mempalace mine --mode convos` captures the verbatim +# transcript on its own, and the LLM-driven diary nudge is opt-IN behind +# MEMPAL_VERBOSE. That works because mempalace/normalize.py has a Claude +# Code JSONL parser. +# +# Cursor is different. Cursor's transcript format is undocumented (see +# STDIN_SHAPE.md) and normalize.py has NO Cursor parser, so the +# background mine below is BEST-EFFORT only — it does not yet yield clean +# verbatim drawers for Cursor. The followup_message is therefore the +# load-bearing verbatim-capture path: it drives the agent to call +# mempalace_add_drawer / mempalace_diary_write from its in-context memory. +# That is why it is ON by default here — silencing it by default would +# leave a default Cursor install capturing nothing. +# +# Users who want the Claude-style "zero tokens in the chat window" +# behaviour can silence the followup (see MEMPAL_CURSOR_SILENT / +# MEMPAL_VERBOSE below). Once normalize.py learns to read Cursor +# transcripts, this default should flip to silent to match Claude. +# +# Companion files in this directory: +# * lib/common.sh — shared helpers (sourced) +# * mempal_precompact_hook_cursor.sh — preCompact event +# * mempal_wake_hook_cursor.sh — sessionStart event +# +# === INSTALL === +# +# Recommended path: run `hooks/cursor/install.sh` from a cloned repo, +# which copies the scripts to ~/.mempalace/hooks/cursor/ and merges +# the wiring into your ~/.cursor/hooks.json. See hooks/cursor/README.md +# for the full walkthrough, or website/guide/cursor-hooks.md for the +# rendered version. +# +# Manual wiring (user scope: ~/.cursor/hooks.json): +# +# { +# "version": 1, +# "hooks": { +# "stop": [ +# { +# "command": "/absolute/path/to/mempal_save_hook_cursor.sh", +# "loop_limit": 1 +# } +# ] +# } +# } +# +# The `loop_limit: 1` cap is defense-in-depth — even if our own +# loop_count check below regresses, Cursor itself will stop emitting +# our followup after one auto-iteration. +# +# === KILL SWITCHES === +# +# MEMPAL_DISABLE_HOOK=1 — Cursor-prompt addition +# MEMPALACE_HOOKS_AUTO_SAVE=false — matches the Claude Code hooks +# ~/.mempalace/config.json "hooks.auto_save": false +# +# Any one of these short-circuits the hook to `{}` and exits 0. + +# Resolve the directory this script lives in so we can source the +# sibling lib/common.sh whether the user invoked us by absolute path, +# by relative path, or via a symlink. +_mempal_self="${BASH_SOURCE[0]:-$0}" +_mempal_dir="$(cd "$(dirname "$_mempal_self")" 2>/dev/null && pwd)" +# shellcheck source=lib/common.sh +. "$_mempal_dir/lib/common.sh" + +SAVE_INTERVAL="${MEMPAL_SAVE_INTERVAL:-15}" +# Coerce empty, non-numeric, AND zero to the default. SAVE_INTERVAL=0 +# would otherwise crash bash on the modulo check below ($((NEXT % 0)) +# is "division by 0"). gh-PR review caught this edge case. +case "$SAVE_INTERVAL" in + ''|*[!0-9]*|0) SAVE_INTERVAL=15 ;; +esac + +# Optional additional project directory to mine on save (parity with +# the Claude Code hook's MEMPAL_DIR knob — purely additive, never an +# override for the transcript mine). +MEMPAL_DIR="${MEMPAL_DIR:-}" + +# ── Followup opt-out ────────────────────────────────────────────── +# +# Returns 0 (true) when the user has asked to suppress the +# followup_message. See the header comment for why the followup is ON +# by default for Cursor. When silenced, the hook still runs the +# best-effort background mine and still maintains its counters/markers +# — it just emits `{}` instead of a followup. Two equivalent signals: +# * MEMPAL_CURSOR_SILENT=1|true|yes|on — dedicated Cursor opt-out +# * MEMPAL_VERBOSE=false|0|no|off — cross-hook silence signal +# (mirror-image of the Claude hook, where MEMPAL_VERBOSE=true is +# what turns its diary nudge ON) +mempal_followup_silenced() { + case "${MEMPAL_CURSOR_SILENT:-}" in + 1|true|yes|on) return 0 ;; + esac + case "${MEMPAL_VERBOSE:-}" in + false|0|no|off) return 0 ;; + esac + return 1 +} + +# Kill switch — emit `{}` so Cursor proceeds with normal stop. +if mempal_is_disabled; then + mempal_emit '{}' + exit 0 +fi + +# Opportunistic, daily-throttled GC of stale per-conversation state. +# Placed after the kill switch so a disabled hook touches nothing. +mempal_gc_stale_state + +INPUT="$(cat)" +mempal_parse_stdin "$INPUT" + +if [ "$MEMPAL_PARSE_OK" != "1" ]; then + mempal_dump_bad_input "$INPUT" "stop" + # Fail-open: don't block the host on a parse error. + mempal_emit '{}' + exit 0 +fi + +mempal_log "stop" "$MEMPAL_CONV_ID" \ + "loop_count=$MEMPAL_LOOP_COUNT status=${MEMPAL_STATUS:-?} workspace=$MEMPAL_WORKSPACE" + +# ── Loop-prevention ──────────────────────────────────────────────── +# +# Cursor's loop_count indicates how many times THIS stop hook has +# already triggered an automatic followup for this conversation +# (starts at 0). If it is > 0, our own previous followup is currently +# being consumed by the agent — let it finish without re-firing. +if [ "$MEMPAL_LOOP_COUNT" -gt 0 ] 2>/dev/null; then + mempal_log "stop" "$MEMPAL_CONV_ID" "loop_count>0; letting agent stop" + mempal_emit '{}' + exit 0 +fi + +WING="$(mempal_infer_wing "$MEMPAL_WORKSPACE")" + +# Build the followup message once; both the pending-marker branch and +# the threshold branch use it. Constructed via Python -c (rather than +# a heredoc) so we can pass the inferred wing as argv[1] and so the +# JSON encoding is correct even for wings whose name would otherwise +# need shell quoting. +_mempal_build_followup() { + "$MEMPAL_PYTHON_BIN" -c ' +import json, sys +wing = sys.argv[1] if len(sys.argv) > 1 else "cursor_session" +msg = ( + "MemPalace save checkpoint. " + "(1) Call mempalace_check_duplicate on the key topics, decisions, " + "and verbatim quotes from this session. " + "(2) For each non-duplicate, call mempalace_add_drawer (wing=" + + wing + ", room=, content=verbatim quote). " + "(3) Call mempalace_diary_write (agent_name=cursor-ide, wing=" + + wing + ", entry=AAAK-format summary). " + "Then stop." +) +print(json.dumps({"followup_message": msg})) +' "$WING" +} + +# ── Pending-save marker from preCompact ─────────────────────────── +# +# preCompact cannot itself emit a followup_message (Cursor docs: +# preCompact is observational-only, output supports only user_message), +# so it drops a marker file and we consume it here. Forces a save +# nudge regardless of the counter. +if mempal_consume_pending "$MEMPAL_CONV_ID"; then + mempal_log "stop" "$MEMPAL_CONV_ID" \ + "consumed pending-save marker (post-compaction)" + if mempal_followup_silenced; then + mempal_log "stop" "$MEMPAL_CONV_ID" \ + "followup silenced (MEMPAL_CURSOR_SILENT/MEMPAL_VERBOSE); emitting {}" + mempal_emit '{}' + exit 0 + fi + _mempal_build_followup + exit 0 +fi + +# ── Normal counter path ─────────────────────────────────────────── +COUNTER_FILE="$(_mempal_counter_path "$MEMPAL_CONV_ID")" +CURRENT="$(mempal_read_counter "$COUNTER_FILE")" +NEXT=$((CURRENT + 1)) +mempal_write_counter_atomic "$COUNTER_FILE" "$NEXT" || { + mempal_log "stop" "$MEMPAL_CONV_ID" \ + "WARN: counter write failed for $COUNTER_FILE; passing through" + mempal_emit '{}' + exit 0 +} + +mempal_log "stop" "$MEMPAL_CONV_ID" \ + "counter $CURRENT -> $NEXT (interval=$SAVE_INTERVAL)" + +# Trigger when we hit a multiple of SAVE_INTERVAL. Modulo arithmetic +# keeps the counter monotonically growing (no reset) so the log file +# is greppable for total turns across a conversation. +if [ "$((NEXT % SAVE_INTERVAL))" -ne 0 ]; then + mempal_emit '{}' + exit 0 +fi + +mempal_log "stop" "$MEMPAL_CONV_ID" "TRIGGERING SAVE at counter=$NEXT" + +# ── Background mine (best effort) ───────────────────────────────── +# +# Two independent targets — both run if both are set: +# 1. transcript_path → its parent directory, --mode convos +# 2. MEMPAL_DIR (user-configured project) → --mode projects +# +# IMPORTANT (Cursor caveat): the --mode convos mine is BEST-EFFORT for +# Cursor. Cursor's transcript format is undocumented and +# mempalace/normalize.py has no Cursor parser, so this call does not +# yet produce clean verbatim conversation drawers — at best it ingests +# raw bytes. The verbatim-capture guarantee for Cursor is carried by +# the followup_message below, which drives the agent to file its own +# in-context verbatim quotes. The --mode projects target (MEMPAL_DIR) +# is unaffected — normalize.py reads ordinary project files fine. +# +# Both run with stdout/stderr appended to the cursor log and are +# backgrounded so a slow mine cannot push the hook past its +# Cursor-configured timeout. `command -v mempalace` gates so a user +# without the CLI on PATH (e.g. a fresh GUI-launched session) does +# not see a noisy error. +if command -v mempalace >/dev/null 2>&1; then + if mempal_is_valid_transcript "$MEMPAL_TRANSCRIPT" \ + && [ -f "$MEMPAL_TRANSCRIPT" ]; then + ( mempalace mine "$(dirname "$MEMPAL_TRANSCRIPT")" --mode convos \ + >> "$MEMPAL_CURSOR_LOG" 2>&1 ) & + elif [ -n "$MEMPAL_TRANSCRIPT" ]; then + mempal_log "stop" "$MEMPAL_CONV_ID" \ + "skipping invalid transcript path: $MEMPAL_TRANSCRIPT" + fi + if [ -n "$MEMPAL_DIR" ] && [ -d "$MEMPAL_DIR" ]; then + ( mempalace mine "$MEMPAL_DIR" --mode projects \ + >> "$MEMPAL_CURSOR_LOG" 2>&1 ) & + fi +else + mempal_log "stop" "$MEMPAL_CONV_ID" \ + "mempalace CLI not on PATH; skipping background mine" +fi + +# The followup is the load-bearing verbatim path for Cursor (see header), +# so it fires by default. Honour the opt-out for users who want silence. +if mempal_followup_silenced; then + mempal_log "stop" "$MEMPAL_CONV_ID" \ + "followup silenced (MEMPAL_CURSOR_SILENT/MEMPAL_VERBOSE); background mine only" + mempal_emit '{}' + exit 0 +fi + +_mempal_build_followup diff --git a/hooks/cursor/mempal_wake_hook_cursor.sh b/hooks/cursor/mempal_wake_hook_cursor.sh new file mode 100755 index 000000000..d484739fd --- /dev/null +++ b/hooks/cursor/mempal_wake_hook_cursor.sh @@ -0,0 +1,82 @@ +#!/bin/bash +# MEMPALACE CURSOR WAKE HOOK — Session-start memory recall +# +# Cursor "sessionStart" hook. This is a Cursor-only capability — +# Claude Code's third-party-hooks compatibility layer does not have +# an equivalent event with the same "inject context into the agent's +# initial system message" semantics. +# +# Behaviour: +# 1. Parse Cursor's sessionStart payload (conversation_id / +# session_id, is_background_agent, composer_mode, plus the +# common workspace_roots field). +# 2. Infer the wing from basename(workspace_roots[0]). +# 3. Return {"additional_context": "..."} instructing the agent to +# scope its memory recall by calling mempalace_search + +# mempalace_diary_read with wing=. +# +# sessionStart is documented as fire-and-forget — Cursor does not +# enforce a blocking response and does not consume "continue" / +# "user_message" — but "additional_context" is honoured and added to +# the conversation's initial system context. Verified in +# cursor.com/docs/hooks.md (fetched 2026-05-27). +# +# === INSTALL === +# +# Add to ~/.cursor/hooks.json (or .cursor/hooks.json for project +# scope) under "sessionStart": +# +# { +# "version": 1, +# "hooks": { +# "sessionStart": [ +# { "command": "/absolute/path/to/mempal_wake_hook_cursor.sh" } +# ] +# } +# } + +_mempal_self="${BASH_SOURCE[0]:-$0}" +_mempal_dir="$(cd "$(dirname "$_mempal_self")" 2>/dev/null && pwd)" +# shellcheck source=lib/common.sh +. "$_mempal_dir/lib/common.sh" + +if mempal_is_disabled; then + mempal_emit '{}' + exit 0 +fi + +INPUT="$(cat)" +mempal_parse_stdin "$INPUT" + +if [ "$MEMPAL_PARSE_OK" != "1" ]; then + mempal_dump_bad_input "$INPUT" "sessionStart" + mempal_emit '{}' + exit 0 +fi + +WING="$(mempal_infer_wing "$MEMPAL_WORKSPACE")" + +mempal_log "sessionStart" "$MEMPAL_CONV_ID" \ + "workspace=$MEMPAL_WORKSPACE wing=$WING" + +# Emit the additional_context payload via Python -c (rather than a +# heredoc) so the JSON encoding survives wings whose name contains +# characters that would otherwise need shell escaping, and so the +# inferred wing arrives as an argv positional. The MCP tool names +# referenced here are verified against mempalace/mcp_server.py: +# mempalace_search and mempalace_diary_read both exist and accept +# the wing parameter. +"$MEMPAL_PYTHON_BIN" -c ' +import json, sys +wing = sys.argv[1] if len(sys.argv) > 1 else "cursor_session" +ctx = ( + "MemPalace wake-up. The Cursor workspace maps to wing=" + wing + ". " + "Before answering anything that touches past work in this " + "project, call mempalace_search (wing=" + wing + ", " + "query=) and mempalace_diary_read " + "(agent_name=cursor-ide, wing=" + wing + ", last_n=10). " + "Use what you find verbatim where it answers the question; " + "never summarise the user'"'"'s own words." +) +print(json.dumps({"additional_context": ctx})) +' "$WING" diff --git a/hooks/mempal_precompact_hook.sh b/hooks/mempal_precompact_hook.sh index b9585876c..921260344 100755 --- a/hooks/mempal_precompact_hook.sh +++ b/hooks/mempal_precompact_hook.sh @@ -119,16 +119,8 @@ INPUT=$(cat) # backslashes are not mangled by echo flag parsing. _mempal_parsed=$( umask 077 - printf '%s' "$INPUT" | "$MEMPAL_PYTHON_BIN" -c " -import sys, json, re -data = json.load(sys.stdin) -sid = data.get('session_id', '') -tp = data.get('transcript_path', '') -safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s)) -print('__MEMPAL_PARSE_OK__') -print(safe(sid)) -print(safe(tp)) -" 2>"$STATE_DIR/last_python_err.log" + printf '%s' "$INPUT" | "$MEMPAL_PYTHON_BIN" -m mempalace.hook_shell parse-precompact \ + 2>"$STATE_DIR/last_python_err.log" ) # Drop the empty file on success; chmod 600 on failure to mirror # last_input.log's privacy contract. @@ -193,7 +185,7 @@ if is_valid_transcript_path "$TRANSCRIPT_PATH" && [ -f "$TRANSCRIPT_PATH" ]; the mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \ >> "$STATE_DIR/hook.log" 2>&1 elif [ -n "$TRANSCRIPT_PATH" ]; then - echo "[$(date '+%H:%M:%S')] Skipping invalid transcript path: $TRANSCRIPT_PATH" \ + echo "[$(date '+%H:%M:%S')] Skipping missing or invalid transcript path after normalization: $TRANSCRIPT_PATH" \ >> "$STATE_DIR/hook.log" fi if [ -n "$MEMPAL_DIR" ] && [ -d "$MEMPAL_DIR" ]; then diff --git a/hooks/mempal_save_hook.sh b/hooks/mempal_save_hook.sh index 58c61e998..5fbcc95b8 100755 --- a/hooks/mempal_save_hook.sh +++ b/hooks/mempal_save_hook.sh @@ -149,21 +149,8 @@ INPUT=$(cat) # ``printf '%s'`` removes the class of bug entirely. _mempal_parsed=$( umask 077 - printf '%s' "$INPUT" | "$MEMPAL_PYTHON_BIN" -c " -import sys, json, re -data = json.load(sys.stdin) -sid = data.get('session_id', '') -sha_raw = data.get('stop_hook_active', False) -tp = data.get('transcript_path', '') -# Shell-safe output: only allow alphanumeric, underscore, hyphen, slash, dot, tilde -safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s)) -# Coerce stop_hook_active to strict boolean string -sha = 'True' if sha_raw is True or str(sha_raw).lower() in ('true', '1', 'yes') else 'False' -print('__MEMPAL_PARSE_OK__') -print(safe(sid)) -print(sha) -print(safe(tp)) -" 2>"$STATE_DIR/last_python_err.log" + printf '%s' "$INPUT" | "$MEMPAL_PYTHON_BIN" -m mempalace.hook_shell parse-stop \ + 2>"$STATE_DIR/last_python_err.log" ) # The 2> redirect creates the file even when stderr is empty (success). # Remove the empty file so the state directory stays clean on the happy @@ -239,24 +226,10 @@ fi # Count human messages in the JSONL transcript # SECURITY: Pass transcript path as sys.argv to avoid shell injection via crafted paths if [ -f "$TRANSCRIPT_PATH" ]; then - EXCHANGE_COUNT=$("$MEMPAL_PYTHON_BIN" - "$TRANSCRIPT_PATH" <<'PYEOF' -import json, sys -count = 0 -with open(sys.argv[1]) as f: - for line in f: - try: - entry = json.loads(line) - msg = entry.get('message', {}) - if isinstance(msg, dict) and msg.get('role') == 'user': - content = msg.get('content', '') - if isinstance(content, str) and '' in content: - continue - count += 1 - except: - pass -print(count) -PYEOF -2>/dev/null) + EXCHANGE_COUNT=$("$MEMPAL_PYTHON_BIN" -m mempalace.hook_shell count-human-messages "$TRANSCRIPT_PATH" 2>/dev/null) +elif [ -n "$TRANSCRIPT_PATH" ]; then + echo "[$(date '+%H:%M:%S')] WARN: transcript_path not found after normalization: $TRANSCRIPT_PATH" >> "$STATE_DIR/hook.log" + EXCHANGE_COUNT=0 else EXCHANGE_COUNT=0 fi diff --git a/integrations/shared/recall-protocol.md b/integrations/shared/recall-protocol.md new file mode 100644 index 000000000..86e89e98b --- /dev/null +++ b/integrations/shared/recall-protocol.md @@ -0,0 +1,90 @@ +# MemPalace Recall Protocol + +The canonical "search before answering" protocol shared across every +MemPalace integration (Cursor, Antigravity, Claude Code, Codex, +OpenClaw). This file is the single source of truth — skills and rules +should link here rather than restating the protocol, so the rule never +drifts from the skill. + +The protocol exists to honour MemPalace's foundational promise: +**100% recall, verbatim, never guess.** When the palace might hold the +answer, the agent must read the palace before answering from model +memory. + +## When to recall + +Search the palace **before answering** whenever the user asks about +anything that may already be filed: + +- Past work, prior decisions, or "what did we do / decide / try?" +- A person, project, or entity ("who is …", "what is …") +- Something that happened in an earlier session ("remember when …", + "last time …", "the thing we discussed") +- A preference, fact, or relationship that could have changed over time + +If the question is pure greenfield work with no memory relevance (e.g. +"rename this variable", "fix this typo"), do not search — recall is +question-driven, not reflexive. + +## The protocol + +1. **On wake-up** (if a session-start hook injected context, honour its wing scoping / `additional_context`): scope recall to the wing inferred from the workspace, then continue. +2. **Before responding** about people, projects, past events, or prior + decisions: call `mempalace_search` first. For relational or temporal + facts ("who reported to whom in March", "what was true then"), call + `mempalace_kg_query` instead or as well. +3. **If unsure** about a fact (name, age, relationship, preference): say + "let me check the palace" and query. Wrong is worse than slow. +4. **Return verbatim.** Quote the drawer's exact stored words. Never + summarize, paraphrase, or lossy-compress what the palace returns — + that is the whole point of the system. +5. **After a substantive session**, record continuity with + `mempalace_diary_write` (background hooks may already do this — do not + double-file). +6. **When a fact changes**, call `mempalace_kg_invalidate` on the old + fact, then `mempalace_kg_add` for the new one. + +## Tool selection + +| You need | Tool | +|---|---| +| Find any memory by meaning | `mempalace_search` (start here) | +| Relational / time-bound facts about an entity | `mempalace_kg_query` | +| The chronological story of an entity | `mempalace_kg_timeline` | +| Recent session continuity | `mempalace_diary_read` | +| Which wings / rooms exist (when scope unknown) | `mempalace_list_wings`, `mempalace_list_rooms` | +| Record this session | `mempalace_diary_write` | + +`mempalace_search` takes a short natural-language `query` (keywords or a +question — not a system prompt or pasted conversation) plus optional +`wing` / `room` filters and `limit` (default 5). + +## Unhappy paths + +- **Empty results.** Say the palace has nothing on this; do not invent an + answer to fill the gap. Offer to widen the search (drop the wing + filter) or to file the new information. +- **MCP unavailable / tool error.** Surface the error plainly and suggest + the user verify the server (`mempalace status`, or re-run install). + Do not silently fall back to guessing from model memory. +- **Stale or conflicting facts.** Prefer the knowledge graph's + time-valid answer; if a fact has changed, invalidate the old one and + add the new one rather than overwriting context silently. + +## Anti-patterns + +- Answering about past work, people, or decisions from model memory when + the palace might know — search first. +- Paraphrasing or summarizing stored content instead of quoting it + verbatim. +- Searching reflexively on every turn, including pure greenfield coding + with no memory relevance. +- Pasting the full conversation or a system prompt into the `query` + argument — keep queries short and keyword-driven. + +## See also + +- [`integrations/openclaw/SKILL.md`](../openclaw/SKILL.md) — the original + full-protocol skill this is distilled from. +- MemPalace design principles (verbatim, local-first, never summarize): + diff --git a/mcp.json b/mcp.json new file mode 100644 index 000000000..ca633f5f5 --- /dev/null +++ b/mcp.json @@ -0,0 +1,7 @@ +{ + "mcpServers": { + "mempalace": { + "command": "mempalace-mcp" + } + } +} diff --git a/mempalace/README.md b/mempalace/README.md index fdbbb6206..ddeef061b 100644 --- a/mempalace/README.md +++ b/mempalace/README.md @@ -16,7 +16,7 @@ The Python package that powers MemPalace. All modules, all logic. | `dialect.py` | AAAK compression — entity codes, emotion markers, 30x lossless ratio | | `knowledge_graph.py` | Temporal entity-relationship graph — SQLite, time-filtered queries, fact invalidation | | `palace_graph.py` | Room-based navigation graph — BFS traversal, tunnel detection across wings | -| `mcp_server.py` | MCP server — 19 tools, AAAK auto-teach, Palace Protocol, agent diary | +| `mcp_server.py` | MCP server — 33 tools, AAAK auto-teach, Palace Protocol, agent diary | | `onboarding.py` | Guided first-run setup — asks about people/projects, generates AAAK bootstrap + wing config | | `entity_registry.py` | Entity code registry — maps names to AAAK codes, handles ambiguous names | | `entity_detector.py` | Auto-detect people and projects from file content | diff --git a/mempalace/backends/__init__.py b/mempalace/backends/__init__.py index 1560f2fbe..7432cbe2f 100644 --- a/mempalace/backends/__init__.py +++ b/mempalace/backends/__init__.py @@ -27,11 +27,13 @@ HealthStatus, LexicalHit, LexicalResult, + MaintenanceResult, PalaceNotFoundError, PalaceRef, QueryResult, UnsupportedCapabilityError, UnsupportedFilterError, + UnsupportedMaintenanceKindError, ) from .chroma import ChromaBackend, ChromaCollection from .pgvector import PgVectorBackend, PgVectorCollection @@ -64,6 +66,7 @@ "HealthStatus", "LexicalHit", "LexicalResult", + "MaintenanceResult", "PalaceNotFoundError", "PalaceRef", "PgVectorBackend", @@ -75,6 +78,7 @@ "SQLiteExactCollection", "UnsupportedCapabilityError", "UnsupportedFilterError", + "UnsupportedMaintenanceKindError", "available_backends", "detect_backend_for_path", "detect_backends_for_path", diff --git a/mempalace/backends/_sidecar.py b/mempalace/backends/_sidecar.py new file mode 100644 index 000000000..a0485daf2 --- /dev/null +++ b/mempalace/backends/_sidecar.py @@ -0,0 +1,71 @@ +"""Shared embedder-identity sidecar (RFC 001). + +A small JSON file in the palace directory, keyed by collection name, recording +the embedder identity (``model_name`` / ``dimension``). It is deliberately +*separate* from a backend's mismatch marker: a marker's presence signals +"palace initialized" (reads raise ``CollectionNotInitializedError`` when the +marker exists but the store doesn't), so recording identity at first empty open +must not create one. The sidecar is unguarded, so a brand-new palace can record +identity immediately — the same approach the chroma backend uses. +""" + +import json +import os +from typing import Optional + +EMBEDDER_SIDECAR_FILENAME = "mempalace_embedder.json" + + +def read_embedder_sidecar(path: Optional[str], collection_name: Optional[str]): + """Return the recorded :class:`EmbedderIdentity` for ``collection_name``, or None. + + Robust to a missing, unreadable, or malformed (non-dict) sidecar — any of + those degrade to ``None`` (the ``unknown`` state) rather than raising. + """ + from .base import EmbedderIdentity + + if not path or not collection_name or not os.path.isfile(path): + return None + try: + with open(path, encoding="utf-8") as f: + data = json.load(f) + except (OSError, json.JSONDecodeError): + return None + if not isinstance(data, dict): + return None + entry = data.get(collection_name) + if not isinstance(entry, dict) or not entry.get("model_name"): + return None + return EmbedderIdentity( + model_name=str(entry["model_name"]), + dimension=int(entry.get("dimension") or 0), + ) + + +def write_embedder_sidecar(path: Optional[str], collection_name: Optional[str], identity) -> None: + """Record ``identity`` for ``collection_name`` in the sidecar, creating it if needed. + + No-ops for a missing path, missing collection name, or a nameless identity. + Preserves other collections' entries; never raises on I/O failure. + """ + if not path or not collection_name or not identity or not getattr(identity, "model_name", ""): + return + data: dict = {} + if os.path.isfile(path): + try: + with open(path, encoding="utf-8") as f: + loaded = json.load(f) + if isinstance(loaded, dict): + data = loaded + except (OSError, json.JSONDecodeError): + data = {} + data[collection_name] = { + "model_name": str(identity.model_name), + "dimension": int(identity.dimension or 0), + } + try: + with open(path, "w", encoding="utf-8") as f: + json.dump(data, f, indent=2, ensure_ascii=False) + os.chmod(path, 0o600) + except (OSError, NotImplementedError): + pass diff --git a/mempalace/backends/base.py b/mempalace/backends/base.py index 21713d6eb..0c643b9c5 100644 --- a/mempalace/backends/base.py +++ b/mempalace/backends/base.py @@ -14,8 +14,8 @@ """ from abc import ABC, abstractmethod -from dataclasses import dataclass -from typing import ClassVar, Optional +from dataclasses import dataclass, field +from typing import ClassVar, Optional, Protocol, runtime_checkable # --------------------------------------------------------------------------- @@ -62,6 +62,15 @@ class UnsupportedCapabilityError(BackendError): """Raised when a backend does not implement an optional capability.""" +class UnsupportedMaintenanceKindError(BackendError): + """Raised when ``run_maintenance(kind)`` is called with an unadvertised kind. + + A backend MUST advertise a kind in ``maintenance_kinds`` before it accepts + it (RFC 001). Advertising a kind it does not implement is a conformance + failure; a kind it has no analogue for MUST be omitted, not no-op'd. + """ + + class BackendMismatchError(BackendError): """Raised when a selected backend does not match existing palace artifacts.""" @@ -74,6 +83,15 @@ class EmbedderIdentityMismatchError(BackendError): """Raised when the stored embedder model name differs from the current one.""" +class EmbedderIdentityUnknownWarning(UserWarning): + """Emitted on first open of a collection with no recorded embedder identity. + + Legacy palaces created before identity tracking carry no model name. Per + RFC 001 the right behavior is warn-not-fail: the identity is recorded on + the next write and subsequent opens become strict. + """ + + # --------------------------------------------------------------------------- # Value objects # --------------------------------------------------------------------------- @@ -116,6 +134,114 @@ class PalaceRef: namespace: Optional[str] = None +@dataclass(frozen=True) +class EmbedderIdentity: + """Identity of the embedder that produced a collection's vectors (RFC 001). + + ``model_name`` is the stable identity persisted alongside a collection and + checked on subsequent opens. ``dimension`` is the vector width. A + ``dimension`` of ``0`` means *unknown / not probed* — comparisons treat it + as "no dimension signal" rather than a real zero-width vector, so a cheap + read-path check can compare model names without loading the model. + """ + + model_name: str + dimension: int = 0 + + +@dataclass(frozen=True) +class MaintenanceResult: + """Observable outcome of ``run_maintenance(kind)`` (RFC 001). + + Maintenance is *not* fire-and-forget: a backend MUST serialize concurrent + same-kind runs and report the outcome so a caller can learn it must not + re-trigger. ``status`` is one of: + + * ``"ran"`` — this call performed the maintenance. + * ``"already_running"`` — another caller holds the work; this call did + nothing and the caller MUST NOT re-trigger (the production index-build + wedge: concurrent writers each issuing the build stacked exclusive locks). + * ``"noop"`` — nothing needed doing (e.g. the index already exists). + + ``stats`` is free-form per kind (rows analyzed, bytes reclaimed, index + build time) for benchmark/operator reporting. + """ + + kind: str + status: str + stats: dict = field(default_factory=dict) + + +@runtime_checkable +class Embedder(Protocol): + """Minimal embedder contract (RFC 001, normative for identity checking). + + The fuller embedder RFC (batching/async/pooling) is additive; identity + enforcement depends only on these three members. + """ + + model_name: str + dimension: int + + def embed(self, texts: list[str]) -> list[list[float]]: ... + + +def check_embedder_identity( + stored: Optional[EmbedderIdentity], + current: Optional[EmbedderIdentity], + *, + force_model_swap: bool = False, +) -> str: + """Three-state embedder-identity check (RFC 001). + + Returns the resolved state and raises on a hard, unforced conflict: + + * ``"unknown"`` — no identity recorded yet (legacy collection), or the + current embedder is nameless. The caller warns and records on write. + * ``"known_match"`` — stored name (and dimension, when both known) equal + the current embedder. Proceed normally. + * ``"known_mismatch"`` — names or dimensions differ. Without + ``force_model_swap`` this raises (:class:`EmbedderIdentityMismatchError` + for a model swap, :class:`DimensionMismatchError` for a width change, + which is checked first because mismatched vectors are physically + unusable). With ``force_model_swap`` it returns the state so the caller + can re-record the identity and log the swap. + + A ``dimension`` of ``0`` on either side means "unknown" and is skipped, so + a model-name-only check (cheap read path) still works. + """ + if current is None or not current.model_name: + return "unknown" + if stored is None: + return "unknown" + + dim_conflict = bool(stored.dimension and current.dimension) and ( + stored.dimension != current.dimension + ) + name_conflict = stored.model_name != current.model_name + + if not dim_conflict and not name_conflict: + return "known_match" + + if force_model_swap: + return "known_mismatch" + + if dim_conflict: + raise DimensionMismatchError( + f"collection was built with a {stored.dimension}-dim embedder " + f"({stored.model_name!r}) but the current embedder is " + f"{current.dimension}-dim ({current.model_name!r}); the stored " + "vectors are incompatible. Re-embed the palace to switch models." + ) + raise EmbedderIdentityMismatchError( + f"collection was built with embedder {stored.model_name!r} but the " + f"current embedder is {current.model_name!r}. Searching across a model " + "swap silently degrades recall. Re-embed the palace, or run " + "`mempalace palace set-embedder --model --force` to record the " + "new identity if you know the vectors are compatible." + ) + + @dataclass(frozen=True) class HealthStatus: ok: bool @@ -301,6 +427,68 @@ def close(self) -> None: def health(self) -> HealthStatus: return HealthStatus.healthy() + @property + def distance_metric(self) -> str: + """The space this collection's ``distances`` are reported in. + + Defaults to the owning backend's declared metric (cosine for all + in-tree backends). Collections that can vary per-collection — e.g. a + legacy Chroma palace built without ``hnsw:space=cosine`` — override + this to report their actual space so core ranking converts correctly. + """ + return "cosine" + + def get_stored_embedder_identity(self) -> Optional[EmbedderIdentity]: + """Return the embedder identity recorded for this collection, if any. + + Returns ``None`` when nothing is recorded — a legacy collection, or a + backend that does not yet persist identity. Core treats ``None`` as the + ``unknown`` state (warn, do not fail). Backends override this and + :meth:`set_embedder_identity` against their own metadata store. + """ + return None + + def set_embedder_identity(self, identity: EmbedderIdentity) -> None: + """Persist this collection's embedder identity. Default: no-op. + + A backend without an identity slot inherits the no-op default and so + stays permanently ``unknown`` (safe — it simply never enforces). The + enforcement choke point calls this when recording on first write or + on an explicit, forced model swap. + """ + return None + + def effective_embedder_identity(self) -> Optional[EmbedderIdentity]: + """The identity of the embedder this collection actually uses. + + For ``server_embedder`` backends that ignore the injected embedder, + this reports the server-side embedder so the same identity rules apply + (RFC 001). Defaults to ``None`` — the collection is embedded by the + injected/core embedder, and the caller supplies the current identity. + """ + return None + + def maintenance_state(self) -> dict: + """Return a structured snapshot of this collection's maintenance state. + + Free-form per backend (e.g. row count, whether a vector index exists, + last-analyze age). Used by benchmark harnesses to record state + alongside each latency/recall measurement so an un-analyzed store is + not compared against a settled one (RFC 001). Defaults to empty. + """ + return {} + + def run_maintenance(self, kind: str) -> "MaintenanceResult": + """Run a maintenance ``kind`` and return an observable result (RFC 001). + + Backends advertise supported kinds in ``BaseBackend.maintenance_kinds`` + and override this. The default supports nothing, so every kind raises + :class:`UnsupportedMaintenanceKindError`. Implementations MUST serialize + concurrent same-kind runs and report ``already_running`` rather than + stacking the work. + """ + raise UnsupportedMaintenanceKindError(f"backend does not support maintenance kind {kind!r}") + def lexical_search( self, *, @@ -381,6 +569,20 @@ class BaseBackend(ABC): name: ClassVar[str] spec_version: ClassVar[str] = "1.0" capabilities: ClassVar[frozenset[str]] = frozenset() + #: The space ``query()`` reports ``distances`` in (RFC 001 §2.1). + #: One of ``"cosine"`` | ``"l2"`` | ``"ip"``. The contract for the + #: ``distances`` field is *lower = closer* regardless of metric; core + #: search converts distance→similarity off this declaration rather than + #: assuming cosine. All in-tree backends are cosine today. + distance_metric: ClassVar[str] = "cosine" + #: Maintenance kinds this backend implements (RFC 001). Reserved names: + #: ``"analyze"`` (refresh planner/query statistics), ``"compact"`` (reclaim + #: space, rewrite storage), ``"reindex"`` (build/rebuild secondary indexes). + #: A backend with no analogue for a kind MUST omit it rather than declare a + #: no-op, so a benchmark harness can trust the set. Backends MAY add their + #: own kinds. ``run_maintenance`` raises ``UnsupportedMaintenanceKindError`` + #: for anything not listed here. + maintenance_kinds: ClassVar[frozenset[str]] = frozenset() @abstractmethod def get_collection( diff --git a/mempalace/backends/chroma.py b/mempalace/backends/chroma.py index b547f806f..15e074a4e 100644 --- a/mempalace/backends/chroma.py +++ b/mempalace/backends/chroma.py @@ -17,6 +17,7 @@ import chromadb from chromadb.errors import NotFoundError as _ChromaNotFoundError +from ._sidecar import EMBEDDER_SIDECAR_FILENAME, read_embedder_sidecar, write_embedder_sidecar from .base import ( BaseBackend, BaseCollection, @@ -925,7 +926,12 @@ def _missing_dimensionality_appears_recoverable( return False label_count = len(id_to_label) - if int(total) != label_count or len(label_to_id) != label_count: + # total_elements_added is monotonic across every add, while id_to_label and + # label_to_id hold only live elements, so a segment that has had deletions + # carries total_elements_added > label_count. Require >= (not ==), otherwise + # every post-deletion dim-None segment is wrongly quarantined (#1710); the + # label-map size and bijection checks still reject inconsistent label maps. + if int(total) < label_count or len(label_to_id) != label_count: return False try: return all(label_to_id.get(label) == item_id for item_id, label in id_to_label.items()) @@ -1067,7 +1073,7 @@ def _fix_blob_seq_ids(palace_path: str) -> None: if os.path.isfile(marker): return try: - with sqlite3.connect(db_path) as conn: + with contextlib.closing(sqlite3.connect(db_path)) as conn: try: rows = conn.execute( "SELECT rowid, seq_id FROM embeddings WHERE typeof(seq_id) = 'blob'" @@ -1724,6 +1730,46 @@ def metadata(self) -> dict: """ return self._collection.metadata or {} + @property + def distance_metric(self) -> str: + """Report this collection's actual space from ``hnsw:space``. + + MemPalace sets ``hnsw:space=cosine`` on every creation path, so a + healthy palace reports ``"cosine"``. When the key is absent, empty, or + an unrecognized value, the collection is genuinely using Chroma's HNSW + default — **L2** (Euclidean) — because cosine was never set on it. We + report ``"l2"`` in that case so core ranking maps the distances + correctly; reporting ``"cosine"`` here would reintroduce the + floor-every-result-to-zero misranking this property exists to fix. + """ + space = str(self.metadata.get("hnsw:space", "") or "").lower() + if space in ("cosine", "l2", "ip"): + return space + return "l2" + + # ------------------------------------------------------------------ + # Embedder identity (RFC 001) + # + # Stored in a small sidecar JSON in the palace dir rather than the Chroma + # collection metadata: ``collection.modify(metadata=...)`` replaces the + # whole dict and some Chroma versions reject re-passing the immutable + # ``hnsw:*`` construction keys, so mutating it on every open is fragile. + # The sidecar is keyed by collection name (a palace may hold several). + # This is complementary to Chroma's own embedding-function-name check — + # the core check runs at open time and yields the clean cross-backend + # error before Chroma's read-time rejection fires. + # ------------------------------------------------------------------ + def _embedder_sidecar_path(self) -> Optional[str]: + if not self._palace_path: + return None + return os.path.join(self._palace_path, EMBEDDER_SIDECAR_FILENAME) + + def get_stored_embedder_identity(self): + return read_embedder_sidecar(self._embedder_sidecar_path(), self._collection_name()) + + def set_embedder_identity(self, identity) -> None: + write_embedder_sidecar(self._embedder_sidecar_path(), self._collection_name(), identity) + # --------------------------------------------------------------------------- # Backend diff --git a/mempalace/backends/embedding_wrapper.py b/mempalace/backends/embedding_wrapper.py index 1441deb2d..46921df75 100644 --- a/mempalace/backends/embedding_wrapper.py +++ b/mempalace/backends/embedding_wrapper.py @@ -49,6 +49,32 @@ def __init__(self, inner: BaseCollection): def __getattr__(self, name): return getattr(self._inner, name) + @property + def distance_metric(self) -> str: + # Explicit delegation: ``BaseCollection`` defines ``distance_metric`` + # as a property, so it resolves on this subclass and ``__getattr__`` + # never fires — without this override the wrapper would report the + # base "cosine" default and mask a wrapped non-cosine backend. + return self._inner.distance_metric + + # Same shadowing reason as ``distance_metric``: these are concrete methods + # on ``BaseCollection``, so ``__getattr__`` never delegates them. Forward + # explicitly to the wrapped backend collection's identity store. + def get_stored_embedder_identity(self): + return self._inner.get_stored_embedder_identity() + + def set_embedder_identity(self, identity) -> None: + return self._inner.set_embedder_identity(identity) + + def effective_embedder_identity(self): + return self._inner.effective_embedder_identity() + + def maintenance_state(self) -> dict: + return self._inner.maintenance_state() + + def run_maintenance(self, kind: str): + return self._inner.run_maintenance(kind) + def add(self, *, documents, ids, metadatas=None, embeddings=None): documents = _as_list(documents) ids = _as_list(ids) diff --git a/mempalace/backends/pgvector.py b/mempalace/backends/pgvector.py index 235184c74..a97f29a34 100644 --- a/mempalace/backends/pgvector.py +++ b/mempalace/backends/pgvector.py @@ -37,6 +37,7 @@ import numpy as np +from ._sidecar import EMBEDDER_SIDECAR_FILENAME, read_embedder_sidecar, write_embedder_sidecar from .base import ( BackendClosedError, BackendError, @@ -352,6 +353,30 @@ def _quote_identifier(name: str) -> str: return '"' + name.replace('"', '""') + '"' +# Session-level advisory-lock namespace for serializing HNSW index builds +# across daemon writers (RFC 001). classid is a fixed mempalace constant +# ("MEMP" in ASCII); objid is a stable per-table key. Both must fit a signed +# int4, which ``pg_advisory_lock(int4, int4)`` requires. +_MAINTENANCE_LOCK_CLASSID = 0x4D454D50 # "MEMP" — a positive, valid int4 + + +def _advisory_objid(table: str) -> int: + """Stable signed-int4 advisory key derived from the table name.""" + raw = int(sha256(table.encode("utf-8")).hexdigest()[:8], 16) # 0 .. 2**32-1 + return raw - 2**32 if raw >= 2**31 else raw + + +def _hnsw_index_name(table: str) -> str: + """Deterministic, collision-safe index name for ``table``. + + Routes through :func:`_pg_identifier`, which hashes the overflow when the + name exceeds Postgres' 63-byte limit. A naive ``[:63]`` truncation could + return the table name verbatim (tables and indexes share the ``pg_class`` + namespace), which would fail with "relation already exists". + """ + return _pg_identifier(f"{table}_hnsw_idx") + + def _field_sql(field: str, expression: Any, params: list) -> str: """Translate one field predicate to a JSONB containment expression.""" if isinstance(expression, dict): @@ -440,11 +465,10 @@ class _PgVectorClient: def __init__(self, config: _PgVectorConfig): self._config = config self._conn = None + self._closed = False self._lock = threading.RLock() def _connect(self): - if self._conn is not None and not getattr(self._conn, "closed", False): - return self._conn try: import psycopg except ImportError as exc: # pragma: no cover - exercised only without the extra @@ -452,15 +476,27 @@ def _connect(self): "pgvector backend requires the optional 'psycopg' dependency; " "install mempalace[pgvector]" ) from exc - try: - self._conn = psycopg.connect(self._config.dsn) - except Exception as exc: # noqa: BLE001 - surface any driver failure uniformly - raise BackendError(f"pgvector connection failed: {exc}") from exc - return self._conn + # One client is shared across threads (PgVectorBackend caches a + # single instance per config), so the read-create-store on self._conn + # must hold the same lock _execute serializes on; unlocked, two + # first-connect threads each opened a connection and the loser leaked + # unclosed. The RLock makes the _execute -> _connect nesting safe. A + # stalled connect blocks peers under the lock the same way any + # in-flight query on this single shared connection already does. + with self._lock: + if self._closed: + raise BackendError("pgvector client has been closed") + if self._conn is not None and not getattr(self._conn, "closed", False): + return self._conn + try: + self._conn = psycopg.connect(self._config.dsn) + except Exception as exc: # noqa: BLE001 - surface any driver failure uniformly + raise BackendError(f"pgvector connection failed: {exc}") from exc + return self._conn def _execute(self, sql: str, params=None, *, fetch: bool = False, many: bool = False): - conn = self._connect() with self._lock: + conn = self._connect() try: with conn.cursor() as cur: if many: @@ -625,8 +661,46 @@ def count_rows(self, table: str) -> int: def drop_table(self, table: str) -> None: self._execute(f"DROP TABLE IF EXISTS {_quote_identifier(table)}") + # ------------------------------------------------------------------ + # Maintenance (RFC 001) + # ------------------------------------------------------------------ + def has_vector_index(self, table: str) -> bool: + rows = self._execute( + "SELECT 1 FROM pg_indexes WHERE schemaname = current_schema() " + "AND tablename = %s AND indexdef ILIKE %s", + [table, "%using hnsw%"], + fetch=True, + ) + return bool(rows) + + def try_advisory_lock(self, classid: int, objid: int) -> bool: + rows = self._execute("SELECT pg_try_advisory_lock(%s, %s)", [classid, objid], fetch=True) + return bool(rows and rows[0] and rows[0][0]) + + def advisory_unlock(self, classid: int, objid: int) -> None: + self._execute("SELECT pg_advisory_unlock(%s, %s)", [classid, objid], fetch=True) + + def create_hnsw_index(self, table: str) -> None: + qi = _quote_identifier(table) + idx = _quote_identifier(_hnsw_index_name(table)) + # Non-concurrent build takes ACCESS EXCLUSIVE for the build duration; + # the advisory lock in the caller ensures only one session builds, so + # writes are blocked once rather than by every writer that crossed the + # threshold (the production wedge this serialization fixes). + self._execute( + f"CREATE INDEX IF NOT EXISTS {idx} ON {qi} USING hnsw (embedding vector_cosine_ops)" + ) + + def analyze_table(self, table: str) -> None: + self._execute(f"ANALYZE {_quote_identifier(table)}") + def close(self) -> None: + # Terminal: the only caller is PgVectorBackend.close(), after which + # the backend refuses to hand the client out again. Without the flag a + # stale reference would silently reconnect and leak a session nobody + # can ever close. with self._lock: + self._closed = True if self._conn is not None: try: self._conn.close() @@ -686,6 +760,14 @@ def _table_exists(self) -> bool: def _marker_exists(self) -> bool: return self._backend._marker_exists(self._palace) + def get_stored_embedder_identity(self): + return self._backend._get_embedder_identity(self._palace, self._collection_name) + + def set_embedder_identity(self, identity) -> None: + # Sidecar-backed (see PgVectorBackend), so this records even on a + # brand-new palace whose mismatch marker doesn't exist yet. + self._backend._set_embedder_identity(self._palace, self._collection_name, identity) + def _ensure_table(self, dimension: int) -> None: if dimension <= 0: raise ValueError("embedding dimension must be positive") @@ -1008,6 +1090,60 @@ def health(self) -> HealthStatus: return HealthStatus.unhealthy(str(exc)) return HealthStatus.healthy() + def maintenance_state(self) -> dict: + empty = {"row_count": 0, "vector_index": None, "index_build_complete": False} + self._ensure_open() + try: + if not self._table_exists(): + return empty + rows = self._client.count_rows(self._table) + has_index = self._client.has_vector_index(self._table) + except Exception: # noqa: BLE001 - state report must not raise + logger.debug("pgvector maintenance state probe failed", exc_info=True) + return empty + return { + "row_count": rows, + "vector_index": "hnsw" if has_index else None, + "index_build_complete": has_index, + } + + def run_maintenance(self, kind: str): + from .base import MaintenanceResult, UnsupportedMaintenanceKindError + + if kind not in PgVectorBackend.maintenance_kinds: + raise UnsupportedMaintenanceKindError( + f"pgvector does not support maintenance kind {kind!r}" + ) + self._ensure_open() + # Nothing to maintain on a not-yet-materialized table (collection opened + # create=True but never written) — return noop rather than letting a + # raw "relation does not exist" error escape. + if not self._table_exists(): + return MaintenanceResult(kind=kind, status="noop", stats={"reason": "no table"}) + if kind == "analyze": + self._client.analyze_table(self._table) + return MaintenanceResult(kind="analyze", status="ran") + + # reindex → build the optional HNSW index. Opt-in: it makes search + # approximate, trading the exact-scan 100%-recall default for scale. + # Serialized with a session advisory lock so concurrent daemon writers + # learn "already_running" instead of each stacking an ACCESS EXCLUSIVE + # index build. + if self._client.has_vector_index(self._table): + return MaintenanceResult(kind="reindex", status="noop", stats={"vector_index": "hnsw"}) + classid, objid = _MAINTENANCE_LOCK_CLASSID, _advisory_objid(self._table) + if not self._client.try_advisory_lock(classid, objid): + return MaintenanceResult(kind="reindex", status="already_running") + try: + if self._client.has_vector_index(self._table): # re-check under lock + return MaintenanceResult( + kind="reindex", status="noop", stats={"vector_index": "hnsw"} + ) + self._client.create_hnsw_index(self._table) + return MaintenanceResult(kind="reindex", status="ran", stats={"vector_index": "hnsw"}) + finally: + self._client.advisory_unlock(classid, objid) + class PgVectorBackend(BaseBackend): name = "pgvector" @@ -1020,9 +1156,15 @@ class PgVectorBackend(BaseBackend): "supports_metadata_filters", "supports_lexical_search", "supports_namespace_isolation", + "supports_server_side_indexes", "server_mode", } ) + # "compact" is omitted: Postgres autovacuum reclaims space automatically, + # so a manual VACUUM kind would be redundant. "reindex" builds the optional + # HNSW index — an opt-in scale lever, NOT on by default, because it makes + # vector search approximate (the exact ``<=>`` scan is the 100%-recall path). + maintenance_kinds = frozenset({"analyze", "reindex"}) def __init__(self): self._clients: dict[_PgVectorConfig, _PgVectorClient] = {} @@ -1140,11 +1282,31 @@ def _write_marker(self, palace: PalaceRef, config: _PgVectorConfig) -> None: except (OSError, NotImplementedError): pass + # Embedder identity lives in a sidecar, NOT the backend marker: the marker's + # presence signals "palace initialized" (reads raise CollectionNotInitialized + # when the marker exists but the remote table doesn't), so recording identity + # at first empty open must not create it. The sidecar is unguarded — like the + # chroma sidecar — so a brand-new palace can record identity immediately. + @staticmethod + def _embedder_sidecar_path(palace: PalaceRef) -> Optional[str]: + if not palace.local_path: + return None + return os.path.join(palace.local_path, EMBEDDER_SIDECAR_FILENAME) + + def _get_embedder_identity(self, palace: PalaceRef, collection_name: str): + return read_embedder_sidecar(self._embedder_sidecar_path(palace), collection_name) + + def _set_embedder_identity(self, palace: PalaceRef, collection_name: str, identity) -> None: + write_embedder_sidecar(self._embedder_sidecar_path(palace), collection_name, identity) + # ------------------------------------------------------------------ def _client(self, config: _PgVectorConfig) -> _PgVectorClient: - if self._closed: - raise BackendClosedError("PgVectorBackend has been closed") with self._lock: + # Checked under the lock so a client cannot be created and stored + # concurrently with close() clearing the registry (mirrors + # SQLiteExactBackend._connect). + if self._closed: + raise BackendClosedError("PgVectorBackend has been closed") client = self._clients.get(config) if client is None: client = _PgVectorClient(config) diff --git a/mempalace/backends/qdrant.py b/mempalace/backends/qdrant.py index f2d348eef..bc516d578 100644 --- a/mempalace/backends/qdrant.py +++ b/mempalace/backends/qdrant.py @@ -24,6 +24,7 @@ import numpy as np +from ._sidecar import EMBEDDER_SIDECAR_FILENAME, read_embedder_sidecar, write_embedder_sidecar from .base import ( BackendClosedError, BackendMismatchError, @@ -661,6 +662,14 @@ def _remote_exists(self) -> bool: def _marker_exists(self) -> bool: return self._backend._marker_exists(self._palace) + def get_stored_embedder_identity(self): + return self._backend._get_embedder_identity(self._palace, self._collection_name) + + def set_embedder_identity(self, identity) -> None: + # Sidecar-backed (see QdrantBackend), so this records even on a + # brand-new palace whose mismatch marker doesn't exist yet. + self._backend._set_embedder_identity(self._palace, self._collection_name, identity) + def _remote_dimension(self) -> Optional[int]: try: info = self._client.get_collection_info(self._remote_collection) @@ -1175,6 +1184,23 @@ def _write_marker(self, palace: PalaceRef, config: _QdrantConfig) -> None: except (OSError, NotImplementedError): pass + # Embedder identity lives in a sidecar, NOT the backend marker: the marker's + # presence signals "palace initialized" (reads raise CollectionNotInitialized + # when the marker exists but the remote collection doesn't), so recording + # identity at first empty open must not create it. The sidecar is unguarded, + # so a brand-new palace can record identity immediately. + @staticmethod + def _embedder_sidecar_path(palace: PalaceRef) -> Optional[str]: + if not palace.local_path: + return None + return os.path.join(palace.local_path, EMBEDDER_SIDECAR_FILENAME) + + def _get_embedder_identity(self, palace: PalaceRef, collection_name: str): + return read_embedder_sidecar(self._embedder_sidecar_path(palace), collection_name) + + def _set_embedder_identity(self, palace: PalaceRef, collection_name: str, identity) -> None: + write_embedder_sidecar(self._embedder_sidecar_path(palace), collection_name, identity) + def _client(self, config: _QdrantConfig) -> _QdrantRESTClient: if self._closed: raise BackendClosedError("QdrantBackend has been closed") diff --git a/mempalace/backends/sqlite_exact.py b/mempalace/backends/sqlite_exact.py index a11692d91..fff5444c7 100644 --- a/mempalace/backends/sqlite_exact.py +++ b/mempalace/backends/sqlite_exact.py @@ -322,6 +322,36 @@ def _fts_available(self, cur) -> bool: row = cur.execute("SELECT value FROM meta WHERE key = 'fts5_available'").fetchone() return bool(row and row[0] == "1") + def _embedder_meta_key(self) -> str: + return f"embedder_model:{self._collection_name}" + + def get_stored_embedder_identity(self): + from .base import EmbedderIdentity + + with self._cursor() as cur: + try: + cid = self._collection_id(cur) + except CollectionNotInitializedError: + return None + row = cur.execute( + "SELECT value FROM meta WHERE key = ?", + (self._embedder_meta_key(),), + ).fetchone() + if not row or not row[0]: + return None + dim = self._collection_dimension(cur, cid) or 0 + return EmbedderIdentity(model_name=str(row[0]), dimension=int(dim)) + + def set_embedder_identity(self, identity) -> None: + if not identity or not identity.model_name: + return + with self._cursor() as cur: + cur.execute( + "INSERT INTO meta(key, value) VALUES (?, ?) " + "ON CONFLICT(key) DO UPDATE SET value = excluded.value", + (self._embedder_meta_key(), str(identity.model_name)), + ) + def _replace_fts(self, cur, collection_id: int, doc_id: str, document: str) -> None: if not self._fts_available(cur): return @@ -710,6 +740,63 @@ def health(self) -> HealthStatus: return HealthStatus.unhealthy("collection closed") return HealthStatus.healthy() + def maintenance_state(self) -> dict: + try: + rows = self.count() + except Exception: + rows = 0 + # vector_index is null by design — exact cosine over every row, no ANN. + state = {"row_count": rows, "vector_index": None} + try: + with self._cursor() as cur: + page_count = cur.execute("PRAGMA page_count").fetchone() + freelist = cur.execute("PRAGMA freelist_count").fetchone() + state["page_count"] = int(page_count[0]) if page_count else 0 + state["freelist_pages"] = int(freelist[0]) if freelist else 0 + except Exception: + pass + return state + + def run_maintenance(self, kind: str): + from .base import MaintenanceResult, UnsupportedMaintenanceKindError + + if kind not in SQLiteExactBackend.maintenance_kinds: + raise UnsupportedMaintenanceKindError( + f"sqlite_exact does not support maintenance kind {kind!r}" + ) + if kind == "analyze": + # Refresh planner stats. Concurrent runs serialize on the handle lock. + with self._cursor() as cur: + cur.execute("ANALYZE") + return MaintenanceResult(kind="analyze", status="ran") + + # compact → VACUUM. It cannot run inside a transaction, so flip the + # connection to autocommit for the duration. The handle lock serializes + # concurrent runs in-process; SQLite's own write lock serializes across + # processes. + before = self.maintenance_state() + with self._handle.lock: + self._ensure_open() + conn = self._handle.conn + prev_isolation = conn.isolation_level + try: + conn.commit() + conn.isolation_level = None + conn.execute("VACUUM") + finally: + conn.isolation_level = prev_isolation + after = self.maintenance_state() + reclaimed = max(0, before.get("page_count", 0) - after.get("page_count", 0)) + return MaintenanceResult( + kind="compact", + status="ran", + stats={ + "pages_before": before.get("page_count", 0), + "pages_after": after.get("page_count", 0), + "pages_reclaimed": reclaimed, + }, + ) + class SQLiteExactBackend(BaseBackend): name = "sqlite_exact" @@ -724,6 +811,9 @@ class SQLiteExactBackend(BaseBackend): "local_mode", } ) + # "reindex" is intentionally omitted: sqlite_exact does exact cosine over + # every row (no ANN index to build), so it has no analogue for it. + maintenance_kinds = frozenset({"analyze", "compact"}) def __init__(self): self._clients: dict[str, _SQLiteExactHandle] = {} @@ -746,19 +836,31 @@ def _connect(self, palace_path: str, create: bool): os.chmod(palace_path, 0o700) except (OSError, NotImplementedError): pass + # Hold the registry lock across cache-check + connect + schema init: + # two threads first-opening the same palace must not each create a + # connection (the loser leaked unclosed and outlived close()) nor run + # _init_schema concurrently on a fresh file, which surfaces transient + # "database is locked" errors before WAL mode is established. Only + # first-open pays for the I/O under the lock; cache hits are a dict + # probe. with self._clients_lock: + if self._closed: + raise BackendClosedError("SQLiteExactBackend has been closed") cached = self._clients.get(palace_path) if cached is not None and not cached.closed: return cached - conn = sqlite3.connect(db_path, check_same_thread=False) - conn.row_factory = sqlite3.Row - lock = threading.RLock() - handle = _SQLiteExactHandle(conn, lock) - with handle.lock: - self._init_schema(conn) - with self._clients_lock: + conn = sqlite3.connect(db_path, check_same_thread=False) + try: + conn.row_factory = sqlite3.Row + lock = threading.RLock() + handle = _SQLiteExactHandle(conn, lock) + with handle.lock: + self._init_schema(conn) + except BaseException: + conn.close() + raise self._clients[palace_path] = handle - return handle + return handle def _init_schema(self, conn: sqlite3.Connection) -> None: conn.executescript( @@ -889,14 +991,19 @@ def close_palace(self, palace: PalaceRef | str) -> None: cached.conn.close() def close(self) -> None: + # Flip _closed under the registry lock so a concurrent _connect either + # sees the flag or finishes before the handle snapshot is taken; a + # connection can no longer slip into the registry after close(). + # Unlocked readers of _closed elsewhere are advisory fast-fails; the + # locked recheck in _connect is the authoritative gate. with self._clients_lock: handles = list(self._clients.values()) self._clients.clear() + self._closed = True for handle in handles: with handle.lock: handle.closed = True handle.conn.close() - self._closed = True def health(self, palace: Optional[PalaceRef] = None) -> HealthStatus: if self._closed: diff --git a/mempalace/backups.py b/mempalace/backups.py new file mode 100644 index 000000000..d9950815c --- /dev/null +++ b/mempalace/backups.py @@ -0,0 +1,74 @@ +"""Retention pruning for timestamped palace backups. + +``mempalace migrate`` and ``mempalace repair max-seq-id`` each write a fresh, +timestamped backup every time they run and historically never deleted the old +ones. On a machine that mines or repairs on a schedule those full-size copies +accumulate silently — a real palace was found with hundreds of gigabytes of +backups sitting beside only a few hundred megabytes of live data, nearly +filling the disk. This module prunes the backup set down to a bounded count +after each new backup is written. + +The retention count comes from ``MempalaceConfig.max_backups`` (default 10). +""" + +import glob +import os +import shutil + + +def prune_backups(pattern, max_backups, *, log=None): + """Delete the oldest backups matching ``pattern`` so at most ``max_backups`` remain. + + Args: + pattern: A glob pattern matching the backup paths (files or + directories). The caller is responsible for ``glob.escape``-ing + any literal, non-wildcard portion that can contain glob + metacharacters — palace paths sometimes do (e.g. a ``[``). + max_backups: Number of most-recent backups to keep. ``None`` or any + value ``<= 0`` disables pruning and returns immediately, so a + backup set is never touched when the user has opted out. + log: Optional callable (e.g. ``print``) for human-readable progress. + + Returns: + The list of paths that were successfully removed. + + Recency is determined by filesystem mtime rather than by parsing the + timestamp out of the name, so it stays correct even when two backup + producers use different timestamp formats. Deletion failures are logged + and skipped: pruning is best-effort cleanup and must never abort the + migrate/repair operation that just completed successfully. + """ + if max_backups is None or max_backups <= 0: + return [] + + scored = [] + for path in glob.glob(pattern): + try: + scored.append((os.path.getmtime(path), path)) + except OSError: + # Vanished between glob and stat (concurrent prune / cleanup); + # nothing for us to remove. + continue + + if len(scored) <= max_backups: + return [] + + # Newest first; the path breaks mtime ties so ordering is deterministic. + scored.sort(key=lambda item: (item[0], item[1]), reverse=True) + + removed = [] + for _mtime, path in scored[max_backups:]: + try: + if os.path.isdir(path) and not os.path.islink(path): + shutil.rmtree(path) + else: + os.remove(path) + except OSError as exc: + if log: + log(f" Backup prune: could not remove {path}: {exc}") + continue + removed.append(path) + if log: + log(f" Backup prune: removed old backup {path}") + + return removed diff --git a/mempalace/cli.py b/mempalace/cli.py index 9ae0ce160..7699610fa 100644 --- a/mempalace/cli.py +++ b/mempalace/cli.py @@ -831,6 +831,53 @@ def cmd_status(args): status(palace_path=palace_path) +def cmd_palace_set_embedder(args): + """Record (or force-override) a palace's embedder identity (RFC 001). + + Resolves the ``unknown`` state for a legacy palace, or records a specific + model with ``--model``. It records identity on the palace only; it does not + change the configured model — when the two differ it prints how to align + ``MEMPALACE_EMBEDDING_MODEL``. ``--force`` overwrites an existing, + differently-named identity. + """ + from .backends.base import EmbedderIdentityMismatchError + from .palace import set_palace_embedder_identity + + config = MempalaceConfig() + palace_path = os.path.abspath( + os.path.expanduser(args.palace) if args.palace else config.palace_path + ) + model = getattr(args, "model", None) + try: + old, new = set_palace_embedder_identity( + palace_path, + model=model, + force=getattr(args, "force", False), + backend=_backend_arg(args), + ) + except EmbedderIdentityMismatchError as exc: + print(f" ✗ {exc}") + raise SystemExit(2) from exc + if old is None: + print(f" ✓ recorded embedder identity: {new.model_name} (dim={new.dimension})") + elif old.model_name == new.model_name: + print(f" ✓ embedder identity unchanged: {new.model_name} (dim={new.dimension})") + else: + print( + f" ✓ embedder identity changed: {old.model_name} → {new.model_name} " + f"(dim={new.dimension})" + ) + # set-embedder records the palace's identity; it does not change the + # configured model. If they differ, the next normal open would mismatch — + # tell the user how to align them. + configured = config.embedding_model + if new.model_name and configured and new.model_name != configured: + print( + f" ⚠ configured model is {configured!r}; set MEMPALACE_EMBEDDING_MODEL=" + f"{new.model_name} (or run onboarding) so normal opens of this palace match." + ) + + def cmd_repair_status(args): """Read-only HNSW capacity health check (#1222).""" palace_path = os.path.expanduser(args.palace) if args.palace else MempalaceConfig().palace_path @@ -842,7 +889,12 @@ def cmd_repair_status(args): def cmd_repair(args): - """Rebuild palace vector index from SQLite metadata.""" + """Rebuild palace vector index from SQLite metadata. + + On success the palace SQLite file is VACUUMed and the FTS5 index is + rebuilt, so the next repair's integrity preflight reads a consistent + database (#1747). + """ config = MempalaceConfig() collection_name = config.collection_name palace_path = os.path.abspath( @@ -859,6 +911,7 @@ def cmd_repair(args): TruncationDetected, _close_chroma_handles, _extract_drawers, + _post_rebuild_cleanup, _rebuild_collection_via_temp, check_extraction_safety, maybe_repair_poisoned_max_seq_id_before_rebuild, @@ -1052,6 +1105,10 @@ def cmd_repair(args): print(f" Backup location: {backup_path}") sys.exit(1) + # The bulk delete + re-upsert cycle above leaves the FTS5 inverted index + # inconsistent, which fails the next repair's integrity preflight (#1747). + _post_rebuild_cleanup(palace_path, backend=backend, progress=print) + print(f"\n Repair complete. {filed} drawers rebuilt.") print(f" Backup saved at {backup_path}") print(f"\n{'=' * 55}\n") @@ -1693,6 +1750,31 @@ def main(): help="Storage backend to use for status (default: config/env/detected/chroma)", ) + p_palace = sub.add_parser("palace", help="Palace maintenance commands") + palace_sub = p_palace.add_subparsers(dest="palace_action") + p_set_embedder = palace_sub.add_parser( + "set-embedder", + help="Record/override the palace's embedder identity (resolve 'unknown', or switch models)", + ) + p_set_embedder.add_argument( + "--model", + default=None, + help="Embedder model to record (default: current configured model). " + "Records identity on the palace only; does not change the configured " + "model (prints how to align MEMPALACE_EMBEDDING_MODEL if they differ).", + ) + p_set_embedder.add_argument( + "--force", + action="store_true", + help="Overwrite an existing identity that names a different model " + "(only if you know the stored vectors are compatible)", + ) + p_set_embedder.add_argument( + "--backend", + default=None, + help="Storage backend (default: config/env/detected/chroma)", + ) + args = parser.parse_args() _apply_backend_arg(args) @@ -1717,6 +1799,13 @@ def main(): cmd_instructions(args) return + if args.command == "palace": + if getattr(args, "palace_action", None) == "set-embedder": + cmd_palace_set_embedder(args) + else: + p_palace.print_help() + return + dispatch = { "init": cmd_init, "mine": cmd_mine, diff --git a/mempalace/config.py b/mempalace/config.py index 36d7d82e9..05d542c5e 100644 --- a/mempalace/config.py +++ b/mempalace/config.py @@ -198,6 +198,12 @@ def sanitize_content(value: str, max_length: int = 100_000) -> str: DEFAULT_COLLECTION_NAME = "mempalace_drawers" DEFAULT_BACKEND = "chroma" +# How many timestamped palace backups to retain before the oldest are +# pruned. Applies to the accumulating backups written by ``mempalace +# migrate`` and ``mempalace repair max-seq-id`` — see +# ``MempalaceConfig.max_backups``. +DEFAULT_MAX_BACKUPS = 10 + @lru_cache(maxsize=1) def get_configured_collection_name() -> str: @@ -326,6 +332,18 @@ def tunnel_file(self): """Path to the tunnel file, sibling of palace_path.""" return os.path.join(os.path.dirname(self.palace_path), "tunnels.json") + @property + def hallway_file(self): + """Path to the hallway file, sibling of palace_path. + + Mirrors ``tunnel_file`` so within-wing hallway state is scoped to the + configured palace and survives palace rebuilds (it does not live in + ChromaDB which can be recreated). Prior to this property the path was + hardcoded under ``~/.mempalace/hallways.json`` and multiple palaces on + one host silently shared one file (see ``hallways._legacy_hallway_file``). + """ + return os.path.join(os.path.dirname(self.palace_path), "hallways.json") + @property def collection_name(self): """ChromaDB collection name.""" @@ -691,6 +709,36 @@ def topic_tunnel_min_count(self): parsed = 1 return max(1, parsed) + @property + def max_backups(self) -> int: + """Number of timestamped palace backups to retain before pruning. + + Applies to the accumulating, timestamped backups created by + ``mempalace migrate`` (``.pre-migrate.``) and + ``mempalace repair max-seq-id`` + (``chroma.sqlite3.max-seq-id-backup-``). Each of those + commands writes a fresh full-size copy every run and historically + never deleted the old ones, so on a machine that mines or repairs on + a schedule the backup set could silently grow until it filled the + disk. After each backup is written, copies beyond this count (oldest + first) are removed. + + Reads ``MEMPALACE_MAX_BACKUPS`` env first, then ``max_backups`` in + ``config.json``, then the default of ``10``. A value of ``0`` disables + pruning and keeps every backup (use when an external retention policy + manages cleanup). Negative or non-numeric values fall back to the + default rather than crashing migrate/repair. + """ + env_val = os.environ.get("MEMPALACE_MAX_BACKUPS") + if env_val is not None: + coerced = self._try_coerce_int(env_val, minimum=0) + if coerced is not None: + return coerced + coerced = self._try_coerce_int( + self._file_config.get("max_backups", DEFAULT_MAX_BACKUPS), minimum=0 + ) + return DEFAULT_MAX_BACKUPS if coerced is None else coerced + @property def hook_silent_save(self): """Whether the stop hook saves directly (True) or blocks for MCP calls (False).""" diff --git a/mempalace/convo_miner.py b/mempalace/convo_miner.py index a1b61aad1..ad802595d 100644 --- a/mempalace/convo_miner.py +++ b/mempalace/convo_miner.py @@ -594,15 +594,14 @@ def _mine_convos_impl( wing = _resolve_wing(convo_path, wing) files = scan_convos(convo_dir) - if limit > 0: - files = files[:limit] print(f"\n{'=' * 55}") print(" MemPalace Mine — Conversations") print(f"{'=' * 55}") print(f" Wing: {wing}") print(f" Source: {convo_path}") - print(f" Files: {len(files)}") + limit_suffix = f" (limit: {limit} new)" if limit > 0 else "" + print(f" Files: {len(files)}{limit_suffix}") print(f" Palace: {palace_path}") if dry_run: print(" DRY RUN — nothing will be filed") @@ -620,10 +619,13 @@ def _mine_convos_impl( ) total_drawers = 0 + files_mined = 0 files_skipped = 0 + files_processed = 0 room_counts = defaultdict(int) for i, filepath in enumerate(files, 1): + files_processed = i source_file = str(filepath) # Skip if already filed at current NORMALIZE_VERSION @@ -684,6 +686,9 @@ def _mine_convos_impl( room_counts[c.get("memory_type", "general")] += 1 else: room_counts[room] += 1 + files_mined += 1 + if limit > 0 and files_mined >= limit: + break continue if extract_mode != "general": @@ -701,14 +706,17 @@ def _mine_convos_impl( room_counts[r] += n total_drawers += drawers_added + files_mined += 1 print(f" + [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers_added}") + if limit > 0 and files_mined >= limit: + break if not dry_run: _validate_palace_fts5_after_mine(palace_path) print(f"\n{'=' * 55}") print(" Done.") - print(f" Files processed: {len(files) - files_skipped}") + print(f" Files processed: {files_processed - files_skipped}") print(f" Files skipped (already filed): {files_skipped}") print(f" Drawers filed: {total_drawers}") if room_counts: diff --git a/mempalace/embedding.py b/mempalace/embedding.py index 85bf1b928..9dfb5861e 100644 --- a/mempalace/embedding.py +++ b/mempalace/embedding.py @@ -32,6 +32,7 @@ from __future__ import annotations import logging +import threading from typing import Optional logger = logging.getLogger(__name__) @@ -56,6 +57,10 @@ ] _EF_CACHE: dict = {} +# Check-then-construct on the cache must be atomic: without it, two threads +# resolving the same key each keep their own EF instance, and each instance +# later lazy-loads its own copy of the model. +_EF_CACHE_LOCK = threading.Lock() _WARNED: set = set() @@ -135,6 +140,16 @@ def name() -> str: _EMBEDDINGGEMMA_PREFIX = "task: sentence similarity | query: " _EMBEDDINGGEMMA_DIM = 384 # Matryoshka truncation — first 384 dims of the 768 _EMBEDDINGGEMMA_MAX_LEN = 2048 +# Default docs per session.run. The ONNX graph has no internal batching, +# so one unchunked run over a repair-scale batch (5000 docs, repair.py/ +# cli.py) allocates attention buffers that grow with batch size and +# superlinearly with padded length (score tensors are batch x heads x +# len^2 per layer), and the kernel OOM-kills the process (#1770). 32 +# matches the internal batch size of chromadb's ONNXMiniLM_L6_V2, whose +# chunked _forward survives the same call sites. embeddinggemma's +# sentence_embedding output is attention-masked, so sub-batch padding +# does not change any row's vector. +_EMBEDDINGGEMMA_BATCH_SIZE = 32 class EmbeddinggemmaONNX: @@ -158,73 +173,105 @@ def name() -> str: # when switching models. Keep it stable. return "embeddinggemma_300m" - def __init__(self, preferred_providers=None): + def __init__(self, preferred_providers=None, batch_size: int = _EMBEDDINGGEMMA_BATCH_SIZE): + if batch_size < 1: + raise ValueError(f"batch_size must be >= 1, got {batch_size}") self._providers = ( list(preferred_providers) if preferred_providers else ["CPUExecutionProvider"] ) + self._batch_size = batch_size self._session = None self._tokenizer = None self._np = None self._output_idx = None + # Instances are shared across threads via _EF_CACHE; serialize the + # one-time model load so concurrent cold calls cannot build (and + # transiently hold) two full model sessions. + self._load_lock = threading.Lock() def _lazy_load(self) -> None: if self._session is not None: return - try: - import numpy as np - import onnxruntime as ort - from huggingface_hub import hf_hub_download - from tokenizers import Tokenizer - except ImportError as e: - raise ImportError( - "EmbeddinggemmaONNX requires huggingface_hub, tokenizers, and " - "numpy — these ship with mempalace core, so this error usually " - "means one was uninstalled or pinned to an incompatible version. " - "Reinstall with: pip install --upgrade --force-reinstall mempalace" - ) from e - - logger.info( - "Downloading %s/%s (cached after first run)…", - _EMBEDDINGGEMMA_REPO, - _EMBEDDINGGEMMA_ONNX, - ) - model_path = hf_hub_download( - _EMBEDDINGGEMMA_REPO, subfolder="onnx", filename=_EMBEDDINGGEMMA_ONNX - ) - hf_hub_download( - _EMBEDDINGGEMMA_REPO, subfolder="onnx", filename=_EMBEDDINGGEMMA_ONNX + "_data" - ) - tok_path = hf_hub_download(_EMBEDDINGGEMMA_REPO, filename="tokenizer.json") - - self._session = ort.InferenceSession(model_path, providers=self._providers) - out_names = [o.name for o in self._session.get_outputs()] - # Model card: sentence_embedding is the pooled output (last_hidden_state - # is the per-token output we don't want). - self._output_idx = ( - out_names.index("sentence_embedding") if "sentence_embedding" in out_names else 1 - ) - - tokenizer = Tokenizer.from_file(tok_path) - tokenizer.enable_padding() - tokenizer.enable_truncation(max_length=_EMBEDDINGGEMMA_MAX_LEN) - self._tokenizer = tokenizer - self._np = np + with self._load_lock: + if self._session is not None: + return + try: + import numpy as np + import onnxruntime as ort + from huggingface_hub import hf_hub_download + from tokenizers import Tokenizer + except ImportError as e: + raise ImportError( + "EmbeddinggemmaONNX requires huggingface_hub, tokenizers, and " + "numpy — these ship with mempalace core, so this error usually " + "means one was uninstalled or pinned to an incompatible version. " + "Reinstall with: pip install --upgrade --force-reinstall mempalace" + ) from e + + logger.info( + "Downloading %s/%s (cached after first run)…", + _EMBEDDINGGEMMA_REPO, + _EMBEDDINGGEMMA_ONNX, + ) + model_path = hf_hub_download( + _EMBEDDINGGEMMA_REPO, subfolder="onnx", filename=_EMBEDDINGGEMMA_ONNX + ) + hf_hub_download( + _EMBEDDINGGEMMA_REPO, subfolder="onnx", filename=_EMBEDDINGGEMMA_ONNX + "_data" + ) + tok_path = hf_hub_download(_EMBEDDINGGEMMA_REPO, filename="tokenizer.json") + + session = ort.InferenceSession(model_path, providers=self._providers) + out_names = [o.name for o in session.get_outputs()] + # Model card: sentence_embedding is the pooled output (last_hidden_state + # is the per-token output we don't want). + output_idx = ( + out_names.index("sentence_embedding") if "sentence_embedding" in out_names else 1 + ) - def __call__(self, input): # noqa: A002 — ChromaDB EF protocol uses `input` + tokenizer = Tokenizer.from_file(tok_path) + tokenizer.enable_padding() + tokenizer.enable_truncation(max_length=_EMBEDDINGGEMMA_MAX_LEN) + self._output_idx = output_idx + self._tokenizer = tokenizer + self._np = np + # Session is assigned last: the unlocked fast path above treats a + # non-None session as "fully loaded", so every other attribute + # must already be in place when it becomes visible. + self._session = session + + def __call__(self, input: str | list[str] | None) -> list[list[float]]: # noqa: A002 — ChromaDB EF protocol + if isinstance(input, str): + # A bare string would be iterated character by character below, + # silently producing one garbage vector per character. + input = [input] + if input is None or len(input) == 0: + # None or zero docs: nothing to embed; skip the lazy model + # download. len() over truthiness so an array-like documents + # sequence is not rejected by ambiguous-truth-value semantics. + return [] self._lazy_load() np = self._np - texts = [_EMBEDDINGGEMMA_PREFIX + t for t in input] - encs = self._tokenizer.encode_batch(texts) - input_ids = np.asarray([e.ids for e in encs], dtype=np.int64) - attention_mask = np.asarray([e.attention_mask for e in encs], dtype=np.int64) - outputs = self._session.run( - None, {"input_ids": input_ids, "attention_mask": attention_mask} - ) - sent_emb = outputs[self._output_idx][:, :_EMBEDDINGGEMMA_DIM] - # L2-normalize so cosine similarity == dot product (matches what the - # MTEB methodology assumes; ChromaDB's distance is configured for it). - norms = np.linalg.norm(sent_emb, axis=1, keepdims=True) + 1e-12 - return (sent_emb / norms).tolist() + embeddings: list[list[float]] = [] + # Tokenize and run per sub-batch, not over the whole input: padding + # is to the longest sequence in the sub-batch, and the ONNX runtime + # only ever holds batch_size rows of attention buffers at a time + # (#1770). + for start in range(0, len(input), self._batch_size): + chunk = input[start : start + self._batch_size] + texts = [_EMBEDDINGGEMMA_PREFIX + t for t in chunk] + encs = self._tokenizer.encode_batch(texts) + input_ids = np.asarray([e.ids for e in encs], dtype=np.int64) + attention_mask = np.asarray([e.attention_mask for e in encs], dtype=np.int64) + outputs = self._session.run( + None, {"input_ids": input_ids, "attention_mask": attention_mask} + ) + sent_emb = outputs[self._output_idx][:, :_EMBEDDINGGEMMA_DIM] + # L2-normalize so cosine similarity == dot product (matches what the + # MTEB methodology assumes; ChromaDB's distance is configured for it). + norms = np.linalg.norm(sent_emb, axis=1, keepdims=True) + 1e-12 + embeddings.extend((sent_emb / norms).tolist()) + return embeddings def embed_query(self, input: list[str]) -> list[list[float]]: # noqa: A002 — ChromaDB EF protocol """Embed query documents (ChromaDB EF protocol).""" @@ -254,18 +301,22 @@ def get_embedding_function(device: Optional[str] = None, model: Optional[str] = providers, effective = _resolve_providers(device) cache_key = (model, tuple(providers)) - cached = _EF_CACHE.get(cache_key) + cached = _EF_CACHE.get(cache_key) # lock-free fast path; dict.get is GIL-atomic if cached is not None: return cached - - if model == "embeddinggemma": - ef = EmbeddinggemmaONNX(preferred_providers=providers) - else: - # Default: minilm (or anything we don't recognize — back-compat win). - ef_cls = _build_ef_class() - ef = ef_cls(preferred_providers=providers) - - _EF_CACHE[cache_key] = ef + with _EF_CACHE_LOCK: + cached = _EF_CACHE.get(cache_key) + if cached is not None: + return cached + + if model == "embeddinggemma": + ef = EmbeddinggemmaONNX(preferred_providers=providers) + else: + # Default: minilm (or anything we don't recognize — back-compat win). + ef_cls = _build_ef_class() + ef = ef_cls(preferred_providers=providers) + + _EF_CACHE[cache_key] = ef logger.info( "Embedding function initialized (model=%s device=%s providers=%s)", model, @@ -287,3 +338,59 @@ def describe_device(device: Optional[str] = None) -> str: device = MempalaceConfig().embedding_device _, effective = _resolve_providers(device) return effective + + +# Probed vector widths, keyed by resolved model name. Populated once per +# process the first time an identity is resolved for a model. +_DIM_CACHE: dict = {} + + +def current_model_name(model: Optional[str] = None) -> str: + """Resolve the canonical embedder model name (cheap, no model load). + + This is the configured ``embedding_model`` (``"minilm"`` / + ``"embeddinggemma"`` / ...), not the embedding function's internal + ``name()`` (which is spoofed to ``"default"`` for ChromaDB compatibility). + """ + if model is not None: + return str(model).strip().lower() + from .config import MempalaceConfig + + return MempalaceConfig().embedding_model + + +def probe_dimension(device: Optional[str] = None, model: Optional[str] = None) -> int: + """Return the embedder's output dimension by embedding a short probe. + + Model-agnostic — works for any model without a hardcoded table — and + cached per resolved model name so the probe is paid at most once per + process. Returns ``0`` if the probe fails (treated as "dimension unknown" + by the identity check, so a probe failure never blocks normal operation). + """ + name = current_model_name(model) + cached = _DIM_CACHE.get(name) + if cached is not None: + return cached + try: + ef = get_embedding_function(device=device, model=model) + vectors = ef(input=["probe"]) + dim = len(vectors[0]) if vectors and vectors[0] is not None else 0 + except Exception: + logger.debug("Embedding dimension probe failed for model=%s", name, exc_info=True) + dim = 0 + _DIM_CACHE[name] = dim + return dim + + +def get_embedder_identity(device: Optional[str] = None, model: Optional[str] = None): + """Resolve the current embedder identity (RFC 001). + + ``model_name`` from config (cheap); ``dimension`` from a cached one-time + probe. Returns an :class:`~mempalace.backends.base.EmbedderIdentity`. + """ + from .backends.base import EmbedderIdentity + + return EmbedderIdentity( + model_name=current_model_name(model), + dimension=probe_dimension(device=device, model=model), + ) diff --git a/mempalace/format_miner.py b/mempalace/format_miner.py index f64a49084..296ca2911 100644 --- a/mempalace/format_miner.py +++ b/mempalace/format_miner.py @@ -507,7 +507,7 @@ def scan_formats(directory: Union[Path, str]) -> list[Path]: def _print_mine_summary( - files: list, + files_seen: int, files_with_text: int, files_skipped: int, files_errored: int, @@ -523,7 +523,7 @@ def _print_mine_summary( print(f"\n{'=' * 55}") print(" Summary") print(f"{'-' * 55}") - print(f" Files seen: {len(files)}") + print(f" Files seen: {files_seen}") print(f" Files extracted: {files_with_text}") print(f" Files skipped: {files_skipped}") print(f" Files errored: {files_errored}") @@ -789,9 +789,11 @@ def mine_formats( files: list = [] collection = None total_drawers = 0 + files_mined = 0 files_skipped = 0 files_with_text = 0 files_errored = 0 + files_processed = 0 status_counts: dict = defaultdict(int) try: @@ -799,15 +801,14 @@ def mine_formats( # ``~/docs`` and relative inputs work consistently. Per PR #1555 review # (Copilot #10). files = scan_formats(format_path) - if limit > 0: - files = files[:limit] print(f"\n{'=' * 55}") print(" MemPalace Mine — Format extraction") print(f"{'=' * 55}") print(f" Wing: {wing}") print(f" Source: {format_path}") - print(f" Files: {len(files)}") + limit_suffix = f" (limit: {limit} new)" if limit > 0 else "" + print(f" Files: {len(files)}{limit_suffix}") print(f" Palace: {palace_path}") if dry_run: print(" DRY RUN — nothing will be filed") @@ -816,6 +817,7 @@ def mine_formats( collection = get_collection(palace_path) if not dry_run else None for i, filepath in enumerate(files, 1): + files_processed = i source_file = str(filepath) # Per-file try/except so one bad file can't crash the whole mine. @@ -876,6 +878,9 @@ def mine_formats( if dry_run: print(f" [DRY RUN] {filepath.name} → {len(chunks)} drawers") total_drawers += len(chunks) + files_mined += 1 + if limit > 0 and files_mined >= limit: + break continue drawers_added, skipped = _file_chunks_locked( @@ -893,7 +898,10 @@ def mine_formats( continue total_drawers += drawers_added + files_mined += 1 print(f" + [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers_added}") + if limit > 0 and files_mined >= limit: + break except Exception as exc: # Log and continue — one malformed file shouldn't kill the # whole mine. Mirrors miner.py's per-file recovery. @@ -973,7 +981,7 @@ def mine_formats( logger.debug("mine_formats: _cleanup_mine_pid_file failed", exc_info=True) _print_mine_summary( - files=files, + files_seen=files_processed, files_with_text=files_with_text, files_skipped=files_skipped, files_errored=files_errored, diff --git a/mempalace/hallways.py b/mempalace/hallways.py index 9bee6c60b..971bd8688 100644 --- a/mempalace/hallways.py +++ b/mempalace/hallways.py @@ -46,10 +46,11 @@ logger = logging.getLogger("mempalace_hallways") -# Persistence target. Mirrors ``palace_graph._TUNNEL_FILE`` so the storage -# pattern is uniform across the two related primitives. Tests override -# this via ``monkeypatch.setattr(hallways, "_HALLWAY_FILE", tmp_path/...)``. -_HALLWAY_FILE = os.path.join(os.path.expanduser("~"), ".mempalace", "hallways.json") +# Persistence target is resolved through ``_get_hallway_file`` below, which +# mirrors ``palace_graph._get_tunnel_file`` (the 3.3.6 palace-scoped pattern) +# so the storage layout is uniform across the two related primitives. Tests +# should monkey-patch ``_get_hallway_file`` and ``_legacy_hallway_file`` rather +# than poking a module-level constant. _SCHEMA_VERSION = 1 @@ -62,36 +63,72 @@ # ───────────────────────────────────────────────────────────────────────────── -# Persistence — JSON file at _HALLWAY_FILE, restricted perms (0600) on POSIX +# Persistence — JSON file resolved from MempalaceConfig.hallway_file, +# restricted perms (0600) on POSIX. Pre-3.3.6 behavior (hardcoded +# ~/.mempalace/hallways.json) is kept only as a one-time orphan detection +# fallback, matching the palace_graph tunnel-file migration pattern. # ───────────────────────────────────────────────────────────────────────────── -def _load_hallways() -> list[dict]: - """Read all hallway records. Returns ``[]`` if the file is missing or corrupt.""" - if not os.path.exists(_HALLWAY_FILE): - return [] - try: - with open(_HALLWAY_FILE, encoding="utf-8") as f: - raw = json.load(f) - except (OSError, json.JSONDecodeError): - logger.debug("hallways: load failed, treating as empty", exc_info=True) +def _get_hallway_file(config=None) -> str: + """Return the path to the hallways.json file, derived from MempalaceConfig.palace_path.""" + from .config import MempalaceConfig + + config = config or MempalaceConfig() + return config.hallway_file + + +def _legacy_hallway_file() -> str: + """The pre-palace-scoped hardcoded path. Kept only for one-time orphan detection.""" + return os.path.join(os.path.expanduser("~"), ".mempalace", "hallways.json") + + +def _load_hallways(config=None) -> list[dict]: + """Read all hallway records. Returns ``[]`` if the file is missing or corrupt. + + Backwards-compatibility: prior to this migration the hallway file was + hardcoded at ``~/.mempalace/hallways.json`` regardless of the configured + palace_path. If the configured hallway file is missing but a legacy file + exists at a different path, log a one-line warning naming both paths so + users can move the file manually. We do NOT auto-migrate — auto-merging + hallway state across two locations is too magical for a bugfix and risks + clobbering newer data. Same posture as ``palace_graph._load_tunnels``. + """ + current_hallway_file = _get_hallway_file(config) + if os.path.exists(current_hallway_file): + try: + with open(current_hallway_file, encoding="utf-8") as f: + raw = json.load(f) + except (OSError, json.JSONDecodeError): + logger.debug("hallways: load failed, treating as empty", exc_info=True) + return [] + if isinstance(raw, dict) and "hallways" in raw: + return raw.get("hallways") or [] + if isinstance(raw, list): + return raw return [] - if isinstance(raw, dict) and "hallways" in raw: - return raw.get("hallways") or [] - if isinstance(raw, list): - return raw + + legacy = _legacy_hallway_file() + if legacy != current_hallway_file and os.path.exists(legacy): + logger.warning( + "Legacy hallways file at '%s' is being ignored; configured location is '%s'. " + "Move or copy the legacy file to the configured path to recover its hallways.", + legacy, + current_hallway_file, + ) return [] -def _save_hallways(hallways: list[dict]) -> None: - """Atomically persist hallway records to _HALLWAY_FILE. +def _save_hallways(hallways: list[dict], config=None) -> None: + """Atomically persist hallway records to the configured hallway file. Uses an os.replace temp-file dance so a crash mid-write doesn't corrupt the file. POSIX permission is restricted to 0600 because hallways reveal within-wing entity connections that the user may not want world-readable. """ - directory = os.path.dirname(_HALLWAY_FILE) + hallway_file = _get_hallway_file(config) + directory = os.path.dirname(hallway_file) os.makedirs(directory, exist_ok=True) payload = { "schema_version": _SCHEMA_VERSION, @@ -106,7 +143,7 @@ def _save_hallways(hallways: list[dict]) -> None: except OSError: # Non-POSIX systems may not support chmod; not fatal. pass - os.replace(tmp_path, _HALLWAY_FILE) + os.replace(tmp_path, hallway_file) except Exception: try: os.unlink(tmp_path) diff --git a/mempalace/hook_shell.py b/mempalace/hook_shell.py new file mode 100644 index 000000000..2f2f32ebc --- /dev/null +++ b/mempalace/hook_shell.py @@ -0,0 +1,159 @@ +"""Compatibility helpers for legacy shell hooks. + +The shell hooks intentionally stay small and portable, but parsing Claude +hook JSON and counting UTF-8 JSONL transcripts is safer in Python than in +inline shell snippets. This module centralizes that behavior for both +hooks/mempal_save_hook.sh and hooks/mempal_precompact_hook.sh. +""" + +from __future__ import annotations + +import json +import re +import sys + + +_SESSION_ID_RE = re.compile(r"[^a-zA-Z0-9_-]") +_CONTROL_CHARS_RE = re.compile(r"[\x00\r\n]") + + +def sanitize_session_id(session_id: object) -> str: + """Keep session ids safe for state-file names.""" + sanitized = _SESSION_ID_RE.sub("", str(session_id or "")) + return sanitized or "unknown" + + +def normalize_transcript_path(path: object) -> str: + r"""Normalize a hook transcript path without destroying Windows paths. + + Claude Code on Windows sends paths like: + + C:\Users\me\.claude\projects\\.jsonl + + The old shell sanitizer removed both the drive-letter colon and + backslashes. That turned a valid transcript path into a nonexistent path. + For transcript paths, we only remove control characters that would break + newline-delimited shell parsing, and normalize backslashes to forward + slashes so Git Bash can still address the same Windows file. + """ + + normalized = str(path or "").replace("\\", "/") + return _CONTROL_CHARS_RE.sub("", normalized) + + +def _stop_hook_active(value: object) -> str: + """Return the exact boolean string expected by the shell hook.""" + if value is True: + return "True" + if str(value).strip().lower() in ("true", "1", "yes"): + return "True" + return "False" + + +def parse_stop_payload(payload: dict) -> tuple[str, str, str]: + return ( + sanitize_session_id(payload.get("session_id", "")), + _stop_hook_active(payload.get("stop_hook_active", False)), + normalize_transcript_path(payload.get("transcript_path", "")), + ) + + +def parse_precompact_payload(payload: dict) -> tuple[str, str]: + return ( + sanitize_session_id(payload.get("session_id", "")), + normalize_transcript_path(payload.get("transcript_path", "")), + ) + + +def count_human_messages(path: str) -> int: + """Count user messages in a Claude transcript JSONL file. + + Claude transcripts are UTF-8. Windows Python defaults to cp1252 in many + environments, so the encoding must be explicit. Invalid bytes are ignored + to match the hooks' fail-soft behavior. + """ + + count = 0 + with open(path, encoding="utf-8", errors="ignore") as fh: + for line in fh: + try: + entry = json.loads(line) + except Exception: + continue + + msg = entry.get("message", {}) + if not isinstance(msg, dict) or msg.get("role") != "user": + continue + + content = msg.get("content", "") + if isinstance(content, str) and "" in content: + continue + + count += 1 + + return count + + +def _load_stdin_json() -> dict: + raw = sys.stdin.read() + + # Empty stdin is a legitimate hook state. Treat it as an empty payload so + # the sentinel is printed and the shell fail-loud guard does not spam disk. + if raw == "": + return {} + + # For non-empty malformed input, intentionally let json.loads raise. + # The shell hooks capture this stderr in last_python_err.log and, because + # no sentinel is printed, write a bounded copy of the raw payload to + # last_input.log. That fail-loud contract is pinned by + # tests/test_hooks_bash_compat.py. + data = json.loads(raw) + + if not isinstance(data, dict): + raise TypeError(f"hook input must be a JSON object, got {type(data).__name__}") + + return data + + +def main(argv: list[str] | None = None) -> int: + argv = list(sys.argv[1:] if argv is None else argv) + if not argv: + print( + "usage: python -m mempalace.hook_shell ", + file=sys.stderr, + ) + return 2 + + command = argv[0] + + if command == "parse-stop": + session_id, stop_hook_active, transcript_path = parse_stop_payload(_load_stdin_json()) + print("__MEMPAL_PARSE_OK__") + print(session_id) + print(stop_hook_active) + print(transcript_path) + return 0 + + if command == "parse-precompact": + session_id, transcript_path = parse_precompact_payload(_load_stdin_json()) + print("__MEMPAL_PARSE_OK__") + print(session_id) + print(transcript_path) + return 0 + + if command == "count-human-messages": + if len(argv) != 2: + print("count-human-messages requires a transcript path", file=sys.stderr) + return 2 + try: + print(count_human_messages(argv[1])) + except Exception: + print(0) + return 0 + + print(f"unknown hook_shell command: {command}", file=sys.stderr) + return 2 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/mempalace/ids.py b/mempalace/ids.py index ff20946f4..de4c7ab67 100644 --- a/mempalace/ids.py +++ b/mempalace/ids.py @@ -22,7 +22,7 @@ # as legacy ``v1`` (pre-delimiter recipe), drawers with ``id_recipe="v2"`` # are guaranteed collision-safe within the v2 generation. The constant is # exported so call sites use ``ids.ID_RECIPE`` rather than a magic string. -ID_RECIPE: str = "v2" +ID_RECIPE: str = "v3" # '|' is reserved in Windows filenames and cannot appear in source paths # on any supported platform, making it strictly safer than ':' (which @@ -49,7 +49,7 @@ def _delimited_sha256(parts: tuple[object, ...], truncate: int) -> str: e.g. ``valid_from=None`` joins as the literal string ``"None"`` rather than crashing. """ - key = _DELIM.join(str(p) for p in parts).encode() + key = "".join(f"{len(part)}:{part}" for part in map(str, parts)).encode() return hashlib.sha256(key).hexdigest()[:truncate] diff --git a/mempalace/layers.py b/mempalace/layers.py index b92890aa2..6acf8523e 100644 --- a/mempalace/layers.py +++ b/mempalace/layers.py @@ -23,7 +23,12 @@ from .config import MempalaceConfig from .palace import get_collection as _get_collection -from .searcher import _first_or_empty, build_where_filter +from .searcher import ( + _distance_to_similarity, + _first_or_empty, + _metric_for_collection, + build_where_filter, +) # --------------------------------------------------------------------------- @@ -283,11 +288,12 @@ def search(self, query: str, wing: str = None, room: str = None, n_results: int if not docs: return "No results found." + metric = _metric_for_collection(col) lines = [f'## L3 — SEARCH RESULTS for "{query}"'] for i, (doc, meta, dist) in enumerate(zip(docs, metas, dists), 1): meta = meta or {} doc = doc or "" - similarity = round(max(0.0, 1 - dist), 3) + similarity = round(_distance_to_similarity(dist, metric), 3) wing_name = meta.get("wing", "?") room_name = meta.get("room", "?") source = Path(meta.get("source_file", "")).name if meta.get("source_file") else "" @@ -327,6 +333,7 @@ def search_raw( except Exception: return [] + metric = _metric_for_collection(col) hits = [] for doc, meta, dist in zip( _first_or_empty(results, "documents"), @@ -346,7 +353,7 @@ def search_raw( "wing": meta.get("wing", "unknown"), "room": meta.get("room", "unknown"), "source_file": Path(meta.get("source_file", "?")).name, - "similarity": round(1 - dist, 3), + "similarity": round(_distance_to_similarity(dist, metric), 3), "metadata": meta, } ) diff --git a/mempalace/mcp_server.py b/mempalace/mcp_server.py index 8187ccaf2..9e2b3c6ab 100644 --- a/mempalace/mcp_server.py +++ b/mempalace/mcp_server.py @@ -74,7 +74,11 @@ ) from .backends import BackendMismatchError, PalaceRef, detect_backend_for_path # noqa: E402 from .query_sanitizer import sanitize_query # noqa: E402 -from .searcher import search_memories # noqa: E402 +from .searcher import ( # noqa: E402 + _distance_to_similarity, + _metric_for_collection, + search_memories, +) from .palace_graph import ( # noqa: E402 traverse, find_tunnels, @@ -84,6 +88,10 @@ delete_tunnel, follow_tunnels, ) +from .hallways import ( # noqa: E402 + list_hallways, + delete_hallway, +) from .knowledge_graph import KnowledgeGraph, DEFAULT_KG_PATH # noqa: E402 from .collision_scan import assert_no_collisions # noqa: E402 @@ -156,6 +164,21 @@ def _init_logging() -> None: logger = logging.getLogger("mempalace_mcp") +def _get_result_ids(result) -> list: + """Return ``get()`` result ids for both typed and dict-like collection results.""" + if result is None: + return [] + ids = getattr(result, "ids", None) + if ids is not None: + return ids + if isinstance(result, dict): + return result.get("ids") or [] + getter = getattr(result, "get", None) + if callable(getter): + return getter("ids") or [] + return [] + + def _parse_args(): parser = argparse.ArgumentParser(description="MemPalace MCP Server") parser.add_argument( @@ -346,6 +369,7 @@ def _force_chroma_cache_reset() -> None: _palace_db_mtime, \ _metadata_cache, \ _metadata_cache_time + cached_client = _client_cache _client_cache = None _collection_cache = None _collection_cache_backend = None @@ -361,7 +385,24 @@ def _force_chroma_cache_reset() -> None: backend = get_backend_for_palace(_config.palace_path) backend.close_palace(PalaceRef(id=_config.palace_path, local_path=_config.palace_path)) except Exception: - pass + logger.debug("Failed to close cached Chroma backend during cache reset", exc_info=True) + if cached_client is not None: + try: + close = getattr(cached_client, "close", None) + if callable(close): + close() + except Exception: + logger.debug( + "Failed to close MCP-local Chroma client during cache reset", exc_info=True + ) + try: + from chromadb.api.client import SharedSystemClient + + clear_system_cache = getattr(SharedSystemClient, "clear_system_cache", None) + if callable(clear_system_cache): + clear_system_cache() + except Exception: + logger.debug("Failed to clear Chroma shared system cache during cache reset", exc_info=True) # ── Vector-search disabled flag (#1222) ────────────────────────────────── @@ -672,6 +713,16 @@ def _get_collection(create=False): } return None + db_path = os.path.join(_config.palace_path, "chroma.sqlite3") + if not create and not os.path.isfile(db_path): + _force_chroma_cache_reset() + _collection_open_error = { + "error": "Chroma database missing", + "details": f"Could not open missing database at {db_path}.", + "hint": "Run: mempalace status or mempalace repair-status for diagnostics.", + } + return None + for attempt in range(2): try: if _collection_cache is not None and ( @@ -1234,9 +1285,10 @@ def tool_check_duplicate(content: str, threshold: float = 0.9): ) duplicates = [] if results["ids"] and results["ids"][0]: + metric = _metric_for_collection(col) for i, drawer_id in enumerate(results["ids"][0]): dist = results["distances"][0][i] - similarity = round(max(0.0, 1 - dist), 3) + similarity = round(_distance_to_similarity(dist, metric), 3) if similarity >= threshold: # Chroma 1.5.x can return None for partially-flushed rows; # coerce to empty sentinels so downstream .get() is safe. @@ -1349,6 +1401,22 @@ def tool_delete_tunnel(tunnel_id: str): return delete_tunnel(tunnel_id) +def tool_list_hallways(wing: str = None): + """List within-wing hallway records, optionally filtered by wing.""" + try: + wing = _sanitize_optional_name(wing, "wing") + except ValueError as e: + return {"error": str(e)} + return list_hallways(wing) + + +def tool_delete_hallway(hallway_id: str): + """Delete a hallway record by its ID.""" + if not hallway_id or not isinstance(hallway_id, str): + return {"error": "hallway_id is required"} + return {"deleted": delete_hallway(hallway_id)} + + def tool_follow_tunnels(wing: str, room: str): """Follow explicit tunnels from a room to see connected drawers in other wings.""" try: @@ -1365,6 +1433,250 @@ def tool_follow_tunnels(wing: str, room: str): # ==================== WRITE TOOLS ==================== +def _chroma_field(result, name, default=None): + if result is None: + return default + if isinstance(result, dict): + return result.get(name, default) + return getattr(result, name, default) + + +def _chunk_index(meta): + try: + return int((meta or {}).get("chunk_index", 0)) + except (TypeError, ValueError): + return 0 + + +def _response_safe_meta(meta): + safe_meta = _safe_meta(meta) + if safe_meta.get("source_file"): + safe_meta["source_file"] = Path(safe_meta["source_file"]).name + return safe_meta + + +def _content_preview(content): + return content[:200] + "..." if len(content) > 200 else content + + +def _single_drawer_record(col, drawer_id: str): + result = col.get(ids=[drawer_id], include=["documents", "metadatas"]) + ids = _chroma_field(result, "ids", []) or [] + if not ids: + return None + + docs = _chroma_field(result, "documents", []) or [] + metas = _chroma_field(result, "metadatas", []) or [] + doc = docs[0] if docs else "" + meta = _safe_meta(metas[0] if metas else {}) + + return { + "drawer_id": ids[0], + "ids": [ids[0]], + "documents": [doc or ""], + "metadatas": [meta], + "content": doc or "", + "metadata": meta, + "chunked": False, + } + + +def _logical_chunk_group(col, drawer_id: str): + try: + result = col.get( + where={"parent_drawer_id": drawer_id}, + include=["documents", "metadatas"], + ) + except Exception: + logger.debug("chunk group lookup failed for %s", drawer_id, exc_info=True) + return None + + ids = _chroma_field(result, "ids", []) or [] + if not ids: + return None + + docs = _chroma_field(result, "documents", []) or [] + metas = _chroma_field(result, "metadatas", []) or [] + + rows = [] + for idx, chunk_id in enumerate(ids): + doc = docs[idx] if idx < len(docs) else "" + meta = _safe_meta(metas[idx] if idx < len(metas) else {}) + rows.append((_chunk_index(meta), chunk_id, doc or "", meta)) + + rows.sort(key=lambda row: (row[0], row[1])) + + chunk_ids = [row[1] for row in rows] + chunk_docs = [row[2] for row in rows] + chunk_metas = [row[3] for row in rows] + first_meta = chunk_metas[0] if chunk_metas else {} + + return { + "drawer_id": drawer_id, + "ids": chunk_ids, + "documents": chunk_docs, + "metadatas": chunk_metas, + "content": "".join(chunk_docs), + "metadata": first_meta, + "chunked": True, + } + + +def _logical_drawer_record(col, drawer_id: str): + direct = _single_drawer_record(col, drawer_id) + if direct is not None: + return direct + return _logical_chunk_group(col, drawer_id) + + +def _drawer_payload(record): + safe_meta = _response_safe_meta(record["metadata"]) + + payload = { + "drawer_id": record["drawer_id"], + "content": record["content"], + "wing": safe_meta.get("wing", ""), + "room": safe_meta.get("room", ""), + "metadata": safe_meta, + } + + if record.get("chunked"): + payload["chunks"] = len(record["ids"]) + payload["chunk_ids"] = record["ids"] + payload["metadata"]["chunks"] = len(record["ids"]) + payload["metadata"]["chunk_ids"] = record["ids"] + + return payload + + +def _fetch_drawer_rows(col, where=None, page_size: int = 1000): + ids = [] + documents = [] + metadatas = [] + offset = 0 + + while True: + kwargs = { + "include": ["documents", "metadatas"], + "limit": page_size, + "offset": offset, + } + if where: + kwargs["where"] = where + + result = col.get(**kwargs) + batch_ids = _chroma_field(result, "ids", []) or [] + if not batch_ids: + break + + batch_docs = _chroma_field(result, "documents", []) or [] + batch_metas = _chroma_field(result, "metadatas", []) or [] + + ids.extend(batch_ids) + + for idx in range(len(batch_ids)): + documents.append(batch_docs[idx] if idx < len(batch_docs) else "") + metadatas.append(batch_metas[idx] if idx < len(batch_metas) else {}) + + offset += len(batch_ids) + if len(batch_ids) < page_size: + break + + return ids, documents, metadatas + + +def _collapse_drawer_rows(ids, documents, metadatas): + groups = {} + singles = [] + + for idx, drawer_id in enumerate(ids): + doc = documents[idx] if idx < len(documents) else "" + meta = _safe_meta(metadatas[idx] if idx < len(metadatas) else {}) + parent_id = meta.get("parent_drawer_id") + + if parent_id: + groups.setdefault(parent_id, []).append( + (_chunk_index(meta), drawer_id, doc or "", meta) + ) + else: + singles.append((drawer_id, doc or "", meta)) + + grouped_ids = set(groups) + drawers = [] + + for drawer_id, doc, meta in singles: + # If both a legacy logical row and chunks exist, display one logical row. + if drawer_id in grouped_ids: + continue + + safe_meta = _response_safe_meta(meta) + drawers.append( + { + "drawer_id": drawer_id, + "wing": safe_meta.get("wing", ""), + "room": safe_meta.get("room", ""), + "content_preview": _content_preview(doc), + "metadata": safe_meta, + } + ) + + for parent_id, parts in groups.items(): + parts.sort(key=lambda row: (row[0], row[1])) + chunk_ids = [row[1] for row in parts] + content = "".join(row[2] for row in parts) + + safe_meta = _response_safe_meta(parts[0][3] if parts else {}) + safe_meta["chunks"] = len(chunk_ids) + safe_meta["chunk_ids"] = chunk_ids + + drawers.append( + { + "drawer_id": parent_id, + "wing": safe_meta.get("wing", ""), + "room": safe_meta.get("room", ""), + "content_preview": _content_preview(content), + "metadata": safe_meta, + "chunks": len(chunk_ids), + "chunk_ids": chunk_ids, + } + ) + + drawers.sort(key=lambda item: item["drawer_id"]) + return drawers + + +def _build_chunk_rows(drawer_id: str, content: str, meta: dict, chunk_size: int): + chunk_size = max(1, int(chunk_size or 1)) + + base_meta = _safe_meta(meta) + base_meta.pop("chunk_index", None) + base_meta["parent_drawer_id"] = drawer_id + + spans = ( + [(0, "")] + if content == "" + else [ + (start, content[start : start + chunk_size]) + for start in range(0, len(content), chunk_size) + ] + ) + + chunk_ids = [] + chunk_docs = [] + chunk_metas = [] + + for start, chunk_doc in spans: + chunk_index = start // chunk_size + chunk_ids.append(f"{drawer_id}_chunk_{chunk_index:06d}") + chunk_docs.append(chunk_doc) + + chunk_meta = dict(base_meta) + chunk_meta["chunk_index"] = chunk_index + chunk_metas.append(chunk_meta) + + return chunk_ids, chunk_docs, chunk_metas + + def tool_add_drawer( wing: str, room: str, content: str, source_file: str = None, added_by: str = "mcp" ): @@ -1435,10 +1747,11 @@ def tool_add_drawer( idempotency_probe_ids = [drawer_id, f"{drawer_id}_chunk_{last_chunk_idx:06d}"] try: existing = col.get(ids=idempotency_probe_ids, include=[]) - if existing.ids: + if _get_result_ids(existing): return {"success": True, "reason": "already_exists", "drawer_id": drawer_id} - except Exception: - logger.debug("Idempotency pre-check failed for %s", idempotency_probe_ids, exc_info=True) + except Exception as e: + logger.warning("Idempotency pre-check failed for %s", idempotency_probe_ids, exc_info=True) + return {"success": False, "error": f"Idempotency check failed before write: {e}"} try: if len(content) <= chunk_size: @@ -1448,7 +1761,7 @@ def tool_add_drawer( metadatas=[{**base_meta, "chunk_index": 0}], ) inserted = col.get(ids=[drawer_id], include=[]) - if not inserted.ids: + if not _get_result_ids(inserted): raise RuntimeError( "Drawer write was acknowledged but the new ID is not readable. " "The palace index may be stale; run reconnect or repair." @@ -1483,7 +1796,7 @@ def tool_add_drawer( # Probe the LAST chunk id, not the first — its presence confirms # the whole batch landed, not just the leading row. inserted = col.get(ids=[chunk_ids[-1]], include=[]) - if not inserted.ids: + if not _get_result_ids(inserted): raise RuntimeError( "Drawer write was acknowledged but the new ID is not readable. " "The palace index may be stale; run reconnect or repair." @@ -1503,38 +1816,250 @@ def tool_add_drawer( def tool_delete_drawer(drawer_id: str): - """Delete a single drawer by ID.""" + """Delete a single logical drawer by ID.""" global _metadata_cache + col = _get_collection() if not col: return _collection_error_or_no_palace() - existing = col.get(ids=[drawer_id]) - if not existing["ids"]: - return {"success": False, "error": f"Drawer not found: {drawer_id}"} - - # Log the deletion with the content being removed for audit trail - deleted_content = existing.get("documents", [""])[0] if existing.get("documents") else "" - deleted_meta = _safe_meta( - existing.get("metadatas", [{}])[0] if existing.get("metadatas") else {} - ) - _wal_log( - "delete_drawer", - { - "drawer_id": drawer_id, - "deleted_meta": deleted_meta, - "content_preview": deleted_content[:200], - }, - ) try: - col.delete(ids=[drawer_id]) + record = _logical_drawer_record(col, drawer_id) + if record is None: + return {"success": False, "error": f"Drawer not found: {drawer_id}"} + + _wal_log( + "delete_drawer", + { + "drawer_id": drawer_id, + "deleted_ids": record["ids"], + "deleted_meta": record["metadata"], + "content_preview": record["content"][:200], + }, + ) + + col.delete(ids=record["ids"]) _metadata_cache = None - logger.info(f"Deleted drawer: {drawer_id}") - return {"success": True, "drawer_id": drawer_id} + + logger.info("Deleted drawer: %s (%s rows)", drawer_id, len(record["ids"])) + + return { + "success": True, + "drawer_id": drawer_id, + "deleted_ids": record["ids"], + "chunks_deleted": len(record["ids"]), + } except Exception as e: return {"success": False, "error": str(e)} +def _capture_fd_stdout(fn): + """Run ``fn()`` with its stdout captured at both the Python and fd level. + + The mining engines (``miner.mine`` / ``convo_miner.mine_convos`` / + ``format_miner.mine_formats``) print progress and a summary to stdout. In + the MCP server stdout is the JSON-RPC channel (``_restore_stdout`` runs once + in ``main`` before the protocol loop), so that output would corrupt the + protocol. Two layers are needed: + + * ``contextlib.redirect_stdout`` captures Python-level ``print`` into a + buffer — this is what becomes the returned summary, and it works even when + ``sys.stdout`` has been swapped (e.g. under pytest capture). + * an ``os.dup2`` of fd 1 to a temp file contains C-level banners emitted by + onnxruntime / chromadb during embedding, which bypass ``sys.stdout`` + entirely (the same reason the module redirects fd 1 at import, #225), and + keeps any direct fd-1 write off the live JSON-RPC channel. + + Returns ``(result, captured_text)``. ``captured_text`` is handed back to the + caller verbatim as an opaque summary; it is never parsed into fields. Falls + back to Python-level capture alone on platforms without fd-level stdio + (embedded interpreters), matching the import-time fallback. + """ + import contextlib + import io + import tempfile + + buf = io.StringIO() + sys.stdout.flush() + sys.stderr.flush() + try: + saved_fd = os.dup(1) + except (OSError, AttributeError): + with contextlib.redirect_stdout(buf): + result = fn() + return result, buf.getvalue() + + try: + with tempfile.TemporaryFile() as tmp: + os.dup2(tmp.fileno(), 1) + try: + with contextlib.redirect_stdout(buf): + result = fn() + finally: + sys.stdout.flush() + os.dup2(saved_fd, 1) + tmp.seek(0) + fd_text = tmp.read().decode("utf-8", "replace") + return result, buf.getvalue() + fd_text + finally: + os.close(saved_fd) + + +def tool_mine( + source: str, + mode: str = "projects", + wing: str = None, + agent: str = "mempalace", + limit: int = 0, + dry_run: bool = False, + extract: str = "exchange", +): + """Mine a directory into the palace — the MCP equivalent of ``mempalace mine``. + + Lets MCP clients that cannot shell out (Claude Desktop, LM Studio, Aionui, + Desktop Commander) trigger indexing in-conversation (#1662). Wraps the same + in-process miners the CLI's ``cmd_mine`` calls; it adds no new ingestion + logic of its own. + + mode: + ``"projects"`` (default) — code/docs via ``miner.mine``. + ``"convos"`` — chat transcripts via ``convo_miner.mine_convos``. + ``"extract"`` — office documents (PDF/DOCX/RTF/…) via + ``format_miner.mine_formats``; requires the + optional ``mempalace[extract]`` dependency. + wing: target wing (default: derived from the source directory name). + agent: recorded on every drawer (default ``"mempalace"``). + limit: max files to process (0 = all). + dry_run: walk + chunk and report, but file nothing. + extract: convos extraction strategy — ``"exchange"`` (default) or + ``"general"``; ignored by the other modes. + + Runs synchronously and mirrors the :func:`tool_sync` contract: success + returns ``{success: True, mode, dry_run, output[, output_truncated]}`` where ``output`` is + the miner's human-readable summary (captured so it cannot corrupt the + JSON-RPC stream); failure returns ``{success: False, error[, error_class]}``. + The palace write lock is held by the miners themselves, so a concurrent mine + surfaces as a structured already-running error. Orphan cleanup is not part of + mining — use ``mempalace_sync`` for that. + """ + global _metadata_cache + from .palace import MineAlreadyRunning, MineValidationError + + if not _config.palace_path: + np = _no_palace() + return {"success": False, "error": np.get("error", "no palace"), "hint": np.get("hint")} + + valid_modes = ("projects", "convos", "extract") + if mode not in valid_modes: + return { + "success": False, + "error": f"invalid mode '{mode}'; expected one of: {', '.join(valid_modes)}", + } + + src = os.path.expanduser(source) if source else "" + if not src or not os.path.isdir(src): + return {"success": False, "error": f"source directory not found: {source!r}"} + + def _run(): + if mode == "convos": + from .convo_miner import mine_convos + + return mine_convos( + convo_dir=src, + palace_path=_config.palace_path, + wing=wing, + agent=agent, + limit=limit, + dry_run=dry_run, + extract_mode=extract, + ) + if mode == "extract": + from .format_miner import mine_formats + + return mine_formats( + format_dir=src, + palace_path=_config.palace_path, + wing=wing, + agent=agent, + limit=limit, + dry_run=dry_run, + ) + from .miner import mine + + return mine( + project_dir=src, + palace_path=_config.palace_path, + wing_override=wing, + agent=agent, + limit=limit, + dry_run=dry_run, + ) + + try: + try: + _result, output = _capture_fd_stdout(_run) + # Order matters: typed handlers precede the bare Exception (mirroring + # tool_sync) so MineAlreadyRunning / MineValidationError / ValueError + # don't fall into the generic "mine failed" branch. + except MineAlreadyRunning as exc: + return { + "success": False, + "error": f"another mine is in progress: {exc}", + "error_class": "LockHeldByOtherProcess", + } + except MineValidationError as exc: + return { + "success": False, + "error": f"palace integrity check failed after mine: {exc}", + "error_class": "MineValidationError", + } + except ImportError as exc: + # 'extract' mode pulls in the optional mempalace[extract] stack; + # name it so the caller knows to install the extra. Other modes have + # no optional imports, so an ImportError there is a real bug, not a + # missing extra — log the traceback and surface its type. + if mode == "extract": + return { + "success": False, + "error": f"mode 'extract' needs the mempalace[extract] extra: {exc}", + "error_class": "MissingDependency", + } + logger.exception("tool_mine: unexpected ImportError (mode=%s)", mode) + return {"success": False, "error": f"mine failed: {exc}", "error_class": "ImportError"} + except ValueError as exc: + return {"success": False, "error": str(exc), "error_class": "ValueError"} + except SystemExit as exc: + # A library mine() must never terminate the MCP server. miner.mine + # converts Ctrl-C into sys.exit(130) (CLI semantics); in-process + # that SystemExit is a BaseException that would slip past the + # protocol loop's `except Exception` and kill the server with no + # response. Convert it to a structured error instead. + return { + "success": False, + "error": f"mine exited early (code {exc.code})", + "error_class": "Interrupted", + } + except Exception as exc: + logger.exception("tool_mine: mine failed (mode=%s)", mode) + return { + "success": False, + "error": f"mine failed: {exc}", + "error_class": type(exc).__name__, + } + # Cap the echoed summary so a very large mine cannot return a multi-MB + # payload to the MCP client. The useful summary is at the tail, so keep + # the end and flag the truncation (never silently). + payload = {"success": True, "mode": mode, "dry_run": dry_run, "output": output} + cap = 4000 + if len(output) > cap: + payload["output"] = output[-cap:] + payload["output_truncated"] = True + return payload + finally: + if not dry_run: + _metadata_cache = None + + def tool_sync(project_dir: str = None, wing: str = None, apply: bool = False): """Prune drawers whose source files are gitignored, missing, or moved (#1252).""" global _metadata_cache @@ -1574,88 +2099,57 @@ def tool_sync(project_dir: str = None, wing: str = None, apply: bool = False): def tool_get_drawer(drawer_id: str): - """Fetch a single drawer by ID. Returns full content and metadata.""" + """Fetch a single logical drawer by ID.""" col = _get_collection() if not col: return _collection_error_or_no_palace() + try: - result = col.get(ids=[drawer_id], include=["documents", "metadatas"]) - if not result["ids"]: + record = _logical_drawer_record(col, drawer_id) + if record is None: return {"error": f"Drawer not found: {drawer_id}"} - meta = _safe_meta(result["metadatas"][0]) - doc = result["documents"][0] - # source_file is the absolute filesystem path written by the - # miners. Reduce to its basename before handing it to the MCP - # client — same threat model as the palace_path leak fix: - # nested-agent / multi-server topologies treat the client as a - # separate trust domain. Basename preserves citation utility. - # Mirrors the searcher.search_memories() return shape. - safe_meta = dict(meta) if meta else {} - if safe_meta.get("source_file"): - safe_meta["source_file"] = Path(safe_meta["source_file"]).name - return { - "drawer_id": drawer_id, - "content": doc, - "wing": safe_meta.get("wing", ""), - "room": safe_meta.get("room", ""), - "metadata": safe_meta, - } + return _drawer_payload(record) except Exception as e: return {"error": str(e)} def tool_list_drawers(wing: str = None, room: str = None, limit: int = 20, offset: int = 0): - """List drawers with pagination. Optional wing/room filter.""" + """List logical drawers with pagination.""" limit = max(1, min(limit, _MAX_RESULTS)) offset = max(0, offset) + try: wing = _sanitize_optional_name(wing, "wing") room = _sanitize_optional_name(room, "room") except ValueError as e: return {"error": str(e)} + col = _get_collection() if not col: return _collection_error_or_no_palace() + try: where = None conditions = [] + if wing: conditions.append({"wing": wing}) if room: conditions.append({"room": room}) + if len(conditions) == 1: where = conditions[0] elif len(conditions) > 1: where = {"$and": conditions} - kwargs = {"include": ["documents", "metadatas"], "limit": limit, "offset": offset} - if where: - kwargs["where"] = where - result = col.get(**kwargs) + ids, documents, metadatas = _fetch_drawer_rows(col, where=where) + drawers = _collapse_drawer_rows(ids, documents, metadatas) + page = drawers[offset : offset + limit] - # Compute total matching drawers for pagination. - if where: - total_result = col.get(where=where, include=[]) - total = len(total_result["ids"]) - else: - total = col.count() - - drawers = [] - for i, did in enumerate(result["ids"]): - meta = _safe_meta(result["metadatas"][i]) - doc = result["documents"][i] - drawers.append( - { - "drawer_id": did, - "wing": meta.get("wing", ""), - "room": meta.get("room", ""), - "content_preview": doc[:200] + "..." if len(doc) > 200 else doc, - } - ) return { - "drawers": drawers, - "total": total, - "count": len(drawers), + "drawers": page, + "total": len(drawers), + "count": len(page), "offset": offset, "limit": limit, } @@ -1664,7 +2158,7 @@ def tool_list_drawers(wing: str = None, room: str = None, limit: int = 20, offse def tool_update_drawer(drawer_id: str, content: str = None, wing: str = None, room: str = None): - """Update an existing drawer's content and/or metadata.""" + """Update an existing logical drawer's content and/or metadata.""" global _metadata_cache if content is None and wing is None and room is None: @@ -1673,13 +2167,14 @@ def tool_update_drawer(drawer_id: str, content: str = None, wing: str = None, ro col = _get_collection() if not col: return _collection_error_or_no_palace() + try: - existing = col.get(ids=[drawer_id], include=["documents", "metadatas"]) - if not existing["ids"]: + record = _logical_drawer_record(col, drawer_id) + if record is None: return {"success": False, "error": f"Drawer not found: {drawer_id}"} - old_meta = _safe_meta(existing["metadatas"][0]) - old_doc = existing["documents"][0] + old_meta = _safe_meta(record["metadata"]) + old_doc = record["content"] new_doc = old_doc if content is not None: @@ -1689,16 +2184,22 @@ def tool_update_drawer(drawer_id: str, content: str = None, wing: str = None, ro return {"success": False, "error": str(e)} new_meta = dict(old_meta) + if wing is not None: try: - new_meta["wing"] = sanitize_name(wing, "wing") + wing = sanitize_name(wing, "wing") except ValueError as e: return {"success": False, "error": str(e)} + if wing.lower() != str(old_meta.get("wing") or "").lower(): + new_meta["wing"] = wing + if room is not None: try: - new_meta["room"] = sanitize_name(room, "room") + room = sanitize_name(room, "room") except ValueError as e: return {"success": False, "error": str(e)} + if room.lower() != str(old_meta.get("room") or "").lower(): + new_meta["room"] = room _wal_log( "update_drawer", @@ -1713,15 +2214,47 @@ def tool_update_drawer(drawer_id: str, content: str = None, wing: str = None, ro }, ) - update_kwargs = {"ids": [drawer_id]} + chunk_size = max(1, int(getattr(_config, "chunk_size", 800) or 800)) + should_chunk = bool(record.get("chunked")) or len(new_doc) > chunk_size + + if should_chunk: + chunk_ids, chunk_docs, chunk_metas = _build_chunk_rows( + drawer_id, + new_doc, + new_meta, + chunk_size, + ) + + col.upsert(ids=chunk_ids, documents=chunk_docs, metadatas=chunk_metas) + + keep_ids = set(chunk_ids) + stale_ids = [old_id for old_id in record["ids"] if old_id not in keep_ids] + if stale_ids: + col.delete(ids=stale_ids) + + _metadata_cache = None + + logger.info("Updated drawer: %s (%s rows)", drawer_id, len(chunk_ids)) + + return { + "success": True, + "drawer_id": drawer_id, + "wing": new_meta.get("wing", ""), + "room": new_meta.get("room", ""), + "chunks": len(chunk_ids), + "chunk_ids": chunk_ids, + } + + update_kwargs = {"ids": [record["ids"][0]]} if content is not None: update_kwargs["documents"] = [new_doc] update_kwargs["metadatas"] = [new_meta] - col.update(**update_kwargs) + col.update(**update_kwargs) _metadata_cache = None - logger.info(f"Updated drawer: {drawer_id}") + logger.info("Updated drawer: %s", drawer_id) + return { "success": True, "drawer_id": drawer_id, @@ -2472,6 +3005,30 @@ def tool_reconnect(): }, "handler": tool_delete_tunnel, }, + "mempalace_list_hallways": { + "description": "List within-wing hallway records (entity-to-entity co-occurrence links built at mine time). Optionally filter by wing.", + "input_schema": { + "type": "object", + "properties": { + "wing": { + "type": "string", + "description": "Filter hallways by wing", + }, + }, + }, + "handler": tool_list_hallways, + }, + "mempalace_delete_hallway": { + "description": "Delete a hallway record by its ID. Returns {deleted: bool}.", + "input_schema": { + "type": "object", + "properties": { + "hallway_id": {"type": "string", "description": "Hallway ID to delete"}, + }, + "required": ["hallway_id"], + }, + "handler": tool_delete_hallway, + }, "mempalace_follow_tunnels": { "description": "Follow tunnels from a room to see what it connects to in other wings. Returns connected rooms with drawer previews.", "input_schema": { @@ -2562,6 +3119,60 @@ def tool_reconnect(): }, "handler": tool_delete_drawer, }, + "mempalace_mine": { + "description": ( + "Mine a directory into the palace — the MCP equivalent of `mempalace mine`. " + "mode='projects' (default) ingests code/docs; mode='convos' ingests chat " + "transcripts; mode='extract' ingests office documents (PDF/DOCX/RTF, requires " + "the mempalace[extract] extra). Runs synchronously and returns the miner's " + "summary as `output`. The palace write lock is automatic; a concurrent mine " + "returns a structured already-running error. Orphan cleanup is separate — use " + "mempalace_sync." + ), + "input_schema": { + "type": "object", + "properties": { + "source": { + "type": "string", + "description": "Directory to mine.", + }, + "mode": { + "type": "string", + "enum": ["projects", "convos", "extract"], + "description": ( + "Ingest mode: projects (code/docs, default), convos (chat " + "transcripts), extract (office docs)." + ), + }, + "wing": { + "type": "string", + "description": "Target wing (default: source directory name).", + }, + "agent": { + "type": "string", + "description": "Recorded on every drawer (default: mempalace).", + }, + "limit": { + "type": "integer", + "description": "Max files to process (0 = all). Default: 0.", + }, + "dry_run": { + "type": "boolean", + "description": "Report what would be filed without writing. Default: false.", + }, + "extract": { + "type": "string", + "enum": ["exchange", "general"], + "description": ( + "Convos extraction strategy: exchange (default) or general. " + "Ignored by other modes." + ), + }, + }, + "required": ["source"], + }, + "handler": tool_mine, + }, "mempalace_sync": { "description": "Prune drawers whose source files are gitignored, deleted, or moved. Returns dry-run report by default; pass apply=true to commit deletions.", "input_schema": { @@ -2662,13 +3273,10 @@ def tool_reconnect(): "description": "Alias for 'entry' — accepted because add_drawer uses 'content'. Provide either 'entry' or 'content'; 'entry' wins if both are given.", }, }, - # agent_name is always required; 'entry' or its alias 'content' must - # be present (the server remaps content->entry at dispatch). + # 'entry' (or its alias 'content') is enforced at dispatch, not via a + # top-level anyOf: Anthropic rejects schemas with a top-level + # anyOf/oneOf/allOf and drops the whole tools array (400). "required": ["agent_name"], - "anyOf": [ - {"required": ["entry"]}, - {"required": ["content"]}, - ], }, "handler": tool_diary_write, }, diff --git a/mempalace/migrate.py b/mempalace/migrate.py index 36e18622d..0814bf514 100644 --- a/mempalace/migrate.py +++ b/mempalace/migrate.py @@ -19,6 +19,7 @@ """ import errno +import glob import os import shutil import sqlite3 @@ -28,6 +29,9 @@ from contextlib import closing from datetime import datetime +from .backups import prune_backups +from .config import MempalaceConfig + def _restore_stale_palace(palace_path: str, stale_path: str) -> None: """Roll back a failed swap. @@ -293,6 +297,16 @@ def migrate(palace_path: str, dry_run: bool = False, confirm: bool = False): print(f"\n Backing up to {backup_path}...") shutil.copytree(palace_path, backup_path) + # Enforce backup retention so repeated migrations cannot fill the disk + # with full-palace copies. The backup we just created is the newest, so + # it survives; only older ``.pre-migrate.*`` siblings beyond the limit + # are removed. Best-effort — never let cleanup fail the migration. + prune_backups( + glob.escape(palace_path) + ".pre-migrate.*", + MempalaceConfig().max_backups, + log=print, + ) + # Build fresh palace in a temp directory (avoids chromadb reading old state). # Wrap the whole import-and-swap dance in try/finally so the temp dir is # cleaned up if any of the chromadb writes, the verify count, or the diff --git a/mempalace/miner.py b/mempalace/miner.py index 5566a0214..aed181070 100644 --- a/mempalace/miner.py +++ b/mempalace/miner.py @@ -1623,9 +1623,6 @@ def _mine_impl( respect_gitignore=respect_gitignore, include_ignored=include_ignored, ) - if limit > 0: - files = files[:limit] - from .embedding import describe_device print(f"\n{'=' * 55}") @@ -1633,7 +1630,8 @@ def _mine_impl( print(f"{'=' * 55}") print(f" Wing: {wing}") print(f" Rooms: {', '.join(r['name'] for r in rooms)}") - print(f" Files: {len(files)}") + limit_suffix = f" (limit: {limit} new)" if limit > 0 else "" + print(f" Files: {len(files)}{limit_suffix}") print(f" Palace: {palace_path}") print(f" Device: {describe_device()}") if dry_run: @@ -1652,6 +1650,7 @@ def _mine_impl( closets_col = None total_drawers = 0 + files_mined = 0 files_skipped = 0 files_skipped_chunk_cap = 0 files_processed = 0 @@ -1699,8 +1698,11 @@ def _mine_impl( else: total_drawers += drawers room_counts[room] += 1 + files_mined += 1 if not dry_run: print(f" + [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers}") + if limit > 0 and files_mined >= limit: + break if not dry_run: # Cross-wing topic tunnels: after every file in this wing has been @@ -1753,7 +1755,7 @@ def _mine_impl( print(f"\n{'=' * 55}") print(" Done.") - print(f" Files processed: {len(files) - files_skipped}") + print(f" Files processed: {files_processed - files_skipped}") # The residual skip bucket label depends on mode: dry-run bypasses # the already-mined check, so the only paths producing (0, room, # None) under dry_run are OSError / too-short / post-lock re-check diff --git a/mempalace/palace.py b/mempalace/palace.py index 32d2d57e1..15a95f4ba 100644 --- a/mempalace/palace.py +++ b/mempalace/palace.py @@ -70,13 +70,109 @@ NORMALIZE_VERSION = 2 +# (palace_id, collection_name, model_name) tuples already validated this +# process, so the identity check (one metadata read) runs at most once per +# collection per run — keeps the hot get_collection path cheap. +_VALIDATED_IDENTITY: set = set() + + +def _enforce_embedder_identity(collection, palace_path, collection_name, *, create) -> None: + """Check (and, for a brand-new collection, record) embedder identity (RFC 001). + + Check at open so a model swap fails fast — before any query silently + returns degraded results. Record only when the collection is brand-new and + empty: recording the *current* model on a legacy palace that already holds + vectors from an unknown model would mislabel it, so populated-but-unrecorded + collections warn instead and are resolved with + ``mempalace palace set-embedder``. + + Bookkeeping must never break memory operations: only the deliberate + identity/dimension mismatch propagates; every other error is swallowed. + """ + import warnings + + from .backends.base import ( + DimensionMismatchError, + EmbedderIdentity, + EmbedderIdentityMismatchError, + EmbedderIdentityUnknownWarning, + check_embedder_identity, + ) + from .embedding import current_model_name + + # A server_embedder backend embeds with its own model and ignores the + # injected/core embedder, so its effective identity — not the configured + # model — is what must be checked and recorded. Fall back to the configured + # model name for the normal (core-embedder) case. + current: Optional[EmbedderIdentity] = None + try: + effective = collection.effective_embedder_identity() + except Exception: + effective = None + if effective is not None and getattr(effective, "model_name", ""): + current = effective + else: + try: + model_name = current_model_name() + except Exception: + return + if not model_name: + return # nameless embedder — cannot enforce identity + current = EmbedderIdentity(model_name=model_name, dimension=0) + + model_name = current.model_name + key = (str(palace_path), str(collection_name), model_name) + if key in _VALIDATED_IDENTITY: + return + + try: + stored = collection.get_stored_embedder_identity() + except Exception: + logger.debug("embedder-identity read failed for %s", collection_name, exc_info=True) + return + try: + state = check_embedder_identity(stored, current) + except (EmbedderIdentityMismatchError, DimensionMismatchError): + raise # deliberate, user-facing — the whole point of the contract + except Exception: + return + + if state == "unknown" and stored is None: + try: + count = collection.count() + except Exception: + count = None + if count == 0: + if create: + try: + collection.set_embedder_identity(current) + except Exception: + logger.debug("embedder-identity record failed", exc_info=True) + elif count: + warnings.warn( + f"palace collection {collection_name!r} has no recorded embedder " + f"identity; assuming the current model {model_name!r}. Run " + "`mempalace palace set-embedder --model ` to record it.", + EmbedderIdentityUnknownWarning, + stacklevel=2, + ) + + _VALIDATED_IDENTITY.add(key) + + def get_collection( palace_path: str, collection_name: Optional[str] = None, create: bool = True, backend: Optional[str] = None, + _skip_identity_check: bool = False, ): - """Get the palace collection through the backend layer.""" + """Get the palace collection through the backend layer. + + ``_skip_identity_check`` bypasses the embedder-identity enforcement so the + ``set-embedder`` override path can open a palace whose recorded model + differs from the current one (the very state it exists to repair). + """ if collection_name is None: from .config import get_configured_collection_name @@ -98,10 +194,71 @@ def get_collection( create=create, ) if "requires_explicit_embeddings" in getattr(backend_obj, "capabilities", frozenset()): - return EmbeddingCollection(collection) + collection = EmbeddingCollection(collection) + if not _skip_identity_check: + _enforce_embedder_identity(collection, palace_path, collection_name, create=create) return collection +def set_palace_embedder_identity( + palace_path: str, + model: Optional[str] = None, + *, + force: bool = False, + backend: Optional[str] = None, + collection_name: Optional[str] = None, +): + """Record (or force-override) a palace collection's embedder identity (RFC 001). + + Backs ``mempalace palace set-embedder``. Returns ``(old, new)`` identities. + Without ``force``, refuses to overwrite an existing identity that names a + different model (the user must confirm they know the vectors are + compatible). Opens with the identity check skipped so a mismatched palace — + the exact state being repaired — can be opened at all. + """ + from .backends.base import EmbedderIdentity, EmbedderIdentityMismatchError + from .config import MempalaceConfig + from .embedding import get_embedder_identity + + configured = MempalaceConfig().embedding_model + target = (model or configured or "").strip().lower() + if not target: + # No model given and none configured — there is nothing to record, and + # recording a nameless identity is a silent no-op in every backend. + raise ValueError( + "no embedder model to record: pass --model NAME or configure MEMPALACE_EMBEDDING_MODEL" + ) + if target == (configured or "").strip().lower(): + # Recording the in-use model — probe its dimension (already loaded). + new = get_embedder_identity() + else: + # Explicit override of a non-configured model: record the name only, + # never load a foreign model (which can be a large download) just to + # probe a dimension. The model-name check is the actual protection. + new = EmbedderIdentity(model_name=target, dimension=0) + collection = get_collection( + palace_path, + collection_name=collection_name, + create=True, + backend=backend, + _skip_identity_check=True, + ) + try: + old = collection.get_stored_embedder_identity() + except Exception: + old = None + if old is not None and old.model_name != new.model_name and not force: + raise EmbedderIdentityMismatchError( + f"palace already records embedder {old.model_name!r}; pass --force to " + f"overwrite it with {new.model_name!r} (only if the vectors are compatible)" + ) + collection.set_embedder_identity(new) + # Reset the per-process validation cache so a re-open re-checks against the + # newly recorded identity rather than a stale verdict. + _VALIDATED_IDENTITY.clear() + return old, new + + def get_closets_collection( palace_path: str, create: bool = True, @@ -568,38 +725,185 @@ def mine_lock(source_file: str): Prevents multiple agents from mining the same file simultaneously, which causes duplicate drawers when the delete+insert cycle interleaves. """ + lock_path = _mine_lock_path(source_file) + lf = _acquire_mine_lock_file(lock_path) + try: + yield + finally: + try: + _unlock_mine_lock_file(lf) + except Exception: + logger.debug("Mine-lock release failed", exc_info=True) + try: + lf.close() + except Exception: + logger.debug("Mine-lock close failed", exc_info=True) + _cleanup_mine_lock_file(lock_path) + + +def _mine_lock_path(source_file: str) -> str: lock_dir = os.path.join(os.path.expanduser("~"), ".mempalace", "locks") os.makedirs(lock_dir, exist_ok=True) - lock_path = os.path.join( - lock_dir, hashlib.sha256(source_file.encode()).hexdigest()[:16] + ".lock" - ) + return os.path.join(lock_dir, hashlib.sha256(source_file.encode()).hexdigest()[:16] + ".lock") - lf = open(lock_path, "w") - try: - if os.name == "nt": - import msvcrt - msvcrt.locking(lf.fileno(), msvcrt.LK_LOCK, 1) - else: - import fcntl +def _open_mine_lock_file(lock_path: str, *, create: bool): + flags = os.O_RDWR + if create: + flags |= os.O_CREAT + fd = os.open(lock_path, flags, 0o600) + return os.fdopen(fd, "r+b") - fcntl.flock(lf, fcntl.LOCK_EX) - yield - finally: + +def _lock_mine_lock_file(lock_file, *, blocking: bool) -> bool: + lock_file.seek(0) + if os.name == "nt": + import msvcrt + + mode = msvcrt.LK_LOCK if blocking else msvcrt.LK_NBLCK try: - if os.name == "nt": - import msvcrt + msvcrt.locking(lock_file.fileno(), mode, 1) + except OSError: + if not blocking: + return False + raise + return True + + import fcntl + + flags = fcntl.LOCK_EX + if not blocking: + flags |= fcntl.LOCK_NB + try: + fcntl.flock(lock_file, flags) + except BlockingIOError: + if not blocking: + return False + raise + return True + - msvcrt.locking(lf.fileno(), msvcrt.LK_UNLCK, 1) - else: - import fcntl +def _unlock_mine_lock_file(lock_file) -> None: + lock_file.seek(0) + if os.name == "nt": + import msvcrt + + msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1) + return + + import fcntl + + fcntl.flock(lock_file, fcntl.LOCK_UN) + + +def _mine_lock_file_is_current(lock_file, lock_path: str) -> bool: + """Return whether ``lock_file`` is still the inode reached by ``lock_path``. + + POSIX advisory locks attach to the opened inode, not the pathname. If a + lock file is unlinked while a contender is waiting, that contender can later + acquire a lock on an inode no new process will use. We reject that stale + handle and retry on the current pathname. + """ + if os.name == "nt": + return True + try: + path_stat = os.stat(lock_path) + file_stat = os.fstat(lock_file.fileno()) + except OSError: + return False + return (path_stat.st_dev, path_stat.st_ino) == (file_stat.st_dev, file_stat.st_ino) + + +def _acquire_open_mine_lock_file(lock_file, lock_path: str) -> bool: + """Acquire ``lock_file`` and return False if cleanup made it stale.""" + _lock_mine_lock_file(lock_file, blocking=True) + if _mine_lock_file_is_current(lock_file, lock_path): + return True + try: + _unlock_mine_lock_file(lock_file) + except Exception: + logger.debug("Mine-lock stale-handle release failed", exc_info=True) + return False - fcntl.flock(lf, fcntl.LOCK_UN) + +def _acquire_mine_lock_file(lock_path: str): + while True: + lf = _open_mine_lock_file(lock_path, create=True) + try: + if _acquire_open_mine_lock_file(lf, lock_path): + return lf except Exception: - logger.debug("Mine-lock release failed", exc_info=True) + lf.close() + raise lf.close() +def _cleanup_mine_lock_file(lock_path: str) -> None: + """Best-effort removal that preserves flock rendezvous semantics. + + A plain ``os.remove(lock_path)`` after closing the critical-section lock is + unsafe on POSIX: a waiter may already be blocked on the old inode while a + later process creates and locks a new inode at the same pathname. Instead, + cleanup briefly re-acquires the current file nonblocking. If it wins, it can + unlink that inode as cleanup-only work; waiters on the old inode will detect + the stale handle after waking and retry on the current path. + """ + try: + lf = _open_mine_lock_file(lock_path, create=False) + except FileNotFoundError: + return + except OSError: + logger.debug("Mine-lock cleanup open failed for %s", lock_path, exc_info=True) + return + + acquired = False + closed = False + try: + try: + acquired = _lock_mine_lock_file(lf, blocking=False) + except OSError: + logger.debug("Mine-lock cleanup acquire failed for %s", lock_path, exc_info=True) + return + if not acquired: + return + if not _mine_lock_file_is_current(lf, lock_path): + return + + if os.name == "nt": + # Windows generally cannot unlink an open locked file. Release and + # close first; if another process opens the file in the gap, + # os.remove should fail and we leave the rendezvous file in place. + try: + _unlock_mine_lock_file(lf) + except Exception: + logger.debug("Mine-lock cleanup release failed", exc_info=True) + acquired = False + return + acquired = False + lf.close() + closed = True + try: + os.remove(lock_path) + except OSError: + pass + return + + try: + os.remove(lock_path) + except FileNotFoundError: + pass + except OSError: + logger.debug("Mine-lock cleanup remove failed for %s", lock_path, exc_info=True) + finally: + if not closed: + if acquired: + try: + _unlock_mine_lock_file(lf) + except Exception: + logger.debug("Mine-lock cleanup release failed", exc_info=True) + lf.close() + + class MineAlreadyRunning(RuntimeError): """Raised when another `mempalace mine` already holds the per-palace lock.""" diff --git a/mempalace/repair.py b/mempalace/repair.py index 7a4a28cd1..46de6228d 100644 --- a/mempalace/repair.py +++ b/mempalace/repair.py @@ -715,6 +715,19 @@ def _vacuum_and_rebuild_fts5(palace_path: str, progress=print) -> None: progress(f" Warning: post-repair cleanup failed (non-fatal): {exc}") +def _post_rebuild_cleanup(palace_path: str, backend: "ChromaBackend", progress=print) -> None: + """Close cached chroma handles, then VACUUM and rebuild the FTS5 index. + + Shared epilogue for the two full-rebuild paths (``rebuild_index`` and the + CLI legacy ``cmd_repair``), so neither can drift out of the post-run + cleanup again (issues #1517, #1747). ChromaDB's PersistentClient keeps + chroma.sqlite3 open and VACUUM needs exclusive access, so the handles + are released first. + """ + _close_chroma_handles(palace_path, backend=backend) + _vacuum_and_rebuild_fts5(palace_path, progress=progress) + + def rebuild_index( palace_path=None, confirm_truncation_ok: bool = False, @@ -848,8 +861,7 @@ def rebuild_index( print(" Live collection was not replaced; leaving the original palace untouched.") raise - _close_chroma_handles(palace_path, backend=backend) - _vacuum_and_rebuild_fts5(palace_path, progress=progress) + _post_rebuild_cleanup(palace_path, backend=backend, progress=progress) print(f"\n Repair complete. {filed} drawers rebuilt.") print(" HNSW index is now clean with cosine distance metric.") @@ -1529,12 +1541,26 @@ def repair_max_seq_id( return result if backup: + import glob + + from .backups import prune_backups + from .config import MempalaceConfig + timestamp = datetime.now().strftime("%Y%m%d-%H%M%S") backup_path = os.path.join(palace_path, f"chroma.sqlite3.max-seq-id-backup-{timestamp}") shutil.copy2(db_path, backup_path) result["backup"] = backup_path print(f" Backup: {backup_path}") + # Retain only the most recent N backups (the copy just written is the + # newest and is kept). Without this, every max-seq-id repair leaves a + # full chroma.sqlite3 copy behind that is never cleaned up. + prune_backups( + os.path.join(glob.escape(palace_path), "chroma.sqlite3.max-seq-id-backup-*"), + MempalaceConfig().max_backups, + log=print, + ) + _close_chroma_handles(palace_path) with sqlite3.connect(db_path) as conn: diff --git a/mempalace/searcher.py b/mempalace/searcher.py index ca0ba46ad..43796c322 100644 --- a/mempalace/searcher.py +++ b/mempalace/searcher.py @@ -130,18 +130,70 @@ def _bm25_scores( return scores +def _distance_to_similarity(distance, metric: str = "cosine") -> float: + """Map a backend-reported ``distance`` to a [0, 1]-ish similarity. + + The backend contract for the ``distances`` field is *lower = closer* + regardless of metric (RFC 001, backend metric declaration), so every + mapping here is monotonic decreasing in ``distance``. The output stays + bounded so it is + commensurable with the min-max-normalized BM25 term in + :func:`_hybrid_rank`. + + * ``cosine`` — distance ∈ [0, 2], 0 = identical: ``max(0, 1 - d)``. + * ``l2`` — Euclidean ∈ [0, ∞): ``1 / (1 + d)`` (1 at d=0, →0 as d→∞). + * ``ip`` — inner-product distance (e.g. pgvector ``<#>`` = -dot, lower = + closer), unbounded and signed: logistic squash ``1 / (1 + e^d)``. + Provisional until a real ip backend exercises it; no in-tree backend + uses ip today. + + ``distance is None`` (vector-unknown, e.g. a BM25-only candidate) maps to + 0.0 so the candidate scores on its BM25 contribution alone. + """ + if distance is None: + return 0.0 + m = (metric or "cosine").lower() + if m == "l2": + return 1.0 / (1.0 + max(0.0, distance)) + if m == "ip": + # Clamp the exponent so a large positive distance can't overflow. + return 1.0 / (1.0 + math.exp(min(60.0, distance))) + # cosine (default) + return max(0.0, 1.0 - distance) + + +def _metric_for_collection(col) -> str: + """Resolve a collection's declared distance metric, defaulting to cosine. + + Reads the ``distance_metric`` exposed by the backend collection (the + RFC 001 backend metric declaration). ``EmbeddingCollection`` delegates the + attribute to its inner collection; legacy Chroma palaces report their + actual ``hnsw:space``. + Any failure falls back to ``"cosine"`` — the value all in-tree backends + use and the only metric MemPalace created palaces with historically. + """ + try: + metric = getattr(col, "distance_metric", "cosine") + except Exception: + return "cosine" + metric = str(metric or "cosine").lower() + return metric if metric in ("cosine", "l2", "ip") else "cosine" + + def _hybrid_rank( results: list, query: str, vector_weight: float = 0.6, bm25_weight: float = 0.4, + metric: str = "cosine", ) -> list: """Re-rank ``results`` by a convex combination of vector similarity and BM25. - * Vector similarity uses absolute cosine sim ``max(0, 1 - distance)`` — - ChromaDB's hnsw cosine distance lives in ``[0, 2]`` (0 = identical). - Absolute (not relative-to-max) means adding/removing a candidate - can't reshuffle the others. + * Vector similarity is derived from each candidate's backend-reported + ``distance`` via :func:`_distance_to_similarity`, interpreted in the + collection's declared ``metric`` (per RFC 001) rather than assuming + cosine. Absolute (not relative-to-max) means adding/removing a + candidate can't reshuffle the others. * BM25 is real Okapi-BM25 with corpus-relative IDF over the candidates themselves. Since the absolute scale is unbounded, BM25 is min-max normalized within the candidate set so weights are commensurable. @@ -164,11 +216,7 @@ def _hybrid_rank( scored = [] for r, raw, norm in zip(results, bm25_raw, bm25_norm): - distance = r.get("distance") - if distance is None: - vec_sim = 0.0 - else: - vec_sim = max(0.0, 1.0 - distance) + vec_sim = _distance_to_similarity(r.get("distance"), metric) r["bm25_score"] = round(raw, 3) scored.append((vector_weight * vec_sim + bm25_weight * norm, r)) @@ -202,6 +250,31 @@ def _extract_drawer_ids_from_closet(closet_doc: str) -> list: return list(seen.keys()) +def _scoped_source_filter(source_file: str, parent_drawer_id=None) -> dict: + """Build a Chroma ``where`` clause that scopes a query to ``source_file``, + additionally constrained by ``parent_drawer_id`` when one is supplied. + + Two unrelated oversized ``tool_add_drawer`` writes (chunked path from + #1539) can pass the same ``source_file`` (e.g. two pastes tagged + ``"chat.log"``); each call stores its own ``parent_drawer_id`` group + of chunks but the bare ``source_file`` filter pulls chunks from both + groups as if they were siblings (#1580). When the matched chunk + carries a ``parent_drawer_id`` the filter narrows to that logical + group. Otherwise (pre-#1539 drawers, single-chunk writes, and + ``diary_ingest`` drawers grouped by real file path) the original + file-global shape is preserved. Mirrors the conditional-``$and`` + precedent in ``build_where_filter``. + """ + if parent_drawer_id: + return { + "$and": [ + {"source_file": source_file}, + {"parent_drawer_id": parent_drawer_id}, + ] + } + return {"source_file": source_file} + + def _expand_with_neighbors(drawers_col, matched_doc: str, matched_meta: dict, radius: int = 1): """Expand a matched drawer with its ±radius sibling chunks in the same source file. @@ -225,15 +298,20 @@ def _expand_with_neighbors(drawers_col, matched_doc: str, matched_meta: dict, ra if not src or not isinstance(chunk_idx, int): return {"text": matched_doc, "drawer_index": chunk_idx, "total_drawers": None} + # Narrow by ``parent_drawer_id`` when present so chunks from unrelated + # logical drawers sharing ``source_file`` do not stitch (#1580). See + # ``_scoped_source_filter`` for the contract. + parent_id = matched_meta.get("parent_drawer_id") target_indexes = [chunk_idx + offset for offset in range(-radius, radius + 1)] + neighbor_clauses = [ + {"source_file": src}, + {"chunk_index": {"$in": target_indexes}}, + ] + if parent_id: + neighbor_clauses.append({"parent_drawer_id": parent_id}) try: neighbors = drawers_col.get( - where={ - "$and": [ - {"source_file": src}, - {"chunk_index": {"$in": target_indexes}}, - ] - }, + where={"$and": neighbor_clauses}, include=["documents", "metadatas"], ) except Exception: @@ -251,10 +329,16 @@ def _expand_with_neighbors(drawers_col, matched_doc: str, matched_meta: dict, ra else: combined_text = "\n\n".join(doc for _, doc in indexed_docs) - # Cheap total_drawers lookup: metadata-only scan of the source file. + # Cheap total_drawers lookup. When ``parent_drawer_id`` is present the + # count is scoped to that group so the returned number matches the + # text the caller gets back. Without a parent id, the legacy + # file-global count is preserved. total_drawers = None try: - all_meta = drawers_col.get(where={"source_file": src}, include=["metadatas"]) + all_meta = drawers_col.get( + where=_scoped_source_filter(src, parent_id), + include=["metadatas"], + ) total_drawers = len(all_meta.ids) if all_meta.ids else None except Exception: logger.debug("total_drawers lookup failed for %s", src, exc_info=True) @@ -350,11 +434,12 @@ def search(query: str, palace_path: str, wing: str = None, room: str = None, n_r # The MCP tool path already hybridizes BM25 with vector sim via # `_hybrid_rank`; do the same here so CLI results match what agents # see via `mempalace_search`. + metric = _metric_for_collection(col) hits = [ {"text": doc or "", "distance": float(dist), "metadata": meta or {}} for doc, meta, dist in zip(docs, metas, dists) ] - hits = _hybrid_rank(hits, query) + hits = _hybrid_rank(hits, query, metric=metric) print(f"\n{'=' * 60}") print(f' Results for: "{query}"') @@ -365,7 +450,7 @@ def search(query: str, palace_path: str, wing: str = None, room: str = None, n_r print(f"{'=' * 60}\n") for i, hit in enumerate(hits, 1): - vec_sim = round(max(0.0, 1 - hit["distance"]), 3) + vec_sim = round(_distance_to_similarity(hit["distance"], metric), 3) bm25 = hit.get("bm25_score", 0.0) meta = hit["metadata"] source = Path(meta.get("source_file", "?")).name @@ -374,7 +459,7 @@ def search(query: str, palace_path: str, wing: str = None, room: str = None, n_r print(f" [{i}] {wing_name} / {room_name}") print(f" Source: {source}") - print(f" Match: cosine={vec_sim} bm25={bm25}") + print(f" Match: {metric}_sim={vec_sim} bm25={bm25}") print() # Print the verbatim text, indented for line in hit["text"].strip().split("\n"): @@ -786,11 +871,12 @@ def _finalize_candidate_hits( "hint": "Use candidate_strategy='vector' or select a backend that supports lexical search.", } - hits = _hybrid_rank(hits, query)[:n_results] + hits = _hybrid_rank(hits, query, metric=_metric_for_collection(drawers_col))[:n_results] for h in hits: h.pop("_sort_key", None) h.pop("_source_file_full", None) h.pop("_chunk_index", None) + h.pop("_parent_drawer_id", None) return hits, None @@ -978,6 +1064,7 @@ def search_memories( if open_error: return open_error + metric = _metric_for_collection(drawers_col) where = build_where_filter(wing, room) # Hybrid retrieval: always query drawers directly (the floor), then use @@ -1070,7 +1157,7 @@ def search_memories( "room": meta.get("room", "unknown"), "source_file": Path(source).name if source else "?", "created_at": meta.get("filed_at", "unknown"), - "similarity": round(max(0.0, 1 - effective_dist), 3), + "similarity": round(_distance_to_similarity(effective_dist, metric), 3), "distance": round(dist, 4), "effective_distance": round(effective_dist, 4), "closet_boost": round(boost, 3), @@ -1082,6 +1169,7 @@ def search_memories( "_sort_key": effective_dist, "_source_file_full": source, "_chunk_index": meta.get("chunk_index"), + "_parent_drawer_id": meta.get("parent_drawer_id"), } if closet_preview: entry["closet_preview"] = closet_preview @@ -1102,9 +1190,11 @@ def search_memories( full_source = h.get("_source_file_full") or "" if not full_source: continue + # Narrow by ``parent_drawer_id`` when present so unrelated + # chunked drawers sharing ``source_file`` do not stitch (#1580). try: source_drawers = drawers_col.get( - where={"source_file": full_source}, + where=_scoped_source_filter(full_source, h.get("_parent_drawer_id")), include=["documents", "metadatas"], ) except Exception: diff --git a/mempalace/version.py b/mempalace/version.py index e5b9b45d6..36716249c 100644 --- a/mempalace/version.py +++ b/mempalace/version.py @@ -1,3 +1,3 @@ """Single source of truth for the MemPalace package version.""" -__version__ = "3.4.0" +__version__ = "3.4.1" diff --git a/pyproject.toml b/pyproject.toml index 761dcae60..4d27ed500 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "mempalace" -version = "3.4.0" +version = "3.4.1" description = "Give your AI a memory — mine projects and conversations into a searchable palace. No API key required." readme = "README.md" requires-python = ">=3.9" @@ -169,6 +169,15 @@ markers = [ "slow: tests that take more than 30 seconds", "stress: destructive scale tests (100K+ drawers)", ] +filterwarnings = [ + # Many tests build raw-chromadb palaces directly (no recorded embedder + # identity), which correctly emits this on open. The behavior itself is + # asserted in tests/test_embedder_identity.py; ignore the fixture noise. + # Matched by message (not by class path) so pytest does not import the + # mempalace package at config time — importing it before coverage starts + # would drop module-level lines from the report. + "ignore:palace collection.*has no recorded embedder identity", +] [tool.coverage.run] source = ["mempalace"] diff --git a/rules/mempalace-recall.mdc b/rules/mempalace-recall.mdc new file mode 100644 index 000000000..00c061a3d --- /dev/null +++ b/rules/mempalace-recall.mdc @@ -0,0 +1,23 @@ +--- +description: When the user asks about past work, prior decisions, people, projects, or events that may be filed in MemPalace, call mempalace_search (or mempalace_kg_query for relational or time-bound facts) before answering from model memory. Return stored content verbatim; never guess when the palace might know. +alwaysApply: false +--- + +# MemPalace recall + +Before answering anything that may already be in the user's memory +palace — past work, prior decisions, a person, a project, or "what did +we do / decide / discuss last time?" — search the palace first: + +1. Call `mempalace_search` with a short keyword query. Use + `mempalace_kg_query` for relational or time-bound facts. +2. Quote the drawer's **verbatim** text. Never summarize or paraphrase + stored content. +3. If results are empty, say so — do not invent an answer. If the MCP + server is unavailable, surface the error; do not fall back to guessing. + +Skip recall for pure greenfield work with no memory relevance (renaming +a variable, fixing a typo). Recall is question-driven, not reflexive. + +Full protocol: `integrations/shared/recall-protocol.md`. Deeper guidance: +the `mempalace-recall` skill. diff --git a/skills/mempalace-recall/SKILL.md b/skills/mempalace-recall/SKILL.md new file mode 100644 index 000000000..ee8cbf458 --- /dev/null +++ b/skills/mempalace-recall/SKILL.md @@ -0,0 +1,112 @@ +--- +name: mempalace-recall +description: "Recall protocol for MemPalace — search the palace before answering about past work, people, projects, or prior decisions. Apply when the user asks what was decided, what happened before, who someone is, what was discussed last time, or anything that may already be filed in their memory palace; or when mempalace-recall is invoked. Complements the mempalace setup skill and requires the mempalace-mcp server." +--- + +# MemPalace Recall + +Search-before-answer protocol for MemPalace. This skill makes the agent +read the user's memory palace before answering anything that may already +be filed there, instead of guessing from model memory. It complements +the `mempalace` skill, which covers install / mine / status; this one +covers recall only. + +## Step 0 — Verify MemPalace is available + +Before relying on recall, confirm MemPalace is installed and reachable: + +- Official release page: +- Check installed: `mempalace --version` +- Do not assume a version — the MCP tool set is the source of truth for + what this installed build supports. + +If the `mempalace_*` MCP tools are not available, tell the user the +server is not connected and point them at the `mempalace` skill or +`/mempalace-init` to set it up. Do not silently fall back to answering +from model memory. + +## Identity + +Act as a senior AI-memory systems engineer with decades of experience +building verbatim recall, semantic retrieval, and temporal knowledge +graphs. Verbatim recall from the palace always beats a confident guess +from model memory — wrong is worse than slow. + +## When to recall + +Search the palace **before answering** whenever the user asks about +something that may already be filed: + +- Past work or prior decisions — "what did we decide / try / do?" +- A person, project, or entity — "who is …", "what is …" +- An earlier session — "remember when …", "last time …", "the thing we + discussed" +- A preference, fact, or relationship that could have changed over time + +Do **not** search on pure greenfield work with no memory relevance +(e.g. "rename this variable", "fix this typo"). Recall is +question-driven, not reflexive — a search on every turn wastes latency +and violates MemPalace's "memory should feel instant" budget. + +## Protocol + +1. On wake-up, if a session-start hook injected `additional_context`, + honour its wing scoping. +2. Before responding about people / projects / past events / prior + decisions: call `mempalace_search` first. Use `mempalace_kg_query` + for relational or time-bound facts. +3. If unsure about a fact: say "let me check the palace" and query. +4. Return the drawer's **verbatim** text. Never summarize or paraphrase + stored content — quoting the exact words is the point of the system. +5. After a substantive session, record continuity with + `mempalace_diary_write` (skip if a background hook already saved). +6. When a fact changes: `mempalace_kg_invalidate` the old fact, then + `mempalace_kg_add` the new one. + +The full canonical protocol — shared verbatim with the Cursor recall +rule and the other integrations — lives in +[`integrations/shared/recall-protocol.md`](../../integrations/shared/recall-protocol.md). + +## Tool selection + +| You need | Tool | +|---|---| +| Find any memory by meaning | `mempalace_search` (start here) | +| Relational / time-bound facts about an entity | `mempalace_kg_query` | +| The chronological story of an entity | `mempalace_kg_timeline` | +| Recent session continuity | `mempalace_diary_read` | +| Which wings / rooms exist (scope unknown) | `mempalace_list_wings`, `mempalace_list_rooms` | +| Record this session | `mempalace_diary_write` | + +`mempalace_search` takes a short natural-language `query` (keywords or a +question — not a system prompt or pasted conversation) plus optional +`wing` / `room` filters and `limit` (default 5). + +## Unhappy paths + +- **Empty results.** Say the palace has nothing on this; do not invent an + answer. Offer to widen the search (drop the `wing` filter) or to file + the new information. +- **MCP error / server down.** Surface the error and suggest the user + run `mempalace status` or re-run `/mempalace-init`. Never fall back to + guessing. +- **Conflicting facts.** Trust the knowledge graph's time-valid answer; + invalidate-then-add rather than overwriting silently. + +## Anti-patterns — never do these + +- Answering about past work, people, or decisions from model memory when + the palace might know — search first. +- Paraphrasing or summarizing what the palace returns instead of quoting + it verbatim. +- Searching on every turn, including greenfield tasks with no memory + relevance. +- Pasting the whole conversation or a system prompt into the `query` + argument — keep queries short and keyword-driven. + +## Official References + +- MemPalace: +- MemPalace releases: +- Cursor Skills documentation: +- Agent Skills specification: diff --git a/skills/mempalace/SKILL.md b/skills/mempalace/SKILL.md new file mode 100644 index 000000000..b318af014 --- /dev/null +++ b/skills/mempalace/SKILL.md @@ -0,0 +1,47 @@ +--- +name: mempalace +description: MemPalace — mine projects and conversations into a searchable memory palace. Use when the user asks about MemPalace, memory palace, mining memories, searching memories, palace setup, wings, rooms, or drawers; or when they want to recall past work that may already be filed in their palace. +--- + +# MemPalace + +A searchable memory palace for AI — mine projects and conversations, then search them semantically. + +## Prerequisites + +Ensure `mempalace` is installed: + +```bash +mempalace --version +``` + +If not installed (uv recommended): + +```bash +uv tool install mempalace # or: pip install mempalace +``` + +## Usage + +MemPalace provides dynamic, version-correct instructions via the CLI. To get instructions for any operation: + +```bash +mempalace instructions +``` + +Where `` is one of: `help`, `init`, `mine`, `search`, `status`. + +Run the appropriate instructions command, then follow the returned instructions step by step. + +## Recalling past work + +This skill covers setup, mining, and status. For questions about past +work, prior decisions, or people that may already be filed in the +palace, prefer the **`mempalace-recall`** skill — it enforces +search-before-answer so the agent reads the palace instead of guessing. + +## Cursor-specific notes + +- The `mempalace-mcp` server is auto-registered by this plugin. Once installed, all 33 MemPalace MCP tools (`mempalace_search`, `mempalace_add_drawer`, `mempalace_diary_write`, `mempalace_check_duplicate`, `mempalace_diary_read`, etc.) are available to the agent without any further configuration. +- For automatic background saving every N agent turns plus session-start memory recall, also install the Cursor hooks separately by running `hooks/cursor/install.sh --scope user` from a cloned MemPalace repo. See [`website/guide/cursor-hooks.md`](../../website/guide/cursor-hooks.md) for the full walkthrough. +- The recommended `agent_name` when calling `mempalace_diary_write` from a Cursor session is `cursor-ide` (matches the precedent of `claude-code` and `codex`). diff --git a/tests/test_antigravity_hooks_install.py b/tests/test_antigravity_hooks_install.py new file mode 100644 index 000000000..1735de768 --- /dev/null +++ b/tests/test_antigravity_hooks_install.py @@ -0,0 +1,324 @@ +"""End-to-end tests for the Antigravity install.sh. + +Covers: + +* `--dry-run` is fully side-effect free. +* A real install creates the expected file tree. +* `hooks.json` is rendered with absolute paths (no `__PLUGIN_DIR__` leak). +* Re-running the installer is byte-identical (cmp gate works). +* `--uninstall` removes the dir cleanly. +* `--uninstall` refuses to wipe a directory whose basename isn't `mempalace`. +* `--uninstall` refuses if the dir is missing a `mempalace` plugin.json. +* Relative `--install-dir` is absolutized into the rendered hooks.json. +""" + +from __future__ import annotations + +import filecmp +import json +import os +import subprocess +from pathlib import Path + +import pytest + +REPO_ROOT = Path(__file__).resolve().parents[1] +INSTALL_SH = REPO_ROOT / "hooks" / "antigravity" / "install.sh" + +# Skip on Windows — install.sh is bash and uses POSIX path semantics. +pytestmark = pytest.mark.skipif( + os.name == "nt", + reason="install.sh is a bash script; Windows users use a separate code path.", +) + +EXPECTED_FILES = ( + "plugin.json", + "mcp_config.json", + "README.md", + "hooks.json", + "skills/mempalace/SKILL.md", + "hooks/lib/common.sh", + "hooks/mempal_save_hook_antigravity.sh", + "hooks/mempal_wake_hook_antigravity.sh", +) + + +def _run_install( + install_dir: Path, + *args: str, + cwd: Path | None = None, + timeout: float = 30.0, +) -> subprocess.CompletedProcess: + """Invoke install.sh with the given install dir and args.""" + cmd = [ + "bash", + str(INSTALL_SH), + "--install-dir", + str(install_dir), + *args, + ] + return subprocess.run( + cmd, + capture_output=True, + text=True, + cwd=str(cwd) if cwd else None, + timeout=timeout, + ) + + +def _assert_install_layout(install_dir: Path) -> None: + for rel in EXPECTED_FILES: + path = install_dir / rel + assert path.is_file(), f"missing after install: {rel}" + assert not path.is_symlink(), f"unexpected symlink at: {rel}" + + +# ── --dry-run ────────────────────────────────────────────────────────── + + +def test_dry_run_is_side_effect_free(tmp_path: Path) -> None: + """--dry-run must not create the install dir or any of its files.""" + install_dir = tmp_path / "mempalace" + result = _run_install(install_dir, "--dry-run") + assert result.returncode == 0, result.stderr + assert not install_dir.exists(), ( + f"dry-run created install dir {install_dir} — that is a side effect." + ) + # Output should mention DRY-RUN at least once for visibility. + assert "DRY-RUN" in result.stdout, result.stdout + + +# ── Real install ────────────────────────────────────────────────────── + + +def test_real_install_creates_full_layout(tmp_path: Path) -> None: + install_dir = tmp_path / "mempalace" + result = _run_install(install_dir) + assert result.returncode == 0, f"install failed:\n{result.stdout}\n{result.stderr}" + _assert_install_layout(install_dir) + + +def test_install_renders_absolute_paths_in_hooks_json(tmp_path: Path) -> None: + """`__PLUGIN_DIR__` must be substituted into hooks.json command paths.""" + install_dir = tmp_path / "mempalace" + result = _run_install(install_dir) + assert result.returncode == 0 + hooks = json.loads((install_dir / "hooks.json").read_text()) + # Every "command" string must be absolute and live under the + # install dir. + cmds = [] + for ns_payload in hooks.values(): + if not isinstance(ns_payload, dict): + continue + for entries in ns_payload.values(): + for entry in entries: + cmds.append(entry["command"]) + assert cmds, f"no command entries found in rendered hooks.json: {hooks}" + install_str = str(install_dir) + for cmd in cmds: + assert "__PLUGIN_DIR__" not in cmd, f"placeholder leaked into rendered hooks.json: {cmd!r}" + assert cmd.startswith("/"), f"command path is not absolute: {cmd!r}" + assert cmd.startswith(install_str + "/"), ( + f"command path {cmd!r} does not live under install dir {install_str!r}" + ) + + +def test_install_executable_bits_preserved(tmp_path: Path) -> None: + """Both hook scripts must end up executable on the install side.""" + install_dir = tmp_path / "mempalace" + result = _run_install(install_dir) + assert result.returncode == 0 + for rel in ( + "hooks/mempal_save_hook_antigravity.sh", + "hooks/mempal_wake_hook_antigravity.sh", + ): + path = install_dir / rel + assert os.access(path, os.X_OK), f"hook script not executable: {rel}" + + +# ── Idempotency ─────────────────────────────────────────────────────── + + +def test_install_is_byte_identical_on_re_run(tmp_path: Path) -> None: + """Re-running the installer should leave every file byte-identical. + + The `cmp`-gated copy and template render is what makes the + installer safe to run from CI and from `babysit`-style cron loops. + """ + install_dir = tmp_path / "mempalace" + first = _run_install(install_dir) + assert first.returncode == 0, first.stderr + # Snapshot every file's content + mtime + mode. + snap1: dict[str, tuple[bytes, float, int]] = {} + for rel in EXPECTED_FILES: + p = install_dir / rel + st = p.stat() + snap1[rel] = (p.read_bytes(), st.st_mtime, st.st_mode) + + # Re-run. + second = _run_install(install_dir) + assert second.returncode == 0, second.stderr + + # Every file's contents must be byte-identical. + for rel in EXPECTED_FILES: + p = install_dir / rel + body, _, mode = snap1[rel] + assert p.read_bytes() == body, f"{rel} differs after re-install" + # Mode must be preserved (we don't enforce mtime since the + # cmp gate explicitly avoids re-writing). + assert p.stat().st_mode == mode, f"{rel} mode changed after re-install" + + # Use filecmp.dircmp as a belt-and-suspenders check. + cmp = filecmp.dircmp(install_dir, install_dir) + assert not cmp.diff_files + + +def test_install_logs_no_writes_on_idempotent_re_run(tmp_path: Path) -> None: + """The second run must not log "wrote: ..." for any file.""" + install_dir = tmp_path / "mempalace" + first = _run_install(install_dir) + assert first.returncode == 0 + # First run writes everything. + assert "wrote:" in first.stdout + second = _run_install(install_dir) + assert second.returncode == 0 + assert "wrote:" not in second.stdout, ( + f"second install should be a no-op but wrote files:\n{second.stdout}" + ) + + +# ── Uninstall ───────────────────────────────────────────────────────── + + +def test_uninstall_removes_mempalace_install(tmp_path: Path) -> None: + install_dir = tmp_path / "mempalace" + install = _run_install(install_dir) + assert install.returncode == 0 + uninstall = _run_install(install_dir, "--uninstall") + assert uninstall.returncode == 0, uninstall.stderr + assert not install_dir.exists() + + +def test_uninstall_refuses_basename_mismatch(tmp_path: Path) -> None: + """Refuses to remove a directory whose basename isn't 'mempalace'. + + Honours the basename-match safety guard caught in the cursor PR + review — prevents an accidental wipe of a sibling like + 'mempalace-foo' or, in the worst case, the user's home directory. + """ + bad_dir = tmp_path / "totally-not-mempalace" + bad_dir.mkdir() + # Even though the dir has a plugin.json with name=mempalace, the + # basename mismatch must still refuse. + (bad_dir / "plugin.json").write_text(json.dumps({"name": "mempalace"}), encoding="utf-8") + sentinel = bad_dir / "do-not-delete.txt" + sentinel.write_text("preserve me", encoding="utf-8") + + result = _run_install(bad_dir, "--uninstall") + assert result.returncode != 0, ( + f"uninstall should have refused but exited 0:\n{result.stdout}\n{result.stderr}" + ) + assert bad_dir.is_dir(), f"bad uninstall removed {bad_dir}" + assert sentinel.is_file(), "uninstall removed unrelated files inside bad dir" + + +def test_uninstall_refuses_when_plugin_json_missing(tmp_path: Path) -> None: + """Refuses when the dir is missing plugin.json (not actually our plugin).""" + install_dir = tmp_path / "mempalace" + install_dir.mkdir() + sentinel = install_dir / "stranger.txt" + sentinel.write_text("preserve me", encoding="utf-8") + result = _run_install(install_dir, "--uninstall") + assert result.returncode != 0 + assert install_dir.is_dir() + assert sentinel.is_file() + + +def test_uninstall_refuses_when_plugin_json_wrong_name(tmp_path: Path) -> None: + """Refuses when plugin.json names a different plugin.""" + install_dir = tmp_path / "mempalace" + install_dir.mkdir() + (install_dir / "plugin.json").write_text( + json.dumps({"name": "some-other-plugin"}), encoding="utf-8" + ) + sentinel = install_dir / "preserved.txt" + sentinel.write_text("safe", encoding="utf-8") + result = _run_install(install_dir, "--uninstall") + assert result.returncode != 0 + assert install_dir.is_dir() + assert sentinel.is_file() + + +def test_uninstall_no_op_when_target_missing(tmp_path: Path) -> None: + """Uninstalling a non-existent dir is a graceful no-op.""" + install_dir = tmp_path / "mempalace" + result = _run_install(install_dir, "--uninstall") + assert result.returncode == 0, result.stderr + + +# ── Relative path absolutization ────────────────────────────────────── + + +def test_relative_install_dir_is_absolutized(tmp_path: Path) -> None: + """A relative --install-dir must be absolutized in the rendered hooks.json. + + Catches the cursor PR review issue where a relative path baked + in verbatim left Antigravity unable to resolve hook commands. + """ + work = tmp_path / "work" + work.mkdir() + # Relative path resolved against $PWD at invocation time. + rel = "build/agy-out/mempalace" + result = _run_install(Path(rel), cwd=work) + assert result.returncode == 0, result.stderr + abs_install = work / rel + assert abs_install.is_dir(), f"installer did not create {abs_install} from relative path {rel}" + hooks = json.loads((abs_install / "hooks.json").read_text()) + cmds = [] + for ns_payload in hooks.values(): + if not isinstance(ns_payload, dict): + continue + for entries in ns_payload.values(): + for entry in entries: + cmds.append(entry["command"]) + for cmd in cmds: + assert cmd.startswith("/"), f"relative install dir leaked into rendered hooks.json: {cmd!r}" + assert "build/agy-out/mempalace" in cmd, ( + f"command path lost the relative-segment context: {cmd!r}" + ) + + +# ── Misc ────────────────────────────────────────────────────────────── + + +def test_install_help_does_not_write(tmp_path: Path) -> None: + """`--help` should print usage and exit 0 without touching the dir.""" + install_dir = tmp_path / "mempalace" + result = subprocess.run( + ["bash", str(INSTALL_SH), "--help"], + capture_output=True, + text=True, + timeout=10, + ) + assert result.returncode == 0 + assert "Usage:" in result.stdout or "Usage:" in result.stderr + assert not install_dir.exists() + + +def test_install_unknown_arg_exits_non_zero(tmp_path: Path) -> None: + """Unknown args must fail loudly rather than silently ignoring.""" + install_dir = tmp_path / "mempalace" + result = subprocess.run( + [ + "bash", + str(INSTALL_SH), + "--install-dir", + str(install_dir), + "--this-flag-does-not-exist", + ], + capture_output=True, + text=True, + timeout=10, + ) + assert result.returncode != 0 + assert not install_dir.exists() diff --git a/tests/test_antigravity_hooks_shell.py b/tests/test_antigravity_hooks_shell.py new file mode 100644 index 000000000..781078da6 --- /dev/null +++ b/tests/test_antigravity_hooks_shell.py @@ -0,0 +1,1295 @@ +"""End-to-end shell tests for the Antigravity hook scripts. + +Invokes the bash scripts directly via subprocess with synthetic stdin +JSON and asserts on their stdout / exit code / state-dir side effects. + +The two scripts under test are: + +* `hooks/antigravity/mempal_save_hook_antigravity.sh` — Stop event +* `hooks/antigravity/mempal_wake_hook_antigravity.sh` — PreInvocation event + +Test isolation: + +* Each test runs in its own temp dir. +* `MEMPAL_STATE_DIR` is overridden to point at the temp dir, so no + test ever touches the real `~/.mempalace/hook_state/`. +* `HOME` is overridden to a temp dir as well so the kill-switch + palace-existence check sees a hermetic state. +""" + +from __future__ import annotations + +import json +import os +import subprocess +import sys +from pathlib import Path + +import pytest + +REPO_ROOT = Path(__file__).resolve().parents[1] +HOOKS_DIR = REPO_ROOT / "hooks" / "antigravity" +SAVE_HOOK = HOOKS_DIR / "mempal_save_hook_antigravity.sh" +WAKE_HOOK = HOOKS_DIR / "mempal_wake_hook_antigravity.sh" +COMMON_LIB = HOOKS_DIR / "lib" / "common.sh" + +# Skip the entire module on Windows — bash 3.2+ is required. +pytestmark = pytest.mark.skipif( + os.name == "nt", + reason="Antigravity shell hooks require bash; Windows uses a separate code path.", +) + + +def _run_hook( + script: Path, + stdin_json: dict | str, + state_dir: Path, + home: Path, + extra_env: dict[str, str] | None = None, + timeout: float = 10.0, +) -> subprocess.CompletedProcess: + """Run a hook script with isolated env and synthetic stdin.""" + if isinstance(stdin_json, dict): + stdin = json.dumps(stdin_json) + else: + stdin = stdin_json + env = os.environ.copy() + # Hermetic env: HOME and state dir point at the test temp. + home.mkdir(parents=True, exist_ok=True) + state_dir.mkdir(parents=True, exist_ok=True) + env["HOME"] = str(home) + env["MEMPAL_STATE_DIR"] = str(state_dir) + # Drop any leftover kill-switch envs from the user's environment so + # the test exercises the gate it intends to. + for k in ("MEMPAL_DISABLE_HOOK", "MEMPALACE_HOOKS_AUTO_SAVE", "MEMPAL_SAVE_INTERVAL"): + env.pop(k, None) + if extra_env: + env.update(extra_env) + return subprocess.run( + ["bash", str(script)], + input=stdin, + capture_output=True, + text=True, + env=env, + timeout=timeout, + ) + + +def _ensure_palace(home: Path) -> None: + """Create $HOME/.mempalace/ so the palace-nuke kill switch passes.""" + (home / ".mempalace").mkdir(parents=True, exist_ok=True) + + +def _poll_log_contains(log_path: Path, needle: str, timeout: float = 5.0) -> bool: + """Poll a log file until it contains ``needle`` or the timeout elapses. + + The save hook now writes mine/probe outcome lines from a detached + background subshell, so the foreground returns before those lines + are flushed. Callers that assert on background-written log lines + must poll rather than read once. + """ + import time + + deadline = time.monotonic() + timeout + while time.monotonic() < deadline: + if log_path.is_file(): + body = log_path.read_text(errors="replace") + if needle in body: + return True + time.sleep(0.05) + return False + + +def _stop_payload(**overrides) -> dict: + base = { + "executionNum": 1, + "terminationReason": "model_stop", + "error": "", + "fullyIdle": True, + "conversationId": "test-conv-001", + "workspacePaths": ["/tmp/test-workspace"], + "transcriptPath": "/tmp/test-transcript.jsonl", + "artifactDirectoryPath": "/tmp/test-artifacts/", + } + base.update(overrides) + return base + + +def _wake_payload(**overrides) -> dict: + base = { + "invocationNum": 1, + "initialNumSteps": 0, + "conversationId": "test-conv-001", + "workspacePaths": ["/tmp/test-workspace"], + "transcriptPath": "/tmp/test-transcript.jsonl", + "artifactDirectoryPath": "/tmp/test-artifacts/", + } + base.update(overrides) + return base + + +# ── Syntax (bash -n) ────────────────────────────────────────────────── + + +@pytest.mark.parametrize("script", [SAVE_HOOK, WAKE_HOOK, COMMON_LIB], ids=lambda p: p.name) +def test_bash_n_clean(script: Path) -> None: + """All shell files parse cleanly under bash 3.2+.""" + result = subprocess.run( + ["bash", "-n", str(script)], + capture_output=True, + text=True, + timeout=10, + ) + assert result.returncode == 0, f"bash -n {script.name} failed:\n{result.stderr}" + + +# ── Save hook ───────────────────────────────────────────────────────── + + +def test_save_hook_emits_empty_object_on_kill_switch_env(tmp_path: Path) -> None: + """MEMPAL_DISABLE_HOOK=1 should silently emit `{}` and exit 0.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_DISABLE_HOOK": "1"}, + ) + assert result.returncode == 0, result.stderr + assert result.stdout.strip() == "{}", result.stdout + + +def test_save_hook_emits_empty_object_on_auto_save_false(tmp_path: Path) -> None: + """MEMPALACE_HOOKS_AUTO_SAVE=false short-circuits.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPALACE_HOOKS_AUTO_SAVE": "false"}, + ) + assert result.returncode == 0, result.stderr + assert result.stdout.strip() == "{}" + + +def test_save_hook_emits_empty_object_when_palace_dir_missing(tmp_path: Path) -> None: + """Removing $HOME/.mempalace acts as the strongest kill switch.""" + state = tmp_path / "state" + home = tmp_path / "home" + home.mkdir() + # Deliberately do NOT create ~/.mempalace + result = _run_hook(SAVE_HOOK, _stop_payload(), state_dir=state, home=home) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +def test_save_hook_emits_empty_object_when_config_disables(tmp_path: Path) -> None: + """~/.mempalace/config.json `hooks.auto_save: false` short-circuits.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + (home / ".mempalace" / "config.json").write_text( + json.dumps({"hooks": {"auto_save": False}}), + encoding="utf-8", + ) + result = _run_hook(SAVE_HOOK, _stop_payload(), state_dir=state, home=home) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +def test_save_hook_emits_empty_object_when_fully_idle_false(tmp_path: Path) -> None: + """fullyIdle=False defers the save; nothing should write to state.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(fullyIdle=False), + state_dir=state, + home=home, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + # The counter file must NOT exist — we deferred before incrementing. + counter = state / "antigravity_save_count_test-conv-001" + assert not counter.exists(), f"counter advanced despite fullyIdle=false: {counter}" + + +def test_save_hook_emits_empty_object_on_error_termination(tmp_path: Path) -> None: + """terminationReason=error skips the save.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(terminationReason="error", error="model crashed"), + state_dir=state, + home=home, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +def test_save_hook_emits_empty_object_on_malformed_stdin(tmp_path: Path) -> None: + """Malformed JSON must not crash the hook — fail-open behaviour.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + "{not even close to json{", + state_dir=state, + home=home, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +def test_save_hook_emits_empty_object_on_empty_stdin(tmp_path: Path) -> None: + """Empty stdin must not crash the hook.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook(SAVE_HOOK, "", state_dir=state, home=home) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +def test_save_hook_never_emits_decision_continue(tmp_path: Path) -> None: + """The save hook must NEVER emit `{"decision":"continue"}`. + + That output would force Antigravity into an infinite agent + re-execution loop. Hard rule, separately tested. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook(SAVE_HOOK, _stop_payload(), state_dir=state, home=home) + assert result.returncode == 0 + # Parse the output so we don't false-match on substring of + # "Continue thread of work" or similar prose. + try: + payload = json.loads(result.stdout) + except json.JSONDecodeError: + pytest.fail(f"save hook emitted non-JSON: {result.stdout!r}") + assert payload.get("decision") != "continue", ( + f"save hook emitted decision=continue, which would force an infinite " + f"agent loop. payload={payload!r}" + ) + + +def test_save_hook_counter_increments_per_fire(tmp_path: Path) -> None: + """Counter advances on each Stop fire.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + counter_path = state / "antigravity_save_count_test-conv-001" + + for expected in (1, 2, 3): + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "999"}, # high interval -> never trigger save + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + assert counter_path.is_file() + assert counter_path.read_text().strip() == str(expected) + + +def test_save_hook_floors_zero_save_interval_to_avoid_div_by_zero(tmp_path: Path) -> None: + """MEMPAL_SAVE_INTERVAL=0 must be floored, never cause `count % 0`.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "0"}, + ) + # Must NOT crash with bash arithmetic divide-by-zero. + assert result.returncode == 0, ( + f"save hook crashed on MEMPAL_SAVE_INTERVAL=0:\n" + f"stdout={result.stdout!r}\nstderr={result.stderr!r}" + ) + assert result.stdout.strip() == "{}" + + +def test_save_hook_floors_negative_save_interval(tmp_path: Path) -> None: + """Negative MEMPAL_SAVE_INTERVAL falls back to default (no crash).""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "-5"}, + ) + assert result.returncode == 0, result.stderr + assert result.stdout.strip() == "{}" + + +@pytest.mark.parametrize("interval", ["08", "09", "008", "0099"]) +def test_save_hook_handles_leading_zero_save_interval(tmp_path: Path, interval: str) -> None: + """MEMPAL_SAVE_INTERVAL with leading zeros must NOT trigger bash octal arithmetic. + + bash arithmetic ($((COUNT % INTERVAL))) parses any token starting + with `0` as octal. Values like "08" or "09" are not valid octal + digits and would crash the modulo step with:: + + bash: 08: value too great for base (error token is "08") + + mempal_save_interval() in lib/common.sh strips leading zeros before + returning. Regression test for gemini-code-assist review on PR + #1633. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": interval}, + ) + assert result.returncode == 0, ( + f"save hook crashed on MEMPAL_SAVE_INTERVAL={interval!r}:\n" + f"stdout={result.stdout!r}\nstderr={result.stderr!r}" + ) + assert result.stdout.strip() == "{}" + # Stderr must not contain the octal "value too great for base" error. + assert "value too great for base" not in result.stderr, ( + f"bash octal parse error leaked through for MEMPAL_SAVE_INTERVAL={interval!r}: " + f"{result.stderr!r}" + ) + + +def test_common_sh_parser_omits_sentinel_on_malformed_json(tmp_path: Path) -> None: + """`mempal_parse_stdin` must NOT print the success sentinel on parse failure. + + The bash callers detect parse failure by checking whether line 1 + of the parser output is exactly ``__MEMPAL_PARSE_OK__``. If + json.load is wrapped in try/except (and falls back to data={}), + the sentinel still gets printed and the bash defense-in-depth + branch never engages. Regression test for gemini-code-assist + review on PR #1633. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + state.mkdir(parents=True, exist_ok=True) + # Source the lib and call mempal_parse_stdin with malformed JSON. + cmd = f". {COMMON_LIB}; mempal_parse_stdin '{{not even close to json{{'" + result = subprocess.run( + ["bash", "-c", cmd], + capture_output=True, + text=True, + env={ + **os.environ, + "HOME": str(home), + "MEMPAL_STATE_DIR": str(state), + }, + timeout=10, + ) + # The function itself shouldn't error (the inner Python crashes, + # but the subshell catches it). Stdout must NOT contain the sentinel. + assert "__MEMPAL_PARSE_OK__" not in result.stdout, ( + f"parser printed success sentinel on bad JSON, defeating " + f"bash-side error detection: stdout={result.stdout!r}" + ) + + +def test_save_hook_missing_mempalace_python_module_does_not_crash(tmp_path: Path) -> None: + """When the resolved Python interpreter cannot run `-m mempalace`, fail open. + + The save hook now invokes mempalace via ``"$MEMPAL_PYTHON_BIN" + -m mempalace mine ...`` rather than the bare ``mempalace`` console + script. If MEMPAL_PYTHON points at an interpreter that doesn't + have the package installed, the hook must log the failure and + still emit ``{}`` — never crash, never block the user. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + transcript = tmp_path / "transcript.jsonl" + transcript.write_text("{}\n", encoding="utf-8") + # Point MEMPAL_PYTHON at a stub interpreter that has no mempalace + # package installed — `python -m mempalace --version` will fail. + stub = tmp_path / "stub_python" + stub.write_text( + "#!/bin/sh\n" + "# Minimal python stub: rejects every -m invocation so the\n" + '# hook hits the "module unrunnable" branch.\n' + 'case "$*" in\n' + ' *"-m mempalace"*) exit 1 ;;\n' + ' *) exec /usr/bin/env python3 "$@" ;;\n' + "esac\n", + encoding="utf-8", + ) + stub.chmod(0o755) + result = _run_hook( + SAVE_HOOK, + _stop_payload(transcriptPath=str(transcript)), + state_dir=state, + home=home, + extra_env={ + "MEMPAL_PYTHON": str(stub), + "MEMPAL_SAVE_INTERVAL": "1", + }, + ) + assert result.returncode == 0, result.stderr + assert result.stdout.strip() == "{}" + # The "is not runnable" line is written by the detached background + # subshell now (the probe moved off the foreground), so poll for it. + log = state / "antigravity_hook.log" + assert _poll_log_contains(log, "is not runnable via"), ( + f"expected the 'mempalace not runnable via $MEMPAL_PYTHON_BIN' log line; " + f"got:\n{log.read_text(errors='replace') if log.is_file() else ''}" + ) + + +def test_save_hook_uses_python_module_invocation(tmp_path: Path) -> None: + """The save hook source MUST invoke mempalace via `-m mempalace`. + + Locks in the gemini-code-assist fix so a future edit doesn't + silently regress to the bare ``mempalace`` console-script call, + which fails when the user's PATH doesn't expose the venv bin. + """ + body = SAVE_HOOK.read_text(encoding="utf-8") + assert '"$MEMPAL_PYTHON_BIN" -m mempalace' in body, ( + "save hook should invoke mempalace via $MEMPAL_PYTHON_BIN -m mempalace, " + "not the bare `mempalace` console script. The bare invocation breaks " + "when the venv's bin/ isn't on the hook's PATH." + ) + # Also verify the bare invocation is gone (defense-in-depth). + # Allow `mempalace` to appear in comments / strings, but not as + # the start of a `nohup ... mempalace mine` command. + assert "nohup mempalace " not in body, ( + "bare `nohup mempalace ...` invocation found; should be " + '`nohup "$MEMPAL_PYTHON_BIN" -m mempalace ...`' + ) + + +def test_save_hook_backgrounds_probe_mine_and_cleanup_in_one_subshell(tmp_path: Path) -> None: + """Probe + mine + marker cleanup must live in ONE detached subshell. + + igorls' PR #1633 review: the foreground `mempalace --version` + probe pays the full chromadb/onnx cold-start import (the `mine` + subparser imports `mempalace.miner` before argparse handles + `--version`), which blows the save budget. Moving the probe into + the background subshell — together with the mine and the marker + cleanup — keeps the foreground instant. + + Folding cleanup into the same subshell also retires the previous + `kill -0 $MINE_PID` watcher: the `rm -f "$PENDING_FILE"` runs + sequentially after the mine in the shell that owns it, so there is + no sibling-PID `wait`/`kill -0` hazard anymore. This test locks in + that structure. + """ + body = SAVE_HOOK.read_text(encoding="utf-8") + # The buggy sibling `wait` must stay gone. + assert 'wait "$MINE_PID"' not in body, ( + 'buggy `wait "$MINE_PID"` reappeared. POSIX wait cannot watch a sibling pid.' + ) + # The kill -0 polling watcher is no longer needed and should be gone. + assert "kill -0" not in body, ( + "the `kill -0` watcher should have been retired when probe+mine+cleanup " + "were folded into a single background subshell." + ) + # Probe still happens (just inside the subshell now) and uses -m. + assert '"$MEMPAL_PYTHON_BIN" -m mempalace --version' in body, ( + "the runnability probe must still run via $MEMPAL_PYTHON_BIN -m mempalace" + ) + # The whole block is backgrounded: a subshell close followed by the + # detach redirection + `&`. + assert ") >/dev/null 2>&1 < /dev/null &" in body, ( + "probe+mine+cleanup must be wrapped in a detached `( ... ) ... &` subshell" + ) + # MINE_PID capture is no longer used (no separate watcher). + assert "MINE_PID=$!" not in body, ( + "MINE_PID capture is vestigial now that there is no separate watcher" + ) + + +def test_save_hook_returns_before_slow_version_probe(tmp_path: Path) -> None: + """The hook must return immediately even if `--version` is slow. + + Proves the probe was moved off the foreground. We point + MEMPAL_PYTHON at a stub that sleeps for 3s on any `-m mempalace` + invocation. If the probe still ran in the foreground the hook + would block ~3s; with the probe backgrounded it returns in well + under a second. + """ + import time + + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + transcript = tmp_path / "transcript.jsonl" + transcript.write_text("{}\n", encoding="utf-8") + stub = tmp_path / "slow_python" + stub.write_text( + "#!/bin/sh\n" + "# Stub python: any -m mempalace call sleeps 3s, simulating a\n" + "# heavy cold-start import. Everything else proxies to python3.\n" + 'case "$*" in\n' + ' *"-m mempalace"*) sleep 3; exit 0 ;;\n' + ' *) exec /usr/bin/env python3 "$@" ;;\n' + "esac\n", + encoding="utf-8", + ) + stub.chmod(0o755) + start = time.monotonic() + result = _run_hook( + SAVE_HOOK, + _stop_payload(transcriptPath=str(transcript)), + state_dir=state, + home=home, + extra_env={"MEMPAL_PYTHON": str(stub), "MEMPAL_SAVE_INTERVAL": "1"}, + timeout=10.0, + ) + elapsed = time.monotonic() - start + assert result.returncode == 0, result.stderr + assert result.stdout.strip() == "{}" + assert elapsed < 2.0, ( + f"save hook took {elapsed:.2f}s with a 3s --version probe; the probe " + f"is still running in the foreground instead of the background subshell." + ) + + +def test_save_hook_counter_write_is_atomic(tmp_path: Path) -> None: + """Counter is written via mempal_write_counter_atomic (temp + mv). + + igorls' PR #1633 review: the counter was written with a plain + `printf > file` (truncate-then-write) while the comment claimed it + was atomic. We verify (a) the source uses the atomic helper, not a + bare redirect into the counter file, (b) the counter still + increments correctly across fires, and (c) no temp file is left + behind. + """ + body = SAVE_HOOK.read_text(encoding="utf-8") + assert 'mempal_write_counter_atomic "$COUNTER_FILE" "$COUNT"' in body, ( + "save hook must write the counter via mempal_write_counter_atomic" + ) + assert 'printf \'%s\' "$COUNT" > "$COUNTER_FILE"' not in body, ( + "non-atomic `printf > $COUNTER_FILE` should have been replaced" + ) + + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + counter_path = state / "antigravity_save_count_test-conv-001" + for expected in (1, 2, 3): + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "999"}, + ) + assert result.returncode == 0 + assert counter_path.read_text().strip() == str(expected) + # No leftover temp files from the atomic write. + temps = [ + p.name + for p in state.iterdir() + if ".XXXXXX" in p.name or p.name.startswith("antigravity_save_count_test-conv-001.") + ] + assert not temps, f"atomic counter write left temp files behind: {temps}" + + +def test_save_hook_helper_uses_temp_and_mv(tmp_path: Path) -> None: + """mempal_write_counter_atomic must use a temp file + mv (not a bare redirect).""" + body = COMMON_LIB.read_text(encoding="utf-8") + assert "mempal_write_counter_atomic()" in body + assert "mktemp" in body, "atomic counter helper should create a temp file via mktemp" + assert "mv -f" in body, "atomic counter helper should promote the temp with mv -f" + + +def test_wake_hook_uses_sys_executable_module_invocation(tmp_path: Path) -> None: + """The wake hook's inner Python must invoke mempalace via sys.executable -m. + + Same rationale as the save hook fix: the bare ``mempalace`` + console script fails when the venv's bin/ isn't on the hook's + PATH. Using ``[sys.executable, '-m', 'mempalace', ...]`` binds + the call to the same interpreter that resolved MEMPAL_PYTHON. + """ + body = WAKE_HOOK.read_text(encoding="utf-8") + assert "sys.executable, '-m', 'mempalace'" in body, ( + "wake hook should invoke mempalace via [sys.executable, '-m', 'mempalace', ...], " + "not ['mempalace', ...]. The bare invocation breaks when the venv's bin/ " + "isn't on the hook's PATH." + ) + assert "['mempalace', 'wake-up'" not in body, ( + "bare ['mempalace', 'wake-up', ...] invocation found in wake hook" + ) + + +def test_save_hook_rejects_traversal_in_transcript_path(tmp_path: Path) -> None: + """A `..` segment in transcriptPath must be rejected.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + # Set interval to 1 so the modulo gate would normally fire on the + # first Stop, then prove the path validator stops the spawn. + result = _run_hook( + SAVE_HOOK, + _stop_payload(transcriptPath="/legit/../etc/passwd"), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + log = state / "antigravity_hook.log" + assert log.is_file() + log_body = log.read_text() + assert "invalid transcriptPath rejected" in log_body or "does not exist" in log_body + + +def test_save_hook_rejects_non_jsonl_transcript_path(tmp_path: Path) -> None: + """A transcriptPath ending in something other than .json[l] is rejected.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(transcriptPath="/tmp/transcript.txt"), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +def test_save_hook_state_files_are_namespaced_antigravity(tmp_path: Path) -> None: + """Every state file the save hook touches starts with `antigravity_`. + + The shared state directory is also home to Claude Code, Codex, and + (in the future) Cursor hook state. Namespacing prevents collisions. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "999"}, + ) + assert result.returncode == 0 + leaks = [p.name for p in state.iterdir() if not p.name.startswith("antigravity_")] + assert not leaks, f"save hook created non-antigravity-namespaced state files: {leaks}" + + +def test_save_hook_pending_marker_blocks_concurrent_save(tmp_path: Path) -> None: + """A fresh pending marker should cause the next save to skip.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + pending = state / "antigravity_pending_test-conv-001" + state.mkdir(parents=True, exist_ok=True) + pending.touch() + # Force the modulo gate to fire by setting interval=1. + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + log_body = (state / "antigravity_hook.log").read_text(errors="replace") + assert "pending save still in flight" in log_body + + +# ── Wake hook ───────────────────────────────────────────────────────── + + +def test_wake_hook_emits_empty_object_on_kill_switch(tmp_path: Path) -> None: + """MEMPAL_DISABLE_HOOK=1 silences the wake hook.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + WAKE_HOOK, + _wake_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_DISABLE_HOOK": "1"}, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + + +@pytest.mark.parametrize("invocation", [0, 2, 5, 100]) +def test_wake_hook_emits_empty_when_invocation_num_not_one(tmp_path: Path, invocation: int) -> None: + """Only invocationNum == 1 triggers injection.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + WAKE_HOOK, + _wake_payload(invocationNum=invocation), + state_dir=state, + home=home, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}", ( + f"wake hook injected at invocationNum={invocation}: {result.stdout!r}" + ) + + +def test_wake_hook_loop_guard_prevents_repeat_injection(tmp_path: Path) -> None: + """A second fire for the same conversationId must skip via the mkdir guard.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + # Pre-create the woke marker dir. + woke = state / "antigravity_woke_test-conv-001" + state.mkdir(parents=True, exist_ok=True) + woke.mkdir() + result = _run_hook( + WAKE_HOOK, + _wake_payload(), + state_dir=state, + home=home, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + log_body = (state / "antigravity_hook.log").read_text(errors="replace") + assert "already woke this conversation" in log_body + + +def test_wake_hook_never_emits_decision_field(tmp_path: Path) -> None: + """The wake hook must never emit a `decision` key (that field is Stop-only).""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + WAKE_HOOK, + _wake_payload(), + state_dir=state, + home=home, + ) + assert result.returncode == 0 + try: + payload = json.loads(result.stdout) + except json.JSONDecodeError: + pytest.fail(f"wake hook emitted non-JSON: {result.stdout!r}") + assert "decision" not in payload, f"wake hook emitted a decision field: {payload!r}" + + +def test_wake_hook_emits_empty_when_mempalace_missing(tmp_path: Path) -> None: + """When mempalace can't be run, the wake hook degrades to `{}`. + + Antigravity's hook framework should never see a stack trace from + a missing CLI — emit `{}` and let the conversation start without + injection. + + Note: the wake hook binds the wake-up call to + ``[sys.executable, '-m', 'mempalace', ...]`` (the interpreter that + resolved MEMPAL_PYTHON), NOT a bare ``mempalace`` on PATH. So we + force MEMPAL_PYTHON="" (resolution falls back to ``python3`` on the + minimal PATH below) and strip PATH to a system python3 that has no + mempalace package installed. The inner ``python3 -m mempalace`` + then fails and the hook must emit `{}`. We also override + MEMPAL_PYTHON explicitly so a value exported in the developer's + shell can't leak in and point at an interpreter that *does* have + mempalace. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + # Strip PATH down to just the bash + python essentials, dropping + # any directory that might have a `mempalace` binary, and clear + # MEMPAL_PYTHON so resolution falls back to this minimal PATH. + minimal_path = "/usr/bin:/bin" + result = _run_hook( + WAKE_HOOK, + _wake_payload(), + state_dir=state, + home=home, + extra_env={"PATH": minimal_path, "MEMPAL_PYTHON": ""}, + ) + assert result.returncode == 0, ( + f"wake hook crashed when mempalace is missing:\n" + f"stdout={result.stdout!r}\nstderr={result.stderr!r}" + ) + assert result.stdout.strip() == "{}" + + +def test_wake_hook_state_files_are_namespaced_antigravity(tmp_path: Path) -> None: + """Wake hook state files are also namespaced.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook(WAKE_HOOK, _wake_payload(), state_dir=state, home=home) + assert result.returncode == 0 + leaks = [p.name for p in state.iterdir() if not p.name.startswith("antigravity_")] + assert not leaks, leaks + + +# ── Wing inference ──────────────────────────────────────────────────── + + +def test_wing_inference_picks_first_workspace_path(tmp_path: Path) -> None: + """Wing is derived from workspacePaths[0]'s leaf directory. + + Antigravity sends an array; the first element is canonical. + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + # Set interval=1 and a real existing transcript so the save path + # logs the inferred wing. + transcript = tmp_path / "transcript.jsonl" + transcript.write_text("{}\n", encoding="utf-8") + workspace = tmp_path / "myproj-with-dashes" + workspace.mkdir() + result = _run_hook( + SAVE_HOOK, + _stop_payload( + transcriptPath=str(transcript), + workspacePaths=[str(workspace), "/some/other/workspace"], + ), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert result.returncode == 0 + log_body = (state / "antigravity_hook.log").read_text(errors="replace") + # Hyphens become underscores; lowercase. + assert "wing=wing_myproj_with_dashes" in log_body, log_body + + +def test_wing_inference_defaults_to_sessions_when_workspace_empty(tmp_path: Path) -> None: + """An empty workspacePaths array yields wing_sessions.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + transcript = tmp_path / "transcript.jsonl" + transcript.write_text("{}\n", encoding="utf-8") + result = _run_hook( + SAVE_HOOK, + _stop_payload( + transcriptPath=str(transcript), + workspacePaths=[], + ), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert result.returncode == 0 + log_body = (state / "antigravity_hook.log").read_text(errors="replace") + assert "wing=wing_sessions" in log_body + + +# ── Performance budget (soft) ───────────────────────────────────────── + + +@pytest.mark.skipif(sys.platform == "win32", reason="not relevant on Windows code path") +def test_save_hook_returns_quickly_under_kill_switch(tmp_path: Path) -> None: + """Under the kill switch the hook should return well under 1s. + + The integration brief budgets hooks at <500ms. We allow a generous + 1500ms here because CI machines can be slow on cold-cache subprocess + spawn. The point of the test is to fail loudly if a future edit + introduces a synchronous mempalace import or DB connection. + """ + import time + + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + start = time.monotonic() + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_DISABLE_HOOK": "1"}, + ) + elapsed = time.monotonic() - start + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + assert elapsed < 1.5, ( + f"save hook under kill switch took {elapsed:.3f}s; expected < 1.5s. " + "A regression here usually means a synchronous import / DB connection " + "is happening before the kill-switch short-circuit." + ) + + +# ── State-file GC (PR #1633 hygiene) ────────────────────────────────── + + +def _backdate(path: Path, days: int) -> None: + """Set a path's atime/mtime ``days`` days into the past.""" + import time + + past = time.time() - days * 86400 + os.utime(path, (past, past)) + + +def test_gc_removes_stale_state_files(tmp_path: Path) -> None: + """Stale per-conversation state (older than the TTL) is swept. + + igorls' PR #1633 review flagged unbounded growth of + antigravity_save_count_*, antigravity_pending_*, and + antigravity_woke_* artifacts. mempal_gc_stale_state removes those + older than MEMPAL_STATE_TTL_DAYS (default 30). + """ + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + state.mkdir(parents=True, exist_ok=True) + + # Stale artifacts (40 days old) — all three shapes. + stale_count = state / "antigravity_save_count_old-conv" + stale_pending = state / "antigravity_pending_old-conv" + stale_woke = state / "antigravity_woke_old-conv" + stale_count.write_text("7", encoding="utf-8") + stale_pending.write_text("", encoding="utf-8") + stale_woke.mkdir() + for p in (stale_count, stale_pending, stale_woke): + _backdate(p, 40) + + # Fresh artifacts must survive. + fresh_count = state / "antigravity_save_count_new-conv" + fresh_count.write_text("1", encoding="utf-8") + + # Protected files must never be touched even when stale. + log = state / "antigravity_hook.log" + log.write_text("log line\n", encoding="utf-8") + _backdate(log, 99) + + cmd = f". {COMMON_LIB}; mempal_gc_stale_state" + result = subprocess.run( + ["bash", "-c", cmd], + capture_output=True, + text=True, + env={**os.environ, "HOME": str(home), "MEMPAL_STATE_DIR": str(state)}, + timeout=10, + ) + assert result.returncode == 0, result.stderr + + assert not stale_count.exists(), "stale counter file not swept" + assert not stale_pending.exists(), "stale pending marker not swept" + assert not stale_woke.exists(), "stale woke marker dir not swept" + assert fresh_count.exists(), "fresh counter file was wrongly swept" + assert log.exists(), "protected hook.log was swept (name glob too broad)" + + +def test_gc_is_throttled_to_once_per_day(tmp_path: Path) -> None: + """A fresh antigravity_last_sweep marker (<24h) skips the sweep.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + state.mkdir(parents=True, exist_ok=True) + + # Fresh sweep marker — GC should bail before touching anything. + marker = state / "antigravity_last_sweep" + marker.write_text("", encoding="utf-8") + + stale_count = state / "antigravity_save_count_old-conv" + stale_count.write_text("7", encoding="utf-8") + _backdate(stale_count, 40) + + cmd = f". {COMMON_LIB}; mempal_gc_stale_state" + result = subprocess.run( + ["bash", "-c", cmd], + capture_output=True, + text=True, + env={**os.environ, "HOME": str(home), "MEMPAL_STATE_DIR": str(state)}, + timeout=10, + ) + assert result.returncode == 0, result.stderr + # Throttled: stale file should still be present. + assert stale_count.exists(), ( + "GC ran despite a fresh antigravity_last_sweep marker; throttle failed" + ) + + +def test_gc_runs_when_marker_is_stale(tmp_path: Path) -> None: + """A stale antigravity_last_sweep marker (>24h) lets the sweep run.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + state.mkdir(parents=True, exist_ok=True) + + marker = state / "antigravity_last_sweep" + marker.write_text("", encoding="utf-8") + _backdate(marker, 2) # 2 days old -> stale + + stale_count = state / "antigravity_save_count_old-conv" + stale_count.write_text("7", encoding="utf-8") + _backdate(stale_count, 40) + + cmd = f". {COMMON_LIB}; mempal_gc_stale_state" + result = subprocess.run( + ["bash", "-c", cmd], + capture_output=True, + text=True, + env={**os.environ, "HOME": str(home), "MEMPAL_STATE_DIR": str(state)}, + timeout=10, + ) + assert result.returncode == 0, result.stderr + assert not stale_count.exists(), "stale counter not swept despite stale marker" + + +def test_state_ttl_days_floors_and_strips(tmp_path: Path) -> None: + """mempal_state_ttl_days validates input like mempal_save_interval.""" + home = tmp_path / "home" + _ensure_palace(home) + + def ttl(value: str | None) -> str: + env = {**os.environ, "HOME": str(home)} + if value is None: + env.pop("MEMPAL_STATE_TTL_DAYS", None) + else: + env["MEMPAL_STATE_TTL_DAYS"] = value + out = subprocess.run( + ["bash", "-c", f". {COMMON_LIB}; mempal_state_ttl_days"], + capture_output=True, + text=True, + env=env, + timeout=10, + ) + return out.stdout.strip() + + assert ttl(None) == "30", "default TTL should be 30" + assert ttl("") == "30", "empty TTL falls back to 30" + assert ttl("abc") == "30", "garbage TTL falls back to 30" + assert ttl("7") == "7" + # Leading zeros must be stripped so `find -mtime +N` never sees octal-ish tokens. + assert ttl("007") == "7" + assert ttl("0") == "0" + + +def test_save_hook_calls_gc(tmp_path: Path) -> None: + """The save hook wires mempal_gc_stale_state in and creates the sweep marker.""" + body = SAVE_HOOK.read_text(encoding="utf-8") + assert "mempal_gc_stale_state" in body, "save hook must call mempal_gc_stale_state" + + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_SAVE_INTERVAL": "999"}, + ) + assert result.returncode == 0 + assert (state / "antigravity_last_sweep").exists(), ( + "save hook should have created the antigravity_last_sweep throttle marker" + ) + + +def test_gc_does_not_run_under_kill_switch(tmp_path: Path) -> None: + """When the kill switch trips, the save hook returns before GC runs.""" + state = tmp_path / "state" + home = tmp_path / "home" + _ensure_palace(home) + result = _run_hook( + SAVE_HOOK, + _stop_payload(), + state_dir=state, + home=home, + extra_env={"MEMPAL_DISABLE_HOOK": "1"}, + ) + assert result.returncode == 0 + assert result.stdout.strip() == "{}" + # GC runs after the kill-switch check, so no sweep marker is written. + assert not (state / "antigravity_last_sweep").exists(), ( + "GC ran despite the kill switch being tripped" + ) + + +# ── Python interpreter resolution (mempal_resolve_python) ───────────── +# +# The hooks run `"$MEMPAL_PYTHON_BIN" -m mempalace`, so MEMPAL_PYTHON_BIN +# must resolve to an interpreter that owns the mempalace package. The +# common install path — `uv tool install mempalace` / `pipx install` — +# puts the console scripts on PATH inside an ISOLATED env whose +# interpreter is NOT system python3. The resolver derives that +# interpreter from the console-script shebang so mining works without +# the user having to set MEMPAL_PYTHON. Regression coverage for the +# silent-skip bug a real user hit on PR #1633. + + +def _resolve_python(env: dict[str, str]) -> str: + """Source common.sh under ``env`` and return the resolved MEMPAL_PYTHON_BIN.""" + out = subprocess.run( + ["bash", "-c", f'. {COMMON_LIB}; printf "%s" "$MEMPAL_PYTHON_BIN"'], + capture_output=True, + text=True, + env=env, + timeout=10, + ) + assert out.returncode == 0, out.stderr + return out.stdout.strip() + + +def _make_fake_python(path: Path) -> Path: + """Create an executable file whose basename looks like a Python interpreter.""" + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text("#!/bin/sh\necho fake-python\n", encoding="utf-8") + path.chmod(0o755) + return path + + +def _make_console_script(path: Path, shebang_interp: str) -> Path: + """Create a fake mempalace console script with the given shebang interpreter.""" + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(f"#!{shebang_interp}\nprint('hi')\n", encoding="utf-8") + path.chmod(0o755) + return path + + +def test_resolve_python_derives_interpreter_from_console_script_shebang( + tmp_path: Path, +) -> None: + """With MEMPAL_PYTHON unset, the resolver reads the mempalace-mcp shebang. + + Simulates a `uv tool install` layout: the console script is on PATH + but its interpreter is an isolated Python, NOT the system python3 + that PATH would otherwise resolve. + """ + home = tmp_path / "home" + _ensure_palace(home) + interp = _make_fake_python(tmp_path / "pyhome" / "python3.12") + bindir = tmp_path / "bin" + _make_console_script(bindir / "mempalace-mcp", str(interp)) + + env = {**os.environ, "HOME": str(home), "PATH": f"{bindir}:/usr/bin:/bin"} + env.pop("MEMPAL_PYTHON", None) + + assert _resolve_python(env) == str(interp), ( + "resolver should derive the interpreter from the mempalace-mcp " + "console-script shebang when MEMPAL_PYTHON is unset" + ) + + +def test_resolve_python_prefers_mcp_script_over_path_python3(tmp_path: Path) -> None: + """The shebang-derived interpreter must win over a system python3 on PATH. + + This is the crux of the fix: a system python3 is present (and would + be picked by the old resolver) but lacks the package, while the + console script's interpreter owns it. + """ + home = tmp_path / "home" + _ensure_palace(home) + interp = _make_fake_python(tmp_path / "pyhome" / "python3.12") + bindir = tmp_path / "bin" + _make_console_script(bindir / "mempalace-mcp", str(interp)) + # A decoy python3 earlier on PATH must be ignored in favour of the shebang. + _make_fake_python(bindir / "python3") + + env = {**os.environ, "HOME": str(home), "PATH": f"{bindir}:/usr/bin:/bin"} + env.pop("MEMPAL_PYTHON", None) + + assert _resolve_python(env) == str(interp), ( + "shebang-derived interpreter must take precedence over a python3 on PATH" + ) + + +def test_resolve_python_override_beats_shebang(tmp_path: Path) -> None: + """An explicit MEMPAL_PYTHON override always wins over shebang derivation.""" + home = tmp_path / "home" + _ensure_palace(home) + override = _make_fake_python(tmp_path / "override" / "python3") + interp = _make_fake_python(tmp_path / "pyhome" / "python3.12") + bindir = tmp_path / "bin" + _make_console_script(bindir / "mempalace-mcp", str(interp)) + + env = { + **os.environ, + "HOME": str(home), + "PATH": f"{bindir}:/usr/bin:/bin", + "MEMPAL_PYTHON": str(override), + } + assert _resolve_python(env) == str(override), ( + "MEMPAL_PYTHON override must take precedence over the console-script shebang" + ) + + +def test_resolve_python_rejects_env_style_shebang(tmp_path: Path) -> None: + """A `#!/usr/bin/env python3` wrapper must be skipped, not used verbatim. + + The first shebang token would be `/usr/bin/env`, which is not a + Python interpreter. The resolver must reject it (basename guard) and + fall through to python3 on PATH rather than trying to run + `/usr/bin/env -m mempalace`. + """ + home = tmp_path / "home" + _ensure_palace(home) + bindir = tmp_path / "bin" + _make_console_script(bindir / "mempalace-mcp", "/usr/bin/env python3") + + env = {**os.environ, "HOME": str(home), "PATH": f"{bindir}:/usr/bin:/bin"} + env.pop("MEMPAL_PYTHON", None) + + resolved = _resolve_python(env) + assert resolved != "/usr/bin/env", "resolver must not return /usr/bin/env" + assert os.path.basename(resolved).startswith("python"), ( + f"resolver should fall back to a python3 on PATH; got {resolved!r}" + ) + + +def test_resolve_python_skips_shebang_interp_that_is_not_executable( + tmp_path: Path, +) -> None: + """A shebang pointing at a missing/non-executable interpreter is skipped. + + Guards against a stale console script whose interpreter was deleted: + the resolver must fall through to python3 rather than returning a + dead path. + """ + home = tmp_path / "home" + _ensure_palace(home) + bindir = tmp_path / "bin" + missing = tmp_path / "pyhome" / "python3.12" # never created -> not -x + _make_console_script(bindir / "mempalace-mcp", str(missing)) + + env = {**os.environ, "HOME": str(home), "PATH": f"{bindir}:/usr/bin:/bin"} + env.pop("MEMPAL_PYTHON", None) + + resolved = _resolve_python(env) + assert resolved != str(missing), "resolver returned a non-executable shebang interp" + assert os.path.basename(resolved).startswith("python"), ( + f"resolver should fall back to python3 on PATH; got {resolved!r}" + ) + + +def test_resolve_python_falls_back_to_path_python3_without_console_scripts( + tmp_path: Path, +) -> None: + """With no mempalace console scripts on PATH, resolve to python3 (prior behaviour).""" + home = tmp_path / "home" + _ensure_palace(home) + + env = {**os.environ, "HOME": str(home), "PATH": "/usr/bin:/bin"} + env.pop("MEMPAL_PYTHON", None) + + resolved = _resolve_python(env) + assert os.path.basename(resolved).startswith("python"), ( + f"resolver should fall back to python3 on PATH; got {resolved!r}" + ) diff --git a/tests/test_antigravity_plugin_manifest.py b/tests/test_antigravity_plugin_manifest.py new file mode 100644 index 000000000..f3dc94c76 --- /dev/null +++ b/tests/test_antigravity_plugin_manifest.py @@ -0,0 +1,358 @@ +"""Schema tests for the .antigravity-plugin/ directory. + +Covers: + +* `plugin.json` matches the verified-minimal Antigravity schema + (`{"name": "..."}`, no fabricated fields). +* `mcp_config.json` registers `mempalace-mcp` under the `mcpServers` + key with the verified shape from + https://antigravity.google/docs/mcp. +* `hooks.json.tmpl` is valid JSON, references both hook scripts via + the `__PLUGIN_DIR__` placeholder, and pins per-event timeouts + inside the safety bounds. +* `skills/mempalace/SKILL.md` exists as a real file (no symlinks) and + carries the required YAML frontmatter (`description`). + +These are contract tests — they fail as soon as anyone changes the +in-repo shape in a way that drifts from Antigravity's documented +schema. See [hooks/antigravity/INVESTIGATION.md](../hooks/antigravity/INVESTIGATION.md) +for the source-of-truth audit driving the assertions. +""" + +import json +import re +from pathlib import Path + +import pytest + +REPO_ROOT = Path(__file__).resolve().parents[1] +PLUGIN_DIR = REPO_ROOT / ".antigravity-plugin" + +PLUGIN_JSON = PLUGIN_DIR / "plugin.json" +MCP_CONFIG = PLUGIN_DIR / "mcp_config.json" +HOOKS_TMPL = PLUGIN_DIR / "hooks.json.tmpl" +SKILL_MD = PLUGIN_DIR / "skills" / "mempalace" / "SKILL.md" +PLUGIN_README = PLUGIN_DIR / "README.md" + +# Recall layer (mirrors the Cursor branch's recall skill + rule + shared protocol) +RECALL_SKILL_MD = PLUGIN_DIR / "skills" / "mempalace-recall" / "SKILL.md" +RECALL_RULE_MD = PLUGIN_DIR / "rules" / "mempalace-recall.md" +SHARED_PROTOCOL = REPO_ROOT / "integrations" / "shared" / "recall-protocol.md" +INSTALL_SH = REPO_ROOT / "hooks" / "antigravity" / "install.sh" +SHARED_PROTOCOL_REF = ( + "https://github.com/MemPalace/mempalace/blob/main/integrations/shared/recall-protocol.md" +) + +EXPECTED_HOOKS = { + "Stop": { + "script_basename": "mempal_save_hook_antigravity.sh", + "timeout_floor": 10, + "timeout_ceiling": 60, + }, + "PreInvocation": { + "script_basename": "mempal_wake_hook_antigravity.sh", + "timeout_floor": 1, + "timeout_ceiling": 10, + }, +} + + +def test_plugin_dir_exists() -> None: + """The in-repo plugin directory exists and is laid out as expected.""" + assert PLUGIN_DIR.is_dir(), f"missing: {PLUGIN_DIR}" + for required in (PLUGIN_JSON, MCP_CONFIG, HOOKS_TMPL, SKILL_MD, PLUGIN_README): + assert required.is_file(), f"missing: {required}" + + +def test_plugin_json_minimal_schema() -> None: + """plugin.json must be `{"name": "mempalace"}` exactly — no fabricated fields. + + The third-party "antigravity-plugins" community skill at + ~/.gemini/skills/antigravity-plugins/SKILL.md documents a + `permissions` field that does not exist in any real + Google-shipped plugin. We pin to the verified minimal shape and + fail loudly if anyone re-introduces the fabrication. + """ + data = json.loads(PLUGIN_JSON.read_text(encoding="utf-8")) + assert isinstance(data, dict), "plugin.json must be a JSON object" + assert data == {"name": "mempalace"}, ( + f"plugin.json must equal {{'name': 'mempalace'}} (verified shape); " + f"got {data!r}. The `permissions` field documented in the third-party " + "antigravity-plugins community skill is fabricated; do not add it." + ) + + +def test_mcp_config_registers_mempalace_mcp() -> None: + """mcp_config.json must register the mempalace stdio server.""" + data = json.loads(MCP_CONFIG.read_text(encoding="utf-8")) + assert isinstance(data, dict) + assert "mcpServers" in data, "missing top-level mcpServers key" + servers = data["mcpServers"] + assert isinstance(servers, dict) + assert "mempalace" in servers, "mcpServers.mempalace not registered" + entry = servers["mempalace"] + assert isinstance(entry, dict) + assert entry.get("command") == "mempalace-mcp", ( + f"mcpServers.mempalace.command must be 'mempalace-mcp'; got {entry.get('command')!r}" + ) + + +def test_hooks_template_valid_json() -> None: + """hooks.json.tmpl must be valid JSON (the `__PLUGIN_DIR__` placeholder is JSON-safe).""" + body = HOOKS_TMPL.read_text(encoding="utf-8") + try: + data = json.loads(body) + except json.JSONDecodeError as exc: + pytest.fail(f"hooks.json.tmpl is not valid JSON: {exc}") + assert isinstance(data, dict) + + +def test_hooks_template_uses_plugin_dir_placeholder() -> None: + """hooks.json.tmpl must use __PLUGIN_DIR__ — never bake an absolute path.""" + body = HOOKS_TMPL.read_text(encoding="utf-8") + assert "__PLUGIN_DIR__" in body, ( + "hooks.json.tmpl must use __PLUGIN_DIR__ as the install-dir placeholder. " + "Hard-coded absolute paths break the installer's idempotency promise." + ) + # Any `/Users/`, `/home/`, or `~/` segment in the template body is a sign + # that an absolute path leaked in. + forbidden = ["/Users/", "/home/", "~/"] + for prefix in forbidden: + assert prefix not in body, ( + f"hooks.json.tmpl must not contain a hard-coded path segment {prefix!r}; " + "use the __PLUGIN_DIR__ placeholder instead." + ) + + +@pytest.mark.parametrize("event", sorted(EXPECTED_HOOKS)) +def test_hooks_template_event_present(event: str) -> None: + """Each expected event has exactly one entry pointing at the right script with bounded timeout.""" + data = json.loads(HOOKS_TMPL.read_text(encoding="utf-8")) + bounds = EXPECTED_HOOKS[event] + # Outer keys are hook namespace names, e.g. "mempalace-save". + matching = [ + (ns, payload[event]) + for ns, payload in data.items() + if isinstance(payload, dict) and event in payload + ] + assert len(matching) == 1, ( + f"expected exactly one hook namespace declaring event {event!r}; " + f"found {len(matching)}: {[m[0] for m in matching]}" + ) + _, entries = matching[0] + assert isinstance(entries, list) + assert len(entries) == 1, ( + f"{event}: expected exactly one handler entry, got {len(entries)}; " + "duplicate entries would double-fire the hook" + ) + handler = entries[0] + assert handler.get("type", "command") == "command", ( + f"{event}: only type=command is supported by Antigravity" + ) + cmd = handler.get("command", "") + assert cmd.startswith("__PLUGIN_DIR__/"), ( + f"{event}: command must be rooted at __PLUGIN_DIR__/, got {cmd!r}" + ) + assert cmd.endswith("/" + bounds["script_basename"]), ( + f"{event}: command must end with the expected script basename " + f"{bounds['script_basename']!r}; got {cmd!r}" + ) + timeout = handler.get("timeout") + is_int = isinstance(timeout, int) and not isinstance(timeout, bool) + assert is_int and bounds["timeout_floor"] <= timeout <= bounds["timeout_ceiling"], ( + f"{event}: timeout must be an int in " + f"[{bounds['timeout_floor']}, {bounds['timeout_ceiling']}]s; got {timeout!r}" + ) + + +def test_skill_is_real_file_not_symlink() -> None: + """SKILL.md at the discovery path must be a real file. + + Antigravity (like Cursor) loads skills by reading + `/skills//SKILL.md` directly. A symlink at that path + would work locally but break under any installer that does a + plain `cp`. Honouring constraint #6 in the integration brief. + """ + assert SKILL_MD.is_file(), f"missing: {SKILL_MD}" + assert not SKILL_MD.is_symlink(), ( + f"{SKILL_MD} must be a real file, not a symlink — installers that " + "cp without -L would otherwise carry the symlink into the install." + ) + + +def test_skill_has_required_frontmatter() -> None: + """SKILL.md must carry YAML frontmatter with a non-empty description. + + Antigravity's skill loader uses the `description` field to decide + when to surface the skill. An empty / missing description would + silently disable progressive disclosure. + """ + body = SKILL_MD.read_text(encoding="utf-8") + assert body.startswith("---\n"), "SKILL.md must begin with YAML frontmatter" + end = body.find("\n---\n", 4) + assert end > 0, "SKILL.md frontmatter is missing the closing fence" + front = body[4:end] + desc_match = re.search(r"^description:\s*(.+)$", front, re.MULTILINE) + assert desc_match is not None, "SKILL.md frontmatter missing `description` key" + desc_value = desc_match.group(1).strip() + assert desc_value, "SKILL.md `description` is empty" + # Sanity: the description should be substantive enough for the + # skill loader to act on. 30 chars is a soft floor, not a tight bound. + assert len(desc_value) >= 30, ( + f"SKILL.md description looks too short to be useful: {desc_value!r}" + ) + + +def test_no_symlinks_inside_plugin_dir() -> None: + """Nothing inside .antigravity-plugin/ may be a symlink. + + This is the broader version of `test_skill_is_real_file_not_symlink` + and a guard against silent regressions if someone re-introduces + the `skills -> ../skills` symlink pattern from the original plan + without honouring `cp -RL` semantics in the installer. + """ + leaks = [p for p in PLUGIN_DIR.rglob("*") if p.is_symlink()] + assert not leaks, ( + f"symlinks found inside .antigravity-plugin/: {[str(p.relative_to(PLUGIN_DIR)) for p in leaks]}; " + "the entire plugin tree must be made of real files so any installer " + "(including those that cp without -L) gets a working install." + ) + + +def test_plugin_readme_present_and_substantive() -> None: + """README.md inside the plugin dir must exist and be substantive. + + Empty / placeholder READMEs are a frequent symptom of half-finished + refactors; a 200-byte floor catches those without being so tight + it discourages legitimate rewrites. + """ + body = PLUGIN_README.read_text(encoding="utf-8") + assert len(body) >= 200, ( + f".antigravity-plugin/README.md looks too short ({len(body)} bytes); " + "expected a substantive description of layout + install." + ) + # Must mention key concepts so the README can't degrade into prose + # that drops the operational links. + for needle in ("plugin.json", "mcp_config.json", "hooks.json"): + assert needle in body, f"README.md must mention {needle}" + + +# ── Recall layer: skill, rule, shared protocol ─────────────────────── +# +# Mirrors the three-layer recall wiring added for Cursor on +# feat/cursor-hooks-support, adapted for Antigravity's plugin surface. +# The wake hook (PreInvocation) is the eager layer; these files are the +# on-demand layers (skill + optional rule), both anchored to the single +# canonical protocol so they can never drift. + + +def test_shared_protocol_exists() -> None: + """The canonical recall protocol is the single source of truth. + + The recall skill and rule both link here rather than restating the + protocol, so the rule can never drift from the skill. + """ + assert SHARED_PROTOCOL.is_file(), f"missing: {SHARED_PROTOCOL}" + body = SHARED_PROTOCOL.read_text(encoding="utf-8") + assert "MemPalace Recall Protocol" in body, "shared protocol must carry its canonical title" + + +def test_recall_skill_exists() -> None: + """The recall-only skill must be a real file at the discovery path.""" + assert RECALL_SKILL_MD.is_file(), f"missing: {RECALL_SKILL_MD}" + assert not RECALL_SKILL_MD.is_symlink(), ( + f"{RECALL_SKILL_MD} must be a real file, not a symlink — installers " + "that cp without -L would otherwise carry the symlink into the install." + ) + + +def test_recall_skill_has_required_frontmatter() -> None: + """The recall skill must carry YAML frontmatter with a non-empty description. + + Antigravity's skill loader uses `description` for progressive + disclosure, exactly like the ops `mempalace` skill. + """ + body = RECALL_SKILL_MD.read_text(encoding="utf-8") + assert body.startswith("---\n"), "recall SKILL.md must begin with YAML frontmatter" + end = body.find("\n---\n", 4) + assert end > 0, "recall SKILL.md frontmatter is missing the closing fence" + front = body[4:end] + desc_match = re.search(r"^description:\s*(.+)$", front, re.MULTILINE) + assert desc_match is not None, "recall SKILL.md frontmatter missing `description` key" + assert desc_match.group(1).strip(), "recall SKILL.md `description` is empty" + + +def test_recall_skill_has_required_sections() -> None: + """The recall skill must carry the load-bearing protocol sections.""" + body = RECALL_SKILL_MD.read_text(encoding="utf-8") + for section in ( + "When to recall", + "Protocol", + "Tool selection", + "Unhappy paths", + "Anti-patterns", + ): + assert section in body, f"recall SKILL.md must contain a '{section}' section" + + +def test_recall_skill_links_to_shared_protocol() -> None: + """The recall skill must defer to the canonical protocol, not restate it.""" + body = RECALL_SKILL_MD.read_text(encoding="utf-8") + assert SHARED_PROTOCOL_REF in body, ( + f"recall SKILL.md must reference {SHARED_PROTOCOL_REF} so the protocol stays single-sourced" + ) + + +def test_recall_rule_exists() -> None: + """The optional recall rule must be a real file under the plugin rules dir.""" + assert RECALL_RULE_MD.is_file(), f"missing: {RECALL_RULE_MD}" + assert not RECALL_RULE_MD.is_symlink(), f"{RECALL_RULE_MD} must be a real file" + + +def test_recall_rule_is_plain_markdown_not_mdc() -> None: + """Antigravity rules are plain `.md` with no YAML frontmatter. + + Per https://antigravity.google/docs plugins use `rules/.md` + (no `.mdc`, no Cursor-style `alwaysApply` frontmatter). Pin the + plain-markdown shape so nobody copies the Cursor `.mdc` verbatim. + """ + assert RECALL_RULE_MD.suffix == ".md", "Antigravity rule must use the .md extension" + assert not (RECALL_RULE_MD.parent / "mempalace-recall.mdc").exists(), ( + "an .mdc rule leaked in; Antigravity rules are plain .md" + ) + body = RECALL_RULE_MD.read_text(encoding="utf-8") + assert not body.startswith("---"), ( + "Antigravity rule must NOT carry YAML frontmatter (no Cursor-style " + "`alwaysApply`); it is plain markdown" + ) + + +def test_recall_rule_references_shared_protocol() -> None: + """The rule must point at the canonical protocol and the deeper skill.""" + body = RECALL_RULE_MD.read_text(encoding="utf-8") + assert SHARED_PROTOCOL_REF in body, f"recall rule must reference {SHARED_PROTOCOL_REF}" + + +def test_installer_creates_rules_dir() -> None: + """install.sh must create the skills/mempalace-recall and rules dirs.""" + body = INSTALL_SH.read_text(encoding="utf-8") + assert '"$INSTALL_DIR/rules"' in body, "install.sh mkdir block must create the rules/ directory" + assert '"$INSTALL_DIR/skills/mempalace-recall"' in body, ( + "install.sh mkdir block must create the skills/mempalace-recall/ directory" + ) + + +def test_installer_copies_recall_skill() -> None: + """install.sh must copy the recall skill into the install dir.""" + body = INSTALL_SH.read_text(encoding="utf-8") + assert "skills/mempalace-recall/SKILL.md" in body, ( + "install.sh must copy_file the mempalace-recall skill" + ) + + +def test_installer_copies_recall_rule() -> None: + """install.sh must copy the recall rule into the install dir.""" + body = INSTALL_SH.read_text(encoding="utf-8") + assert "rules/mempalace-recall.md" in body, ( + "install.sh must copy_file the mempalace-recall rule" + ) diff --git a/tests/test_backends.py b/tests/test_backends.py index f7845825e..a7be828da 100644 --- a/tests/test_backends.py +++ b/tests/test_backends.py @@ -768,6 +768,33 @@ def test_fix_blob_seq_ids_writes_marker_when_already_integer(tmp_path): assert marker.is_file(), "marker must be written even when no BLOBs found" +def test_fix_blob_seq_ids_closes_sqlite_connection(tmp_path, monkeypatch): + """The migration closes sqlite connections after the pre-open probe.""" + db_path = tmp_path / "chroma.sqlite3" + with closing(sqlite3.connect(str(db_path))) as conn: + conn.execute("CREATE TABLE embeddings (rowid INTEGER PRIMARY KEY, seq_id INTEGER)") + conn.execute("INSERT INTO embeddings (seq_id) VALUES (42)") + conn.commit() + + closed = [] + real_connect = sqlite3.connect + + class TrackingConnection(sqlite3.Connection): + def close(self): + closed.append(True) + super().close() + + def tracking_connect(*args, **kwargs): + kwargs["factory"] = TrackingConnection + return real_connect(*args, **kwargs) + + monkeypatch.setattr("mempalace.backends.chroma.sqlite3.connect", tracking_connect) + + _fix_blob_seq_ids(str(tmp_path)) + + assert closed == [True] + + def test_fix_blob_seq_ids_skips_sqlite_when_marker_present(tmp_path): """When the marker exists, ``_fix_blob_seq_ids`` does not open sqlite3. @@ -1467,6 +1494,35 @@ def test_quarantine_invalid_hnsw_metadata_keeps_consistent_missing_dimensionalit assert seg.exists() +def test_quarantine_invalid_hnsw_metadata_keeps_post_deletion_missing_dimensionality(tmp_path): + """A deleted-from segment has total_elements_added > live label count (the + counter is monotonic); that dim-None shape is recoverable, not corruption (#1710). + """ + palace = tmp_path / "palace" + palace.mkdir() + seg = palace / "abcd-1234-5678" + seg.mkdir() + (seg / "data_level0.bin").write_bytes(b"x" * 2048) + (seg / "link_lists.bin").write_bytes(b"x" * 128) + with open(seg / "index_metadata.pickle", "wb") as f: + pickle.dump( + { + "dimensionality": None, + "total_elements_added": 5, + "max_seq_id": None, + "id_to_label": {"a": 1, "b": 2}, + "label_to_id": {1: "a", 2: "b"}, + "id_to_seq_id": {}, + }, + f, + ) + + moved = quarantine_invalid_hnsw_metadata(str(palace)) + + assert moved == [] + assert seg.exists() + + def test_quarantine_invalid_hnsw_metadata_renames_mismatched_missing_dimensionality(tmp_path): palace = tmp_path / "palace" palace.mkdir() diff --git a/tests/test_backups.py b/tests/test_backups.py new file mode 100644 index 000000000..b11769c96 --- /dev/null +++ b/tests/test_backups.py @@ -0,0 +1,157 @@ +"""Tests for backup retention pruning (mempalace.backups.prune_backups). + +These guard the fix for unbounded backup growth: ``mempalace migrate`` and +``mempalace repair max-seq-id`` each drop a fresh full-size, timestamped copy +every run, and used to never delete the old ones — a palace was found with +hundreds of GB of stale backups beside a few hundred MB of live data. +""" + +import os + +import pytest + +from mempalace.backups import prune_backups + + +def _make_backup_dir(parent, name, mtime): + """Create a directory backup with a fixed mtime.""" + path = parent / name + path.mkdir() + (path / "chroma.sqlite3").write_text("db") + os.utime(path, (mtime, mtime)) + return path + + +def _make_backup_file(parent, name, mtime): + """Create a file backup with a fixed mtime.""" + path = parent / name + path.write_text("db") + os.utime(path, (mtime, mtime)) + return path + + +def test_prune_keeps_newest_and_removes_oldest(tmp_path): + # 5 backups, mtimes 100..500; keep 2 newest (400, 500). + paths = [_make_backup_file(tmp_path, f"b.{i}", mtime=i * 100) for i in range(1, 6)] + + removed = prune_backups(str(tmp_path / "b.*"), max_backups=2) + + surviving = {p.name for p in tmp_path.iterdir()} + assert surviving == {"b.4", "b.5"} + assert set(removed) == {str(paths[0]), str(paths[1]), str(paths[2])} + + +def test_prune_removes_directory_backups(tmp_path): + """migrate writes directory backups (full copytree) — must rmtree them.""" + _make_backup_dir(tmp_path, "palace.pre-migrate.1", mtime=100) + _make_backup_dir(tmp_path, "palace.pre-migrate.2", mtime=200) + keep = _make_backup_dir(tmp_path, "palace.pre-migrate.3", mtime=300) + + removed = prune_backups(str(tmp_path / "palace.pre-migrate.*"), max_backups=1) + + assert keep.is_dir() + assert len(removed) == 2 + assert not (tmp_path / "palace.pre-migrate.1").exists() + assert not (tmp_path / "palace.pre-migrate.2").exists() + + +def test_prune_noop_when_under_limit(tmp_path): + _make_backup_file(tmp_path, "b.1", mtime=100) + _make_backup_file(tmp_path, "b.2", mtime=200) + + removed = prune_backups(str(tmp_path / "b.*"), max_backups=10) + + assert removed == [] + assert len(list(tmp_path.iterdir())) == 2 + + +def test_prune_noop_when_exactly_at_limit(tmp_path): + _make_backup_file(tmp_path, "b.1", mtime=100) + _make_backup_file(tmp_path, "b.2", mtime=200) + + removed = prune_backups(str(tmp_path / "b.*"), max_backups=2) + + assert removed == [] + + +@pytest.mark.parametrize("disabled", [0, -1, None]) +def test_prune_disabled_keeps_everything(tmp_path, disabled): + for i in range(1, 6): + _make_backup_file(tmp_path, f"b.{i}", mtime=i * 100) + + removed = prune_backups(str(tmp_path / "b.*"), max_backups=disabled) + + assert removed == [] + assert len(list(tmp_path.iterdir())) == 5 + + +def test_prune_no_matches(tmp_path): + assert prune_backups(str(tmp_path / "nope.*"), max_backups=3) == [] + + +def test_prune_only_touches_matching_pattern(tmp_path): + """Live data and unrelated files must never be swept up by a backup glob.""" + _make_backup_file(tmp_path, "chroma.sqlite3.max-seq-id-backup-1", mtime=100) + _make_backup_file(tmp_path, "chroma.sqlite3.max-seq-id-backup-2", mtime=200) + _make_backup_file(tmp_path, "chroma.sqlite3.max-seq-id-backup-3", mtime=300) + # The live database and an unrelated file — must survive. + live = _make_backup_file(tmp_path, "chroma.sqlite3", mtime=400) + other = _make_backup_file(tmp_path, "tunnels.json", mtime=400) + + prune_backups( + str(tmp_path / "chroma.sqlite3.max-seq-id-backup-*"), + max_backups=1, + ) + + assert live.exists() + assert other.exists() + assert (tmp_path / "chroma.sqlite3.max-seq-id-backup-3").exists() + assert not (tmp_path / "chroma.sqlite3.max-seq-id-backup-1").exists() + assert not (tmp_path / "chroma.sqlite3.max-seq-id-backup-2").exists() + + +def test_prune_respects_glob_escape_for_metacharacter_paths(tmp_path): + """Palace paths can contain glob metacharacters like ``[``. + + Without ``glob.escape`` the pattern would silently match nothing (the + bracket is read as a character class), leaving backups unpruned. Callers + escape the literal prefix; this confirms the helper prunes correctly once + they do. + """ + import glob + + weird = tmp_path / "weird[name]" + weird.mkdir() + for i in range(1, 4): + _make_backup_file(weird, f"chroma.sqlite3.max-seq-id-backup-{i}", mtime=i * 100) + + pattern = os.path.join(glob.escape(str(weird)), "chroma.sqlite3.max-seq-id-backup-*") + removed = prune_backups(pattern, max_backups=1) + + assert len(removed) == 2 + assert (weird / "chroma.sqlite3.max-seq-id-backup-3").exists() + + +def test_prune_is_best_effort_on_delete_failure(tmp_path, monkeypatch): + """A failed deletion is logged and skipped, never raised — pruning must + not undo a migrate/repair that already succeeded.""" + for i in range(1, 5): + _make_backup_file(tmp_path, f"b.{i}", mtime=i * 100) + + real_remove = os.remove + + def flaky_remove(path): + if path.endswith("b.1"): + raise OSError("permission denied") + return real_remove(path) + + monkeypatch.setattr(os, "remove", flaky_remove) + + logs = [] + removed = prune_backups(str(tmp_path / "b.*"), max_backups=2, log=logs.append) + + # b.1 and b.2 were over the limit; b.1 failed, b.2 succeeded. + assert str(tmp_path / "b.2") in removed + assert str(tmp_path / "b.1") not in removed + assert (tmp_path / "b.1").exists() + assert any("could not remove" in line for line in logs) diff --git a/tests/test_cli.py b/tests/test_cli.py index 3346b5cdd..d8977dc9b 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -6,6 +6,7 @@ import sqlite3 import subprocess import sys +from contextlib import closing from pathlib import Path from unittest.mock import MagicMock, call, patch @@ -1082,6 +1083,131 @@ def test_cmd_repair_restores_backup_on_live_rebuild_failure(mock_config_cls, tmp ] +def _repair_backend_mocks(mock_config_cls, palace_dir, create_collection_results=None): + """Config + backend mocks for a 2-drawer legacy repair run. + + ``create_collection_results`` overrides the ``create_collection`` + side_effect sequence; the default is a temp + live collection pair + that succeeds. + """ + mock_config_cls.return_value.palace_path = str(palace_dir) + mock_config_cls.return_value.collection_name = "mempalace_drawers" + mock_col = MagicMock() + mock_col.count.return_value = 2 + mock_col.get.return_value = { + "ids": ["id1", "id2"], + "documents": ["doc1", "doc2"], + "metadatas": [{"wing": "a"}, {"wing": "b"}], + } + if create_collection_results is None: + mock_temp_col = MagicMock() + mock_temp_col.count.return_value = 2 + mock_new_col = MagicMock() + mock_new_col.count.return_value = 2 + create_collection_results = [mock_temp_col, mock_new_col] + mock_backend = _mock_backend_for(col=mock_col) + mock_backend.create_collection.side_effect = create_collection_results + return mock_backend + + +@patch("mempalace.cli.MempalaceConfig") +def test_cmd_repair_closes_handles_then_rebuilds_fts5(mock_config_cls, tmp_path): + """cmd_repair must close chroma handles, then run _vacuum_and_rebuild_fts5. + + Mirrors test_rebuild_index_calls_vacuum in test_repair.py: ChromaDB's + PersistentClient holds an open connection to chroma.sqlite3 and VACUUM + requires exclusive access, so the handles must be released first. See + #1747: the legacy path skipped this cleanup entirely. + """ + palace_dir = tmp_path / "palace" + palace_dir.mkdir() + sqlite3.connect(str(palace_dir / "chroma.sqlite3")).close() + args = argparse.Namespace(palace=None, yes=True) + mock_backend = _repair_backend_mocks(mock_config_cls, palace_dir) + + call_order = [] + with ( + patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend), + patch( + "mempalace.repair._close_chroma_handles", + side_effect=lambda *a, **kw: call_order.append("close"), + ) as mock_close, + patch( + "mempalace.repair._vacuum_and_rebuild_fts5", + side_effect=lambda *a, **kw: call_order.append("vacuum"), + ) as mock_vacuum, + ): + cmd_repair(args) + + mock_close.assert_called_once() + mock_vacuum.assert_called_once() + assert call_order == ["close", "vacuum"], "handles must be closed before VACUUM" + vacuum_args, _ = mock_vacuum.call_args + assert vacuum_args[0] == str(palace_dir) + + +@patch("mempalace.cli.MempalaceConfig") +def test_cmd_repair_success_rebuilds_fts5_and_vacuums(mock_config_cls, tmp_path, capsys): + """A clean legacy repair leaves the FTS5 index rebuilt and the file vacuumed. + + Regression test for #1747: cmd_repair printed "Repair complete" without + ever running _vacuum_and_rebuild_fts5, so the bulk delete + re-upsert + cycle left the FTS5 inverted index inconsistent and the next repair run + aborted at the integrity preflight. The two banners are the user-visible + contract that the cleanup ran. + """ + palace_dir = tmp_path / "palace" + palace_dir.mkdir() + with closing(sqlite3.connect(str(palace_dir / "chroma.sqlite3"))) as conn: + conn.execute( + "CREATE VIRTUAL TABLE embedding_fulltext_search" + " USING fts5(string_value, tokenize='unicode61')" + ) + conn.execute("INSERT INTO embedding_fulltext_search(string_value) VALUES('hello world')") + conn.commit() + args = argparse.Namespace(palace=None, yes=True) + mock_backend = _repair_backend_mocks(mock_config_cls, palace_dir) + + with patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend): + cmd_repair(args) + + out = capsys.readouterr().out + assert "Repair complete" in out + assert "FTS5 index rebuilt." in out + assert "SQLite VACUUM complete." in out + assert "post-repair cleanup failed" not in out + with closing(sqlite3.connect(str(palace_dir / "chroma.sqlite3"))) as conn: + result = conn.execute("PRAGMA quick_check").fetchall() + assert result == [("ok",)] + + +@patch("mempalace.cli.MempalaceConfig") +def test_cmd_repair_does_not_vacuum_when_rebuild_fails(mock_config_cls, tmp_path, capsys): + """Post-run FTS5 cleanup must not fire when the rebuild itself failed.""" + palace_dir = tmp_path / "palace" + palace_dir.mkdir() + sqlite3.connect(str(palace_dir / "chroma.sqlite3")).close() + args = argparse.Namespace(palace=None, yes=True) + mock_temp_col = MagicMock() + mock_temp_col.count.return_value = 2 + mock_backend = _repair_backend_mocks( + mock_config_cls, + palace_dir, + create_collection_results=[mock_temp_col, RuntimeError("live build failed")], + ) + + with ( + patch("mempalace.backends.chroma.ChromaBackend", return_value=mock_backend), + patch("mempalace.repair._vacuum_and_rebuild_fts5") as mock_vacuum, + ): + with pytest.raises(SystemExit) as excinfo: + cmd_repair(args) + + assert "Repair failed" in capsys.readouterr().out + assert excinfo.value.code == 1 + mock_vacuum.assert_not_called() + + @patch("mempalace.cli.MempalaceConfig") def test_cmd_repair_aborts_without_confirmation(mock_config_cls, tmp_path, capsys): palace_dir = tmp_path / "palace" diff --git a/tests/test_closets.py b/tests/test_closets.py index e57bf34b3..4744c1cb9 100644 --- a/tests/test_closets.py +++ b/tests/test_closets.py @@ -543,14 +543,21 @@ def test_closet_boost_marks_hit_as_drawer_plus_closet(self, palace_path, seeded_ ``closet_preview`` exposes the hydrated index line.""" closets = get_closets_collection(palace_path) # Seed the closet against the same source_file the drawer uses so - # the boost lookup keys align. - closets.upsert( - ids=["closet_proj_backend_aaa_01"], - documents=["JWT auth tokens|;|→drawer_proj_backend_aaa"], - metadatas=[{"wing": "project", "room": "backend", "source_file": "auth.py"}], + # the boost lookup keys align. Use several high-signal closet lines + # instead of one terse pointer so the ranking is stable across Chroma + # platform builds. + upsert_closet_lines( + closets, + closet_id_base="closet_proj_backend_aaa", + lines=[ + "JWT auth tokens|;|→drawer_proj_backend_aaa", + "session expiry authentication module|;|→drawer_proj_backend_aaa", + "HttpOnly refresh cookies|;|→drawer_proj_backend_aaa", + ], + metadata={"wing": "project", "room": "backend", "source_file": "auth.py"}, ) - result = search_memories("JWT authentication", palace_path) + result = search_memories("JWT auth tokens expiry", palace_path) assert result["results"], "hybrid search should still return results" # The JWT-bearing drawer should surface with closet agreement. boosted = [h for h in result["results"] if h["matched_via"] == "drawer+closet"] @@ -1518,3 +1525,304 @@ def test_hybrid_search_enrichment_populates_drawer_index_and_total(self, palace_ # Enriched text must include the grep-best chunk plus one neighbor # on each side (chunk boundary may clip). assert "chunk_" in top["text"] + + def test_expand_isolates_chunks_by_parent_drawer_id_when_source_file_shared(self, palace_path): + """Regression for #1580. After #1539 the chunked ``tool_add_drawer`` + path stores per-chunk drawers tagged with a ``parent_drawer_id`` + linking them to the logical group. If two unrelated logical + drawers happen to share the same ``source_file`` (e.g. two pastes + labelled ``source_file="chat.log"``), filtering only by + ``source_file + chunk_index`` pulls chunks from both groups as if + they were sequential neighbors, corrupting the enriched text. + Scoping by ``parent_drawer_id`` when present keeps each logical + group isolated. (``tool_diary_write`` chunks tag a different key + (``parent_entry_id``) and are written without ``source_file``, so + they never reach this enrichment path.) + """ + col = get_collection(palace_path) + source = "shared.log" + # Group A: 2 chunks under parent_drawer_id="drawer_A". + col.upsert( + ids=["drawer_A_chunk_000000", "drawer_A_chunk_000001"], + documents=["alpha-A-chunk-0 content", "alpha-A-chunk-1 content"], + metadatas=[ + { + "wing": "w", + "room": "r", + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_A", + "filed_at": "2026-04-13T00:00:00", + }, + { + "wing": "w", + "room": "r", + "source_file": source, + "chunk_index": 1, + "parent_drawer_id": "drawer_A", + "filed_at": "2026-04-13T00:00:00", + }, + ], + ) + # Group B: 2 chunks under the SAME source_file but a different + # parent_drawer_id. Chunk indices intentionally collide with A. + col.upsert( + ids=["drawer_B_chunk_000000", "drawer_B_chunk_000001"], + documents=["bravo-B-chunk-0 content", "bravo-B-chunk-1 content"], + metadatas=[ + { + "wing": "w", + "room": "r", + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_B", + "filed_at": "2026-04-13T00:00:00", + }, + { + "wing": "w", + "room": "r", + "source_file": source, + "chunk_index": 1, + "parent_drawer_id": "drawer_B", + "filed_at": "2026-04-13T00:00:00", + }, + ], + ) + + matched_doc = "alpha-A-chunk-0 content" + matched_meta = { + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_A", + } + out = _expand_with_neighbors(col, matched_doc, matched_meta, radius=1) + text = out["text"] + # Group A's chunks are returned in chunk_index order. + assert "alpha-A-chunk-0" in text + assert "alpha-A-chunk-1" in text + # No leakage of group B's chunks through the shared source_file key. + assert "bravo-B-chunk-0" not in text + assert "bravo-B-chunk-1" not in text + # total_drawers is scoped to the parent group so the caller sees a + # count consistent with the text returned (2 chunks in group A), + # not 4 (every row sharing the source_file key). + assert out["total_drawers"] == 2 + assert out["drawer_index"] == 0 + + def test_expand_backwards_compat_no_parent_drawer_id_returns_all_source_neighbors( + self, palace_path + ): + """Drawers without a ``parent_drawer_id`` (single-chunk writes, + legacy palaces, ``diary_ingest`` chunks grouped by real file path) + must take the 2-clause fallback (``source_file + chunk_index``) + unchanged, so neighbor expansion still works file-globally for + those callers. + """ + col, _ = self._seed_source_file(palace_path, "/proj/legacy.md", n_chunks=5) + matched_meta = {"source_file": "/proj/legacy.md", "chunk_index": 2} + out = _expand_with_neighbors( + col, "chunk_2 content about topic alpha", matched_meta, radius=1 + ) + # Same expectations as test_expand_returns_matched_plus_neighbors: + # no parent_drawer_id anywhere, so behavior is unchanged. + assert out["total_drawers"] == 5 + assert out["drawer_index"] == 2 + text = out["text"] + assert "chunk_1" in text + assert "chunk_2" in text + assert "chunk_3" in text + + def test_hybrid_search_enrichment_isolates_chunks_across_drawers_sharing_source_file( + self, palace_path + ): + """End-to-end for #1580. Two oversized add_drawer-shape groups + share a ``source_file``, a closet boosts that source, and the + ranked hit lands on group A. The enrichment step in + ``search_memories`` must return only group A's text, not a mix + of A and B chunks stitched as if they were sequential context. + """ + col = get_collection(palace_path) + source = "/proj/shared_log.md" + # Group A: 2 chunks under parent_drawer_id "drawer_proj_log_aaa". + col.upsert( + ids=[ + "drawer_proj_log_aaa_chunk_000000", + "drawer_proj_log_aaa_chunk_000001", + ], + documents=[ + "alpha JWT authentication flow", + "alpha continues the auth narrative", + ], + metadatas=[ + { + "wing": "proj", + "room": "log", + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_proj_log_aaa", + "filed_at": "2026-04-13T00:00:00", + }, + { + "wing": "proj", + "room": "log", + "source_file": source, + "chunk_index": 1, + "parent_drawer_id": "drawer_proj_log_aaa", + "filed_at": "2026-04-13T00:00:00", + }, + ], + ) + # Group B: 2 chunks under the SAME source_file but a different + # parent_drawer_id, with content unrelated to the JWT query. + col.upsert( + ids=[ + "drawer_proj_log_bbb_chunk_000000", + "drawer_proj_log_bbb_chunk_000001", + ], + documents=[ + "bravo unrelated topic about database migrations", + "bravo continues with PostgreSQL specifics", + ], + metadatas=[ + { + "wing": "proj", + "room": "log", + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_proj_log_bbb", + "filed_at": "2026-04-13T00:00:00", + }, + { + "wing": "proj", + "room": "log", + "source_file": source, + "chunk_index": 1, + "parent_drawer_id": "drawer_proj_log_bbb", + "filed_at": "2026-04-13T00:00:00", + }, + ], + ) + # Closet pointing at group A's first chunk for this source. + closets = get_closets_collection(palace_path) + closets.upsert( + ids=["closet_proj_log_aaa_01"], + documents=["JWT auth|;|→drawer_proj_log_aaa_chunk_000000"], + metadatas=[{"wing": "proj", "room": "log", "source_file": source}], + ) + + result = search_memories("JWT authentication", palace_path) + assert result["results"] + boosted = [h for h in result["results"] if h["matched_via"] == "drawer+closet"] + assert boosted, "hybrid search should mark the closet-agreeing source" + top = boosted[0] + text = top["text"] + # Group A's content is present. + assert "alpha" in text + # Group B's content must not leak in through the shared source_file + # key. The enrichment loop fetches sibling chunks for the matched + # source, and prior to #1580 that fetch ignored parent_drawer_id. + assert "bravo" not in text, ( + "neighbor enrichment leaked group B's chunks through the shared " + "source_file key (see #1580)" + ) + # total_drawers on the enriched hit is scoped to the matched + # parent group (2 chunks in group A), not the full source_file + # row count (4 across both groups). Pins the scoping contract on + # the live enrichment path, not just the helper. + assert top["total_drawers"] == 2 + # Internal scoring-loop keys must be scrubbed before results are + # returned to MCP callers. ``_parent_drawer_id`` is added during + # the #1580 fix and popped in the final cleanup loop alongside + # the existing internal keys. + for h in result["results"]: + assert "_parent_drawer_id" not in h + assert "_source_file_full" not in h + assert "_chunk_index" not in h + assert "_sort_key" not in h + + def test_expand_isolates_asymmetric_groups_under_shared_source_file(self, palace_path): + """Asymmetric coverage: group A has 1 chunk, group B has 3 chunks + under the shared ``source_file``. Catches a regression where + ``total_drawers`` accidentally drifts back to the unscoped + file-global count (4) when one group dominates the row mix. + """ + col = get_collection(palace_path) + source = "asym.log" + col.upsert( + ids=["drawer_solo_chunk_000000"], + documents=["solo-A-chunk-0 content"], + metadatas=[ + { + "wing": "w", + "room": "r", + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_solo", + "filed_at": "2026-04-13T00:00:00", + } + ], + ) + col.upsert( + ids=[ + "drawer_trio_chunk_000000", + "drawer_trio_chunk_000001", + "drawer_trio_chunk_000002", + ], + documents=[ + "trio-B-chunk-0 content", + "trio-B-chunk-1 content", + "trio-B-chunk-2 content", + ], + metadatas=[ + { + "wing": "w", + "room": "r", + "source_file": source, + "chunk_index": i, + "parent_drawer_id": "drawer_trio", + "filed_at": "2026-04-13T00:00:00", + } + for i in range(3) + ], + ) + + out = _expand_with_neighbors( + col, + "solo-A-chunk-0 content", + { + "source_file": source, + "chunk_index": 0, + "parent_drawer_id": "drawer_solo", + }, + radius=1, + ) + # Singleton group A: text is the matched chunk, total_drawers == 1. + assert "solo-A-chunk-0" in out["text"] + assert "trio-B-chunk" not in out["text"] + assert out["total_drawers"] == 1 + assert out["drawer_index"] == 0 + + def test_expand_empty_string_parent_drawer_id_treated_as_absent(self, palace_path): + """Contract pin: an empty-string ``parent_drawer_id`` value + degrades to the 2-clause file-global filter (matches the + ``if not src`` empty-string handling for ``source_file`` at + ``searcher.py:239``). Writers in the codebase never emit an + empty parent id, but pinning the contract guards against a + future migration that does and avoids a silent narrow-then- + miss surprise. + """ + col, _ = self._seed_source_file(palace_path, "/proj/empty_parent.md", n_chunks=3) + matched_meta = { + "source_file": "/proj/empty_parent.md", + "chunk_index": 1, + "parent_drawer_id": "", + } + out = _expand_with_neighbors( + col, "chunk_1 content about topic alpha", matched_meta, radius=1 + ) + # Empty parent_drawer_id is treated as absent; full file-global + # neighborhood is returned. Mirrors backwards-compat behavior. + assert out["total_drawers"] == 3 + assert "chunk_0" in out["text"] + assert "chunk_1" in out["text"] + assert "chunk_2" in out["text"] diff --git a/tests/test_config.py b/tests/test_config.py index 19847b6dd..06ecba162 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -688,3 +688,63 @@ def test_hooks_auto_save_env_override_true(): assert cfg.hooks_auto_save is True finally: del os.environ["MEMPALACE_HOOKS_AUTO_SAVE"] + + +# --- max_backups (backup retention) --- + + +def test_max_backups_default(monkeypatch): + monkeypatch.delenv("MEMPALACE_MAX_BACKUPS", raising=False) + cfg = MempalaceConfig(config_dir=tempfile.mkdtemp()) + assert cfg.max_backups == 10 + + +def test_max_backups_from_config(monkeypatch, tmp_path): + monkeypatch.delenv("MEMPALACE_MAX_BACKUPS", raising=False) + with open(tmp_path / "config.json", "w") as f: + json.dump({"max_backups": 3}, f) + cfg = MempalaceConfig(config_dir=str(tmp_path)) + assert cfg.max_backups == 3 + + +def test_max_backups_zero_disables(monkeypatch, tmp_path): + """0 is a valid, explicit "keep everything" — not garbage.""" + monkeypatch.delenv("MEMPALACE_MAX_BACKUPS", raising=False) + with open(tmp_path / "config.json", "w") as f: + json.dump({"max_backups": 0}, f) + cfg = MempalaceConfig(config_dir=str(tmp_path)) + assert cfg.max_backups == 0 + + +def test_max_backups_env_overrides_config(monkeypatch, tmp_path): + with open(tmp_path / "config.json", "w") as f: + json.dump({"max_backups": 3}, f) + monkeypatch.setenv("MEMPALACE_MAX_BACKUPS", "7") + cfg = MempalaceConfig(config_dir=str(tmp_path)) + assert cfg.max_backups == 7 + + +@pytest.mark.parametrize("bad", ["abc", "", "-5", "1.5", "true"]) +def test_max_backups_garbage_falls_back_to_default(monkeypatch, tmp_path, bad): + """A hand-edited bad value must never crash migrate/repair.""" + with open(tmp_path / "config.json", "w") as f: + json.dump({"max_backups": bad}, f) + monkeypatch.delenv("MEMPALACE_MAX_BACKUPS", raising=False) + cfg = MempalaceConfig(config_dir=str(tmp_path)) + assert cfg.max_backups == 10 + + +def test_max_backups_negative_in_config_falls_back(monkeypatch, tmp_path): + monkeypatch.delenv("MEMPALACE_MAX_BACKUPS", raising=False) + with open(tmp_path / "config.json", "w") as f: + json.dump({"max_backups": -3}, f) + cfg = MempalaceConfig(config_dir=str(tmp_path)) + assert cfg.max_backups == 10 + + +def test_max_backups_bad_env_falls_back_to_config(monkeypatch, tmp_path): + with open(tmp_path / "config.json", "w") as f: + json.dump({"max_backups": 4}, f) + monkeypatch.setenv("MEMPALACE_MAX_BACKUPS", "garbage") + cfg = MempalaceConfig(config_dir=str(tmp_path)) + assert cfg.max_backups == 4 diff --git a/tests/test_convo_miner.py b/tests/test_convo_miner.py index fd775ace6..22eefcdd0 100644 --- a/tests/test_convo_miner.py +++ b/tests/test_convo_miner.py @@ -416,3 +416,41 @@ def test_resolve_wing_empty_string_treated_as_no_wing(tmp_path): target = tmp_path / ".gemini" / "tmp" target.mkdir(parents=True) assert _resolve_wing(target, wing="") == "wing_api" + + +def test_mine_convos_limit_skips_already_mined(capsys): + """--limit N counts only new work, not already-mined skips (#1535).""" + tmpdir = tempfile.mkdtemp() + try: + convo_text = ( + "> What is topic {i}?\n" + "Topic {i} is about something important and interesting enough " + "to produce at least one exchange chunk for the test.\n\n" + "> Tell me more about topic {i}.\n" + "Sure, topic {i} has many facets worth exploring in detail.\n" + ) + for i in range(4): + with open(os.path.join(tmpdir, f"chat_{i}.txt"), "w") as f: + f.write(convo_text.format(i=i)) + + palace_path = os.path.join(tmpdir, "palace") + + mine_convos(tmpdir, palace_path, wing="test") + capsys.readouterr() + + for i in range(4, 7): + with open(os.path.join(tmpdir, f"chat_{i}.txt"), "w") as f: + f.write(convo_text.format(i=i)) + + mine_convos(tmpdir, palace_path, wing="test", limit=2) + out = capsys.readouterr().out + + assert "Files processed: 2" in out + assert "Drawers filed:" in out + for line in out.split("\n"): + if "Drawers filed:" in line: + filed = int(line.split(":")[1].strip()) + assert filed > 0, f"limit=2 should mine new files, got {filed}" + break + finally: + shutil.rmtree(tmpdir, ignore_errors=True) diff --git a/tests/test_cursor_hooks_install.py b/tests/test_cursor_hooks_install.py new file mode 100644 index 000000000..652d6558f --- /dev/null +++ b/tests/test_cursor_hooks_install.py @@ -0,0 +1,469 @@ +"""Contract tests for ``hooks/cursor/install.sh``. + +The installer's job is to merge MemPalace hook entries into a Cursor +``hooks.json`` file without: + +- modifying unrelated hook entries already in the file, +- duplicating MemPalace entries when re-run, +- writing to disk when ``--dry-run`` is passed, +- leaving stale MemPalace entries behind on ``--uninstall``. + +These four contracts are what protects a user's existing Cursor +configuration. Tests use ``--scope project --target `` so +the test never touches the real user `~/.cursor/`. +""" + +from __future__ import annotations + +import json +import os +import subprocess +import sys +from pathlib import Path + +import pytest + +REPO_ROOT = Path(__file__).resolve().parent.parent +INSTALL_SH = REPO_ROOT / "hooks" / "cursor" / "install.sh" + +pytestmark = pytest.mark.skipif(os.name == "nt", reason="install.sh is POSIX-only") + + +# ── helpers ───────────────────────────────────────────────────────── + + +def _run_install( + *args: str, + target: Path, + home: Path | None = None, + expected_rc: int = 0, +) -> tuple[str, str]: + """Invoke install.sh with --scope project --target . + + Forces ``MEMPAL_PYTHON=sys.executable`` for the same reason the + shell-hook tests do — ensures the JSON merge runs even if PATH on + a GUI-launched CI runner is missing python3. Forces a clean HOME + so the default --install-dir (under ~/.mempalace/hooks/cursor/) + lands in a sandboxed tmp tree rather than the developer's real + home. + """ + env = { + "HOME": str(home) if home else "/tmp/mempal-install-test-home", + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + "MEMPAL_PYTHON": sys.executable, + } + cmd = [ + "bash", + str(INSTALL_SH), + "--scope", + "project", + "--target", + str(target), + *args, + ] + p = subprocess.run( + cmd, + capture_output=True, + text=True, + env=env, + timeout=30, + ) + assert p.returncode == expected_rc, ( + f"install.sh exited {p.returncode} (expected {expected_rc}); " + f"stderr={p.stderr!r}; stdout={p.stdout!r}; argv={cmd}" + ) + return p.stdout, p.stderr + + +def _hooks_file(target: Path) -> Path: + return target / ".cursor" / "hooks.json" + + +def _seed(target: Path, payload: dict) -> Path: + cursor_dir = target / ".cursor" + cursor_dir.mkdir(parents=True, exist_ok=True) + hf = cursor_dir / "hooks.json" + hf.write_text(json.dumps(payload, indent=2)) + return hf + + +# ── --help and bash syntax ────────────────────────────────────────── + + +def test_bash_syntax_clean(): + p = subprocess.run( + ["bash", "-n", str(INSTALL_SH)], + capture_output=True, + text=True, + ) + assert p.returncode == 0, f"install.sh syntax error: {p.stderr}" + + +def test_help_describes_all_flags(): + p = subprocess.run( + ["bash", str(INSTALL_SH), "--help"], + capture_output=True, + text=True, + env={ + "HOME": "/tmp/mempal-help", + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + }, + timeout=10, + ) + assert p.returncode == 0 + for flag in ("--scope", "--target", "--variant", "--dry-run", "--uninstall"): + assert flag in p.stdout, f"--help must describe {flag}" + + +def test_unknown_flag_exits_nonzero(): + p = subprocess.run( + ["bash", str(INSTALL_SH), "--bogus-flag"], + capture_output=True, + text=True, + env={ + "HOME": "/tmp/mempal-bogus", + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + }, + timeout=10, + ) + assert p.returncode != 0 + + +# ── --dry-run contract ───────────────────────────────────────────── + + +class TestDryRun: + def test_dry_run_does_not_write_target_file(self, tmp_path): + stdout, _ = _run_install("--dry-run", target=tmp_path, home=tmp_path) + assert not _hooks_file(tmp_path).exists(), "--dry-run must not write the target file" + # But it must still print valid JSON to stdout for the user to + # review. + parsed = json.loads(stdout) + assert parsed["version"] == 1 + assert "hooks" in parsed + + def test_dry_run_does_not_copy_scripts(self, tmp_path): + install_dir = tmp_path / "install-dest" + _run_install( + "--dry-run", + "--install-dir", + str(install_dir), + target=tmp_path, + home=tmp_path, + ) + assert not install_dir.exists(), ( + "--dry-run must not copy any hook scripts to the install dir" + ) + + def test_dry_run_emits_full_variant_by_default(self, tmp_path): + stdout, _ = _run_install("--dry-run", target=tmp_path, home=tmp_path) + cfg = json.loads(stdout) + assert set(cfg["hooks"].keys()) >= {"sessionStart", "stop", "preCompact"} + + def test_dry_run_minimal_variant_only_wires_stop(self, tmp_path): + stdout, _ = _run_install( + "--dry-run", + "--variant", + "minimal", + target=tmp_path, + home=tmp_path, + ) + cfg = json.loads(stdout) + # `stop` is the only event we touch. Anything else (including + # sessionStart / preCompact) should not be present unless the + # seed file already had it. + assert "stop" in cfg["hooks"] + assert "sessionStart" not in cfg["hooks"] + assert "preCompact" not in cfg["hooks"] + + +# ── merge-preservation contract ──────────────────────────────────── + + +class TestMergePreservation: + def test_preserves_unrelated_hook_events(self, tmp_path): + _seed( + tmp_path, + { + "version": 1, + "hooks": { + "afterFileEdit": [ + {"command": "/usr/local/bin/my-formatter.sh"}, + ], + "beforeShellExecution": [ + {"command": "/usr/local/bin/audit-shell.sh"}, + ], + }, + }, + ) + _run_install(target=tmp_path, home=tmp_path) + result = json.loads(_hooks_file(tmp_path).read_text()) + assert result["hooks"]["afterFileEdit"] == [ + {"command": "/usr/local/bin/my-formatter.sh"}, + ], "unrelated afterFileEdit entry must survive merge" + assert result["hooks"]["beforeShellExecution"] == [ + {"command": "/usr/local/bin/audit-shell.sh"}, + ], "unrelated beforeShellExecution entry must survive merge" + + def test_preserves_other_entries_on_same_event(self, tmp_path): + # User has their own `stop` hook. We must add MemPalace's + # entry alongside, not replace. + _seed( + tmp_path, + { + "version": 1, + "hooks": { + "stop": [ + {"command": "/usr/local/bin/my-stop-hook.sh"}, + ], + }, + }, + ) + _run_install(target=tmp_path, home=tmp_path) + result = json.loads(_hooks_file(tmp_path).read_text()) + stop_entries = result["hooks"]["stop"] + assert len(stop_entries) == 2, ( + f"expected user entry + MemPalace entry; got {stop_entries!r}" + ) + commands = {e["command"] for e in stop_entries} + assert "/usr/local/bin/my-stop-hook.sh" in commands + assert any("mempal_save_hook_cursor.sh" in c for c in commands) + + def test_creates_target_dir_when_missing(self, tmp_path): + # No .cursor/ exists yet; install must create both the + # directory and the file. + assert not (tmp_path / ".cursor").exists() + _run_install(target=tmp_path, home=tmp_path) + assert _hooks_file(tmp_path).exists() + cfg = json.loads(_hooks_file(tmp_path).read_text()) + assert "stop" in cfg["hooks"] + + def test_refuses_to_overwrite_malformed_existing_json(self, tmp_path): + cursor_dir = tmp_path / ".cursor" + cursor_dir.mkdir(parents=True) + (cursor_dir / "hooks.json").write_text("{ this is not json") + # The merge step should fail with a non-zero exit; the file + # must remain untouched so the user can fix it. + _, stderr = _run_install(target=tmp_path, home=tmp_path, expected_rc=2) + assert "not valid JSON" in stderr or "Refusing to overwrite" in stderr + # File should be unchanged. + assert (cursor_dir / "hooks.json").read_text() == "{ this is not json" + + +# ── idempotency contract ─────────────────────────────────────────── + + +class TestIdempotency: + def test_running_install_twice_does_not_duplicate(self, tmp_path): + _run_install(target=tmp_path, home=tmp_path) + first = json.loads(_hooks_file(tmp_path).read_text()) + _run_install(target=tmp_path, home=tmp_path) + second = json.loads(_hooks_file(tmp_path).read_text()) + assert first == second, ( + "re-running install.sh must produce an identical config " + f"(idempotency); first={first!r} second={second!r}" + ) + # And specifically no duplicate MemPalace entry on `stop`. + stop = second["hooks"]["stop"] + mempal_entries = [e for e in stop if "mempal_save_hook_cursor.sh" in e["command"]] + assert len(mempal_entries) == 1, ( + f"re-running install must not duplicate the stop entry; got {stop!r}" + ) + + +# ── --uninstall contract ─────────────────────────────────────────── + + +class TestUninstall: + def test_uninstall_removes_only_mempalace_entries(self, tmp_path): + # Seed: user has their own stop hook AND an unrelated event. + _seed( + tmp_path, + { + "version": 1, + "hooks": { + "stop": [ + {"command": "/usr/local/bin/my-stop-hook.sh"}, + ], + "afterFileEdit": [ + {"command": "/usr/local/bin/my-formatter.sh"}, + ], + }, + }, + ) + _run_install(target=tmp_path, home=tmp_path) + # MemPalace is now wired alongside the user's entries. + _run_install("--uninstall", target=tmp_path, home=tmp_path) + cfg = json.loads(_hooks_file(tmp_path).read_text()) + # User's stop hook must remain; MemPalace's must be gone. + commands = {e["command"] for e in cfg["hooks"].get("stop", [])} + assert commands == {"/usr/local/bin/my-stop-hook.sh"} + # Unrelated event untouched. + assert cfg["hooks"]["afterFileEdit"] == [ + {"command": "/usr/local/bin/my-formatter.sh"}, + ] + # sessionStart / preCompact (which were ONLY ever wired by us) + # must be removed entirely since they would otherwise dangle + # as empty lists. + assert "sessionStart" not in cfg["hooks"] + assert "preCompact" not in cfg["hooks"] + + def test_uninstall_removes_empty_file_when_no_user_hooks_remain(self, tmp_path): + # No pre-existing hooks; install then uninstall should leave + # an effectively-empty config -> file removed entirely. + _run_install(target=tmp_path, home=tmp_path) + assert _hooks_file(tmp_path).exists() + _run_install("--uninstall", target=tmp_path, home=tmp_path) + assert not _hooks_file(tmp_path).exists(), ( + "fully-empty hooks.json after uninstall must be removed, " + 'not left as `{"version": 1, "hooks": {}}`' + ) + + def test_uninstall_is_safe_when_file_missing(self, tmp_path): + # User never installed; running uninstall must not crash. + assert not _hooks_file(tmp_path).exists() + _run_install("--uninstall", target=tmp_path, home=tmp_path) + # File should still not exist (and definitely should not have + # been created with an empty config). + assert not _hooks_file(tmp_path).exists() + + def test_uninstall_dry_run_does_not_modify_file(self, tmp_path): + _seed( + tmp_path, + { + "version": 1, + "hooks": { + "stop": [ + {"command": "/usr/local/bin/my-stop-hook.sh"}, + { + "command": ( + "/Users/anon/.mempalace/hooks/cursor/mempal_save_hook_cursor.sh" + ), + "loop_limit": 1, + }, + ], + }, + }, + ) + before = _hooks_file(tmp_path).read_text() + _run_install("--uninstall", "--dry-run", target=tmp_path, home=tmp_path) + after = _hooks_file(tmp_path).read_text() + assert before == after, "--uninstall --dry-run must not mutate the target file" + + +# ── scope handling ────────────────────────────────────────────────── + + +def test_project_scope_writes_to_project_dir(tmp_path): + # Sanity check that --scope project + --target lands the file at + # /.cursor/hooks.json and nowhere else. + home = tmp_path / "fake-home" + home.mkdir() + project = tmp_path / "fake-project" + project.mkdir() + _run_install(target=project, home=home) + assert (project / ".cursor" / "hooks.json").exists() + assert not (home / ".cursor" / "hooks.json").exists(), ( + "--scope project must not write into HOME" + ) + + +def test_invalid_scope_rejected(tmp_path): + p = subprocess.run( + ["bash", str(INSTALL_SH), "--scope", "bogus"], + capture_output=True, + text=True, + env={ + "HOME": str(tmp_path), + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + }, + timeout=10, + ) + assert p.returncode != 0 + assert "scope" in p.stderr.lower() + + +def test_invalid_variant_rejected(tmp_path): + p = subprocess.run( + ["bash", str(INSTALL_SH), "--variant", "bogus"], + capture_output=True, + text=True, + env={ + "HOME": str(tmp_path), + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + }, + timeout=10, + ) + assert p.returncode != 0 + assert "variant" in p.stderr.lower() + + +# ── --install-dir path resolution ─────────────────────────────────── + + +class TestInstallDirAbsolutePath: + """Regression for gh-PR review: a relative ``--install-dir`` must + be resolved to an absolute path BEFORE it is written into + ``hooks.json``. Cursor invokes hook commands from its own working + directory (typically the project root), so a relative command path + would silently fail to launch the hook. + """ + + def test_relative_install_dir_is_absolutized_in_hooks_json(self, tmp_path): + # Run install from a known cwd with a relative --install-dir. + # The resulting hooks.json must reference an absolute path. + cwd = tmp_path / "run-from-here" + cwd.mkdir() + relative_install_dir = "rel-install" + # NOTE: we deliberately do NOT pre-create the directory — the + # installer itself creates it. The test asserts on the path + # baked into hooks.json, not on filesystem state. + env = { + "HOME": str(tmp_path), + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + "MEMPAL_PYTHON": sys.executable, + } + p = subprocess.run( + [ + "bash", + str(INSTALL_SH), + "--scope", + "project", + "--target", + str(tmp_path), + "--install-dir", + relative_install_dir, + ], + capture_output=True, + text=True, + env=env, + cwd=str(cwd), + timeout=30, + ) + assert p.returncode == 0, f"install failed: {p.stderr!r}" + cfg = json.loads(_hooks_file(tmp_path).read_text()) + expected_abs = str(cwd / relative_install_dir) + stop_cmd = cfg["hooks"]["stop"][0]["command"] + assert stop_cmd.startswith("/"), ( + f"hook command must be absolute path, not relative; got {stop_cmd!r}" + ) + assert stop_cmd.startswith(expected_abs), ( + f"hook command must be resolved against cwd={cwd!s}; got {stop_cmd!r}" + ) + + def test_absolute_install_dir_is_preserved_verbatim(self, tmp_path): + """The relative-to-absolute resolution must not mangle paths + that were already absolute.""" + abs_install_dir = tmp_path / "abs-install" + _run_install( + "--install-dir", + str(abs_install_dir), + target=tmp_path, + home=tmp_path, + ) + cfg = json.loads(_hooks_file(tmp_path).read_text()) + stop_cmd = cfg["hooks"]["stop"][0]["command"] + assert stop_cmd.startswith(str(abs_install_dir)), ( + f"absolute --install-dir must be preserved verbatim; " + f"got {stop_cmd!r} for input {abs_install_dir!s}" + ) diff --git a/tests/test_cursor_hooks_shell.py b/tests/test_cursor_hooks_shell.py new file mode 100644 index 000000000..557e4e426 --- /dev/null +++ b/tests/test_cursor_hooks_shell.py @@ -0,0 +1,775 @@ +"""Behavioral coverage for the Cursor hook shell scripts. + +Mirrors ``tests/test_hooks_shell.py`` + ``tests/test_hooks_bash_compat.py`` +in shape so a future contributor recognises the pattern. The three +hooks live at ``hooks/cursor/`` and source ``hooks/cursor/lib/common.sh``. + +Covered contracts: + +- bash 3.2 compatibility (no ``mapfile`` / ``readarray``; ``sed -n 'Np'`` + used for line extraction; ``bash -n`` clean). +- Per-conversation counter increments atomically across ``stop`` calls + and emits a ``followup_message`` only on the configured interval. +- ``MEMPAL_DISABLE_HOOK=1`` and ``MEMPALACE_HOOKS_AUTO_SAVE=false`` both + short-circuit every hook to ``{}``. +- Malformed stdin dumps the payload to a bounded 0600 file and logs a + warning; the hook still exits 0 with ``{}`` so Cursor proceeds. +- ``loop_count > 0`` short-circuits the save hook (loop-prevention). +- A pending-save marker dropped by ``preCompact`` forces a save + followup on the very next ``stop`` regardless of the counter. +- ``infer_wing_from_cwd`` handles ``/``, trailing slashes, spaces, and + empty input. +- The wake hook emits ``additional_context`` referencing the inferred + wing. +- The precompact hook drops a pending-save marker and emits the + documented ``user_message`` shape. +""" + +from __future__ import annotations + +import json +import os +import stat +import subprocess +import sys +import time +from pathlib import Path + +import pytest + +REPO_ROOT = Path(__file__).resolve().parent.parent +HOOKS_DIR = REPO_ROOT / "hooks" / "cursor" +SAVE_HOOK = HOOKS_DIR / "mempal_save_hook_cursor.sh" +PRECOMPACT_HOOK = HOOKS_DIR / "mempal_precompact_hook_cursor.sh" +WAKE_HOOK = HOOKS_DIR / "mempal_wake_hook_cursor.sh" +COMMON_LIB = HOOKS_DIR / "lib" / "common.sh" + +# All three .sh scripts, parametrised together for the source-level and +# universal-behaviour tests. ids= keeps pytest output readable. +ALL_HOOKS = pytest.mark.parametrize( + "hook", + [SAVE_HOOK, PRECOMPACT_HOOK, WAKE_HOOK], + ids=["save_hook", "precompact_hook", "wake_hook"], +) + +pytestmark = pytest.mark.skipif(os.name == "nt", reason="bash hook scripts are POSIX-only") + + +# ── helpers ────────────────────────────────────────────────────────── + + +def _run_hook( + hook: Path, + stdin: str, + home: Path, + *, + extra_env: dict | None = None, + path_prefix: list[Path] | None = None, + expected_rc: int = 0, +) -> tuple[str, str]: + """Invoke a hook with a controlled environment and assert exit code. + + Returns ``(stdout, stderr)``. Forces ``MEMPAL_PYTHON=sys.executable`` + so the hook always finds a Python that can ``import json`` — without + this, GUI-launched CI on macOS could hit a missing python3 on the + inherited PATH and produce spurious failures unrelated to the hook + logic. Forces a clean ``HOME`` so the state directory is sandboxed + under the test's ``tmp_path``. + """ + env = { + "HOME": str(home), + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + "MEMPAL_PYTHON": sys.executable, + } + if path_prefix: + env["PATH"] = os.pathsep.join(str(p) for p in path_prefix) + os.pathsep + env["PATH"] + if extra_env: + env.update(extra_env) + p = subprocess.run( + ["bash", str(hook)], + input=stdin, + capture_output=True, + text=True, + env=env, + timeout=30, + # Force a permissive umask so the hook's own ``umask 077`` inside + # parsing subshells is provably the sole reason diagnostic files + # end up mode 0600. Without this, an ambient restrictive umask + # on the CI runner would mask a regression that drops the in-hook + # ``umask`` line. Mirrors tests/test_hooks_bash_compat.py. + preexec_fn=lambda: os.umask(0o022), + ) + assert p.returncode == expected_rc, ( + f"{hook.name} exited {p.returncode} (expected {expected_rc}); " + f"stderr={p.stderr!r}; stdout={p.stdout!r}" + ) + return p.stdout, p.stderr + + +def _stop_payload( + *, + conv: str = "conv-1", + loop_count: int = 0, + transcript: str = "", +) -> str: + return json.dumps( + { + "conversation_id": conv, + "loop_count": loop_count, + "status": "completed", + "model": "claude-sonnet-4-20250514", + "hook_event_name": "stop", + "transcript_path": transcript, + "workspace_roots": ["/Users/test/sampleProj"], + } + ) + + +def _precompact_payload(*, conv: str = "conv-1", transcript: str = "") -> str: + return json.dumps( + { + "conversation_id": conv, + "hook_event_name": "preCompact", + "trigger": "auto", + "transcript_path": transcript, + "workspace_roots": ["/Users/test/sampleProj"], + } + ) + + +def _session_start_payload(*, conv: str = "conv-1") -> str: + return json.dumps( + { + "conversation_id": conv, + "session_id": conv, + "hook_event_name": "sessionStart", + "is_background_agent": False, + "composer_mode": "agent", + "workspace_roots": ["/Users/test/sampleProj"], + } + ) + + +def _state_dir(home: Path) -> Path: + return home / ".mempalace" / "hook_state" + + +def _log_text(home: Path) -> str: + log = _state_dir(home) / "cursor_hook.log" + return log.read_text() if log.exists() else "" + + +# ── source-level bash 3.2 compat (matches tests/test_hooks_bash_compat.py) ── + + +class TestBash32Compat: + @ALL_HOOKS + def test_bash_syntax_clean(self, hook): + p = subprocess.run( + ["bash", "-n", str(hook)], + capture_output=True, + text=True, + ) + assert p.returncode == 0, f"{hook.name} syntax error: {p.stderr}" + + def test_common_lib_syntax_clean(self): + p = subprocess.run( + ["bash", "-n", str(COMMON_LIB)], + capture_output=True, + text=True, + ) + assert p.returncode == 0, f"common.sh syntax error: {p.stderr}" + + @ALL_HOOKS + def test_no_bash4_array_builtins(self, hook): + src = "\n".join( + line for line in hook.read_text().splitlines() if not line.lstrip().startswith("#") + ) + assert "mapfile" not in src, ( + f"{hook.name} uses mapfile; unavailable on macOS /bin/bash 3.2 (#1440)" + ) + assert "readarray" not in src, ( + f"{hook.name} uses readarray; unavailable on macOS /bin/bash 3.2 (#1440)" + ) + + def test_common_lib_no_bash4_array_builtins(self): + src = "\n".join( + line + for line in COMMON_LIB.read_text().splitlines() + if not line.lstrip().startswith("#") + ) + assert "mapfile" not in src and "readarray" not in src, ( + "common.sh uses bash-4-only array builtins; would break macOS bash 3.2" + ) + + def test_common_lib_uses_sed_n_for_extraction(self): + # Defense: if a future edit swaps the sed-based parser for + # mapfile (the bash-4 form), this catches it at source level + # before any behavioural test runs. ``parse_cursor_stdin`` reads + # seven values (sentinel + six fields) so we expect at least + # seven ``sed -n 'Np'`` calls. + src = "\n".join( + line + for line in COMMON_LIB.read_text().splitlines() + if not line.lstrip().startswith("#") + ) + assert src.count("sed -n '") >= 7, ( + "common.sh must use sed -n 'Np' for POSIX-portable line extraction" + ) + + +# ── kill switches ─────────────────────────────────────────────────── + + +class TestKillSwitches: + @ALL_HOOKS + @pytest.mark.parametrize("value", ["1", "true", "yes", "on"]) + def test_disable_hook_env_short_circuits(self, hook, value, tmp_path): + out, _ = _run_hook( + hook, + _stop_payload(), + tmp_path, + extra_env={"MEMPAL_DISABLE_HOOK": value}, + ) + assert json.loads(out) == {}, f"MEMPAL_DISABLE_HOOK={value} must short-circuit; got {out!r}" + # No state files should be created when the kill switch fires. + state = _state_dir(tmp_path) + assert not (state / "cursor_hook.log").exists() or _log_text(tmp_path) == "" + + @ALL_HOOKS + @pytest.mark.parametrize("value", ["false", "0", "no", "off"]) + def test_auto_save_env_short_circuits(self, hook, value, tmp_path): + out, _ = _run_hook( + hook, + _stop_payload(), + tmp_path, + extra_env={"MEMPALACE_HOOKS_AUTO_SAVE": value}, + ) + assert json.loads(out) == {}, ( + f"MEMPALACE_HOOKS_AUTO_SAVE={value} must short-circuit; got {out!r}" + ) + + @ALL_HOOKS + def test_config_file_auto_save_false_short_circuits(self, hook, tmp_path): + cfg_dir = tmp_path / ".mempalace" + cfg_dir.mkdir(parents=True) + (cfg_dir / "config.json").write_text(json.dumps({"hooks": {"auto_save": False}})) + out, _ = _run_hook(hook, _stop_payload(), tmp_path) + assert json.loads(out) == {}, ( + f"config.json hooks.auto_save=false must short-circuit; got {out!r}" + ) + + +# ── malformed stdin ───────────────────────────────────────────────── + + +class TestMalformedStdin: + @ALL_HOOKS + def test_malformed_input_does_not_crash(self, hook, tmp_path): + out, _ = _run_hook(hook, "not-json garbage", tmp_path) + # Must still produce parseable JSON so Cursor proceeds. + assert json.loads(out) == {}, f"hook must emit {{}} on malformed input; got {out!r}" + + @ALL_HOOKS + def test_malformed_input_logs_warning_and_dumps_payload(self, hook, tmp_path): + _run_hook(hook, "not-json garbage", tmp_path) + state = _state_dir(tmp_path) + log = (state / "cursor_hook.log").read_text() + assert "WARN: input parse failed" in log, ( + f"expected parse-failure warning in log; got: {log!r}" + ) + dump = state / "cursor_last_input.log" + assert dump.exists() + assert "not-json garbage" in dump.read_text() + + @ALL_HOOKS + def test_dump_is_mode_0600(self, hook, tmp_path): + _run_hook(hook, "not-json garbage", tmp_path) + dump = _state_dir(tmp_path) / "cursor_last_input.log" + mode = stat.S_IMODE(dump.stat().st_mode) + assert mode == 0o600, f"cursor_last_input.log mode should be 0600, got {oct(mode)}" + + @ALL_HOOKS + def test_dump_cap_at_4096_bytes(self, hook, tmp_path): + _run_hook(hook, "x" * 4097, tmp_path) + dump = _state_dir(tmp_path) / "cursor_last_input.log" + assert dump.stat().st_size == 4096, ( + f"cap must be exactly 4096 bytes; got {dump.stat().st_size}" + ) + + @ALL_HOOKS + def test_empty_stdin_does_not_dump(self, hook, tmp_path): + out, _ = _run_hook(hook, "", tmp_path) + assert json.loads(out) == {} + dump = _state_dir(tmp_path) / "cursor_last_input.log" + assert not dump.exists(), "empty stdin must not produce a dump file" + + @ALL_HOOKS + def test_successful_parse_leaves_no_python_err_log(self, hook, tmp_path): + if hook == SAVE_HOOK: + payload = _stop_payload() + elif hook == PRECOMPACT_HOOK: + payload = _precompact_payload() + else: + payload = _session_start_payload() + _run_hook(hook, payload, tmp_path) + err_log = _state_dir(tmp_path) / "cursor_last_python_err.log" + assert not err_log.exists(), "successful parse must clean up cursor_last_python_err.log" + + +# ── save hook: counter + threshold ────────────────────────────────── + + +class TestSaveHookCounter: + def test_counter_increments_across_invocations(self, tmp_path): + for _ in range(3): + out, _ = _run_hook(SAVE_HOOK, _stop_payload(conv="conv-A"), tmp_path) + assert json.loads(out) == {}, "below threshold must be a no-op" + counter_file = _state_dir(tmp_path) / "cursor_conv-A.count" + assert counter_file.read_text() == "3", ( + f"counter should be 3 after 3 invocations; got {counter_file.read_text()!r}" + ) + + def test_counter_per_conversation_isolated(self, tmp_path): + _run_hook(SAVE_HOOK, _stop_payload(conv="conv-A"), tmp_path) + _run_hook(SAVE_HOOK, _stop_payload(conv="conv-A"), tmp_path) + _run_hook(SAVE_HOOK, _stop_payload(conv="conv-B"), tmp_path) + assert (_state_dir(tmp_path) / "cursor_conv-A.count").read_text() == "2" + assert (_state_dir(tmp_path) / "cursor_conv-B.count").read_text() == "1" + + def test_threshold_emits_followup_message(self, tmp_path): + # Lower the interval to keep the test fast. + env = {"MEMPAL_SAVE_INTERVAL": "3"} + for _ in range(2): + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + assert json.loads(out) == {} + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + response = json.loads(out) + assert "followup_message" in response, ( + f"third invocation must emit a followup_message; got {response!r}" + ) + msg = response["followup_message"] + # Followup must reference the real MCP tool names (regression + # guard against future typos that would silently fail). + assert "mempalace_add_drawer" in msg + assert "mempalace_check_duplicate" in msg + assert "mempalace_diary_write" in msg + assert "cursor-ide" in msg, "diary entries must be tagged agent_name=cursor-ide" + + def test_threshold_followup_references_inferred_wing(self, tmp_path): + env = {"MEMPAL_SAVE_INTERVAL": "1"} + # workspace_roots[0] = /Users/test/sampleProj -> wing=sampleproj + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + msg = json.loads(out)["followup_message"] + assert "sampleproj" in msg, f"followup should reference inferred wing; got {msg!r}" + + def test_save_interval_zero_is_coerced_to_default(self, tmp_path): + """Regression for gh-PR review: MEMPAL_SAVE_INTERVAL=0 would + otherwise crash bash on `$((NEXT % 0))` (division by zero). + Zero must be coerced to the default interval (15) so the hook + survives a misconfigured env var without exiting non-zero. + """ + env = {"MEMPAL_SAVE_INTERVAL": "0"} + # Three independent invocations: each must succeed (rc=0) and + # emit {} since the coerced interval (15) is never reached. + for _ in range(3): + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + assert json.loads(out) == {}, ( + f"SAVE_INTERVAL=0 must coerce to default and pass through; got {out!r}" + ) + + +# ── save hook: followup opt-out ───────────────────────────────────── + + +class TestSaveHookFollowupSilence: + """The Cursor followup_message is ON by default (it is the + load-bearing verbatim path because Cursor's transcript is unminable), + but users can silence it. These tests lock the opt-out contract. + """ + + def test_followup_on_by_default_at_threshold(self, tmp_path): + """Sanity baseline: with no silence flag, the threshold emits a + followup. Guards against an accidental default flip.""" + env = {"MEMPAL_SAVE_INTERVAL": "1"} + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + assert "followup_message" in json.loads(out) + + @pytest.mark.parametrize("value", ["1", "true", "yes", "on"]) + def test_cursor_silent_suppresses_followup(self, value, tmp_path): + env = {"MEMPAL_SAVE_INTERVAL": "1", "MEMPAL_CURSOR_SILENT": value} + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + assert json.loads(out) == {}, ( + f"MEMPAL_CURSOR_SILENT={value!r} must suppress the followup; got {out!r}" + ) + + @pytest.mark.parametrize("value", ["false", "0", "no", "off"]) + def test_verbose_false_suppresses_followup(self, value, tmp_path): + env = {"MEMPAL_SAVE_INTERVAL": "1", "MEMPAL_VERBOSE": value} + out, _ = _run_hook(SAVE_HOOK, _stop_payload(), tmp_path, extra_env=env) + assert json.loads(out) == {}, ( + f"MEMPAL_VERBOSE={value!r} must suppress the followup; got {out!r}" + ) + + def test_silenced_followup_still_increments_counter(self, tmp_path): + """Silence must not disable bookkeeping — the counter still + advances so cadence is preserved if the user re-enables.""" + env = {"MEMPAL_SAVE_INTERVAL": "5", "MEMPAL_CURSOR_SILENT": "1"} + for _ in range(2): + _run_hook(SAVE_HOOK, _stop_payload(conv="conv-S"), tmp_path, extra_env=env) + counter = _state_dir(tmp_path) / "cursor_conv-S.count" + assert counter.exists() and counter.read_text().strip() == "2", ( + "silenced followup must still maintain the per-conversation counter" + ) + + def test_silenced_pending_marker_emits_empty(self, tmp_path): + """A consumed pending marker normally forces a followup; under + silence it must emit {} but still clear the marker.""" + env = {"MEMPAL_CURSOR_SILENT": "1"} + pending = _state_dir(tmp_path) / "cursor_conv-P.pending" + pending.parent.mkdir(parents=True, exist_ok=True) + pending.touch() + out, _ = _run_hook(SAVE_HOOK, _stop_payload(conv="conv-P"), tmp_path, extra_env=env) + assert json.loads(out) == {} + assert not pending.exists(), "pending marker must be consumed even when silenced" + + +# ── save hook: loop-prevention ────────────────────────────────────── + + +class TestSaveHookLoopPrevention: + def test_loop_count_gt_zero_short_circuits(self, tmp_path): + out, _ = _run_hook( + SAVE_HOOK, + _stop_payload(loop_count=1), + tmp_path, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert json.loads(out) == {}, ( + "loop_count > 0 must short-circuit even at the trigger interval" + ) + # No counter file should be written in the short-circuit path. + assert not (_state_dir(tmp_path) / "cursor_conv-1.count").exists() + + def test_loop_count_zero_does_not_short_circuit(self, tmp_path): + out, _ = _run_hook( + SAVE_HOOK, + _stop_payload(loop_count=0), + tmp_path, + extra_env={"MEMPAL_SAVE_INTERVAL": "1"}, + ) + assert "followup_message" in json.loads(out) + + +# ── save hook: pending-save marker from preCompact ────────────────── + + +class TestPendingSaveMarker: + def test_pending_marker_forces_followup_regardless_of_counter(self, tmp_path): + state = _state_dir(tmp_path) + state.mkdir(parents=True, exist_ok=True) + # Drop the marker as if precompact had run. + (state / "cursor_conv-1.pending").write_text("") + out, _ = _run_hook( + SAVE_HOOK, + _stop_payload(), + tmp_path, + # SAVE_INTERVAL=1000 ensures the normal counter path would + # not trigger; the marker is the only reason a followup + # gets emitted. + extra_env={"MEMPAL_SAVE_INTERVAL": "1000"}, + ) + response = json.loads(out) + assert "followup_message" in response, ( + "pending marker must force a followup even far below threshold" + ) + # Marker must be consumed on read. + assert not (state / "cursor_conv-1.pending").exists(), ( + "pending marker must be deleted after consumption" + ) + + def test_pending_marker_is_per_conversation(self, tmp_path): + state = _state_dir(tmp_path) + state.mkdir(parents=True, exist_ok=True) + (state / "cursor_conv-OTHER.pending").write_text("") + out, _ = _run_hook( + SAVE_HOOK, + _stop_payload(conv="conv-1"), + tmp_path, + extra_env={"MEMPAL_SAVE_INTERVAL": "1000"}, + ) + # conv-1 has no marker -> counter path -> no trigger -> {}. + assert json.loads(out) == {} + # conv-OTHER marker must NOT be consumed by conv-1's invocation. + assert (state / "cursor_conv-OTHER.pending").exists() + + +# ── preCompact hook ───────────────────────────────────────────────── + + +class TestPreCompactHook: + def test_emits_user_message(self, tmp_path): + out, _ = _run_hook(PRECOMPACT_HOOK, _precompact_payload(), tmp_path) + response = json.loads(out) + # Cursor's preCompact only accepts user_message; never + # followup_message or decision. + assert "user_message" in response, f"expected user_message; got {response!r}" + assert "followup_message" not in response + assert "decision" not in response + + def test_drops_pending_marker(self, tmp_path): + _run_hook(PRECOMPACT_HOOK, _precompact_payload(conv="conv-X"), tmp_path) + marker = _state_dir(tmp_path) / "cursor_conv-X.pending" + assert marker.exists(), "preCompact must drop a pending-save marker" + + def test_logs_trigger(self, tmp_path): + _run_hook(PRECOMPACT_HOOK, _precompact_payload(conv="conv-Y"), tmp_path) + log = _log_text(tmp_path) + assert "event=preCompact" in log + assert "conv=conv-Y" in log + assert "trigger=auto" in log + + +# ── wake (sessionStart) hook ─────────────────────────────────────── + + +class TestWakeHook: + def test_emits_additional_context(self, tmp_path): + out, _ = _run_hook(WAKE_HOOK, _session_start_payload(), tmp_path) + response = json.loads(out) + assert "additional_context" in response, ( + f"sessionStart must emit additional_context; got {response!r}" + ) + ctx = response["additional_context"] + # Must reference the inferred wing AND the real MCP tools. + assert "sampleproj" in ctx, f"context should reference inferred wing; got {ctx!r}" + assert "mempalace_search" in ctx + assert "mempalace_diary_read" in ctx + assert "cursor-ide" in ctx + + def test_falls_back_to_env_when_workspace_roots_missing(self, tmp_path): + # Cursor always provides workspace_roots, but the env-var + # fallback path needs coverage so a future Cursor schema + # change cannot silently break the wake hook. + payload = json.dumps( + { + "conversation_id": "conv-Z", + "session_id": "conv-Z", + "hook_event_name": "sessionStart", + "is_background_agent": False, + "composer_mode": "agent", + } + ) + out, _ = _run_hook( + WAKE_HOOK, + payload, + tmp_path, + extra_env={"CURSOR_PROJECT_DIR": "/Users/test/envFallback"}, + ) + ctx = json.loads(out)["additional_context"] + assert "envfallback" in ctx, ( + f"env-var fallback workspace should drive wing inference; got {ctx!r}" + ) + + +# ── infer_wing_from_cwd via direct function call ──────────────────── + + +def _call_infer_wing(arg: str) -> str: + """Source common.sh in a bash subshell and invoke mempal_infer_wing. + + Returns the function's stdout. Uses bash -c so we never have to + pollute the test's own shell environment with the common.sh state + (which mkdir's directories and resolves Python paths). + """ + script = f'. "{COMMON_LIB}" >/dev/null 2>&1; mempal_infer_wing "$1"' + # The argument is passed as a positional so it survives any shell + # quirks around spaces/empty values exactly as the production hook + # would see them. + p = subprocess.run( + ["bash", "-c", script, "_test", arg], + capture_output=True, + text=True, + env={ + "HOME": "/tmp", + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + "MEMPAL_PYTHON": sys.executable, + }, + timeout=10, + ) + assert p.returncode == 0, f"infer_wing call failed: {p.stderr!r}" + return p.stdout + + +class TestInferWing: + def test_basename_of_normal_path(self): + assert _call_infer_wing("/Users/me/myproject") == "myproject" + + def test_strips_trailing_slash(self): + assert _call_infer_wing("/Users/me/myproject/") == "myproject" + + def test_root_path_falls_back(self): + assert _call_infer_wing("/") == "root" + + def test_empty_input_falls_back(self): + assert _call_infer_wing("") == "cursor_session" + + def test_spaces_collapsed_to_underscore(self): + assert _call_infer_wing("/Users/me/my project") == "my_project" + + def test_lowercases_uppercase_basename(self): + # Cursor on macOS often hands us /Users//Projects/MyApp. + # The wing scoping in MemPalace's MCP tools is case-sensitive, + # so the wake hook and save hook must produce identical wings + # for the same workspace — lowercasing is the simplest + # contract. + assert _call_infer_wing("/Users/me/MyApp") == "myapp" + + def test_windows_style_path(self): + # Cursor on Windows passes C:\path\to\Project as workspace_root. + # The hook scripts are POSIX-only (we skip them on Windows) but + # WSL users may still hit a backslash-bearing path via the + # CURSOR_PROJECT_DIR env var when Cursor is launched from + # PowerShell. + assert _call_infer_wing(r"C:\Users\me\MyProj") == "myproj" + + +# ── state-file TTL + GC ───────────────────────────────────────────── + + +def _run_common_snippet(snippet: str, home: Path, *, extra_env: dict | None = None) -> str: + """Source common.sh and run a bash snippet against a sandboxed HOME. + + Returns stdout. Used to exercise mempal_state_ttl_days / + mempal_gc_stale_state directly without going through a full hook. + """ + script = f'. "{COMMON_LIB}" >/dev/null 2>&1; {snippet}' + env = { + "HOME": str(home), + "PATH": os.environ.get("PATH", "/usr/bin:/bin"), + "MEMPAL_PYTHON": sys.executable, + } + if extra_env: + env.update(extra_env) + p = subprocess.run( + ["bash", "-c", script, "_test"], + capture_output=True, + text=True, + env=env, + timeout=15, + ) + assert p.returncode == 0, f"snippet failed: {p.stderr!r}" + return p.stdout + + +def _age_file(path: Path, days: int) -> None: + old = time.time() - days * 86400 + os.utime(path, (old, old)) + + +class TestStateTtlDays: + @pytest.mark.parametrize( + "value,expected", + [ + ("", "30"), + ("abc", "30"), + ("45", "45"), + ("08", "8"), + ("007", "7"), + ("0", "0"), + ], + ) + def test_ttl_validation_and_octal_strip(self, value, expected, tmp_path): + out = _run_common_snippet( + "mempal_state_ttl_days", + tmp_path, + extra_env={"MEMPAL_STATE_TTL_DAYS": value} if value != "" else {}, + ) + assert out.strip() == expected, ( + f"MEMPAL_STATE_TTL_DAYS={value!r} should resolve to {expected!r}; got {out.strip()!r}" + ) + + def test_ttl_default_when_unset(self, tmp_path): + assert _run_common_snippet("mempal_state_ttl_days", tmp_path).strip() == "30" + + +class TestStateGc: + def test_removes_stale_count_and_pending(self, tmp_path): + sd = _state_dir(tmp_path) + sd.mkdir(parents=True, exist_ok=True) + stale_count = sd / "cursor_old.count" + stale_pending = sd / "cursor_old.pending" + fresh_count = sd / "cursor_new.count" + for f in (stale_count, stale_pending, fresh_count): + f.write_text("1") + _age_file(stale_count, 40) + _age_file(stale_pending, 40) + _run_common_snippet("mempal_gc_stale_state", tmp_path) + assert not stale_count.exists(), "stale .count older than TTL must be swept" + assert not stale_pending.exists(), "stale .pending older than TTL must be swept" + assert fresh_count.exists(), "recent state must be preserved" + + def test_preserves_shared_logs_and_other_editor_state(self, tmp_path): + sd = _state_dir(tmp_path) + sd.mkdir(parents=True, exist_ok=True) + # Shared logs + another editor's state, all aged well past the TTL. + keep = [ + sd / "cursor_hook.log", + sd / "cursor_last_input.log", + sd / "cursor_last_python_err.log", + sd / "antigravity_save_count_xyz", + sd / "hook.log", + ] + for f in keep: + f.write_text("x") + _age_file(f, 99) + _run_common_snippet("mempal_gc_stale_state", tmp_path) + for f in keep: + assert f.exists(), f"GC must never touch {f.name}" + + def test_creates_sweep_marker(self, tmp_path): + _run_common_snippet("mempal_gc_stale_state", tmp_path) + assert (_state_dir(tmp_path) / "cursor_last_sweep").exists() + + def test_throttled_within_24h(self, tmp_path): + sd = _state_dir(tmp_path) + sd.mkdir(parents=True, exist_ok=True) + # A fresh sweep marker must suppress a second sweep, so a stale + # file created afterwards survives until the throttle expires. + (sd / "cursor_last_sweep").write_text("") + stale = sd / "cursor_old.count" + stale.write_text("1") + _age_file(stale, 40) + _run_common_snippet("mempal_gc_stale_state", tmp_path) + assert stale.exists(), "GC must be throttled when last_sweep is recent" + + def test_gc_gated_by_kill_switch(self, tmp_path): + """A disabled hook must not sweep (or even create the marker).""" + sd = _state_dir(tmp_path) + sd.mkdir(parents=True, exist_ok=True) + stale = sd / "cursor_zombie.count" + stale.write_text("1") + _age_file(stale, 40) + _run_hook( + SAVE_HOOK, + _stop_payload(), + tmp_path, + extra_env={"MEMPAL_DISABLE_HOOK": "1"}, + ) + assert stale.exists(), "disabled hook must not GC state" + assert not (sd / "cursor_last_sweep").exists(), ( + "disabled hook must not even create the sweep marker" + ) + + +# ── logging discipline ───────────────────────────────────────────── + + +class TestLogging: + def test_log_uses_iso8601_utc_timestamps(self, tmp_path): + _run_hook(SAVE_HOOK, _stop_payload(), tmp_path) + log = _log_text(tmp_path) + # ISO 8601 with 'Z' suffix means UTC, locale-independent. + # Regression guard against switching back to %H:%M:%S which + # loses both the date and the timezone. + assert "T" in log and "Z]" in log, f"log timestamps must be ISO 8601 UTC; got: {log!r}" diff --git a/tests/test_cursor_plugin_manifest.py b/tests/test_cursor_plugin_manifest.py new file mode 100644 index 000000000..6d122333f --- /dev/null +++ b/tests/test_cursor_plugin_manifest.py @@ -0,0 +1,527 @@ +"""Contract tests for ``.cursor-plugin/``. + +These tests protect the four things a Cursor user actually relies on +once they install the plugin: + +1. The manifest (``.cursor-plugin/plugin.json``) is valid JSON, satisfies + Cursor's required + structural fields, and every component path it + declares resolves to a real on-disk target. +2. The marketplace manifest (``.cursor-plugin/marketplace.json``) is + valid JSON and points at the same plugin. +3. The MCP config (``.cursor-plugin/mcp.json``) is valid JSON, wraps + server entries under the documented ``mcpServers`` key, and + registers the ``mempalace-mcp`` binary that ships with the package. +4. Every skill ``SKILL.md`` and command ``*.md`` parses as YAML + frontmatter + markdown body. Cursor derives the slash-command + slug from the **filename stem** (e.g. ``mempalace-help.md`` → + ``/mempalace-help``), so command files do NOT need a ``name`` + frontmatter field — only ``description`` is required. + +Run with:: + + uv run pytest tests/test_cursor_plugin_manifest.py -v + +All tests are pure file inspection (no subprocesses, no network) and +take milliseconds. They run on every CI platform without needing +Cursor itself. +""" + +from __future__ import annotations + +import json +import re +from pathlib import Path + +import pytest +import yaml + +REPO_ROOT = Path(__file__).resolve().parent.parent +PLUGIN_DIR = REPO_ROOT / ".cursor-plugin" +MANIFEST_PATH = PLUGIN_DIR / "plugin.json" +MARKETPLACE_PATH = PLUGIN_DIR / "marketplace.json" +MCP_PATH = PLUGIN_DIR / "mcp.json" +README_PATH = PLUGIN_DIR / "README.md" + +# Component directories: canonical location is at the plugin root (repo root), +# NOT inside .cursor-plugin/. Cursor's default discovery requires real +# directories at the plugin root; .cursor-plugin/ symlinks back to these. +SKILLS_DIR = REPO_ROOT / "skills" +COMMANDS_DIR = REPO_ROOT / "commands" +RULES_DIR = REPO_ROOT / "rules" + +# The slugs we promise to ship. The README's "Available Slash Commands" +# table is the user-facing contract; if you add/remove a command, +# update both the README and this list. +EXPECTED_COMMAND_NAMES = { + "mempalace-help", + "mempalace-init", + "mempalace-mine", + "mempalace-search", + "mempalace-status", +} + +# Per cursor.com/docs/reference/plugins: "Plugin identifier. Lowercase, +# kebab-case (alphanumerics, hyphens, and periods). Must start and end +# with an alphanumeric character." +KEBAB_RE = re.compile(r"^[a-z0-9]([a-z0-9.-]*[a-z0-9])?$") + +# Cursor's submission checklist explicitly forbids these in manifest +# paths: "All paths in manifest are relative and valid (no `..`, no +# absolute paths)." Treat both as hard failures rather than warnings — +# the marketplace review bot would reject the plugin otherwise. +_FORBIDDEN_PATH_FRAGMENTS = ("..",) + + +# ── helpers ───────────────────────────────────────────────────────── + + +def _parse_frontmatter(text: str) -> tuple[dict, str]: + """Split a markdown file with YAML frontmatter into ``(meta, body)``. + + The frontmatter must start at byte 0 with a literal ``---\\n`` and + close with another ``---\\n`` line. Files without frontmatter are + treated as having an empty ``meta`` dict so the caller can decide + whether that's acceptable for the file type under test. + """ + if not text.startswith("---\n"): + return {}, text + end = text.find("\n---\n", 4) + if end == -1: + return {}, text + raw = text[4:end] + body = text[end + 5 :] + parsed = yaml.safe_load(raw) or {} + if not isinstance(parsed, dict): + raise AssertionError(f"Frontmatter parsed to {type(parsed).__name__}, expected dict") + return parsed, body + + +def _is_safe_relative(path_str: str) -> bool: + """Return True iff ``path_str`` is a relative, ``..``-free path.""" + if not isinstance(path_str, str) or not path_str: + return False + p = Path(path_str) + if p.is_absolute(): + return False + return not any(part in _FORBIDDEN_PATH_FRAGMENTS for part in p.parts) + + +# ── plugin.json ───────────────────────────────────────────────────── + + +class TestPluginManifest: + def test_manifest_exists(self): + assert MANIFEST_PATH.is_file(), f"{MANIFEST_PATH} is missing" + + def test_manifest_is_valid_json(self): + json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + + def test_manifest_has_required_name_field(self): + data = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + assert isinstance(data.get("name"), str) and data["name"], ( + "plugin.json must have a non-empty 'name' (only required field per Cursor schema)" + ) + + def test_manifest_name_is_kebab_case(self): + data = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + assert KEBAB_RE.match(data["name"]), ( + f"name must be lowercase kebab-case; got {data['name']!r}" + ) + + def test_manifest_has_recommended_optional_fields(self): + """``description`` and ``author.name`` aren't required by the + schema but ARE required by the submission checklist, so failing + early here saves a round-trip with the marketplace reviewers.""" + data = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + assert isinstance(data.get("description"), str) and data["description"] + author = data.get("author") + assert isinstance(author, dict) and isinstance(author.get("name"), str) + assert author["name"], "author.name must be non-empty" + + def test_manifest_omits_hardcoded_version(self): + """plugin.json must NOT hardcode a ``version`` field. + + ``mempalace/version.py`` is the single source of truth (per + CLAUDE.md). A hardcoded version here silently drifts on the next + release (igorls review, PR #1632). The sibling Antigravity plugin + omits the field entirely; we match that. The marketplace resolves + the package version at publish time. + """ + data = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + assert "version" not in data, ( + "plugin.json must omit the hardcoded 'version' field to avoid " + f"drift from mempalace/version.py; found {data.get('version')!r}" + ) + + def test_marketplace_entry_omits_hardcoded_version(self): + """Same drift guard for the marketplace plugin entry.""" + data = json.loads(MARKETPLACE_PATH.read_text(encoding="utf-8")) + for plugin in data.get("plugins", []): + if isinstance(plugin, dict): + assert "version" not in plugin, ( + "marketplace.json plugin entry must omit the hardcoded " + f"'version' field; found {plugin.get('version')!r}" + ) + + @pytest.mark.parametrize("field", ["skills", "commands", "mcpServers"]) + def test_manifest_component_paths_are_safe(self, field: str): + """Every path the manifest declares must be relative + ``..``-free. + + Cursor's submission checklist rejects ``..`` or absolute paths + outright. We check here so a typo doesn't fail review. + """ + data = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + value = data.get(field) + if value is None: + return # optional — if missing, default discovery kicks in + if isinstance(value, str): + paths = [value] + elif isinstance(value, list): + paths = [v for v in value if isinstance(v, str)] + else: + return # inline object form; nothing to validate path-wise + for p in paths: + assert _is_safe_relative(p), f"{field}: {p!r} must be relative and contain no '..'" + + @pytest.mark.parametrize( + "field,expected_type", + [ + ("skills", "dir"), + ("commands", "dir"), + ("mcpServers", "file"), + ], + ) + def test_manifest_component_paths_resolve(self, field: str, expected_type: str): + """Every component path must point at a real on-disk target. + + Use REPO_ROOT (not PLUGIN_DIR) as the resolution base because + Cursor resolves manifest paths against the plugin root, which + for our layout is the repo root (the dir containing + ``.cursor-plugin/``). + """ + data = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + value = data.get(field) + if not isinstance(value, str): + return # inline object form or absent + target = (REPO_ROOT / value).resolve() + if expected_type == "dir": + assert target.is_dir(), f"{field}={value!r} -> {target} is not a directory" + else: + assert target.is_file(), f"{field}={value!r} -> {target} is not a file" + + +# ── marketplace.json ──────────────────────────────────────────────── + + +class TestMarketplaceManifest: + def test_marketplace_exists(self): + assert MARKETPLACE_PATH.is_file(), f"{MARKETPLACE_PATH} is missing" + + def test_marketplace_is_valid_json(self): + json.loads(MARKETPLACE_PATH.read_text(encoding="utf-8")) + + def test_marketplace_required_fields(self): + data = json.loads(MARKETPLACE_PATH.read_text(encoding="utf-8")) + assert isinstance(data.get("name"), str) and data["name"] + owner = data.get("owner") + assert isinstance(owner, dict) and isinstance(owner.get("name"), str) + plugins = data.get("plugins") + assert isinstance(plugins, list) and 1 <= len(plugins) <= 500 + + def test_marketplace_lists_mempalace_plugin(self): + """The marketplace must list our plugin, and the listed name must + match the actual ``plugin.json::name`` — otherwise the marketplace + resolver looks up ``my-plugin/.cursor-plugin/plugin.json`` and + gets a name mismatch, which Cursor rejects at install time. + """ + data = json.loads(MARKETPLACE_PATH.read_text(encoding="utf-8")) + manifest = json.loads(MANIFEST_PATH.read_text(encoding="utf-8")) + names = {p.get("name") for p in data["plugins"] if isinstance(p, dict)} + assert manifest["name"] in names, ( + f"marketplace.json plugins list does not include {manifest['name']!r}" + ) + + +# ── mcp.json ──────────────────────────────────────────────────────── + + +class TestMcpConfig: + def test_mcp_config_exists(self): + assert MCP_PATH.is_file(), f"{MCP_PATH} is missing" + + def test_mcp_config_is_valid_json(self): + json.loads(MCP_PATH.read_text(encoding="utf-8")) + + def test_mcp_config_wraps_servers_under_mcpservers_key(self): + """Per cursor.com/docs/reference/plugins#mcp-servers, the MCP + config file must contain server entries under a ``mcpServers`` + key. Using the flat shape (used by Claude's ``.mcp.json``) here + would silently fail to register the server with Cursor. + """ + data = json.loads(MCP_PATH.read_text(encoding="utf-8")) + assert "mcpServers" in data and isinstance(data["mcpServers"], dict), ( + "mcp.json must wrap servers under an 'mcpServers' object" + ) + + def test_mcp_config_registers_mempalace_server(self): + data = json.loads(MCP_PATH.read_text(encoding="utf-8")) + servers = data["mcpServers"] + assert "mempalace" in servers, "mcp.json must register a server named 'mempalace'" + entry = servers["mempalace"] + assert isinstance(entry, dict) and isinstance(entry.get("command"), str) + assert entry["command"] == "mempalace-mcp", ( + f"mempalace server command must be 'mempalace-mcp' (the binary " + f"shipped by the package); got {entry.get('command')!r}" + ) + + +# ── skills/ ───────────────────────────────────────────────────────── + + +class TestSkills: + def test_skills_dir_exists(self): + assert SKILLS_DIR.is_dir() + + def test_at_least_one_skill_present(self): + skill_files = list(SKILLS_DIR.glob("*/SKILL.md")) + assert skill_files, ( + f"{SKILLS_DIR} must contain at least one /SKILL.md " + "(otherwise Cursor's discovery treats the plugin as having no skills)" + ) + + def test_mempalace_skill_exists(self): + assert (SKILLS_DIR / "mempalace" / "SKILL.md").is_file() + + def test_mempalace_recall_skill_exists(self): + """The recall skill is the search-before-answer half of the + plugin (the ``mempalace`` skill covers setup/mine/status). If it + goes missing, recall silently regresses to model-memory guessing. + """ + assert (SKILLS_DIR / "mempalace-recall" / "SKILL.md").is_file() + + def test_each_skill_has_valid_frontmatter(self): + """Every SKILL.md must declare ``name`` (kebab-case) and a + non-empty ``description``. Skills missing these fields silently + fail to register in Cursor's skill picker (#1410 equivalent). + """ + for skill_path in SKILLS_DIR.glob("*/SKILL.md"): + text = skill_path.read_text(encoding="utf-8") + meta, body = _parse_frontmatter(text) + ctx = f"{skill_path.relative_to(REPO_ROOT)}" + assert meta, f"{ctx}: missing YAML frontmatter" + assert isinstance(meta.get("name"), str) and meta["name"], ( + f"{ctx}: 'name' must be a non-empty string" + ) + assert KEBAB_RE.match(meta["name"]), ( + f"{ctx}: name must be lowercase kebab-case; got {meta['name']!r}" + ) + assert isinstance(meta.get("description"), str) and meta["description"], ( + f"{ctx}: 'description' must be a non-empty string" + ) + assert body.strip(), f"{ctx}: body must not be empty" + + def test_skill_name_matches_directory(self): + """The skill's directory name should equal the frontmatter + ``name`` — Cursor displays the directory name in the picker and + the frontmatter name in the API; mismatches confuse both users + and the agent. + """ + for skill_path in SKILLS_DIR.glob("*/SKILL.md"): + meta, _ = _parse_frontmatter(skill_path.read_text(encoding="utf-8")) + dir_name = skill_path.parent.name + assert meta.get("name") == dir_name, ( + f"{skill_path.relative_to(REPO_ROOT)}: name={meta.get('name')!r} " + f"must match directory {dir_name!r}" + ) + + +# ── rules/ ────────────────────────────────────────────────────────── + + +class TestRules: + """The plugin ships an optional recall rule at the plugin root under + ``rules/``. Like skills and commands, rules are discovered from a + real directory at the plugin root (the repo root), not from inside + ``.cursor-plugin/``. + """ + + def test_rules_dir_exists(self): + assert RULES_DIR.is_dir(), "rules/ missing at repo root" + + def test_rules_dir_is_real_not_symlink(self): + assert not RULES_DIR.is_symlink(), ( + "rules/ must be a real directory, not a symlink — " + "Cursor does not follow symlinks for local-plugin discovery" + ) + + def test_recall_rule_exists(self): + assert (RULES_DIR / "mempalace-recall.mdc").is_file() + + def test_each_rule_has_valid_frontmatter(self): + """Every ``.mdc`` rule must declare a non-empty ``description`` + (Cursor's matcher reads it to decide relevance) and a boolean + ``alwaysApply``. A rule missing ``description`` never auto-applies. + """ + rule_files = list(RULES_DIR.glob("*.mdc")) + assert rule_files, f"{RULES_DIR} must contain at least one .mdc rule" + for rule_path in rule_files: + text = rule_path.read_text(encoding="utf-8") + meta, body = _parse_frontmatter(text) + ctx = f"{rule_path.relative_to(REPO_ROOT)}" + assert meta, f"{ctx}: missing YAML frontmatter" + assert isinstance(meta.get("description"), str) and meta["description"], ( + f"{ctx}: 'description' must be a non-empty string" + ) + assert isinstance(meta.get("alwaysApply"), bool), ( + f"{ctx}: 'alwaysApply' must be a boolean" + ) + assert body.strip(), f"{ctx}: body must not be empty" + + def test_shipped_recall_rule_is_not_always_apply(self): + """The plugin-shipped recall rule must be ``alwaysApply: false``. + + An always-on rule loads on every turn in every workspace the + plugin touches, adding MCP latency to unrelated work and fighting + MemPalace's "memory should feel instant" budget. The aggressive + ``alwaysApply: true`` variant is an opt-in shipped only under + examples/, never wired into the default plugin bundle. + """ + meta, _ = _parse_frontmatter( + (RULES_DIR / "mempalace-recall.mdc").read_text(encoding="utf-8") + ) + assert meta.get("alwaysApply") is False, ( + "the plugin-shipped recall rule must be alwaysApply: false; " + "the always-on variant belongs in examples/cursor/rules/" + ) + + +# ── commands/ ─────────────────────────────────────────────────────── + + +class TestDefaultDiscoveryLayout: + """Cursor discovers plugin components from real ``commands/``, + ``skills/``, and ``mcp.json`` at the *plugin root* (our repo root). + + These must be real directories/files — Cursor does not follow + symlinks for local-plugin component discovery. We verified this + behaviour by comparing the cached Cloudflare plugin structure + (all real dirs) against our earlier broken symlink-only attempt. + """ + + def test_commands_is_real_dir_at_plugin_root(self): + target = REPO_ROOT / "commands" + assert target.is_dir(), "commands/ missing at repo root" + assert not target.is_symlink(), ( + "commands/ must be a real directory, not a symlink — " + "Cursor does not follow symlinks for local-plugin discovery" + ) + + def test_skills_is_real_dir_at_plugin_root(self): + target = REPO_ROOT / "skills" + assert target.is_dir(), "skills/ missing at repo root" + assert not target.is_symlink(), ( + "skills/ must be a real directory, not a symlink — " + "Cursor does not follow symlinks for local-plugin discovery" + ) + + def test_mcp_json_is_real_file_at_plugin_root(self): + target = REPO_ROOT / "mcp.json" + assert target.is_file(), "mcp.json missing at repo root" + assert not target.is_symlink(), ( + "mcp.json must be a real file, not a symlink — " + "Cursor does not follow symlinks for local-plugin discovery" + ) + + def test_no_symlinks_under_cursor_plugin_dir(self): + """No path under ``.cursor-plugin/`` may be a symlink. + + igorls review (PR #1632): committed symlinks materialise as plain + text files containing the link target on Windows clones with + ``core.symlinks=false``, silently breaking the plugin. CI's + manifest tests skip Windows, so this guard runs on every platform. + The canonical components live at the repo root (``source: "."``); + the old ``.cursor-plugin/{commands,skills}`` convenience symlinks + were redundant and have been removed. + """ + offenders = [p for p in PLUGIN_DIR.rglob("*") if p.is_symlink()] + assert not offenders, f"Symlinks under .cursor-plugin/ break Windows clones: {offenders}" + + +class TestCommands: + def test_commands_dir_exists(self): + assert COMMANDS_DIR.is_dir() + + def test_command_set_matches_promised_set(self): + """The README documents exactly five slash commands. The files + on disk must match that set — no more, no fewer — otherwise the + README is lying to users. + + Cursor derives the slash-command slug from the filename stem, so + we compare stems, not frontmatter ``name`` values. + """ + actual = {cmd_path.stem for cmd_path in COMMANDS_DIR.glob("*.md")} + assert actual == EXPECTED_COMMAND_NAMES, ( + f"Command file stem set drifted from promised set. " + f"On disk: {sorted(actual)}. " + f"Expected: {sorted(EXPECTED_COMMAND_NAMES)}." + ) + + def test_each_command_has_valid_frontmatter(self): + """Every command file must have YAML frontmatter with a non-empty + ``description`` and a non-empty body. + + Cursor derives the slash-command slug from the filename stem, so + a ``name`` field is intentionally absent — the ``description`` + field is what Cursor shows in the command picker. + """ + for cmd_path in COMMANDS_DIR.glob("*.md"): + text = cmd_path.read_text(encoding="utf-8") + meta, body = _parse_frontmatter(text) + ctx = f"{cmd_path.relative_to(REPO_ROOT)}" + assert meta, f"{ctx}: missing YAML frontmatter" + assert isinstance(meta.get("description"), str) and meta["description"], ( + f"{ctx}: 'description' must be a non-empty string" + ) + assert body.strip(), f"{ctx}: body must not be empty" + + def test_each_command_name_prefixed_with_mempalace(self): + """Cursor commands are global (not plugin-namespaced), so every + command file must be named ``mempalace-*.md`` to avoid colliding + with built-in or other-plugin commands. + + The slash-command slug is the filename stem, so + ``mempalace-help.md`` → ``/mempalace-help``. + """ + for cmd_path in COMMANDS_DIR.glob("*.md"): + stem = cmd_path.stem + assert stem.startswith("mempalace-"), ( + f"{cmd_path.relative_to(REPO_ROOT)}: filename stem {stem!r} " + "must be prefixed with 'mempalace-' to avoid global-namespace collisions" + ) + + +# ── README.md ─────────────────────────────────────────────────────── + + +class TestReadme: + def test_readme_exists(self): + assert README_PATH.is_file(), f"{README_PATH} is missing" + + def test_readme_documents_every_command(self): + """If the README has a command table, every command we ship + must be listed in it. This catches the drift case where someone + adds a command file but forgets to update the docs.""" + text = README_PATH.read_text(encoding="utf-8") + missing = [name for name in EXPECTED_COMMAND_NAMES if f"/{name}" not in text] + assert not missing, f"README does not document: {missing}" + + def test_readme_cross_references_hooks_install_path(self): + """Hooks are deliberately NOT part of the plugin (they're wired + via hooks/cursor/install.sh). The README must tell users where + to go for that, otherwise users will assume the plugin already + installed the hooks and wonder why nothing saves. + """ + text = README_PATH.read_text(encoding="utf-8") + assert "hooks/cursor/install.sh" in text, ( + "README must reference hooks/cursor/install.sh so users know how to enable auto-save" + ) diff --git a/tests/test_distance_metric.py b/tests/test_distance_metric.py new file mode 100644 index 000000000..c4334b0db --- /dev/null +++ b/tests/test_distance_metric.py @@ -0,0 +1,227 @@ +"""Tests for backend-declared distance metrics (RFC 001) and the +metric-aware distance→similarity conversion in the searcher. + +Before this, the searcher hard-coded ``max(0, 1 - distance)`` everywhere, +which is correct only for cosine. A backend reporting L2 or inner-product +distances (or a legacy Chroma palace built without ``hnsw:space=cosine``) +was silently mis-ranked — L2 distances routinely exceed 1.0 and floored +every result's similarity to 0. The contract now lets a backend declare its +metric and the searcher converts accordingly. +""" + +import math +import types + +import pytest + +from mempalace.backends.base import BaseBackend, BaseCollection +from mempalace.backends.chroma import ChromaCollection +from mempalace.searcher import ( + _distance_to_similarity, + _hybrid_rank, + _metric_for_collection, +) + + +# --------------------------------------------------------------------------- +# Contract surface +# --------------------------------------------------------------------------- + + +def test_basebackend_declares_cosine_default(): + assert BaseBackend.distance_metric == "cosine" + + +def test_basecollection_reports_cosine_default(): + # A minimal concrete collection inherits the cosine default. + class _Col(BaseCollection): + def add(self, **k): ... + def upsert(self, **k): ... + def query(self, **k): ... + def get(self, **k): ... + def delete(self, **k): ... + def count(self): + return 0 + + assert _Col().distance_metric == "cosine" + + +# --------------------------------------------------------------------------- +# _distance_to_similarity — per-metric math +# --------------------------------------------------------------------------- + + +def test_cosine_conversion(): + assert _distance_to_similarity(0.0, "cosine") == 1.0 + assert _distance_to_similarity(2.0, "cosine") == 0.0 + # cosine distance > 1 must floor at 0, never go negative. + assert _distance_to_similarity(1.5, "cosine") == 0.0 + + +def test_l2_conversion_is_monotonic_and_bounded(): + assert _distance_to_similarity(0.0, "l2") == 1.0 + assert _distance_to_similarity(1.0, "l2") == pytest.approx(0.5) + # Strictly decreasing, and a large L2 distance does NOT floor to 0 the + # way the old cosine formula did — that was the bug. + far = _distance_to_similarity(5.0, "l2") + near = _distance_to_similarity(1.0, "l2") + assert 0.0 < far < near < 1.0 + + +def test_l2_distance_above_one_keeps_signal(): + # The regression this fixes: under cosine, d=1.7 -> 0.0 (no signal). + # Under a correctly-declared L2 metric, it stays positive and ordered. + assert _distance_to_similarity(1.7, "cosine") == 0.0 + assert _distance_to_similarity(1.7, "l2") > 0.0 + + +def test_ip_conversion_monotonic_decreasing(): + # Inner-product distance is signed/unbounded (lower = closer). Logistic + # squash keeps it in (0, 1) and monotonic. + assert _distance_to_similarity(-5.0, "ip") > _distance_to_similarity(0.0, "ip") + assert _distance_to_similarity(0.0, "ip") == pytest.approx(0.5) + assert _distance_to_similarity(0.0, "ip") > _distance_to_similarity(5.0, "ip") + + +def test_ip_does_not_overflow_on_large_distance(): + # Exponent is clamped so a huge positive distance can't raise OverflowError. + val = _distance_to_similarity(1e6, "ip") + assert val == pytest.approx(0.0, abs=1e-9) + assert not math.isinf(val) and not math.isnan(val) + + +def test_none_distance_maps_to_zero(): + # BM25-only candidates carry distance=None -> no vector signal. + assert _distance_to_similarity(None, "cosine") == 0.0 + assert _distance_to_similarity(None, "l2") == 0.0 + + +def test_unknown_metric_falls_back_to_cosine(): + assert _distance_to_similarity(0.3, "weird") == _distance_to_similarity(0.3, "cosine") + assert _distance_to_similarity(0.3, None) == _distance_to_similarity(0.3, "cosine") + + +# --------------------------------------------------------------------------- +# _metric_for_collection — resolution + delegation + safety +# --------------------------------------------------------------------------- + + +def test_metric_resolver_reads_declared_metric(): + col = types.SimpleNamespace(distance_metric="l2") + assert _metric_for_collection(col) == "l2" + + +def test_metric_resolver_normalizes_case_and_garbage(): + assert _metric_for_collection(types.SimpleNamespace(distance_metric="L2")) == "l2" + assert _metric_for_collection(types.SimpleNamespace(distance_metric="nonsense")) == "cosine" + assert _metric_for_collection(types.SimpleNamespace(distance_metric=None)) == "cosine" + + +def test_metric_resolver_defaults_when_absent(): + assert _metric_for_collection(object()) == "cosine" + + +def test_metric_resolver_follows_embeddingcollection_delegation(): + inner = types.SimpleNamespace(distance_metric="ip") + + class _Wrapper: + def __init__(self, i): + self._i = i + + def __getattr__(self, name): + return getattr(self._i, name) + + assert _metric_for_collection(_Wrapper(inner)) == "ip" + + +def test_metric_resolver_survives_raising_attribute(): + class _Boom: + @property + def distance_metric(self): + raise RuntimeError("backend down") + + assert _metric_for_collection(_Boom()) == "cosine" + + +def test_real_embeddingcollection_delegates_metric_not_shadowed(): + # Regression: BaseCollection defines distance_metric as a property, so on + # the real EmbeddingCollection subclass it resolves directly and + # __getattr__ never fires. Without an explicit override the wrapper would + # report the base "cosine" default and mask a wrapped non-cosine backend. + from mempalace.backends.embedding_wrapper import EmbeddingCollection + + class _Inner(BaseCollection): + distance_metric = "l2" + + def add(self, **k): ... + def upsert(self, **k): ... + def query(self, **k): ... + def get(self, **k): ... + def delete(self, **k): ... + def count(self): + return 0 + + wrapped = EmbeddingCollection(_Inner()) + assert wrapped.distance_metric == "l2" + assert _metric_for_collection(wrapped) == "l2" + + +# --------------------------------------------------------------------------- +# ChromaCollection — legacy L2 palace reports its real metric +# --------------------------------------------------------------------------- + + +def _chroma_col_with_metadata(meta): + fake_inner = types.SimpleNamespace(metadata=meta) + return ChromaCollection(fake_inner) + + +def test_chroma_reports_cosine_when_set(): + assert _chroma_col_with_metadata({"hnsw:space": "cosine"}).distance_metric == "cosine" + + +def test_chroma_legacy_l2_palace_reports_l2(): + # A pre-cosine palace: the property surfaces the real space so the + # searcher maps distances correctly instead of flooring to 0. + assert _chroma_col_with_metadata({"hnsw:space": "l2"}).distance_metric == "l2" + + +def test_chroma_missing_or_unknown_metadata_reports_l2(): + # Absent/empty/garbage hnsw:space means the collection never had cosine + # set, so it is genuinely using Chroma's HNSW default (L2). Reporting + # cosine here would reintroduce the floor-to-0 bug this fixes. + assert _chroma_col_with_metadata({}).distance_metric == "l2" + assert _chroma_col_with_metadata({"hnsw:space": ""}).distance_metric == "l2" + assert _chroma_col_with_metadata({"hnsw:space": "bogus"}).distance_metric == "l2" + + +# --------------------------------------------------------------------------- +# _hybrid_rank — ranking actually respects the metric +# --------------------------------------------------------------------------- + + +def test_hybrid_rank_l2_keeps_far_candidate_ranked_above_unknown(): + # Two candidates with identical (zero) lexical overlap to the query, so + # only the vector term decides. Under cosine, a d=1.6 hit floors to 0 and + # ties a distance-None hit; under L2 it stays positive and ranks above. + results = [ + {"text": "alpha", "distance": None}, + {"text": "beta", "distance": 1.6}, + ] + ranked = _hybrid_rank(results, "zzzznomatch", metric="l2") + assert ranked[0]["text"] == "beta" # real vector signal beats vector-unknown + + +def test_hybrid_rank_cosine_unchanged_behavior(): + # Cosine path must be byte-for-byte the old behavior (max(0, 1-d)). + results = [ + {"text": "near", "distance": 0.1}, + {"text": "far", "distance": 0.9}, + ] + ranked = _hybrid_rank(results, "zzzznomatch", metric="cosine") + assert ranked[0]["text"] == "near" + assert ranked[1]["text"] == "far" + + +def test_hybrid_rank_empty_is_noop(): + assert _hybrid_rank([], "q", metric="l2") == [] diff --git a/tests/test_embedder_identity.py b/tests/test_embedder_identity.py new file mode 100644 index 000000000..93ddae02f --- /dev/null +++ b/tests/test_embedder_identity.py @@ -0,0 +1,408 @@ +"""Embedder-identity persistence and three-state enforcement (RFC 001). + +A same-dimension model swap (e.g. two 384-d models) silently corrupts +retrieval on the explicit-embedding backends, which have no native model +check. The contract records the model name and refuses a swap on open. These +tests avoid loading any embedding model: the enforcement *check* path needs +only the configured model name (cheap), and persistence is exercised with +``EmbedderIdentity`` objects and explicit vectors. +""" + +import os +import warnings + +import pytest + +from mempalace.backends.base import ( + DimensionMismatchError, + EmbedderIdentity, + EmbedderIdentityMismatchError, + EmbedderIdentityUnknownWarning, + PalaceRef, + check_embedder_identity, +) + + +# --------------------------------------------------------------------------- +# Three-state helper +# --------------------------------------------------------------------------- + + +def test_unknown_when_nothing_stored(): + assert check_embedder_identity(None, EmbedderIdentity("minilm", 384)) == "unknown" + + +def test_unknown_when_current_is_nameless(): + stored = EmbedderIdentity("minilm", 384) + assert check_embedder_identity(stored, EmbedderIdentity("", 384)) == "unknown" + assert check_embedder_identity(stored, None) == "unknown" + + +def test_known_match(): + a = EmbedderIdentity("minilm", 384) + assert check_embedder_identity(a, a) == "known_match" + + +def test_match_skips_unknown_dimension(): + # dimension 0 means "not probed" and must not be treated as a real conflict. + assert ( + check_embedder_identity(EmbedderIdentity("minilm", 384), EmbedderIdentity("minilm", 0)) + == "known_match" + ) + + +def test_model_swap_raises_identity_error(): + with pytest.raises(EmbedderIdentityMismatchError): + check_embedder_identity(EmbedderIdentity("minilm", 384), EmbedderIdentity("gemma", 384)) + + +def test_dimension_change_raises_dimension_error_first(): + # Width change is physically unusable — checked before the name swap. + with pytest.raises(DimensionMismatchError): + check_embedder_identity(EmbedderIdentity("a", 384), EmbedderIdentity("b", 768)) + + +def test_force_returns_mismatch_without_raising(): + assert ( + check_embedder_identity( + EmbedderIdentity("minilm", 384), + EmbedderIdentity("gemma", 384), + force_model_swap=True, + ) + == "known_mismatch" + ) + + +# --------------------------------------------------------------------------- +# Per-backend persistence roundtrip (no model loads) +# --------------------------------------------------------------------------- + + +def _sqlite_collection(tmp_path): + from mempalace.backends.sqlite_exact import SQLiteExactBackend + + backend = SQLiteExactBackend() + ref = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + return backend.get_collection(palace=ref, collection_name="mempalace_drawers", create=True) + + +def _chroma_collection(tmp_path): + from mempalace.backends.chroma import ChromaBackend + + backend = ChromaBackend() + ref = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + return backend.get_collection(palace=ref, collection_name="mempalace_drawers", create=True) + + +def test_sqlite_identity_roundtrip(tmp_path): + col = _sqlite_collection(tmp_path) + assert col.get_stored_embedder_identity() is None + col.add(documents=["x"], ids=["a"], metadatas=[{}], embeddings=[[0.1, 0.2, 0.3, 0.4]]) + col.set_embedder_identity(EmbedderIdentity("minilm", 4)) + got = col.get_stored_embedder_identity() + assert got is not None and got.model_name == "minilm" and got.dimension == 4 + + +def test_sqlite_set_identity_ignores_nameless(tmp_path): + col = _sqlite_collection(tmp_path) + col.add(documents=["x"], ids=["a"], metadatas=[{}], embeddings=[[0.1, 0.2, 0.3, 0.4]]) + col.set_embedder_identity(EmbedderIdentity("", 4)) + assert col.get_stored_embedder_identity() is None + + +def test_chroma_identity_roundtrip_via_sidecar(tmp_path): + col = _chroma_collection(tmp_path) + assert col.get_stored_embedder_identity() is None + col.set_embedder_identity(EmbedderIdentity("minilm", 384)) + got = col.get_stored_embedder_identity() + assert got is not None and got.model_name == "minilm" and got.dimension == 384 + assert os.path.isfile(os.path.join(str(tmp_path), "mempalace_embedder.json")) + + +def test_pgvector_identity_survives_marker_rewrite(tmp_path): + # Identity lives in a sidecar, separate from the mismatch marker, so a + # marker rebuild (which happens on every write) must not affect it. + from mempalace.backends.pgvector import PgVectorBackend, _PgVectorConfig + + backend = PgVectorBackend() + cfg = _PgVectorConfig(dsn="postgresql://example", namespace=None) + ref = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + # No marker needed to record identity — the sidecar is unguarded. + backend._set_embedder_identity(ref, "mempalace_drawers", EmbedderIdentity("minilm", 384)) + backend._write_marker(ref, cfg) + got = backend._get_embedder_identity(ref, "mempalace_drawers") + assert got is not None and got.model_name == "minilm" and got.dimension == 384 + + +def test_embeddingcollection_delegates_identity_not_shadowed(): + # BaseCollection defines these as concrete methods, so __getattr__ never + # delegates them — the wrapper needs explicit forwarding or it silently + # reports the no-op default and masks the wrapped backend's identity. + from mempalace.backends.base import BaseCollection + from mempalace.backends.embedding_wrapper import EmbeddingCollection + + class _Inner(BaseCollection): + def __init__(self): + self._ident = None + + def add(self, **k): ... + def upsert(self, **k): ... + def query(self, **k): ... + def get(self, **k): ... + def delete(self, **k): ... + def count(self): + return 0 + + def get_stored_embedder_identity(self): + return self._ident + + def set_embedder_identity(self, identity): + self._ident = identity + + inner = _Inner() + wrapped = EmbeddingCollection(inner) + wrapped.set_embedder_identity(EmbedderIdentity("minilm", 384)) + assert inner._ident is not None and inner._ident.model_name == "minilm" + assert wrapped.get_stored_embedder_identity().model_name == "minilm" + + +# --------------------------------------------------------------------------- +# Enforcement via palace.get_collection (sqlite_exact, no model load) +# --------------------------------------------------------------------------- + + +@pytest.fixture +def clear_identity_cache(): + from mempalace import palace + + palace._VALIDATED_IDENTITY.clear() + yield + palace._VALIDATED_IDENTITY.clear() + + +def _seed_sqlite_with_identity(tmp_path, model): + col = _sqlite_collection(tmp_path) + col.add(documents=["x"], ids=["a"], metadatas=[{}], embeddings=[[0.1, 0.2, 0.3, 0.4]]) + if model is not None: + col.set_embedder_identity(EmbedderIdentity(model, 4)) + return col + + +def test_enforcement_match_does_not_raise(tmp_path, monkeypatch, clear_identity_cache): + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + monkeypatch.setenv("MEMPALACE_BACKEND", "sqlite_exact") + from mempalace import palace as P + + _seed_sqlite_with_identity(tmp_path, "minilm") + P._VALIDATED_IDENTITY.clear() + # Should not raise. + P.get_collection(str(tmp_path), collection_name="mempalace_drawers", create=False) + + +def test_enforcement_model_swap_raises(tmp_path, monkeypatch, clear_identity_cache): + monkeypatch.setenv("MEMPALACE_BACKEND", "sqlite_exact") + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + from mempalace import palace as P + + _seed_sqlite_with_identity(tmp_path, "minilm") + P._VALIDATED_IDENTITY.clear() + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "embeddinggemma") + with pytest.raises(EmbedderIdentityMismatchError): + P.get_collection(str(tmp_path), collection_name="mempalace_drawers", create=False) + + +def test_enforcement_brand_new_records_current_model(tmp_path, monkeypatch, clear_identity_cache): + monkeypatch.setenv("MEMPALACE_BACKEND", "sqlite_exact") + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + from mempalace import palace as P + + col = P.get_collection(str(tmp_path), collection_name="mempalace_drawers", create=True) + got = col.get_stored_embedder_identity() + assert got is not None and got.model_name == "minilm" + + +def test_enforcement_legacy_with_data_warns(tmp_path, monkeypatch, clear_identity_cache): + monkeypatch.setenv("MEMPALACE_BACKEND", "sqlite_exact") + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + from mempalace import palace as P + + _seed_sqlite_with_identity(tmp_path, None) # data, but no recorded identity + P._VALIDATED_IDENTITY.clear() + with pytest.warns(EmbedderIdentityUnknownWarning): + P.get_collection(str(tmp_path), collection_name="mempalace_drawers", create=False) + + +def test_enforcement_nameless_model_is_a_noop(tmp_path, monkeypatch, clear_identity_cache): + monkeypatch.setenv("MEMPALACE_BACKEND", "sqlite_exact") + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + from mempalace import palace as P + + _seed_sqlite_with_identity(tmp_path, None) + P._VALIDATED_IDENTITY.clear() + # A nameless current embedder cannot enforce — no raise, no warning. + monkeypatch.setattr("mempalace.embedding.current_model_name", lambda model=None: "") + with warnings.catch_warnings(): + warnings.simplefilter("error") + P.get_collection(str(tmp_path), collection_name="mempalace_drawers", create=False) + + +# --------------------------------------------------------------------------- +# set_palace_embedder_identity override path +# --------------------------------------------------------------------------- + + +def test_set_palace_identity_override_requires_force(tmp_path, monkeypatch, clear_identity_cache): + monkeypatch.setenv("MEMPALACE_BACKEND", "sqlite_exact") + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + from mempalace import palace as P + + _seed_sqlite_with_identity(tmp_path, "minilm") + # Recording a different model without force is refused. + with pytest.raises(EmbedderIdentityMismatchError): + P.set_palace_embedder_identity(str(tmp_path), model="embeddinggemma", force=False) + # With force it goes through, recording the name only (no foreign load). + old, new = P.set_palace_embedder_identity(str(tmp_path), model="embeddinggemma", force=True) + assert old.model_name == "minilm" and new.model_name == "embeddinggemma" + + +def test_set_palace_identity_empty_target_raises(tmp_path, monkeypatch): + # No model given and none configured: recording is a no-op in every backend, + # so refuse rather than claim a phantom success. + monkeypatch.setattr("mempalace.config.MempalaceConfig.embedding_model", property(lambda s: "")) + from mempalace import palace as P + + with pytest.raises(ValueError): + P.set_palace_embedder_identity(str(tmp_path), model=None) + + +def test_enforcement_prefers_effective_identity(monkeypatch, clear_identity_cache): + # A server_embedder collection reports its own effective identity; the + # configured model must be ignored in favor of it. Here effective and + # stored disagree, so enforcement raises even though config says "minilm". + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "minilm") + from mempalace import palace as P + + class _ServerCol: + def effective_embedder_identity(self): + return EmbedderIdentity("server-model", 768) + + def get_stored_embedder_identity(self): + return EmbedderIdentity("other-model", 768) + + def count(self): + return 5 + + def set_embedder_identity(self, identity): + raise AssertionError("must not record on a mismatch") + + with pytest.raises(EmbedderIdentityMismatchError): + P._enforce_embedder_identity(_ServerCol(), "/tmp/x", "c", create=False) + + +def test_chroma_corrupt_sidecar_returns_none(tmp_path): + # A malformed sidecar (non-dict JSON) must not raise — degrade to unknown. + col = _chroma_collection(tmp_path) + path = os.path.join(str(tmp_path), "mempalace_embedder.json") + with open(path, "w", encoding="utf-8") as f: + f.write('["not", "a", "dict"]') + assert col.get_stored_embedder_identity() is None + # And a subsequent set still works (overwrites the junk). + col.set_embedder_identity(EmbedderIdentity("minilm", 384)) + assert col.get_stored_embedder_identity().model_name == "minilm" + + +# --------------------------------------------------------------------------- +# qdrant: identity persisted in the local marker (no live qdrant needed) +# --------------------------------------------------------------------------- + + +def _qdrant_collection(tmp_path, *, write_marker=True): + from mempalace.backends.qdrant import QdrantBackend, QdrantCollection, _QdrantConfig + + backend = QdrantBackend() + config = _QdrantConfig(url="http://localhost:6333", api_key=None, namespace=None) + ref = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + if write_marker: + backend._write_marker(ref, config) + # The identity methods read/write the local marker only; the client is + # never touched, so a placeholder stands in for a live REST connection. + return QdrantCollection( + backend=backend, + client=object(), + config=config, + palace=ref, + collection_name="mempalace_drawers", + remote_collection="mp_drawers_remote", + ) + + +def test_qdrant_identity_survives_marker_rewrite(tmp_path): + from mempalace.backends.qdrant import QdrantBackend, _QdrantConfig + + backend = QdrantBackend() + config = _QdrantConfig(url="http://localhost:6333", api_key=None, namespace=None) + ref = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + backend._write_marker(ref, config) + backend._set_embedder_identity(ref, "mempalace_drawers", EmbedderIdentity("minilm", 384)) + backend._write_marker(ref, config) # rebuild must not wipe embedders + got = backend._get_embedder_identity(ref, "mempalace_drawers") + assert got is not None and got.model_name == "minilm" and got.dimension == 384 + + +def test_qdrant_collection_delegates_identity(tmp_path): + col = _qdrant_collection(tmp_path) + assert col.get_stored_embedder_identity() is None + col.set_embedder_identity(EmbedderIdentity("minilm", 384)) + got = col.get_stored_embedder_identity() + assert got is not None and got.model_name == "minilm" and got.dimension == 384 + + +def test_qdrant_set_identity_creates_sidecar_when_missing(tmp_path): + # Brand-new palace whose first write hasn't created the marker yet: + # recording identity must create it, not silently no-op into permanent + # "unknown" (the marker-on-write vs record-on-open timing gap). + col = _qdrant_collection(tmp_path, write_marker=False) + assert not col._marker_exists() + col.set_embedder_identity(EmbedderIdentity("minilm", 384)) + got = col.get_stored_embedder_identity() + assert got is not None and got.model_name == "minilm" + + +def _pgvector_collection(tmp_path, *, write_marker=True): + from mempalace.backends.pgvector import PgVectorBackend, PgVectorCollection, _PgVectorConfig + + backend = PgVectorBackend() + config = _PgVectorConfig(dsn="postgresql://example", namespace=None) + ref = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + if write_marker: + backend._write_marker(ref, config) + return PgVectorCollection( + backend=backend, + client=object(), + config=config, + palace=ref, + collection_name="mempalace_drawers", + table="mp_drawers_t", + ) + + +def test_pgvector_set_identity_creates_sidecar_when_missing(tmp_path): + # Same brand-new-palace timing gap as qdrant: recording must create the + # marker rather than no-op. + col = _pgvector_collection(tmp_path, write_marker=False) + assert not col._marker_exists() + col.set_embedder_identity(EmbedderIdentity("minilm", 384)) + got = col.get_stored_embedder_identity() + assert got is not None and got.model_name == "minilm" + + +def test_qdrant_enforcement_model_swap_raises(tmp_path, monkeypatch, clear_identity_cache): + # The enforcement check reads the marker (no server) and compares to the + # configured model — a swap raises just like the local backends. + from mempalace import palace as P + + col = _qdrant_collection(tmp_path) + col.set_embedder_identity(EmbedderIdentity("minilm", 384)) + monkeypatch.setenv("MEMPALACE_EMBEDDING_MODEL", "embeddinggemma") + with pytest.raises(EmbedderIdentityMismatchError): + P._enforce_embedder_identity(col, str(tmp_path), "mempalace_drawers", create=False) diff --git a/tests/test_embeddinggemma.py b/tests/test_embeddinggemma.py index f108ff969..6ad398a9f 100644 --- a/tests/test_embeddinggemma.py +++ b/tests/test_embeddinggemma.py @@ -9,6 +9,8 @@ """ import sys +import threading +import time import pytest @@ -162,6 +164,171 @@ def fake_encode_batch(self, texts): assert any("raw text one" in t for t in captured) +def test_call_chunks_large_batches(patched_lazy_load, monkeypatch): + """A large input must be tokenized and run in bounded sub-batches. + + One unchunked session.run over a repair-scale batch (5000 docs) allocates + attention buffers beyond available RAM and the kernel kills the process + (#1770) — so __call__ may never see more than _EMBEDDINGGEMMA_BATCH_SIZE + docs per forward pass. + """ + batch_sizes = [] + captured_texts = [] + original_encode_batch = _FakeTokenizer.encode_batch + + def recording_encode_batch(self, texts): + batch_sizes.append(len(texts)) + captured_texts.extend(texts) + return original_encode_batch(self, texts) + + monkeypatch.setattr(_FakeTokenizer, "encode_batch", recording_encode_batch) + ef = embedding.EmbeddinggemmaONNX() + n = embedding._EMBEDDINGGEMMA_BATCH_SIZE * 2 + 6 + docs = [f"doc {i}" for i in range(n)] + out = ef(docs) + + assert batch_sizes == [ + embedding._EMBEDDINGGEMMA_BATCH_SIZE, + embedding._EMBEDDINGGEMMA_BATCH_SIZE, + 6, + ], f"expected bounded sub-batches, got {batch_sizes}" + # Sub-batches must cover the input in order; combined with the per-chunk + # extend in __call__ this pins output row order to input order. + assert captured_texts == [embedding._EMBEDDINGGEMMA_PREFIX + d for d in docs] + arr = np.asarray(out) + assert arr.shape == (n, 384), f"chunked outputs must concatenate to (n, 384), got {arr.shape}" + assert np.allclose(np.linalg.norm(arr, axis=1), 1.0, atol=1e-5) + + +_B = 32 # mirrors _EMBEDDINGGEMMA_BATCH_SIZE; literal so the cases read plainly + + +@pytest.mark.parametrize( + ("n", "expected_batches"), + [ + (1, [1]), + (_B, [_B]), + (_B + 1, [_B, 1]), + (2 * _B, [_B, _B]), + ], +) +def test_call_chunk_boundaries(patched_lazy_load, monkeypatch, n, expected_batches): + """Exact-multiple and off-by-one inputs produce no empty or oversized runs.""" + assert _B == embedding._EMBEDDINGGEMMA_BATCH_SIZE, "update _B alongside the constant" + batch_sizes = [] + original_encode_batch = _FakeTokenizer.encode_batch + + def recording_encode_batch(self, texts): + batch_sizes.append(len(texts)) + return original_encode_batch(self, texts) + + monkeypatch.setattr(_FakeTokenizer, "encode_batch", recording_encode_batch) + ef = embedding.EmbeddinggemmaONNX() + out = ef([f"doc {i}" for i in range(n)]) + assert batch_sizes == expected_batches + assert len(out) == n + + +def test_custom_batch_size_is_honored(patched_lazy_load, monkeypatch): + """The constructor knob must drive the sub-batch split.""" + batch_sizes = [] + original_encode_batch = _FakeTokenizer.encode_batch + + def recording_encode_batch(self, texts): + batch_sizes.append(len(texts)) + return original_encode_batch(self, texts) + + monkeypatch.setattr(_FakeTokenizer, "encode_batch", recording_encode_batch) + ef = embedding.EmbeddinggemmaONNX(batch_size=10) + out = ef([f"doc {i}" for i in range(24)]) + assert batch_sizes == [10, 10, 4] + assert len(out) == 24 + + +def test_batch_size_below_one_is_rejected(): + """A zero or negative batch size would loop forever or embed nothing.""" + with pytest.raises(ValueError, match="batch_size"): + embedding.EmbeddinggemmaONNX(batch_size=0) + with pytest.raises(ValueError, match="batch_size"): + embedding.EmbeddinggemmaONNX(batch_size=-3) + + +def test_call_empty_input_returns_empty(patched_lazy_load): + """Zero docs must yield zero embeddings without loading the model.""" + ef = embedding.EmbeddinggemmaONNX() + assert ef([]) == [] + assert ef(None) == [] + assert patched_lazy_load["hf_hub_download"] == 0, "empty input must not trigger the download" + + +def test_call_bare_string_is_wrapped(patched_lazy_load): + """A single string is one document, not a sequence of characters.""" + ef = embedding.EmbeddinggemmaONNX() + out = ef("standalone document") + assert np.asarray(out).shape == (1, 384) + + +def test_concurrent_first_calls_load_model_once(patched_lazy_load, monkeypatch): + """Cold concurrent calls must build exactly one session. + + Instances are shared across threads via _EF_CACHE; without the load + lock, two cold callers would transiently hold two full model sessions. + """ + import huggingface_hub + + fixture_download = huggingface_hub.hf_hub_download + + def slow_download(*args, **kwargs): + time.sleep(0.05) # widen the race window the lock must close + return fixture_download(*args, **kwargs) + + monkeypatch.setattr(huggingface_hub, "hf_hub_download", slow_download) + + ef = embedding.EmbeddinggemmaONNX() + barrier = threading.Barrier(2) + results = [None, None] + + def worker(slot): + barrier.wait(timeout=5) + results[slot] = ef([f"doc {slot}"]) + + threads = [threading.Thread(target=worker, args=(slot,)) for slot in range(2)] + for t in threads: + t.start() + for t in threads: + t.join(timeout=10) + + assert patched_lazy_load["InferenceSession"] == 1 + assert all(r is not None and len(r) == 1 for r in results) + + +def test_concurrent_get_embedding_function_single_instance(monkeypatch): + """Concurrent cache misses must converge on one shared EF instance. + + The instance-level load lock is not enough on its own: if the factory's + check-then-construct is unsynchronized, each thread keeps its own + instance and each one later loads its own copy of the model. + """ + monkeypatch.setattr( + embedding, "_resolve_providers", lambda device: (["CPUExecutionProvider"], "cpu") + ) + barrier = threading.Barrier(2) + instances = [None, None] + + def worker(slot): + barrier.wait(timeout=5) + instances[slot] = embedding.get_embedding_function(device="cpu", model="embeddinggemma") + + threads = [threading.Thread(target=worker, args=(slot,)) for slot in range(2)] + for t in threads: + t.start() + for t in threads: + t.join(timeout=10) + + assert instances[0] is not None, "worker thread did not complete" + assert instances[0] is instances[1], "factory must hand every thread the same EF" + + def test_get_embedding_function_dispatches_to_embeddinggemma(monkeypatch): """model='embeddinggemma' must build EmbeddinggemmaONNX, not the MiniLM EF.""" monkeypatch.setattr( diff --git a/tests/test_format_miner.py b/tests/test_format_miner.py index a9ed81682..5f998d09c 100644 --- a/tests/test_format_miner.py +++ b/tests/test_format_miner.py @@ -801,6 +801,36 @@ def test_mine_formats_respects_limit(_mine_formats_mocks): assert p_md.call_count == 2 +def test_mine_formats_limit_skips_already_mined(_mine_formats_mocks): + """--limit N counts only new work, not already-mined skips (#1535).""" + from unittest.mock import patch + from mempalace.format_miner import mine_formats + + tmp = _mine_formats_mocks["tmp_path"] + files = [] + for i in range(6): + p = tmp / f"f{i}.pdf" + p.write_bytes(b"%PDF-1.4 stub") + files.append(p) + + call_idx = 0 + + def fake_already_mined(collection, source_file, **kwargs): + nonlocal call_idx + call_idx += 1 + return call_idx <= 4 + + _mine_formats_mocks["file_already_mined"].side_effect = fake_already_mined + with ( + patch("mempalace.format_miner.scan_formats", return_value=files), + patch( + "mempalace.format_miner._extract_via_markitdown", return_value="long text " * 50 + ) as p_md, + ): + mine_formats(format_dir=str(tmp), palace_path=str(tmp / "palace"), limit=1) + assert p_md.call_count == 1 + + def test_mine_formats_wing_defaults_from_directory_name(_mine_formats_mocks): """When wing=None, the directory's basename becomes the wing.""" from unittest.mock import patch diff --git a/tests/test_hallways.py b/tests/test_hallways.py index 92ba1865a..b2c74368c 100644 --- a/tests/test_hallways.py +++ b/tests/test_hallways.py @@ -20,9 +20,18 @@ def _use_tmp_hallway_file(monkeypatch, tmp_path): - """Redirect hallway persistence to a per-test JSON file.""" + """Redirect both the hallway-file resolver and the legacy-file check at the + tmp_path so existing tests stay in the configured-path branch and don't + accidentally trip the new legacy-file warning branch in _load_hallways. + Mirrors the analogous helper in ``tests/test_palace_graph_tunnels.py``. + """ hallway_file = tmp_path / "hallways.json" - monkeypatch.setattr(hallways_mod, "_HALLWAY_FILE", str(hallway_file)) + monkeypatch.setattr(hallways_mod, "_get_hallway_file", lambda *a, **kw: str(hallway_file)) + monkeypatch.setattr( + hallways_mod, + "_legacy_hallway_file", + lambda: str(tmp_path / "legacy-hallways.json"), + ) return hallway_file diff --git a/tests/test_hallways_pagination.py b/tests/test_hallways_pagination.py index 8847071ae..b33b0b4c9 100644 --- a/tests/test_hallways_pagination.py +++ b/tests/test_hallways_pagination.py @@ -15,7 +15,13 @@ def _use_tmp_hallway_file(monkeypatch, tmp_path): - monkeypatch.setattr(hallways_mod, "_HALLWAY_FILE", str(tmp_path / "hallways.json")) + hallway_file = tmp_path / "hallways.json" + monkeypatch.setattr(hallways_mod, "_get_hallway_file", lambda *a, **kw: str(hallway_file)) + monkeypatch.setattr( + hallways_mod, + "_legacy_hallway_file", + lambda: str(tmp_path / "legacy-hallways.json"), + ) def _collection_that_rejects_where_get(drawers): diff --git a/tests/test_hallways_palace_scoped.py b/tests/test_hallways_palace_scoped.py new file mode 100644 index 000000000..2cff134f6 --- /dev/null +++ b/tests/test_hallways_palace_scoped.py @@ -0,0 +1,159 @@ +"""Tests for the palace-scoped hallway-file migration. + +The pre-3.4 hallway store was hardcoded at ``~/.mempalace/hallways.json`` +regardless of the configured ``palace_path``, so two palaces on one host +silently shared one file. This file covers the migration: ``hallways.py`` +now resolves the path through ``MempalaceConfig.hallway_file`` (sibling of +``palace_path``), mirroring the 3.3.6 tunnel-file migration in +``palace_graph._get_tunnel_file``. + +Style and structure mirror ``tests/test_palace_graph_tunnels.py``'s +analogous coverage for tunnels (orphaned-legacy warning, same-path +no-warning, palace_path-follows behavior). +""" + +import logging +import os +from unittest.mock import MagicMock, patch + +with patch.dict("sys.modules", {"chromadb": MagicMock()}): + from mempalace import hallways as hallways_mod + from mempalace.config import DEFAULT_PALACE_PATH, MempalaceConfig + + +# ============================================================================= +# Resolver: MempalaceConfig.hallway_file + _get_hallway_file +# ============================================================================= + + +class TestHallwayFileResolution: + def test_default_hallway_file_is_sibling_of_default_palace(self): + cfg = MempalaceConfig() + expected = os.path.join(os.path.dirname(DEFAULT_PALACE_PATH), "hallways.json") + assert cfg.hallway_file == expected + assert hallways_mod._get_hallway_file(cfg) == expected + + def test_hallway_file_follows_palace_path(self, tmp_path): + """Custom palace_path → hallway sits beside the palace, not at the + hardcoded legacy location.""" + custom_dir = tmp_path / "custom-palace" + cfg = MempalaceConfig(config_dir=tmp_path) + cfg._file_config["palace_path"] = str(custom_dir) + assert cfg.hallway_file == str(tmp_path / "hallways.json") + assert hallways_mod._get_hallway_file(cfg) == str(tmp_path / "hallways.json") + + def test_palace_env_var_redirects_hallway_file(self, tmp_path, monkeypatch): + """MEMPALACE_PALACE_PATH must redirect the hallway file too.""" + custom_palace = tmp_path / "envpalace" / "palace" + monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(custom_palace)) + cfg = MempalaceConfig() + assert cfg.hallway_file == str(tmp_path / "envpalace" / "hallways.json") + + +# ============================================================================= +# Orphan detection: legacy file present, configured file missing +# ============================================================================= + + +class TestLegacyHallwayFileDetection: + def test_load_hallways_warns_on_orphaned_legacy_file(self, tmp_path, monkeypatch, caplog): + """When the configured hallway file is missing but a legacy file + exists at a different path, _load_hallways logs a one-line warning + naming both paths and returns []. Critically, it does NOT + auto-migrate — silent merging risks clobbering newer data.""" + configured = tmp_path / "configured" / "hallways.json" + legacy = tmp_path / "legacy" / "hallways.json" + legacy.parent.mkdir(parents=True) + legacy.write_text( + '{"schema_version": 1, "hallways": [' + '{"id":"orphan","wing":"a","entity_a":"Alice",' + '"entity_b":"Bob","co_occurrence_count":2,"rooms":["r"]}' + "]}", + encoding="utf-8", + ) + + # Point the module constant at the patched-legacy path so the + # back-compat shim treats it as "legacy, defer to resolver". + monkeypatch.setattr(hallways_mod, "_get_hallway_file", lambda *a, **kw: str(configured)) + monkeypatch.setattr(hallways_mod, "_legacy_hallway_file", lambda: str(legacy)) + + with caplog.at_level(logging.WARNING, logger="mempalace_hallways"): + result = hallways_mod._load_hallways() + + assert result == [], "must not auto-migrate from legacy file" + assert str(legacy) in caplog.text + assert str(configured) in caplog.text + + def test_no_legacy_warning_when_paths_match(self, tmp_path, monkeypatch, caplog): + """If configured and legacy resolve to the same path (default install), + we must not emit a misleading 'legacy file ignored' warning when the + file simply doesn't exist yet.""" + same = tmp_path / "hallways.json" + monkeypatch.setattr(hallways_mod, "_get_hallway_file", lambda *a, **kw: str(same)) + monkeypatch.setattr(hallways_mod, "_legacy_hallway_file", lambda: str(same)) + + with caplog.at_level(logging.WARNING, logger="mempalace_hallways"): + assert hallways_mod._load_hallways() == [] + + assert "Legacy hallways file" not in caplog.text + + +# ============================================================================= +# Multi-palace isolation: two palaces no longer share the file +# ============================================================================= + + +class TestMultiPalaceIsolation: + def test_two_palaces_get_distinct_hallway_files(self, tmp_path, monkeypatch): + """The original bug: switching MEMPALACE_PALACE_PATH between two + palace dirs must produce two distinct hallway files, not one shared. + """ + palace_a = tmp_path / "a" / "palace" + palace_b = tmp_path / "b" / "palace" + palace_a.mkdir(parents=True) + palace_b.mkdir(parents=True) + + monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(palace_a)) + file_a = MempalaceConfig().hallway_file + + monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(palace_b)) + file_b = MempalaceConfig().hallway_file + + assert file_a != file_b + assert file_a == str(tmp_path / "a" / "hallways.json") + assert file_b == str(tmp_path / "b" / "hallways.json") + + def test_save_then_load_under_different_palace_returns_empty(self, tmp_path, monkeypatch): + """End-to-end: writing hallways under palace-A and then loading under + palace-B must NOT return palace-A's records. This is the regression + guard for the original bug.""" + palace_a = tmp_path / "a" / "palace" + palace_b = tmp_path / "b" / "palace" + palace_a.mkdir(parents=True) + palace_b.mkdir(parents=True) + + # Pin the legacy-file lookup to a temp path so the legacy-warning + # branch never checks the host's real ~/.mempalace/hallways.json. + monkeypatch.setattr( + hallways_mod, + "_legacy_hallway_file", + lambda: str(tmp_path / "legacy-hallways.json"), + ) + + monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(palace_a)) + hallways_mod._save_hallways( + [ + { + "id": "h_from_a", + "wing": "wing_a", + "entity_a": "Alice", + "entity_b": "Bob", + "co_occurrence_count": 2, + "rooms": ["room_a"], + } + ] + ) + assert os.path.exists(str(tmp_path / "a" / "hallways.json")) + + monkeypatch.setenv("MEMPALACE_PALACE_PATH", str(palace_b)) + assert hallways_mod._load_hallways() == [] diff --git a/tests/test_hook_shell.py b/tests/test_hook_shell.py new file mode 100644 index 000000000..4c41efe4a --- /dev/null +++ b/tests/test_hook_shell.py @@ -0,0 +1,128 @@ +import json +import subprocess +import sys + +from mempalace import hook_shell + + +def test_normalize_transcript_path_preserves_windows_drive_and_segments(): + path = r"C:\Users\me\.claude\projects\-Users-me-Proj\session.jsonl" + + assert ( + hook_shell.normalize_transcript_path(path) + == "C:/Users/me/.claude/projects/-Users-me-Proj/session.jsonl" + ) + + +def test_normalize_transcript_path_preserves_spaces_and_unicode(): + path = r"C:\Users\Me User\.claude\projects\emoji 🧠\session.jsonl" + + assert ( + hook_shell.normalize_transcript_path(path) + == "C:/Users/Me User/.claude/projects/emoji 🧠/session.jsonl" + ) + + +def test_parse_stop_payload_keeps_session_strict_but_path_not_over_sanitized(): + session_id, stop_active, transcript_path = hook_shell.parse_stop_payload( + { + "session_id": "../bad session!!", + "stop_hook_active": "yes", + "transcript_path": r"C:\Users\Me User\.claude\projects\emoji 🧠\session.jsonl", + } + ) + + assert session_id == "badsession" + assert stop_active == "True" + assert transcript_path == "C:/Users/Me User/.claude/projects/emoji 🧠/session.jsonl" + + +def test_parse_precompact_cli_outputs_sentinel_and_normalized_path(): + payload = { + "session_id": "sess-1", + "transcript_path": r"D:\Claude\projects\-Users-me-App\session.jsonl", + } + + result = subprocess.run( + [sys.executable, "-m", "mempalace.hook_shell", "parse-precompact"], + input=json.dumps(payload), + text=True, + capture_output=True, + check=True, + ) + + assert result.stdout.splitlines() == [ + "__MEMPAL_PARSE_OK__", + "sess-1", + "D:/Claude/projects/-Users-me-App/session.jsonl", + ] + + +def test_count_human_messages_reads_utf8_transcripts_tolerantly(tmp_path): + transcript = tmp_path / "session.jsonl" + transcript.write_text( + json.dumps( + {"message": {"role": "user", "content": "emoji: 🧠 café Привет"}}, + ensure_ascii=False, + ) + + "\n" + + json.dumps({"message": {"role": "user", "content": "ignore message"}}) + + "\n" + + json.dumps({"message": {"role": "assistant", "content": "ignored"}}) + + "\n" + + "{bad json\n", + encoding="utf-8", + ) + + assert hook_shell.count_human_messages(str(transcript)) == 1 + + result = subprocess.run( + [sys.executable, "-m", "mempalace.hook_shell", "count-human-messages", str(transcript)], + text=True, + capture_output=True, + check=True, + ) + + assert result.stdout.strip() == "1" + + +def test_parse_stop_cli_fails_loud_on_malformed_nonempty_stdin(): + result = subprocess.run( + [sys.executable, "-m", "mempalace.hook_shell", "parse-stop"], + input="not-json garbage", + text=True, + capture_output=True, + check=False, + ) + + assert result.returncode != 0 + assert "__MEMPAL_PARSE_OK__" not in result.stdout + assert "traceback" in result.stderr.lower() or "json" in result.stderr.lower() + + +def test_parse_precompact_cli_fails_loud_on_malformed_nonempty_stdin(): + result = subprocess.run( + [sys.executable, "-m", "mempalace.hook_shell", "parse-precompact"], + input="not-json garbage", + text=True, + capture_output=True, + check=False, + ) + + assert result.returncode != 0 + assert "__MEMPAL_PARSE_OK__" not in result.stdout + assert "traceback" in result.stderr.lower() or "json" in result.stderr.lower() + + +def test_parse_stop_cli_treats_empty_stdin_as_empty_payload(): + result = subprocess.run( + [sys.executable, "-m", "mempalace.hook_shell", "parse-stop"], + input="", + text=True, + capture_output=True, + check=True, + ) + + lines = result.stdout.splitlines() + assert lines[:3] == ["__MEMPAL_PARSE_OK__", "unknown", "False"] + assert result.stderr == "" diff --git a/tests/test_hybrid_search.py b/tests/test_hybrid_search.py index 793216aee..a2672de41 100644 --- a/tests/test_hybrid_search.py +++ b/tests/test_hybrid_search.py @@ -114,11 +114,13 @@ def test_closet_preview_exposed_when_boosted(self, tmp_path): palace, drawer_id="D1", source_file="fixture_D1.md", - topics=["JWT auth tokens", "24h expiry", "authentication"], + topics=["JWT auth tokens", "session expiry", "authentication service"], ) - result = search_memories("JWT authentication", palace, n_results=2) + result = search_memories("JWT auth tokens expiry", palace, n_results=2) top = result["results"][0] assert top["source_file"] == "fixture_D1.md" + assert top["matched_via"] == "drawer+closet" + assert top["closet_boost"] > 0 assert "closet_preview" in top def test_drawer_only_hits_have_no_closet_preview(self, tmp_path): diff --git a/tests/test_ids.py b/tests/test_ids.py index 10376027e..0eb1216c3 100644 --- a/tests/test_ids.py +++ b/tests/test_ids.py @@ -18,11 +18,11 @@ # ── ID_RECIPE constant ───────────────────────────────────────────────── -def test_id_recipe_constant_is_v2(): +def test_id_recipe_constant_is_v3(): """Audit code reads ids.ID_RECIPE to tag new drawers. The constant - must be the literal "v2" string; a typo here silently re-introduces + must be the literal "v3" string; a typo here silently re-introduces the ambiguity v2 was meant to fix.""" - assert ids.ID_RECIPE == "v2" + assert ids.ID_RECIPE == "v3" # ── make_drawer_id_from_chunk ───────────────────────────────────────── @@ -170,15 +170,17 @@ def test_make_triple_id_does_not_collide_across_iso_datetime_boundary(): # ── _delimited_sha256 (private helper, smoke test only) ─────────────── -def test_private_delimited_sha256_uses_pipe_delimiter(): - """Confirms the implementation actually uses '|' and not ':' — a - subtle copy-paste from the diary_ingest precedent or a stale - ':' precedent from convo_miner could regress the delimiter without - breaking the higher-level tests.""" +def test_private_delimited_sha256_uses_length_prefixing(): result = ids._delimited_sha256(("a", "b"), 64) - expected = hashlib.sha256(b"a|b").hexdigest() + expected = hashlib.sha256(b"1:a1:b").hexdigest() assert result == expected + # These tuples collapse to the same raw pipe-joined string: + # "a|b|c|d". The v3 length-prefixed recipe must keep them distinct. + left = ids._delimited_sha256(("a", "b|c", "d"), 64) + right = ids._delimited_sha256(("a|b", "c", "d"), 64) + assert left != right + def test_private_delimited_sha256_truncation_honoured(): """Truncation argument actually shortens the hex output.""" diff --git a/tests/test_legacy_shell_hooks.py b/tests/test_legacy_shell_hooks.py new file mode 100644 index 000000000..7a97f2f2d --- /dev/null +++ b/tests/test_legacy_shell_hooks.py @@ -0,0 +1,26 @@ +from pathlib import Path + + +ROOT = Path(__file__).resolve().parents[1] + + +def _hook(name): + return (ROOT / "hooks" / name).read_text(encoding="utf-8") + + +def test_save_hook_uses_shared_parser_and_utf8_counter(): + body = _hook("mempal_save_hook.sh") + + assert "-m mempalace.hook_shell parse-stop" in body + assert "-m mempalace.hook_shell count-human-messages" in body + assert "transcript_path not found after normalization" in body + assert "safe = lambda" not in body + assert "with open(sys.argv[1]) as f:" not in body + + +def test_precompact_hook_uses_shared_parser(): + body = _hook("mempal_precompact_hook.sh") + + assert "-m mempalace.hook_shell parse-precompact" in body + assert "missing or invalid transcript path after normalization" in body + assert "safe = lambda" not in body diff --git a/tests/test_maintenance_hooks.py b/tests/test_maintenance_hooks.py new file mode 100644 index 000000000..95ac0fd8c --- /dev/null +++ b/tests/test_maintenance_hooks.py @@ -0,0 +1,275 @@ +"""Backend maintenance hooks (RFC 001). + +Maintenance is observable, not fire-and-forget: ``run_maintenance(kind)`` +returns a ``MaintenanceResult`` and MUST serialize concurrent same-kind runs. +The pgvector ``reindex`` path (the opt-in HNSW build) is exercised here with a +fake client so the advisory-lock flow is tested without a live Postgres. +""" + +import pytest + +from mempalace.backends.base import ( + BaseCollection, + MaintenanceResult, + PalaceRef, + UnsupportedMaintenanceKindError, +) + + +# --------------------------------------------------------------------------- +# Contract surface +# --------------------------------------------------------------------------- + + +def test_maintenance_result_shape(): + r = MaintenanceResult(kind="reindex", status="ran", stats={"ms": 12}) + assert r.kind == "reindex" and r.status == "ran" and r.stats["ms"] == 12 + assert MaintenanceResult(kind="analyze", status="noop").stats == {} + + +def test_default_collection_rejects_all_kinds(): + class _Col(BaseCollection): + def add(self, **k): ... + def upsert(self, **k): ... + def query(self, **k): ... + def get(self, **k): ... + def delete(self, **k): ... + def count(self): + return 0 + + col = _Col() + assert col.maintenance_state() == {} + with pytest.raises(UnsupportedMaintenanceKindError): + col.run_maintenance("analyze") + + +def test_backend_maintenance_kinds_declared(): + from mempalace.backends.chroma import ChromaBackend + from mempalace.backends.pgvector import PgVectorBackend + from mempalace.backends.qdrant import QdrantBackend + from mempalace.backends.sqlite_exact import SQLiteExactBackend + + assert SQLiteExactBackend.maintenance_kinds == frozenset({"analyze", "compact"}) + assert PgVectorBackend.maintenance_kinds == frozenset({"analyze", "reindex"}) + # qdrant self-optimizes; chroma maintenance is the separate repair CLI. + assert QdrantBackend.maintenance_kinds == frozenset() + assert ChromaBackend.maintenance_kinds == frozenset() + + +# --------------------------------------------------------------------------- +# sqlite_exact (CI-runnable, real backend) +# --------------------------------------------------------------------------- + + +def _sqlite_collection(tmp_path, rows=20): + from mempalace.backends.sqlite_exact import SQLiteExactBackend + + col = SQLiteExactBackend().get_collection( + palace=PalaceRef(id=str(tmp_path), local_path=str(tmp_path)), + collection_name="mempalace_drawers", + create=True, + ) + for i in range(rows): + col.add( + documents=[f"doc {i}"], + ids=[f"id{i}"], + metadatas=[{}], + embeddings=[[0.1, 0.2, 0.3, 0.4]], + ) + return col + + +def test_sqlite_maintenance_state(tmp_path): + col = _sqlite_collection(tmp_path, rows=5) + state = col.maintenance_state() + assert state["row_count"] == 5 + assert state["vector_index"] is None # exact scan — no ANN index + assert "page_count" in state and "freelist_pages" in state + + +def test_sqlite_analyze_runs(tmp_path): + col = _sqlite_collection(tmp_path, rows=5) + r = col.run_maintenance("analyze") + assert r.kind == "analyze" and r.status == "ran" + + +def test_sqlite_compact_runs_and_reports_pages(tmp_path): + col = _sqlite_collection(tmp_path, rows=30) + col.delete(ids=[f"id{i}" for i in range(20)]) + r = col.run_maintenance("compact") + assert r.kind == "compact" and r.status == "ran" + assert "pages_reclaimed" in r.stats + + +def test_sqlite_omits_reindex(tmp_path): + # sqlite_exact has no ANN index, so reindex is omitted, not no-op'd. + col = _sqlite_collection(tmp_path, rows=2) + with pytest.raises(UnsupportedMaintenanceKindError): + col.run_maintenance("reindex") + + +def test_sqlite_unknown_kind_raises(tmp_path): + col = _sqlite_collection(tmp_path, rows=2) + with pytest.raises(UnsupportedMaintenanceKindError): + col.run_maintenance("bogus") + + +# --------------------------------------------------------------------------- +# pgvector advisory-lock reindex flow (fake client, no live Postgres) +# --------------------------------------------------------------------------- + + +class _FakeClient: + def __init__(self, has_index=False): + self.has_index = has_index + self.locked = False + self.created = 0 + self.analyzed = 0 + + def table_exists(self, table): + return True + + def count_rows(self, table): + return 7 + + def has_vector_index(self, table): + return self.has_index + + def try_advisory_lock(self, classid, objid): + if self.locked: + return False + self.locked = True + return True + + def advisory_unlock(self, classid, objid): + self.locked = False + + def create_hnsw_index(self, table): + self.has_index = True + self.created += 1 + + def analyze_table(self, table): + self.analyzed += 1 + + +class _FakeBackend: + _closed = False + + +def _pg_collection(client): + from mempalace.backends.pgvector import PgVectorCollection, _PgVectorConfig + + return PgVectorCollection( + backend=_FakeBackend(), + client=client, + config=_PgVectorConfig(dsn="postgresql://example", namespace=None), + palace=PalaceRef(id="/tmp/p", local_path="/tmp/p"), + collection_name="mempalace_drawers", + table="mp_drawers_t", + ) + + +def test_pgvector_reindex_builds_index_under_lock(): + client = _FakeClient(has_index=False) + col = _pg_collection(client) + r = col.run_maintenance("reindex") + assert r.status == "ran" and r.stats.get("vector_index") == "hnsw" + assert client.created == 1 + assert client.locked is False # lock released in finally + + +def test_pgvector_reindex_noop_when_index_exists(): + client = _FakeClient(has_index=True) + col = _pg_collection(client) + r = col.run_maintenance("reindex") + assert r.status == "noop" + assert client.created == 0 # never attempted a build + + +def test_pgvector_reindex_already_running_when_lock_held(): + client = _FakeClient(has_index=False) + client.locked = True # another session is building + col = _pg_collection(client) + r = col.run_maintenance("reindex") + assert r.status == "already_running" + assert client.created == 0 # did not re-trigger the build + + +def test_pgvector_analyze_runs(): + client = _FakeClient() + col = _pg_collection(client) + r = col.run_maintenance("analyze") + assert r.status == "ran" and client.analyzed == 1 + + +def test_pgvector_unknown_kind_raises(): + col = _pg_collection(_FakeClient()) + with pytest.raises(UnsupportedMaintenanceKindError): + col.run_maintenance("compact") # pgvector omits compact (autovacuum) + + +def test_pgvector_maintenance_state_reports_index(): + col = _pg_collection(_FakeClient(has_index=True)) + state = col.maintenance_state() + assert state["row_count"] == 7 + assert state["vector_index"] == "hnsw" and state["index_build_complete"] is True + + +def test_pgvector_maintenance_noop_when_table_missing(): + # Collection opened create=True but never written: no table yet. Maintenance + # must noop, not let a raw "relation does not exist" error escape. + client = _FakeClient() + client.table_exists = lambda table: False + col = _pg_collection(client) + assert col.run_maintenance("reindex").status == "noop" + assert col.run_maintenance("analyze").status == "noop" + assert col.maintenance_state()["row_count"] == 0 + + +def test_hnsw_index_name_never_collides_with_table_name(): + # A naive [:63] truncation would return a 63-char table name verbatim, + # colliding in pg_class. _pg_identifier hashes the overflow instead. + from mempalace.backends.pgvector import _hnsw_index_name + + for table in ("t", "mp_drawers", "x" * 63, "y" * 200): + name = _hnsw_index_name(table) + assert name != table + assert len(name.encode("utf-8")) <= 63 + + +def test_pgvector_advisory_key_is_signed_int4_and_stable(): + from mempalace.backends.pgvector import _MAINTENANCE_LOCK_CLASSID, _advisory_objid + + for table in ("a", "mempalace_drawers_xyz", "x" * 80): + objid = _advisory_objid(table) + assert -(2**31) <= objid < 2**31 + assert _advisory_objid(table) == objid # stable + assert -(2**31) <= _MAINTENANCE_LOCK_CLASSID < 2**31 + + +# --------------------------------------------------------------------------- +# EmbeddingCollection delegation +# --------------------------------------------------------------------------- + + +def test_embeddingcollection_delegates_maintenance(): + from mempalace.backends.embedding_wrapper import EmbeddingCollection + + class _Inner(BaseCollection): + def add(self, **k): ... + def upsert(self, **k): ... + def query(self, **k): ... + def get(self, **k): ... + def delete(self, **k): ... + def count(self): + return 0 + + def maintenance_state(self): + return {"row_count": 3} + + def run_maintenance(self, kind): + return MaintenanceResult(kind=kind, status="ran") + + wrapped = EmbeddingCollection(_Inner()) + assert wrapped.maintenance_state() == {"row_count": 3} + assert wrapped.run_maintenance("analyze").status == "ran" diff --git a/tests/test_mcp_mine.py b/tests/test_mcp_mine.py new file mode 100644 index 000000000..0f73addf4 --- /dev/null +++ b/tests/test_mcp_mine.py @@ -0,0 +1,262 @@ +""" +test_mcp_mine.py — Tests for the ``mempalace_mine`` MCP tool (#1662). + +Mining was previously CLI-only (``mempalace mine``); non-Claude-Code MCP clients +(Desktop Commander, LM Studio, Aionui) had no MCP-callable mine. ``tool_mine`` +wraps the same in-process miners the CLI uses — projects / convos / extract — +synchronously, mirroring the ``tool_sync`` contract. + +The miners print progress + a summary to stdout, which in the MCP server is the +JSON-RPC channel. ``tool_mine`` therefore redirects stdout at the file-descriptor +level around the miner and returns the text as an opaque ``output`` field rather +than letting it corrupt the protocol. These tests assert the dispatch/return +contract, that convos mining actually files drawers (the #1662 gap), and that the +stdout isolation holds. +""" + +import os + +import chromadb + + +def _patch(monkeypatch, config): + from mempalace import mcp_server + + monkeypatch.setattr(mcp_server, "_config", config) + + +def _write(path, text): + with open(path, "w", encoding="utf-8") as fh: + fh.write(text) + + +# ── Registration ───────────────────────────────────────────────────────── + + +def test_registered_in_tools(): + from mempalace import mcp_server + + assert "mempalace_mine" in mcp_server.TOOLS + entry = mcp_server.TOOLS["mempalace_mine"] + assert entry["handler"] is mcp_server.tool_mine + assert entry["input_schema"]["required"] == ["source"] + + +# ── Guard rails ────────────────────────────────────────────────────────── + + +def test_no_palace_returns_structured_error(monkeypatch): + from mempalace import mcp_server + + class _EmptyConfig: + palace_path = "" + collection_name = "mempalace_drawers" + + monkeypatch.setattr(mcp_server, "_config", _EmptyConfig()) + result = mcp_server.tool_mine(source="/tmp") + assert result["success"] is False + assert "error" in result + + +def test_invalid_mode_returns_structured_error(monkeypatch, config, tmp_dir): + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "src") + os.makedirs(src) + result = mcp_server.tool_mine(source=src, mode="bogus") + assert result["success"] is False + assert "invalid mode" in result["error"].lower() + + +def test_missing_source_dir_returns_structured_error(monkeypatch, config): + from mempalace import mcp_server + + _patch(monkeypatch, config) + result = mcp_server.tool_mine(source="/nonexistent/path/xyz") + assert result["success"] is False + assert "source" in result["error"].lower() + + +# ── Dispatch + return contract ─────────────────────────────────────────── + + +def test_dry_run_projects_returns_success_and_output(monkeypatch, config, tmp_dir): + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "proj") + os.makedirs(src) + _write(os.path.join(src, "notes.md"), "# Title\n\n" + ("Some real content. " * 40)) + + result = mcp_server.tool_mine(source=src, mode="projects", dry_run=True) + assert result["success"] is True + assert result["mode"] == "projects" + assert result["dry_run"] is True + assert isinstance(result["output"], str) and result["output"] + + +def test_convos_mode_files_drawers(monkeypatch, config, tmp_dir): + """The #1662 core ask: mine conversation transcripts via MCP. + + Proves the tool eliminates the gap rather than masking it — after a real + convos mine the palace collection actually holds the drawers. + """ + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "convos") + os.makedirs(src) + _write( + os.path.join(src, "chat.txt"), + "> What is memory?\nMemory is persistence.\n\n" + "> Why does it matter?\nIt enables continuity across sessions.\n\n" + "> How do we build it?\nWith structured verbatim storage.\n", + ) + + result = mcp_server.tool_mine(source=src, mode="convos", wing="test_convos") + assert result["success"] is True + assert result["mode"] == "convos" + assert result["dry_run"] is False + + client = chromadb.PersistentClient(path=config.palace_path) + try: + col = client.get_collection("mempalace_drawers") + assert col.count() >= 2 + finally: + del client + + +def test_stdout_captured_not_leaked_to_fd(monkeypatch, config, tmp_dir, capfd): + """Miner stdout must land in ``output``, never on the real fd-1 JSON-RPC + channel. ``tool_mine`` redirects fd 1 around the in-process miner.""" + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "convos") + os.makedirs(src) + _write( + os.path.join(src, "chat.txt"), + "> Q one?\nAnswer one is reasonably long so it forms a chunk here.\n\n" + "> Q two?\nAnswer two is also long enough to be filed as a drawer here.\n", + ) + + result = mcp_server.tool_mine(source=src, mode="convos", wing="cap", dry_run=True) + captured = capfd.readouterr() + assert "Done." in result["output"] + assert "Done." not in captured.out + + +def test_mine_already_running_surfaces_structured_error(monkeypatch, config, tmp_dir): + """A held palace lock (MineAlreadyRunning) surfaces as a structured + already-running error, mirroring tool_sync.""" + from mempalace import mcp_server + from mempalace.palace import MineAlreadyRunning + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "proj") + os.makedirs(src) + _write(os.path.join(src, "a.md"), "content " * 50) + + def _boom(*args, **kwargs): + raise MineAlreadyRunning("held by pid 999") + + monkeypatch.setattr("mempalace.miner.mine", _boom) + result = mcp_server.tool_mine(source=src, mode="projects") + assert result["success"] is False + assert result.get("error_class") == "LockHeldByOtherProcess" + + +def test_large_output_is_tail_truncated(monkeypatch, config, tmp_dir): + """A very large miner summary is tail-trimmed (and flagged, never silently) + so the MCP response stays bounded.""" + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "proj") + os.makedirs(src) + + def _chatty(*args, **kwargs): + print("X" * 5000) + return None + + monkeypatch.setattr("mempalace.miner.mine", _chatty) + result = mcp_server.tool_mine(source=src, mode="projects") + assert result["success"] is True + assert result["output_truncated"] is True + assert len(result["output"]) == 4000 + + +def test_import_error_outside_extract_is_not_mislabeled(monkeypatch, config, tmp_dir): + """An ImportError outside extract mode is a real bug, not a missing extra — + it must not be labelled MissingDependency.""" + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "proj") + os.makedirs(src) + + def _broken(*args, **kwargs): + raise ImportError("no module named 'totally_internal'") + + monkeypatch.setattr("mempalace.miner.mine", _broken) + result = mcp_server.tool_mine(source=src, mode="projects") + assert result["success"] is False + assert result.get("error_class") == "ImportError" + assert "mine failed" in result["error"] + + +def test_extract_missing_dependency_is_named(monkeypatch, config, tmp_dir): + """extract mode surfaces a MissingDependency error pointing at the extra.""" + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "docs") + os.makedirs(src) + + def _no_extra(*args, **kwargs): + raise ImportError("No module named 'markitdown'") + + monkeypatch.setattr("mempalace.format_miner.mine_formats", _no_extra) + result = mcp_server.tool_mine(source=src, mode="extract") + assert result["success"] is False + assert result.get("error_class") == "MissingDependency" + assert "mempalace[extract]" in result["error"] + + +def test_system_exit_from_miner_does_not_kill_server(monkeypatch, config, tmp_dir): + """miner.mine turns Ctrl-C into sys.exit(130); in-process that SystemExit + would escape the protocol loop (which only catches Exception) and kill the + server. tool_mine converts it to a structured error instead.""" + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "proj") + os.makedirs(src) + + def _exit(*args, **kwargs): + raise SystemExit(130) + + monkeypatch.setattr("mempalace.miner.mine", _exit) + result = mcp_server.tool_mine(source=src, mode="projects") + assert result["success"] is False + assert result.get("error_class") == "Interrupted" + + +def test_generic_exception_carries_error_class(monkeypatch, config, tmp_dir): + """An unexpected miner failure is surfaced with its exception type so the + caller can distinguish error kinds.""" + from mempalace import mcp_server + + _patch(monkeypatch, config) + src = os.path.join(tmp_dir, "proj") + os.makedirs(src) + + def _boom(*args, **kwargs): + raise RuntimeError("disk gone") + + monkeypatch.setattr("mempalace.miner.mine", _boom) + result = mcp_server.tool_mine(source=src, mode="projects") + assert result["success"] is False + assert "mine failed" in result["error"] + assert result.get("error_class") == "RuntimeError" diff --git a/tests/test_mcp_server.py b/tests/test_mcp_server.py index a5e406b0e..27f4251c9 100644 --- a/tests/test_mcp_server.py +++ b/tests/test_mcp_server.py @@ -9,6 +9,7 @@ from datetime import datetime import json import os +from types import SimpleNamespace import subprocess import sys from unittest.mock import MagicMock @@ -537,6 +538,20 @@ def test_tools_list(self): assert "mempalace_add_drawer" in names assert "mempalace_kg_add" in names + def test_no_tool_schema_uses_top_level_combinator(self): + """Anthropic's Messages API rejects a tool whose input schema has a + top-level anyOf/oneOf/allOf and drops the entire tools array with a + 400, killing the session (#1711). Cross-tool constraints must be + enforced at dispatch instead. + """ + from mempalace.mcp_server import handle_request + + resp = handle_request({"method": "tools/list", "id": 2, "params": {}}) + for tool in resp["result"]["tools"]: + schema = tool["inputSchema"] + for keyword in ("anyOf", "oneOf", "allOf"): + assert keyword not in schema, f"{tool['name']} schema has top-level {keyword}" + def test_null_arguments_does_not_hang(self, monkeypatch, config, palace_path, seeded_kg): """Sending arguments: null should return a result, not hang (#394).""" _patch_mcp_server(monkeypatch, config, seeded_kg) @@ -1054,6 +1069,38 @@ def fake_reset(): assert "results" in result assert result.get("index_recovered") is True + def test_search_retry_preserves_collection_name(self, monkeypatch, config, kg): + """Retry path must query the same configured collection both times.""" + _patch_mcp_server(monkeypatch, config, kg) + from mempalace import mcp_server + + monkeypatch.setattr( + mcp_server, + "_config", + SimpleNamespace( + palace_path=config.palace_path, + collection_name="custom_drawers", + ), + ) + seen_collection_names = [] + + def fake_search(*args, **kwargs): + seen_collection_names.append(kwargs.get("collection_name")) + if len(seen_collection_names) == 1: + return { + "error": "Search error: Error executing plan: Internal error: Error finding id" + } + return {"results": [{"text": "ok", "wing": "w", "room": "r"}]} + + monkeypatch.setattr(mcp_server, "search_memories", fake_search) + monkeypatch.setattr(mcp_server, "_force_chroma_cache_reset", lambda: None) + monkeypatch.setattr(mcp_server.time, "sleep", lambda _: None) + + result = mcp_server.tool_search(query="anything", wing="wing_api") + + assert "results" in result + assert seen_collection_names == ["custom_drawers", "custom_drawers"] + def test_search_does_not_retry_on_non_transient_error(self, monkeypatch, config, kg): """Validation / unrelated errors must not trigger the retry path.""" _patch_mcp_server(monkeypatch, config, kg) @@ -1192,6 +1239,63 @@ def test_add_drawer_duplicate_detection(self, monkeypatch, config, palace_path, assert result2["success"] is True assert result2["reason"] == "already_exists" + def test_add_drawer_returns_failure_when_idempotency_precheck_raises( + self, monkeypatch, config, kg + ): + _patch_mcp_server(monkeypatch, config, kg) + from mempalace import mcp_server + + mock_col = MagicMock() + mock_col.get.side_effect = RuntimeError("precheck boom") + monkeypatch.setattr(mcp_server, "_get_collection", lambda create=False: mock_col) + + result = mcp_server.tool_add_drawer("w", "r", "content") + + assert result["success"] is False + assert "Idempotency check failed before write" in result["error"] + assert "precheck boom" in result["error"] + + def test_add_drawer_does_not_upsert_when_idempotency_precheck_raises( + self, monkeypatch, config, kg + ): + _patch_mcp_server(monkeypatch, config, kg) + from mempalace import mcp_server + + mock_col = MagicMock() + mock_col.get.side_effect = RuntimeError("precheck boom") + monkeypatch.setattr(mcp_server, "_get_collection", lambda create=False: mock_col) + + result = mcp_server.tool_add_drawer("w", "r", "content") + + assert result["success"] is False + mock_col.upsert.assert_not_called() + + def test_add_drawer_treats_dict_like_precheck_hit_as_already_exists( + self, monkeypatch, config, kg + ): + _patch_mcp_server(monkeypatch, config, kg) + from mempalace import mcp_server + + mock_col = MagicMock() + mock_col.get.return_value = {"ids": ["existing-drawer"]} + monkeypatch.setattr(mcp_server, "_get_collection", lambda create=False: mock_col) + + result = mcp_server.tool_add_drawer("w", "r", "content") + + assert result["success"] is True + assert result["reason"] == "already_exists" + mock_col.upsert.assert_not_called() + + def test_get_result_ids_normalizes_none_to_empty_list(self): + from mempalace import mcp_server + + class DictLikeResult: + def get(self, key, default=None): + return None + + assert mcp_server._get_result_ids({"ids": None}) == [] + assert mcp_server._get_result_ids(DictLikeResult()) == [] + def test_add_drawer_fails_when_readback_misses(self, monkeypatch, config, kg): _patch_mcp_server(monkeypatch, config, kg) from mempalace import mcp_server @@ -1512,6 +1616,113 @@ def _raise(*args, **kwargs): assert result == {"error": msg} + # ── hallway MCP tools (mirror the tunnel pattern) ── + + def _seed_hallways(self, monkeypatch, tmp_path): + """Point hallways resolvers at a tmp file and seed two records.""" + from mempalace import hallways + + hallway_file = tmp_path / "hallways.json" + monkeypatch.setattr(hallways, "_get_hallway_file", lambda *a, **kw: str(hallway_file)) + monkeypatch.setattr( + hallways, + "_legacy_hallway_file", + lambda: str(tmp_path / "legacy-hallways.json"), + ) + seeded = [ + { + "id": "hallway_wing_a_X_Y_aaaa", + "wing": "wing_a", + "entity_a": "X", + "entity_b": "Y", + "co_occurrence_count": 3, + "rooms": ["room1"], + }, + { + "id": "hallway_wing_b_X_Z_bbbb", + "wing": "wing_b", + "entity_a": "X", + "entity_b": "Z", + "co_occurrence_count": 1, + "rooms": ["room2"], + }, + ] + hallways._save_hallways(seeded) + return seeded + + def test_tool_list_hallways_returns_all_without_filter(self, monkeypatch, tmp_path): + """tool_list_hallways with no wing returns every record.""" + from mempalace import mcp_server + + seeded = self._seed_hallways(monkeypatch, tmp_path) + result = mcp_server.tool_list_hallways() + assert isinstance(result, list) + assert len(result) == len(seeded) + ids = {h["id"] for h in result} + assert ids == {h["id"] for h in seeded} + + def test_tool_list_hallways_filters_by_wing(self, monkeypatch, tmp_path): + """tool_list_hallways with wing returns only that wing's records.""" + from mempalace import mcp_server + + self._seed_hallways(monkeypatch, tmp_path) + result = mcp_server.tool_list_hallways(wing="wing_a") + assert len(result) == 1 + assert result[0]["wing"] == "wing_a" + + def test_tool_list_hallways_rejects_invalid_wing_name(self, monkeypatch, tmp_path): + """Invalid wing names go through _sanitize_optional_name and return a + structured error rather than crashing — mirrors tool_list_tunnels.""" + from mempalace import mcp_server + + self._seed_hallways(monkeypatch, tmp_path) + # Forward-slash is not a valid name character per sanitize_name. + result = mcp_server.tool_list_hallways(wing="wing/with/slashes") + assert isinstance(result, dict) + assert "error" in result + + def test_tool_delete_hallway_removes_existing_record(self, monkeypatch, tmp_path): + """tool_delete_hallway removes the record and returns {deleted: True}.""" + from mempalace import mcp_server + + seeded = self._seed_hallways(monkeypatch, tmp_path) + target_id = seeded[0]["id"] + result = mcp_server.tool_delete_hallway(hallway_id=target_id) + assert result == {"deleted": True} + remaining = mcp_server.tool_list_hallways() + assert target_id not in {h["id"] for h in remaining} + + def test_tool_delete_hallway_unknown_id_returns_false(self, monkeypatch, tmp_path): + """Deleting an ID that doesn't exist returns {deleted: False} without error.""" + from mempalace import mcp_server + + self._seed_hallways(monkeypatch, tmp_path) + result = mcp_server.tool_delete_hallway(hallway_id="hallway_does_not_exist") + assert result == {"deleted": False} + + def test_tool_delete_hallway_requires_string_id(self): + """Missing or non-string hallway_id surfaces a structured error.""" + from mempalace import mcp_server + + assert mcp_server.tool_delete_hallway(hallway_id="") == {"error": "hallway_id is required"} + assert mcp_server.tool_delete_hallway(hallway_id=None) == { + "error": "hallway_id is required" + } + + def test_hallway_tools_registered_in_tools_registry(self): + """Both new tools must appear in the public TOOLS registry so MCP clients can dispatch them.""" + from mempalace import mcp_server + + assert "mempalace_list_hallways" in mcp_server.TOOLS + assert "mempalace_delete_hallway" in mcp_server.TOOLS + assert ( + mcp_server.TOOLS["mempalace_list_hallways"]["handler"] is mcp_server.tool_list_hallways + ) + assert ( + mcp_server.TOOLS["mempalace_delete_hallway"]["handler"] + is mcp_server.tool_delete_hallway + ) + def test_add_drawer_normal_content_single_drawer(self, monkeypatch, config, palace_path, kg): """Regression catch: content below CHUNK_SIZE produces exactly one drawer with ``chunks == 1``. Pre-#1539 contract preserved.""" @@ -1634,35 +1845,88 @@ def test_add_drawer_boundary_exact_chunk_size_stays_single( assert result["chunks"] == 1 assert "chunk_ids" not in result - def test_add_drawer_chunked_logical_id_not_fetchable_directly( - self, monkeypatch, config, palace_path, kg - ): - """Documented contract on the chunked path: ``tool_get_drawer`` - and ``tool_delete_drawer`` against the returned logical - ``drawer_id`` report ``not found`` because no row is stored - under that id. Callers must iterate ``chunk_ids`` or query by - ``parent_drawer_id`` metadata.""" - _patch_mcp_server(monkeypatch, config, kg) - _client, _col = _get_collection(palace_path, create=True) - del _client - from mempalace.mcp_server import tool_add_drawer, tool_delete_drawer, tool_get_drawer - result = tool_add_drawer(wing="w", room="r", content="P" * 4000) - assert result["success"] is True and result["chunks"] > 1 +def test_add_drawer_chunked_logical_id_fetches_deletes_and_lists_as_one( + monkeypatch, config, palace_path, kg +): + """Chunk rows are internal storage; MCP tools operate on the logical id.""" + _patch_mcp_server(monkeypatch, config, kg) + _client, _col = _get_collection(palace_path, create=True) + del _client + + from mempalace.mcp_server import ( + tool_add_drawer, + tool_delete_drawer, + tool_get_drawer, + tool_list_drawers, + ) + + result = tool_add_drawer(wing="w", room="r", content="P" * 4000) + + assert result["success"] is True + assert result["chunks"] > 1 + + logical_id = result["drawer_id"] + + fetched = tool_get_drawer(logical_id) + assert fetched["drawer_id"] == logical_id + assert fetched["content"] == "P" * 4000 + assert fetched["chunks"] == result["chunks"] + assert fetched["chunk_ids"] == result["chunk_ids"] - # tool_get_drawer against logical id: not found. - got_logical = tool_get_drawer(result["drawer_id"]) - assert "error" in got_logical and "not found" in got_logical["error"].lower() + listed = tool_list_drawers(wing="w", room="r") + assert listed["total"] == 1 + assert listed["count"] == 1 + assert listed["drawers"][0]["drawer_id"] == logical_id + assert listed["drawers"][0]["chunks"] == result["chunks"] - # tool_get_drawer against the first chunk id: found, full content slice. - got_chunk = tool_get_drawer(result["chunk_ids"][0]) - assert got_chunk["content"] == "P" * config.chunk_size - assert got_chunk["metadata"]["parent_drawer_id"] == result["drawer_id"] + deleted = tool_delete_drawer(logical_id) + assert deleted["success"] is True + assert deleted["chunks_deleted"] == result["chunks"] - # tool_delete_drawer against logical id: also not found. - deleted_logical = tool_delete_drawer(result["drawer_id"]) - assert deleted_logical["success"] is False - assert "not found" in deleted_logical["error"].lower() + missing = tool_get_drawer(logical_id) + assert "error" in missing + assert "not found" in missing["error"].lower() + + +def test_update_drawer_chunked_logical_id_rewrites_group(monkeypatch, config, palace_path, kg): + """Updating the returned logical id rewrites the underlying chunk group.""" + _patch_mcp_server(monkeypatch, config, kg) + _client, _col = _get_collection(palace_path, create=True) + del _client + + from mempalace.mcp_server import ( + tool_add_drawer, + tool_get_drawer, + tool_list_drawers, + tool_update_drawer, + ) + + result = tool_add_drawer(wing="old", room="old_room", content="A" * 2600) + assert result["success"] is True + assert result["chunks"] > 1 + + logical_id = result["drawer_id"] + + updated = tool_update_drawer( + logical_id, + content="B" * 1800, + wing="new", + room="new_room", + ) + + assert updated["success"] is True + assert updated["drawer_id"] == logical_id + + fetched = tool_get_drawer(logical_id) + assert fetched["drawer_id"] == logical_id + assert fetched["content"] == "B" * 1800 + assert fetched["wing"] == "new" + assert fetched["room"] == "new_room" + + listed = tool_list_drawers(wing="new", room="new_room") + assert listed["total"] == 1 + assert listed["drawers"][0]["drawer_id"] == logical_id # ── KG Tools ──────────────────────────────────────────────────────────── @@ -2338,10 +2602,20 @@ def test_missing_db_invalidates_cache(self, monkeypatch, config, palace_path, kg if os.path.isfile(db_file): os.remove(db_file) + make_client_calls = [] + + def fail_if_make_client_called(path): + make_client_calls.append(path) + raise AssertionError("_get_collection(create=False) should not open missing Chroma DB") + + monkeypatch.setattr(mcp_server.ChromaBackend, "make_client", fail_if_make_client_called) + # Cache should be invalidated; _get_collection returns None # because the backend can't open a missing DB without create=True - mcp_server._get_collection() + assert mcp_server._get_collection() is None # The key assertion: the old cached collection was dropped + assert make_client_calls == [] + assert mcp_server._collection_cache is None assert mcp_server._palace_db_inode == 0 assert mcp_server._palace_db_mtime == 0.0 diff --git a/tests/test_migrate.py b/tests/test_migrate.py index 5d9255a9f..1e0259ba1 100644 --- a/tests/test_migrate.py +++ b/tests/test_migrate.py @@ -286,3 +286,52 @@ def tracking_mkdtemp(*args, **kwargs): assert captured_temp_paths, "mkdtemp was never called — flow short-circuited" for p in captured_temp_paths: assert not os.path.exists(p), f"temp palace was not cleaned up: {p}" + + +def test_migrate_prunes_old_pre_migrate_backups(tmp_path, monkeypatch): + """Repeated migrations must not accumulate full-palace copies forever. + + The backup + prune happen right after copytree, before the (mocked) + chromadb step, so even a migration that fails afterward still trims the + backup set. We let copytree run for real so the fresh backup exists on + disk for the prune to evaluate. + """ + palace_dir = tmp_path / "palace" + palace_dir.mkdir() + (palace_dir / "chroma.sqlite3").write_text("db") + + # Pre-seed 3 stale .pre-migrate.* sibling dirs with old mtimes. + for i in range(3): + stale = tmp_path / f"palace.pre-migrate.2026010{i}_000000" + stale.mkdir() + (stale / "chroma.sqlite3").write_text("old") + os.utime(stale, (1_700_000_000 + i, 1_700_000_000 + i)) + + monkeypatch.setenv("MEMPALACE_MAX_BACKUPS", "2") + + failing_backend = MagicMock() + failing_backend.get_collection.side_effect = Exception("unreadable") + failing_backend.get_or_create_collection.side_effect = RuntimeError("chromadb boom") + + import mempalace.backends.chroma as _chroma_mod + + with ( + patch("mempalace.migrate.detect_chromadb_version", return_value="0.5.x"), + patch( + "mempalace.migrate.extract_drawers_from_sqlite", + return_value=[{"id": "id1", "document": "doc", "metadata": {"wing": "w", "room": "r"}}], + ), + patch("builtins.input", return_value="y"), + patch.object(_chroma_mod, "ChromaBackend", return_value=failing_backend), + ): + try: + migrate(str(palace_dir), confirm=True) + except Exception: + pass + + backups = sorted(p.name for p in tmp_path.glob("palace.pre-migrate.*")) + # 3 stale + 1 fresh = 4 created; retention keeps only the 2 newest. + assert len(backups) == 2 + # The two oldest stale backups must be gone. + assert "palace.pre-migrate.20260100_000000" not in backups + assert "palace.pre-migrate.20260101_000000" not in backups diff --git a/tests/test_mine_lock_lifecycle.py b/tests/test_mine_lock_lifecycle.py new file mode 100644 index 000000000..7b6f52970 --- /dev/null +++ b/tests/test_mine_lock_lifecycle.py @@ -0,0 +1,225 @@ +from __future__ import annotations + +import multiprocessing +import os +import time +from pathlib import Path + +import pytest + +import mempalace.palace as palace_module +from mempalace.palace import ( + _lock_mine_lock_file, + _mine_lock_path, + _open_mine_lock_file, + _unlock_mine_lock_file, + mine_lock, +) + + +def _set_home(monkeypatch, tmp_path: Path) -> None: + monkeypatch.setenv("HOME", str(tmp_path)) + monkeypatch.setenv("USERPROFILE", str(tmp_path)) + + +def _wait_for_path(path: Path, timeout: float = 10.0) -> bool: + deadline = time.monotonic() + timeout + while time.monotonic() < deadline: + if path.exists(): + return True + time.sleep(0.01) + return path.exists() + + +def _assert_path_absent_for(path: Path, duration: float = 0.5) -> None: + deadline = time.monotonic() + duration + while time.monotonic() < deadline: + assert not path.exists(), "waiter entered while replacement lock was held" + time.sleep(0.01) + assert not path.exists(), "waiter entered while replacement lock was held" + + +def _stale_waiter_target( + lock_path: str, + source_file: str, + opened_flag: str, + entered_flag: str, + release_flag: str, + result_q, +) -> None: + try: + from mempalace.palace import ( + _acquire_open_mine_lock_file as acquire_open, + _open_mine_lock_file as open_lock, + _unlock_mine_lock_file as unlock_file, + mine_lock as public_mine_lock, + ) + + lf = open_lock(lock_path, create=True) + Path(opened_flag).touch() + current = acquire_open(lf, lock_path) + result_q.put(("first-acquire-current", current)) + if current: + Path(entered_flag).touch() + _wait_for_path(Path(release_flag)) + unlock_file(lf) + lf.close() + result_q.put(("done", True)) + return + + lf.close() + result_q.put(("retrying", True)) + with public_mine_lock(source_file): + Path(entered_flag).touch() + _wait_for_path(Path(release_flag)) + result_q.put(("done", True)) + except BaseException as exc: # pragma: no cover - surfaced through queue + result_q.put(("error", repr(exc))) + + +def test_mine_lock_removes_uncontended_lock_file(tmp_path, monkeypatch): + _set_home(monkeypatch, tmp_path) + source_file = str(tmp_path / "source.txt") + lock_path = Path(_mine_lock_path(source_file)) + + with mine_lock(source_file): + assert lock_path.exists() + + assert not lock_path.exists() + + with mine_lock(source_file): + assert lock_path.exists() + + assert not lock_path.exists() + + +def test_mine_lock_close_failure_still_runs_cleanup(monkeypatch): + events = [] + + class FakeLock: + def close(self): + events.append("close") + raise OSError("close failed") + + fake_lock = FakeLock() + monkeypatch.setattr(palace_module, "_mine_lock_path", lambda source_file: "source.lock") + monkeypatch.setattr(palace_module, "_acquire_mine_lock_file", lambda lock_path: fake_lock) + monkeypatch.setattr( + palace_module, "_unlock_mine_lock_file", lambda lock_file: events.append("unlock") + ) + monkeypatch.setattr( + palace_module, + "_cleanup_mine_lock_file", + lambda lock_path: events.append(("cleanup", lock_path)), + ) + + with palace_module.mine_lock("source.txt"): + events.append("body") + + assert events == ["body", "unlock", "close", ("cleanup", "source.lock")] + + +def test_windows_cleanup_release_failure_does_not_retry_unlock(monkeypatch): + events = [] + + class FakeLock: + def close(self): + events.append("close") + + fake_lock = FakeLock() + + monkeypatch.setattr(palace_module.os, "name", "nt", raising=False) + monkeypatch.setattr( + palace_module, "_open_mine_lock_file", lambda lock_path, *, create: fake_lock + ) + monkeypatch.setattr(palace_module, "_lock_mine_lock_file", lambda lock_file, *, blocking: True) + monkeypatch.setattr( + palace_module, "_mine_lock_file_is_current", lambda lock_file, lock_path: True + ) + + def fail_unlock(lock_file): + events.append("unlock") + raise OSError("unlock failed") + + monkeypatch.setattr(palace_module, "_unlock_mine_lock_file", fail_unlock) + + palace_module._cleanup_mine_lock_file("source.lock") + + assert events == ["unlock", "close"] + + +@pytest.mark.skipif(os.name == "nt", reason="POSIX inode replacement regression") +def test_mine_lock_retries_when_waiter_wakes_on_unlinked_inode(tmp_path, monkeypatch): + """A waiter on an unlinked lock inode must not enter the critical section. + + This models the race from issue #1800: process A removes the path after + release while process B was already waiting on the old inode and process C + has locked a replacement path. B must reject the stale inode and retry. + """ + _set_home(monkeypatch, tmp_path) + source_file = str(tmp_path / "source.txt") + lock_path = Path(_mine_lock_path(source_file)) + + old_lf = _open_mine_lock_file(str(lock_path), create=True) + replacement_lf = None + child = None + try: + assert _lock_mine_lock_file(old_lf, blocking=False) + + opened_flag = tmp_path / "opened" + entered_flag = tmp_path / "entered" + release_flag = tmp_path / "release" + ctx = multiprocessing.get_context("spawn") + result_q = ctx.Queue() + child = ctx.Process( + target=_stale_waiter_target, + args=( + str(lock_path), + source_file, + str(opened_flag), + str(entered_flag), + str(release_flag), + result_q, + ), + ) + child.start() + assert _wait_for_path(opened_flag), "waiter did not open the original lock file" + + os.remove(lock_path) + replacement_lf = _open_mine_lock_file(str(lock_path), create=True) + assert _lock_mine_lock_file(replacement_lf, blocking=False) + + _unlock_mine_lock_file(old_lf) + old_lf.close() + old_lf = None + + assert result_q.get(timeout=10) == ("first-acquire-current", False) + assert result_q.get(timeout=10) == ("retrying", True) + _assert_path_absent_for(entered_flag) + + _unlock_mine_lock_file(replacement_lf) + replacement_lf.close() + replacement_lf = None + + assert _wait_for_path(entered_flag), "waiter did not retry on the replacement path" + release_flag.touch() + assert result_q.get(timeout=10) == ("done", True) + child.join(timeout=10) + assert child.exitcode == 0 + assert not lock_path.exists() + finally: + if child is not None and child.is_alive(): + child.terminate() + child.join(timeout=5) + if replacement_lf is not None: + try: + _unlock_mine_lock_file(replacement_lf) + except Exception: + pass + replacement_lf.close() + if old_lf is not None: + try: + _unlock_mine_lock_file(old_lf) + except Exception: + pass + old_lf.close() diff --git a/tests/test_miner.py b/tests/test_miner.py index 15d5931d6..34ceff6dc 100644 --- a/tests/test_miner.py +++ b/tests/test_miner.py @@ -2185,3 +2185,129 @@ def get(self, where=None, limit=None, offset=0, include=None): "must iterate all groups for the source_file (mirroring the existing " "paginated pattern in the extract_mode-is-set branch)." ) + + +# ── --limit skips already-mined files (#1535) ────────────────────────── + + +def test_mine_limit_skips_already_mined_files(tmp_path, capsys): + """--limit N should count only NEW work, not already-mined skips (#1535).""" + from unittest.mock import patch + + project_root = tmp_path / "proj" + project_root.mkdir() + _make_minable_project(project_root, n_files=10) + palace_path = project_root / "palace" + + call_count = 0 + + def fake_process_file(*args, **kwargs): + nonlocal call_count + call_count += 1 + if call_count <= 8: + return (0, "general", None) + return (3, "general", None) + + with patch("mempalace.miner.process_file", side_effect=fake_process_file): + mine(str(project_root), str(palace_path), limit=5) + + out = capsys.readouterr().out + assert "Drawers filed: 6" in out + assert call_count == 10 + + +def test_mine_limit_stops_after_n_new_files(tmp_path, capsys): + """--limit 3 on 5 unmined files mines exactly 3 and stops.""" + from unittest.mock import patch + + project_root = tmp_path / "proj" + project_root.mkdir() + _make_minable_project(project_root, n_files=5) + palace_path = project_root / "palace" + + call_count = 0 + + def fake_process_file(*args, **kwargs): + nonlocal call_count + call_count += 1 + return (2, "general", None) + + with patch("mempalace.miner.process_file", side_effect=fake_process_file): + mine(str(project_root), str(palace_path), limit=3) + + assert call_count == 3 + out = capsys.readouterr().out + assert "Drawers filed: 6" in out + + +def test_mine_limit_zero_mines_all(tmp_path, capsys): + """--limit 0 (default) processes every file.""" + from unittest.mock import patch + + project_root = tmp_path / "proj" + project_root.mkdir() + _make_minable_project(project_root, n_files=4) + palace_path = project_root / "palace" + + call_count = 0 + + def fake_process_file(*args, **kwargs): + nonlocal call_count + call_count += 1 + return (1, "general", None) + + with patch("mempalace.miner.process_file", side_effect=fake_process_file): + mine(str(project_root), str(palace_path), limit=0) + + assert call_count == 4 + out = capsys.readouterr().out + assert "Drawers filed: 4" in out + + +def test_mine_limit_dry_run(tmp_path, capsys): + """--dry-run --limit N counts new files toward the limit.""" + from unittest.mock import patch + + project_root = tmp_path / "proj" + project_root.mkdir() + _make_minable_project(project_root, n_files=5) + palace_path = project_root / "palace" + + call_count = 0 + + def fake_process_file(*args, **kwargs): + nonlocal call_count + call_count += 1 + return (2, "general", None) + + with patch("mempalace.miner.process_file", side_effect=fake_process_file): + mine(str(project_root), str(palace_path), limit=3, dry_run=True) + + assert call_count == 3 + + +def test_mine_limit_summary_counts(tmp_path, capsys): + """Summary arithmetic is correct when limit causes early exit.""" + from unittest.mock import patch + + project_root = tmp_path / "proj" + project_root.mkdir() + _make_minable_project(project_root, n_files=8) + palace_path = project_root / "palace" + + call_idx = 0 + + def fake_process_file(*args, **kwargs): + nonlocal call_idx + call_idx += 1 + if call_idx % 2 == 0: + return (0, "general", None) + return (3, "general", None) + + with patch("mempalace.miner.process_file", side_effect=fake_process_file): + mine(str(project_root), str(palace_path), limit=2) + + out = capsys.readouterr().out + assert "Files processed: 2" in out + assert "Drawers filed: 6" in out + assert "(limit: 2 new)" in out diff --git a/tests/test_miner_fts5_validation.py b/tests/test_miner_fts5_validation.py index 7442159c1..2b085a594 100644 --- a/tests/test_miner_fts5_validation.py +++ b/tests/test_miner_fts5_validation.py @@ -87,10 +87,18 @@ def _corrupt_fts5_segment(sqlite_path: Path) -> None: pytest.skip("FTS5 segments empty: cannot fabricate FTS5-only corruption") target = next((r for r in rows if r[0] > 10), rows[0]) garbage = b"\xde\xad\xbe\xef" * (len(target[1]) // 4) - conn.execute( - "UPDATE embedding_fulltext_search_data SET block=? WHERE id=?", - (garbage, target[0]), - ) + try: + conn.execute( + "UPDATE embedding_fulltext_search_data SET block=? WHERE id=?", + (garbage, target[0]), + ) + except sqlite3.OperationalError as exc: + if "may not be modified" in str(exc): + pytest.skip( + "this SQLite build refuses direct FTS5 shadow-table writes; " + "cannot fabricate FTS5-only corruption" + ) + raise conn.commit() diff --git a/tests/test_pgvector_backend.py b/tests/test_pgvector_backend.py index 6c16ce209..a22df2ee0 100644 --- a/tests/test_pgvector_backend.py +++ b/tests/test_pgvector_backend.py @@ -1,4 +1,7 @@ import os +import sys +import threading +import types import pytest @@ -14,6 +17,8 @@ ) from mempalace.backends.pgvector import ( PgVectorBackend, + _PgVectorClient, + _PgVectorConfig, _matches_where, _vector_distance, _as_vector_array, @@ -509,3 +514,140 @@ def test_pgvector_live_roundtrip_when_enabled(tmp_path): except Exception: pass backend.close() + + +def test_client_concurrent_first_connect_single_connection(monkeypatch): + """Two threads racing ``_execute`` through the first ``_connect`` must end + up on one shared connection. + + The barrier inside the fake ``psycopg.connect`` releases immediately only + when both threads pass the ``self._conn is None`` check together: the + broken interleaving, which created two connections, leaked the loser, and + ran the threads on different connections. With ``_connect`` under + ``self._lock`` the second thread blocks on the lock, the winner's barrier + times out, and the loser reuses the winner's connection. + """ + created = [] + barrier = threading.Barrier(2) + + class _FakeCursor: + def __enter__(self): + return self + + def __exit__(self, *exc): + return False + + def execute(self, sql, params=None): + return None + + def executemany(self, sql, params=None): + return None + + def fetchall(self): + return [(1,)] + + class _FakeConn: + def __init__(self): + self.closed = False + + def cursor(self): + return _FakeCursor() + + def commit(self): + return None + + def rollback(self): + return None + + def close(self): + self.closed = True + + fake_psycopg = types.ModuleType("psycopg") + + def racing_connect(dsn): + try: + barrier.wait(timeout=1.0) + except threading.BrokenBarrierError: + pass + conn = _FakeConn() + created.append(conn) + return conn + + fake_psycopg.connect = racing_connect + monkeypatch.setitem(sys.modules, "psycopg", fake_psycopg) + + client = _PgVectorClient(_PgVectorConfig(dsn="postgresql://localhost/unused", namespace=None)) + errors = [] + + def run_query(): + try: + client.ping() + except Exception as exc: + errors.append(exc) + + threads = [threading.Thread(target=run_query, daemon=True) for _ in range(2)] + for t in threads: + t.start() + for t in threads: + t.join(timeout=30) + + assert not any(t.is_alive() for t in threads) + assert errors == [] + assert len(created) == 1 + assert client._conn is created[0] + + client.close() + assert created[0].closed + + +def test_client_execute_after_close_raises(monkeypatch): + """``close()`` is terminal: a stale client reference must get an error + instead of silently reconnecting and leaking a session nobody closes.""" + created = [] + + class _FakeCursor: + def __enter__(self): + return self + + def __exit__(self, *exc): + return False + + def execute(self, sql, params=None): + return None + + def fetchall(self): + return [(1,)] + + class _FakeConn: + def __init__(self): + self.closed = False + + def cursor(self): + return _FakeCursor() + + def commit(self): + return None + + def close(self): + self.closed = True + + fake_psycopg = types.ModuleType("psycopg") + + def fake_connect(dsn): + conn = _FakeConn() + created.append(conn) + return conn + + fake_psycopg.connect = fake_connect + monkeypatch.setitem(sys.modules, "psycopg", fake_psycopg) + + client = _PgVectorClient(_PgVectorConfig(dsn="postgresql://localhost/unused", namespace=None)) + client.ping() + assert len(created) == 1 + + client.close() + assert created[0].closed + + with pytest.raises(BackendError, match="closed"): + client.ping() + assert len(created) == 1 diff --git a/tests/test_repair.py b/tests/test_repair.py index 981351ef8..8824dcea4 100644 --- a/tests/test_repair.py +++ b/tests/test_repair.py @@ -1240,6 +1240,57 @@ def test_max_seq_id_backup_created(tmp_path): assert rows[seg["drawers_meta"]] == seg["poisoned_values"][seg["drawers_meta"]] +def test_max_seq_id_backup_pruned_to_max_backups(tmp_path, monkeypatch): + """Old max-seq-id backups beyond MEMPALACE_MAX_BACKUPS are pruned after a repair. + + Without retention, every repair left a full chroma.sqlite3 copy behind + that was never cleaned up — the unbounded disk-growth bug this guards. + """ + palace = str(tmp_path / "palace") + _seed_poisoned_max_seq_id(palace) + + # Pre-seed 4 stale backups with old mtimes so the just-created one is + # unambiguously the newest. + for i in range(4): + stale = os.path.join(palace, f"chroma.sqlite3.max-seq-id-backup-2026010{i}-000000") + with open(stale, "w") as f: + f.write("old") + os.utime(stale, (1_700_000_000 + i, 1_700_000_000 + i)) + + monkeypatch.setenv("MEMPALACE_MAX_BACKUPS", "2") + + result = repair.repair_max_seq_id(palace, assume_yes=True) + + backups = sorted( + fn for fn in os.listdir(palace) if fn.startswith("chroma.sqlite3.max-seq-id-backup-") + ) + # 4 stale + 1 fresh = 5 written; retention keeps only the 2 newest. + assert len(backups) == 2 + # The backup created by this repair must be one of the survivors. + assert os.path.basename(result["backup"]) in backups + + +def test_max_seq_id_backup_retained_when_pruning_disabled(tmp_path, monkeypatch): + """max_backups=0 keeps every backup (opt-out for external retention).""" + palace = str(tmp_path / "palace") + _seed_poisoned_max_seq_id(palace) + + for i in range(3): + stale = os.path.join(palace, f"chroma.sqlite3.max-seq-id-backup-2026010{i}-000000") + with open(stale, "w") as f: + f.write("old") + os.utime(stale, (1_700_000_000 + i, 1_700_000_000 + i)) + + monkeypatch.setenv("MEMPALACE_MAX_BACKUPS", "0") + + repair.repair_max_seq_id(palace, assume_yes=True) + + backups = [ + fn for fn in os.listdir(palace) if fn.startswith("chroma.sqlite3.max-seq-id-backup-") + ] + assert len(backups) == 4 + + def test_max_seq_id_rollback_on_verification_failure(tmp_path, monkeypatch): """If the post-update detector still sees poison, raise and leave a backup.""" palace = str(tmp_path / "palace") diff --git a/tests/test_searcher.py b/tests/test_searcher.py index 721bb117e..236d5b98d 100644 --- a/tests/test_searcher.py +++ b/tests/test_searcher.py @@ -343,8 +343,10 @@ def test_search_applies_bm25_hybrid_rerank(self, fake_palace_path, capsys): # Non-zero bm25 reported assert "bm25=" in first_block assert "bm25=0.0" not in first_block - # Cosine still reported for transparency - assert "cosine=" in first_block + # Metric-labeled vector similarity still reported for transparency. + # Label is now "_sim=" (honest about the backend's metric) + # rather than a hard-coded "cosine=". + assert "cosine_sim=" in first_block def test_search_warns_when_palace_uses_wrong_distance_metric(self, fake_palace_path, capsys): """Legacy palaces created without `hnsw:space=cosine` silently diff --git a/tests/test_sqlite_exact_backend.py b/tests/test_sqlite_exact_backend.py index b5b953e1a..82322d35d 100644 --- a/tests/test_sqlite_exact_backend.py +++ b/tests/test_sqlite_exact_backend.py @@ -1,7 +1,10 @@ import math +import sqlite3 +import threading import pytest +import mempalace.backends.sqlite_exact as sqlite_exact_module from mempalace.backends import ( BackendMismatchError, CollectionNotInitializedError, @@ -375,3 +378,58 @@ def test_search_vector_disabled_fallback_is_chroma_only(tmp_path, monkeypatch): assert result["unsupported_capability"] == "chroma_hnsw_fallback" assert result["backend"] == "sqlite_exact" + + +def test_concurrent_first_open_single_connection_no_leak(tmp_path, monkeypatch): + """Two threads first-opening the same palace concurrently must share one + handle and one sqlite connection. + + The barrier inside the patched ``sqlite3.connect`` releases immediately + only when both threads pass the cache-miss check together: the broken + interleaving, which also ran ``_init_schema`` concurrently on a fresh + file and surfaced "database is locked". With creation serialized under + ``_clients_lock`` the second thread waits on the lock instead, the + winner's barrier times out, and exactly one connection is ever created. + """ + created = [] + barrier = threading.Barrier(2) + real_connect = sqlite3.connect + + def racing_connect(*args, **kwargs): + try: + barrier.wait(timeout=1.0) + except threading.BrokenBarrierError: + pass + conn = real_connect(*args, **kwargs) + created.append(conn) + return conn + + monkeypatch.setattr(sqlite_exact_module.sqlite3, "connect", racing_connect) + + backend = SQLiteExactBackend() + palace = PalaceRef(id=str(tmp_path), local_path=str(tmp_path)) + results = [None, None] + errors = [] + + def open_collection(i): + try: + results[i] = backend.get_collection( + palace=palace, collection_name="drawers", create=True + ) + except Exception as exc: + errors.append(exc) + + threads = [threading.Thread(target=open_collection, args=(i,), daemon=True) for i in range(2)] + for t in threads: + t.start() + for t in threads: + t.join(timeout=30) + + assert not any(t.is_alive() for t in threads) + assert errors == [] + assert len(created) == 1 + assert results[0]._handle is results[1]._handle + + backend.close() + with pytest.raises(sqlite3.ProgrammingError): + created[0].execute("SELECT 1") diff --git a/uv.lock b/uv.lock index 2a5ce27b3..17efd938c 100644 --- a/uv.lock +++ b/uv.lock @@ -1951,7 +1951,7 @@ wheels = [ [[package]] name = "mempalace" -version = "3.3.6" +version = "3.4.1" source = { editable = "." } dependencies = [ { name = "chromadb" }, diff --git a/website/.vitepress/config.mts b/website/.vitepress/config.mts index 6f01b4024..c44ba0550 100644 --- a/website/.vitepress/config.mts +++ b/website/.vitepress/config.mts @@ -57,9 +57,11 @@ export default withMermaid( { text: 'Claude Code Plugin', link: '/guide/claude-code' }, { text: 'Claude Code Retention', link: '/guide/claude-code-retention' }, { text: 'Gemini CLI', link: '/guide/gemini-cli' }, + { text: 'Antigravity Plugin', link: '/guide/antigravity' }, { text: 'OpenClaw Skill', link: '/guide/openclaw' }, { text: 'Local Models', link: '/guide/local-models' }, { text: 'Auto-Save Hooks', link: '/guide/hooks' }, + { text: 'Cursor IDE Hooks', link: '/guide/cursor-hooks' }, { text: 'Configuration', link: '/guide/configuration' }, ], }, diff --git a/website/guide/antigravity.md b/website/guide/antigravity.md new file mode 100644 index 000000000..08ced5a68 --- /dev/null +++ b/website/guide/antigravity.md @@ -0,0 +1,283 @@ +# Antigravity Plugin + +MemPalace ships first-class support for Google's +[Antigravity IDE](https://antigravity.google/) as an installable +plugin. The plugin registers MemPalace's MCP server, ships the +`mempalace` skill, and wires two lifecycle hooks (Stop and +PreInvocation) for background mining and startup memory injection. + +## What gets registered + +| Surface | Antigravity component | +|-----------------|------------------------------------------------------------| +| MCP server | `mempalace` (stdio, runs `mempalace-mcp`) | +| Skill | `mempalace` (in-plugin `skills/mempalace/SKILL.md`) | +| Stop hook | `mempalace-save` — background-mines the conversation | +| PreInvocation | `mempalace-wake` — injects memory on the first model call | + +The full audit of which Antigravity surfaces we use, why, and what we +deliberately do not ship is in [`hooks/antigravity/INVESTIGATION.md`](https://github.com/MemPalace/mempalace/blob/main/hooks/antigravity/INVESTIGATION.md). + +## Prerequisites + +- Python 3.9+ +- [`mempalace`](https://github.com/MemPalace/mempalace) installed and + on `$PATH` (`mempalace --version` to verify) +- [Antigravity IDE](https://antigravity.google/) installed (`~/.gemini/` + exists) + +## Install + +From the cloned `mempalace` repo: + +```bash +bash hooks/antigravity/install.sh +``` + +This installs to `~/.gemini/config/plugins/mempalace/`. Restart +Antigravity and the plugin loads automatically — you'll see +`mempalace` in the MCP store and the skill list. + +### Dry run first + +```bash +bash hooks/antigravity/install.sh --dry-run +``` + +### Custom install dir (workspace-scoped) + +```bash +bash hooks/antigravity/install.sh \ + --install-dir /.agents/plugins/mempalace +``` + +The installer absolutizes any relative path before baking it into the +rendered `hooks.json`, so the resulting plugin is portable to any +working directory. + +### Idempotency + +Re-running the installer produces a byte-identical install — +`cmp`-gated copies skip files whose contents already match. Safe to +run from CI. + +### Uninstall + +```bash +bash hooks/antigravity/install.sh --uninstall +``` + +The uninstaller has two safety guards: + +1. The basename of `--install-dir` must be exactly `mempalace`. +2. The directory must contain a `plugin.json` whose `name` is + `"mempalace"`. + +This prevents an accidental wipe of an unrelated directory if the +install dir is ever misconfigured. + +## How the hooks behave + +### Stop hook (`mempalace-save`) + +Fires every time the agent's execution loop terminates. Counts each +fire per-conversation; on every Nth fire (default 15, configurable +via `MEMPAL_SAVE_INTERVAL`), it spawns +`mempalace mine --mode convos` in the background. + +Defers when: + +- `fullyIdle == false` — background commands are still running, the + transcript is in motion. Try again on the next Stop fire. +- `terminationReason == "error"` — the transcript may be corrupt. +- A previous save for this conversation is still running. +- Any kill switch (see below) is set. + +The hook **always** returns `{}` to stdout — never +`{"decision": "continue"}`, which would force the agent into an +infinite re-execution loop. + +### PreInvocation hook (`mempalace-wake`) + +Fires before every model call, but is gated to `invocationNum == 1` +so memory only gets injected once per conversation (mimicking +Cursor's `sessionStart` semantics). + +When the gate passes, runs `mempalace wake-up --wing ` with +a 500ms hard timeout and emits the verbatim output as an +`ephemeralMessage`. The injection lives for one turn only and never +persists into the transcript. + +The wing is inferred from `workspacePaths[0]` (the first absolute +workspace path). If you have a multi-workspace conversation, the +first workspace wins. + +## Kill switches + +Any one of these silently disables both hooks: + +| Knob | Value | +|-------------------------------------|--------------------------------------| +| `MEMPAL_DISABLE_HOOK` | `1`, `true`, `yes` | +| `MEMPALACE_HOOKS_AUTO_SAVE` | `false`, `0`, `no` | +| `~/.mempalace/config.json` | `{ "hooks": { "auto_save": false }}` | +| (remove `~/.mempalace/` entirely) | palace nuke = no-op hooks | + +Each kill switch results in `{}` on stdout and exit 0 — the hook +becomes a no-op without removing the plugin. + +## Performance budget + +- The hook scripts are designed to return in under 100ms when the + kill switch trips or any gate fails. +- The Stop hook spawns mining in a detached background subprocess + (`nohup ... &`) so the hook itself returns immediately while the + mining proceeds. +- The PreInvocation hook enforces a 500ms hard cap on + `mempalace wake-up`. If the call doesn't return in time, the hook + emits `{}` and the conversation starts without injection rather + than blocking the user. + +## How the hooks find your `mempalace` install + +The hooks run `mempalace` as `python -m mempalace`, so they need a +Python interpreter that can actually import the package. In almost +every case this is resolved **automatically** — you should not need to +configure anything. The resolution order is: + +1. **`MEMPAL_PYTHON`** — an explicit interpreter path you export + (escape hatch; see below). +2. **The `mempalace-mcp` / `mempalace` console-script shebang.** When + you install via `uv tool install mempalace` or `pipx install`, the + package lives in an *isolated* environment whose interpreter is + **not** your system `python3`. The hooks read the shebang line of + the console script already on your `PATH` (the same one the MCP + server launches) to find that exact interpreter. This is what makes + the common install paths work with zero configuration. +3. **`python3` on `PATH`** — covers an activated virtualenv or an + editable (`pip install -e .`) dev checkout. +4. A bare `python3` fallback. + +### When you might need `MEMPAL_PYTHON` + +You only need to set it if the hooks can't otherwise reach a Python +with `mempalace` importable — for example, an unusual install layout, +or a wrapper interpreter the shebang heuristic can't follow. Point it +at the interpreter that owns the package: + +```bash +# uv tool install: the interpreter lives under `uv tool dir` +export MEMPAL_PYTHON="$(uv tool dir)/mempalace/bin/python" + +# or a project virtualenv +export MEMPAL_PYTHON=/path/to/.venv/bin/python +``` + +Add the line to your `~/.zshrc` / `~/.bashrc` so a GUI-launched +Antigravity (which may not inherit your interactive shell `PATH`) +picks it up. Verify with: + +```bash +"$MEMPAL_PYTHON" -m mempalace --version +``` + +## Verifying installation + +```bash +ls ~/.gemini/config/plugins/mempalace/ +# expect: README.md hooks/ hooks.json mcp_config.json plugin.json skills/ + +cat ~/.gemini/config/plugins/mempalace/hooks.json +# absolute paths to the two hook scripts + +mempalace-mcp --version +# binary on PATH + +bash -n ~/.gemini/config/plugins/mempalace/hooks/*.sh +# no syntax errors +``` + +After restarting Antigravity: + +1. The MCP store should list `mempalace` as a registered server. +2. Starting a fresh conversation should fire the wake hook — check + `~/.mempalace/hook_state/antigravity_hook.log` for an + `[event=preInvocation]` line. +3. Ending a turn should fire the save hook — same log. + +## Troubleshooting + +### "MCP server `mempalace` not found" + +The plugin file is in place but the binary isn't on `$PATH`: + +```bash +mempalace-mcp --version +# command not found? +``` + +Install via uv (recommended) or pip: + +```bash +uv tool install mempalace +# or +pip install mempalace +``` + +### Hooks aren't firing + +Check the antigravity hook log: + +```bash +tail -50 ~/.mempalace/hook_state/antigravity_hook.log +``` + +Each fire writes a line. No lines = the hook is not being invoked. +Verify `~/.gemini/config/plugins/mempalace/hooks.json` exists and the +`command` paths point to executable files. + +### Save fires but no mining happens + +Two common causes: + +1. **The interval hasn't elapsed.** Mining only triggers when + `count % MEMPAL_SAVE_INTERVAL == 0`. The log shows the running + counter and interval per fire — wait for the next save tick or set + `MEMPAL_SAVE_INTERVAL=1` for testing. +2. **The resolved Python can't import `mempalace`.** Look for this + line in `~/.mempalace/hook_state/antigravity_hook.log`: + + ``` + ERROR: mempalace is not runnable via -m mempalace; install mempalace or set MEMPAL_PYTHON + ``` + + If you see it, the interpreter resolution (see *How the hooks find + your `mempalace` install* above) landed on a Python without the + package. Set `MEMPAL_PYTHON` to the correct interpreter and restart + Antigravity. + +### Wake injection isn't appearing + +The wake hook is gated to `invocationNum == 1` AND only injects once +per conversation (atomic `mkdir` marker). Check +`~/.mempalace/hook_state/antigravity_woke_` exists +after a successful injection. + +For a manual re-test: + +```bash +rm -rf ~/.mempalace/hook_state/antigravity_woke_* +``` + +## See also + +- [`hooks/antigravity/INVESTIGATION.md`](https://github.com/MemPalace/mempalace/blob/main/hooks/antigravity/INVESTIGATION.md) + — every Antigravity surface investigated, with verbatim quotes from + the official docs. +- [`hooks/antigravity/STDIN_SHAPE.md`](https://github.com/MemPalace/mempalace/blob/main/hooks/antigravity/STDIN_SHAPE.md) + — exact wire format for both events. +- [`examples/antigravity/`](https://github.com/MemPalace/mempalace/tree/main/examples/antigravity) + — standalone `hooks.json` + `mcp_config.json` for users who don't + want the full plugin install. +- [Auto-Save Hooks](./hooks.md) — Claude Code equivalent. +- [Gemini CLI](./gemini-cli.md) — Gemini CLI integration (separate from Antigravity). diff --git a/website/guide/claude-code.md b/website/guide/claude-code.md index 94a73e084..a3b5f6121 100644 --- a/website/guide/claude-code.md +++ b/website/guide/claude-code.md @@ -15,7 +15,7 @@ Restart Claude Code, then type `/skills` to verify "mempalace" appears. With the plugin installed, Claude Code automatically: - Starts the MemPalace MCP server on launch -- Has access to all 29 tools +- Has access to all 33 tools - Learns the AAAK dialect and memory protocol from the `mempalace_status` response - Searches the palace before answering questions about past work diff --git a/website/guide/configuration.md b/website/guide/configuration.md index f2efd3f49..05b6da20f 100644 --- a/website/guide/configuration.md +++ b/website/guide/configuration.md @@ -8,7 +8,8 @@ Located at `~/.mempalace/config.json`: { "palace_path": "/custom/path/to/palace", "collection_name": "mempalace_drawers", - "people_map": {"Kai": "KAI", "Priya": "PRI"} + "people_map": {"Kai": "KAI", "Priya": "PRI"}, + "max_backups": 10 } ``` @@ -17,6 +18,7 @@ Located at `~/.mempalace/config.json`: | `palace_path` | `~/.mempalace/palace` | Where ChromaDB stores your drawers | | `collection_name` | `mempalace_drawers` | ChromaDB collection name | | `people_map` | `{}` | Entity name → AAAK code mappings | +| `max_backups` | `10` | How many timestamped palace backups to keep before the oldest are pruned. Applies to `mempalace migrate` (`.pre-migrate.*`) and `mempalace repair max-seq-id` (`chroma.sqlite3.max-seq-id-backup-*`), which each write a full copy every run. Set to `0` to keep every backup (e.g. when an external retention policy manages cleanup). | ## Project Config @@ -83,3 +85,4 @@ python -m mempalace.mcp_server --palace /custom/palace |----------|-------------| | `MEMPALACE_PALACE_PATH` | Override palace path (same as `--palace`) | | `MEMPAL_DIR` | Directory for auto-mining in hooks | +| `MEMPALACE_MAX_BACKUPS` | Override `max_backups` retention count (`0` disables pruning) | diff --git a/website/guide/cursor-hooks.md b/website/guide/cursor-hooks.md new file mode 100644 index 000000000..f477b8078 --- /dev/null +++ b/website/guide/cursor-hooks.md @@ -0,0 +1,367 @@ +# Cursor IDE Hooks + +Three hooks for the [Cursor](https://cursor.com) IDE that save memories +automatically and inject recall context at session start. No manual "save" +commands needed. + +These are additive to the existing [Claude Code + Codex hooks](/guide/hooks). +You can run both — they share the same `~/.mempalace/hook_state/` +directory and the same kill switches. + +::: tip Pair this with the Cursor plugin +The hooks here only handle the auto-save side. To also get MemPalace's +MCP server, slash commands (`/mempalace-search`, etc.), and the +guided `mempalace` skill, install the bundled +[Cursor plugin](https://github.com/MemPalace/mempalace/blob/main/.cursor-plugin/README.md) — +it's the `.cursor-plugin/` folder at the repo root, dropped into +`~/.cursor/plugins/local/mempalace`. The plugin and the hooks are +orthogonal: install whichever you want, in any order. The plugin +deliberately does **not** wire hooks itself because Cursor's hooks +system is configured per-user/per-project (in `~/.cursor/hooks.json`), +not per-plugin. +::: + +## Three layers of recall + +The `sessionStart` wake hook is one of three orthogonal ways MemPalace +gets the agent to read the palace before answering. Install any +combination — they reinforce each other and all reference the same +canonical protocol in +[`integrations/shared/recall-protocol.md`](https://github.com/MemPalace/mempalace/blob/develop/integrations/shared/recall-protocol.md). + +| Layer | Fires | Scope | Get it from | +|-------|-------|-------|-------------| +| **`sessionStart` hook** | Once per new conversation | Injects wing-scoped recall context up front | The hooks on this page | +| **`mempalace-recall` skill** | When a request matches its description, or when attached | Full search-before-answer protocol | The [Cursor plugin](https://github.com/MemPalace/mempalace/blob/main/.cursor-plugin/README.md) (`skills/`) | +| **Recall rule** | When Cursor's matcher judges the turn recall-relevant | A short nudge to search first | The plugin (`rules/mempalace-recall.mdc`, `alwaysApply: false`) or [`examples/cursor/rules/`](https://github.com/MemPalace/mempalace/blob/develop/examples/cursor/rules/README.md) | + +The hook is the only layer that fires *automatically and exactly once* +per chat. The skill and rule are demand-driven: they kick in when the +user actually asks about past work, people, or prior decisions, and stay +out of the way on greenfield coding. For recall forced into every +conversation, copy the `alwaysApply: true` variant from +`examples/cursor/rules/` into `~/.cursor/rules/` — a heavier, deliberate +opt-in. + +## What They Do + +| Hook | When It Fires | What Happens | +|------|---------------|--------------| +| **Wake Hook** | `sessionStart` — when a new Cursor conversation opens | Returns `additional_context` telling the agent to recall scoped to the wing inferred from the workspace root. Cursor-only — Claude Code has no equivalent. | +| **Save Hook** | `stop` — after every agent turn | Counts stop invocations per conversation. Every 15 (default), emits a `followup_message` telling the agent to file the session into MemPalace and write a diary entry. | +| **PreCompact Hook** | `preCompact` — right before context compaction | Runs `mempalace mine` synchronously on the transcript before compaction summarises it. Drops a pending-save marker so the next stop forces a save followup. | + +**Two-layer capture:** the save and precompact hooks both mine the JSONL +transcript directly into the palace (capturing verbatim tool output — Shell +results, search findings, build errors). The save hook also nudges the AI +to write structured drawers and a diary entry. Belt-and-suspenders. + +## Install — Cursor + +The fastest path is the installer that ships in the repo. + +Preview the change first (writes nothing, just prints the would-be JSON): + +```bash +hooks/cursor/install.sh --scope user --dry-run +``` + +User scope — applies globally, writes `~/.cursor/hooks.json`: + +```bash +hooks/cursor/install.sh --scope user +``` + +Or project scope — only this repo, writes `/.cursor/hooks.json`: + +```bash +hooks/cursor/install.sh --scope project --target /path/to/your/repo +``` + +The installer copies the three hook scripts to `~/.mempalace/hooks/cursor/`, +merges the entries into your `hooks.json`, and preserves any unrelated +hooks already in that file. Re-running is idempotent. Pass `--variant +minimal` for the `stop`-only setup, or `--uninstall` to remove the +MemPalace entries (leaves other hooks intact). + +### Manual install — `~/.cursor/hooks.json` (user scope) + +```json +{ + "version": 1, + "hooks": { + "sessionStart": [ + { "command": "/absolute/path/to/hooks/cursor/mempal_wake_hook_cursor.sh" } + ], + "stop": [ + { + "command": "/absolute/path/to/hooks/cursor/mempal_save_hook_cursor.sh", + "loop_limit": 1 + } + ], + "preCompact": [ + { "command": "/absolute/path/to/hooks/cursor/mempal_precompact_hook_cursor.sh" } + ] + } +} +``` + +### Manual install — `.cursor/hooks.json` (project scope) + +Identical content. Project hooks load in any trusted workspace and are +checked into version control with the project. Cloud agents also load +project hooks. + +Make the scripts executable once: + +```bash +chmod +x hooks/cursor/mempal_save_hook_cursor.sh \ + hooks/cursor/mempal_precompact_hook_cursor.sh \ + hooks/cursor/mempal_wake_hook_cursor.sh +``` + +Cursor watches `hooks.json` and reloads automatically after a save. If +hooks still do not fire, restart Cursor and check the Hooks panel in +Settings → Hooks. + +## Configuration + +All knobs are environment variables. Defaults match the Claude Code hooks +where they overlap. + +- **`MEMPAL_SAVE_INTERVAL=15`** — number of `stop` events between save + followups. Lower = more frequent saves, higher = less interruption. +- **`MEMPAL_CURSOR_SILENT=1`** — suppress the `followup_message` entirely + (the hook still runs its best-effort background mine and keeps its + counters). `MEMPAL_VERBOSE=false`/`0`/`no` is equivalent. Note the + followup is **on by default** for Cursor — see "Why the followup is on + by default" below. +- **`MEMPAL_STATE_DIR`** — where the hook keeps counter files, the + pending-save marker, and `cursor_hook.log`. Defaults to + `~/.mempalace/hook_state/`. +- **`MEMPAL_STATE_TTL_DAYS=30`** — age after which stale + `cursor_*.count` / `cursor_*.pending` files are swept. The hooks run a + daily-throttled garbage collection so per-conversation state can't grow + unbounded; only Cursor's own state is touched. +- **`MEMPAL_DIR`** — optional project directory (code, notes, docs) to + also mine on each save trigger, with `--mode projects`. The transcript + is always mined regardless — `MEMPAL_DIR` is purely additive. +- **`MEMPAL_PYTHON`** — path to a Python 3 interpreter. The hook's own + JSON parsing and the install script's JSON merge use this. Resolution + order: `$MEMPAL_PYTHON` → `command -v python3` → bare `python3`. Set + this when Cursor is launched from a GUI on macOS and the inherited + PATH lacks the Python where you installed MemPalace. +- **`MEMPAL_DISABLE_HOOK=1`** — emergency kill switch. Disables all + three hooks; they emit `{}` and exit 0. +- **`MEMPALACE_HOOKS_AUTO_SAVE=false`** — same effect as + `MEMPAL_DISABLE_HOOK=1`. Also honoured via `~/.mempalace/config.json`: + + ```json + { "hooks": { "auto_save": false } } + ``` + +## How It Works + +### Wake Hook (`sessionStart`) + +``` +Cursor opens new conversation → sessionStart fires + ↓ + Hook reads workspace_roots[0] + ↓ + Infers wing = basename(workspace_root) + ↓ + {"additional_context": "scope recall to wing=<...>"} + ↓ + Agent reads additional_context before first turn + ↓ + Agent calls mempalace_search + mempalace_diary_read + wing-scoped on the first relevant question +``` + +Cursor's `sessionStart` is fire-and-forget — the agent loop does not wait +for a blocking response and does not consume `continue` / `user_message`. +But it does honour `additional_context`, and that is the only field +MemPalace emits. + +### Save Hook (`stop` event) + +``` +User sends message → agent responds → Cursor fires stop hook + ↓ + Hook reads loop_count from stdin + ↓ + ┌─── loop_count > 0 (our own followup running) ──→ echo "{}" + │ + └─── loop_count == 0 + ↓ + Check pending-save marker from preCompact + ↓ + ┌── marker present ──→ delete + emit followup_message + │ + └── no marker + ↓ + Atomic counter++ for this conversation_id + ↓ + ┌── counter % SAVE_INTERVAL != 0 ──→ echo "{}" + │ + └── counter % SAVE_INTERVAL == 0 + ↓ + Background: mempalace mine + ↓ + Emit {"followup_message": "save key topics..."} + ↓ + Cursor auto-submits followup as next user turn + ↓ + Agent files drawers + writes diary + ↓ + Agent stops; stop fires again with loop_count = 1 + ↓ + Hook sees loop_count > 0 → echo "{}" → agent stops +``` + +The `loop_count > 0` short-circuit prevents infinite loops: emit once → +agent saves → stops → we see `loop_count = 1` → we let it through. This +is the Cursor equivalent of Claude Code's `stop_hook_active` flag. The +`loop_limit: 1` in `hooks.json` is defense-in-depth on top. + +### PreCompact Hook + +``` +Context window near full → Cursor fires preCompact (observational) + ↓ + Synchronously: mempalace mine + ↓ + Drop pending-save marker for this conversation_id + ↓ + {"user_message": "transcript snapshotted..."} + ↓ + Compaction proceeds (we cannot block it) + ↓ + Next stop event picks up the marker → forces save +``` + +Cursor's `preCompact` is documented as **observational only** — its only +output field is `user_message`, with no `followup_message` and no way to +block. That is fundamentally different from Claude Code's `PreCompact` +which can block until the AI has saved. We work around the limitation by +mining the verbatim transcript synchronously (zero LLM cost) and queueing +a save nudge for the next agent turn. + +::: tip Why synchronous (and what happens on a slow mine) +The pre-compaction mine runs **synchronously** on purpose: compaction is +irreversible, so we must finish ingesting before the hook returns — +background mining would race the compaction. On a very large transcript +this can exceed Cursor's per-hook timeout, in which case Cursor kills the +mine mid-run. That is safe: `mempalace mine` is incremental and +append-only, so a killed mine resumes cleanly on the next invocation +rather than corrupting the palace, and the pending-save marker still +forces a re-mine plus a verbatim save nudge on the next `stop`. +::: + +## Cursor-only extras + +The features below are not available in the Claude Code or Codex hooks +because their hook surfaces do not expose the necessary events. + +- **Session-start recall via `sessionStart`.** The wake hook injects + wing-scoped recall guidance into the conversation's initial system + context, so the agent searches the palace before answering anything + that touches past work. Verified output field — see the [Cursor hooks + reference](https://cursor.com/docs/hooks.md) section "sessionStart". +- **Per-script `loop_limit`.** Cursor's `loop_limit` (default 5, + configurable per script) is a hard cap on how many auto-followups + Cursor will issue. MemPalace sets it to `1` in the example + `hooks.json` as defense-in-depth on top of its own `loop_count` + check. +- **Inferred wing from `workspace_roots`.** Both the wake hook and the + save hook use `basename(workspace_roots[0])` to scope memory + operations. A user with multiple Cursor workspaces gets per-project + wings without any manual configuration. + +## Debugging + +```bash +cat ~/.mempalace/hook_state/cursor_hook.log +``` + +Example output (ISO-8601 timestamps, event + conversation id, message): + +``` +[2026-05-27T02:16:01Z] [event=sessionStart] [conv=abc123] workspace=/Users/me/proj wing=proj +[2026-05-27T02:21:33Z] [event=stop] [conv=abc123] counter 0 -> 1 (interval=15) +[2026-05-27T02:42:09Z] [event=stop] [conv=abc123] counter 14 -> 15 (interval=15) +[2026-05-27T02:42:09Z] [event=stop] [conv=abc123] TRIGGERING SAVE at counter=15 +[2026-05-27T02:42:11Z] [event=stop] [conv=abc123] loop_count>0; letting agent stop +[2026-05-27T03:05:44Z] [event=preCompact] [conv=abc123] trigger=auto transcript=/Users/me/.cursor/.../transcript.txt +[2026-05-27T03:05:46Z] [event=stop] [conv=abc123] consumed pending-save marker (post-compaction) +``` + +When a hook can't parse its stdin (corrupt payload, future Cursor schema +change), the raw input — capped at 4096 bytes, mode 0600 — lands at: + +``` +~/.mempalace/hook_state/cursor_last_input.log +~/.mempalace/hook_state/cursor_last_python_err.log +``` + +Both are overwritten on each failure, never appended, so a repeating +misconfiguration cannot grow disk usage. + +## Cost + +**Zero extra tokens spent by the hooks themselves.** The hooks are bash +scripts that run locally. They do not call any API. The `followup_message` +the save hook emits is a normal user turn — it counts the same as any +other user message and does not invoke any extra LLM call beyond the one +the user would otherwise make. To suppress it entirely, set +`MEMPAL_CURSOR_SILENT=1`. + +## Why the followup is on by default + +The Claude Code hook is **silent by default**: its background `mempalace +mine --mode convos` captures the verbatim transcript on its own (because +`normalize.py` has a Claude Code JSONL parser), and the LLM-driven diary +nudge is opt-in behind `MEMPAL_VERBOSE`. + +Cursor is different. Cursor's transcript format is **undocumented** and +`normalize.py` has **no Cursor parser**, so the background mine is +best-effort only and does not yet yield clean verbatim drawers. That makes +the `followup_message` — which drives the agent to file its own in-context +verbatim quotes via `mempalace_add_drawer` / `mempalace_diary_write` — the +**load-bearing verbatim-capture path** for Cursor. Turning it off by +default would leave a default install capturing nothing, so it is on by +default. + +If you want the Claude-style "zero tokens in the chat window" behaviour +and accept the reduced capture, set `MEMPAL_CURSOR_SILENT=1` (or +`MEMPAL_VERBOSE=false`). The proper long-term fix is a Cursor transcript +parser in `normalize.py` (tracked follow-up); once that works, this +default flips to silent to match Claude. + +## Known limitations + +- **Hooks load at session start.** Cursor watches `hooks.json` and reloads + the wiring when the file changes, but for the freshly-loaded hook + scripts to take effect on an existing conversation you usually have to + start a new conversation. This matches the behaviour of Claude Code's + hook lifecycle. +- **`preCompact` cannot block.** See the diagram above. The + pending-save marker is the workaround. +- **Transcript file format is opaque.** Cursor does not document the + schema of the file at `transcript_path`, and `mempalace/normalize.py` + has no Cursor parser yet, so the background `mempalace mine --mode + convos` is **best-effort** for Cursor — it does not yet produce clean + verbatim conversation drawers. The `followup_message` is the + load-bearing capture path (see below). Adding a Cursor parser to + `normalize.py` is tracked follow-up work; once it lands, the followup + can default to silent like the Claude hook. + +## Related + +- [Auto-Save Hooks (Claude Code + Codex)](/guide/hooks) — the analogous + feature for those tools. +- [`hooks/cursor/STDIN_SHAPE.md`](https://github.com/MemPalace/mempalace/blob/develop/hooks/cursor/STDIN_SHAPE.md) + — per-event JSON schema with citations. +- [Claude Code Retention](/guide/claude-code-retention) — broader + setup checklist if you mix Cursor with Claude Code. diff --git a/website/guide/mcp-integration.md b/website/guide/mcp-integration.md index 182bfaef7..6d8c7731a 100644 --- a/website/guide/mcp-integration.md +++ b/website/guide/mcp-integration.md @@ -1,6 +1,6 @@ # MCP Integration -MemPalace provides 29 tools through the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/), giving any MCP-compatible AI full read/write access to your palace. +MemPalace provides 33 tools through the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/), giving any MCP-compatible AI full read/write access to your palace. ## Setup @@ -26,7 +26,7 @@ claude mcp add mempalace -- python -m mempalace.mcp_server --palace /path/to/pal codex mcp add mempalace -- python -m mempalace.mcp_server --palace /path/to/palace ``` -Now your AI has all 29 tools available. Ask it anything: +Now your AI has all 33 tools available. Ask it anything: > *"What did we decide about auth last month?"* diff --git a/website/guide/openclaw.md b/website/guide/openclaw.md index a9ca6dc47..cdfe4f591 100644 --- a/website/guide/openclaw.md +++ b/website/guide/openclaw.md @@ -27,7 +27,7 @@ Or by directly editing your OpenClaw configuration: ## How It Works -Once connected, OpenClaw agents receive all 29 tools along with the **Memory Protocol**—a strict behavioral guide indicating they should: +Once connected, OpenClaw agents receive all 33 tools along with the **Memory Protocol**—a strict behavioral guide indicating they should: 1. **Never guess**: Query `mempalace_search` or `mempalace_kg_query` before confidently answering. 2. **Keep an agent diary**: Maintain continuity between sessions by writing to `mempalace_diary_write`. 3. **Manage the Knowledge Graph**: Update declarative facts when things change using `mempalace_kg_add` and `mempalace_kg_invalidate`. diff --git a/website/reference/mcp-tools.md b/website/reference/mcp-tools.md index 220b15e51..121014abd 100644 --- a/website/reference/mcp-tools.md +++ b/website/reference/mcp-tools.md @@ -1,6 +1,6 @@ # MCP Tools Reference -Detailed parameter schemas for all 30 MCP tools. +Detailed parameter schemas for all 33 MCP tools. ## Palace — Read Tools @@ -114,6 +114,24 @@ Delete a drawer by ID. Irreversible. --- +### `mempalace_mine` + +Mine a directory into the palace — the MCP equivalent of `mempalace mine`. Wraps the same in-process miners the CLI uses; runs synchronously and returns the miner's summary as `output`. The palace write lock is automatic — a concurrent mine returns a structured already-running error. Orphan cleanup is separate (see `mempalace_sync`). + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `source` | string | **Yes** | Directory to mine | +| `mode` | string | No | `projects` (code/docs, default), `convos` (chat transcripts), or `extract` (office docs; needs the `mempalace[extract]` extra) | +| `wing` | string | No | Target wing (default: source directory name) | +| `agent` | string | No | Recorded on every drawer (default: `mempalace`) | +| `limit` | integer | No | Max files to process (0 = all; default 0) | +| `dry_run` | boolean | No | Report what would be filed without writing (default false) | +| `extract` | string | No | Convos extraction strategy: `exchange` (default) or `general`; ignored by other modes | + +**Returns:** `{ success, mode, dry_run, output }` on success (`output` is the miner's human-readable summary; `output_truncated: true` is added when a very large summary is tail-trimmed), or `{ success: false, error, error_class? }` on failure. + +--- + ### `mempalace_sync` Prune drawers whose source files are gitignored, deleted, or moved. Returns a dry-run report by default; pass `apply=true` to commit deletions. @@ -319,6 +337,30 @@ Delete an explicit tunnel by its ID. --- +### `mempalace_list_hallways` + +List within-wing hallway records (entity-to-entity co-occurrence links built at mine time). Optionally filter by wing. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `wing` | string | No | Filter hallways by wing | + +**Returns:** `[ { id, wing, entity_a, entity_b, co_occurrence_count, rooms, ... }, ... ]` + +--- + +### `mempalace_delete_hallway` + +Delete a hallway record by its ID. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `hallway_id` | string | **Yes** | Hallway ID to delete | + +**Returns:** `{ deleted: bool }` + +--- + ### `mempalace_follow_tunnels` Follow tunnels from a room to see what it connects to in other wings. Returns connected rooms with drawer previews. diff --git a/website/reference/modules.md b/website/reference/modules.md index a4485f551..4c12ae9ce 100644 --- a/website/reference/modules.md +++ b/website/reference/modules.md @@ -9,7 +9,7 @@ mempalace/ ├── README.md ← project documentation ├── mempalace/ ← core package │ ├── cli.py ← CLI entry point -│ ├── mcp_server.py ← MCP server (29 tools) +│ ├── mcp_server.py ← MCP server (33 tools) │ ├── knowledge_graph.py ← temporal entity graph │ ├── palace_graph.py ← room navigation graph │ ├── dialect.py ← AAAK compression @@ -56,7 +56,7 @@ Argparse-based CLI with subcommands: `init`, `mine`, `split`, `search`, `compres ### `mcp_server.py` — MCP Server -JSON-RPC over stdin/stdout. Implements the MCP protocol with 29 tools covering palace read/write, drawer CRUD, knowledge graph, navigation, tunnels, agent diary, and system operations. Includes the Memory Protocol and AAAK Spec in status responses. +JSON-RPC over stdin/stdout. Implements the MCP protocol with 33 tools covering palace read/write, drawer CRUD, knowledge graph, navigation, tunnels, agent diary, and system operations. Includes the Memory Protocol and AAAK Spec in status responses. ### `searcher.py` — Semantic Search