fix(repair): point index-read failures to repair --mode from-sqlite (#1843) by mvalentsev · Pull Request #1847 · MemPalace/mempalace

mvalentsev · 2026-06-21T17:17:50Z

Addresses #1843.

What does this PR do?

When the chromadb compactor cannot apply the WAL into the drawers HNSW
segment (InternalError: Failed to apply logs to the hnsw segment writer),
mempalace repair (default legacy mode) and the library rebuild_index
both fail on their first read (Collection.count()) and then print:

Cannot recover — palace may need to be re-mined from source files.

That advice is harmful: the drawer rows are intact in chroma.sqlite3, and
repair --mode from-sqlite rebuilds the index from them. Re-mining instead
silently drops drawers added through the MCP server and diary entries, which
have no source file to mine. That is the data loss #1843 reports.

Both legacy read-failure sites now emit shared guidance pointing at the
working recovery:

If a MemPalace server or mine is still running against this palace,
stop it and retry. Otherwise the drawer index is likely corrupt
(for example a failed chromadb HNSW compaction) while your drawer
rows remain in chroma.sqlite3. Rebuild the index from SQLite rather
than re-mining:

    mempalace repair --mode from-sqlite --archive-existing

(Re-mining from source files would drop drawers added via the MCP
server and diary entries, which have no source file.)

The message is worded conditionally (it also covers a live server/mine still
holding the palace open) because the except Exception cannot prove which
case it caught.

Scope

This is the "at minimum, change the message" part of #1843's first request.
The larger asks (auto-fallback to from-sqlite, MCP reconnect resilience,
decoupling the durable SQLite write from HNSW indexing in add_drawer) are
out of scope here; the compaction failure itself lives in chromadb's core.

Two other repair.py re-mine messages are left as-is on purpose:
check_extraction_safety (a row-count truncation abort, not an index read)
and the RebuildCollectionError no-backup-after-live-swap path (where the
collection is genuinely gone and re-mining is a legitimate last resort).

How to test

uv run pytest tests/test_cli.py tests/test_repair.py -q

New regression tests:

test_cmd_repair_error_reading_points_to_from_sqlite_not_remine
test_rebuild_index_read_failure_points_to_from_sqlite
test_index_read_recovery_guidance_recommends_from_sqlite

Checklist

Tests pass (python -m pytest tests/ -v)
No hardcoded paths
Linter passes (ruff check .)

…emPalace#1843) When the chromadb compactor cannot apply the WAL into the drawers HNSW segment (InternalError: Failed to apply logs to the hnsw segment writer), the legacy repair paths fail on their first Collection.count() read and advise re-mining from source files. The drawer rows are intact in chroma.sqlite3, so repair --mode from-sqlite rebuilds them; re-mining silently drops drawers added via the MCP server and diary entries that have no source file. Both legacy read-failure sites (cmd_repair and rebuild_index) now emit shared guidance pointing at the from-sqlite recovery, worded conditionally so it also covers a live server or mine still holding the palace open. Co-Authored-By: undeadindustries <9536461+undeadindustries@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request improves the recovery guidance provided to users when a drawer-index read fails (e.g., due to a chromadb HNSW compactor failure). Instead of advising users to re-mine from source files—which can cause data loss for drawers added via the MCP server or diary entries—the CLI and index rebuilder now recommend rebuilding the index from SQLite using mempalace repair --mode from-sqlite --archive-existing. Unit tests have been added to verify this behavior. There are no review comments, so I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist Bot reviewed Jun 21, 2026

View reviewed changes

mvalentsev marked this pull request as ready for review June 21, 2026 17:25

mvalentsev requested review from igorls and milla-jovovich as code owners June 21, 2026 17:25

undeadindustries mentioned this pull request Jun 22, 2026

fix: point diverged-index recovery at from-sqlite, not re-mine (#1843) #1849

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(repair): point index-read failures to repair --mode from-sqlite (#1843)#1847

fix(repair): point index-read failures to repair --mode from-sqlite (#1843)#1847
mvalentsev wants to merge 1 commit into
MemPalace:developfrom
mvalentsev:fix/1843-repair-from-sqlite-guidance

mvalentsev commented Jun 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mvalentsev commented Jun 21, 2026

What does this PR do?

Scope

How to test

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant