Skip to content

feat: auto-discover wings from root directory on startup#219

Open
matrix9neonebuchadnezzar2199-sketch wants to merge 8 commits into
MemPalace:developfrom
matrix9neonebuchadnezzar2199-sketch:feat/auto-discover-wings-from-root-dir
Open

feat: auto-discover wings from root directory on startup#219
matrix9neonebuchadnezzar2199-sketch wants to merge 8 commits into
MemPalace:developfrom
matrix9neonebuchadnezzar2199-sketch:feat/auto-discover-wings-from-root-dir

Conversation

@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown

Summary

When users run mempalace init <dir>, the specified directory is now saved as root_dir in the config. On each MCP server startup (and on mempalace_status calls), subdirectories under root_dir are automatically scanned and registered as wings.

This eliminates the need to manually run mempalace mine <dir> --wing <name> every time a new project folder is created.

Problem

Currently, adding a new wing requires the user to manually run a mine command with --wing for each subdirectory. For users who organize multiple projects under a single parent directory, this is tedious and easy to forget. New projects silently remain invisible to MemPalace until the user remembers to register them.

Solution

Three files modified:

config.py

  • Added root_dir property — reads from config file or MEMPALACE_ROOT_DIR env var
  • Added _save() method — persists config changes to disk
  • Modified init() — accepts optional root_dir parameter and saves it

mcp_server.py

  • Added _sync_wings_from_root() — scans root_dir subdirectories, compares against known wings in both ChromaDB metadata and wing_config.json, registers any new folders as wings
  • Added _folder_to_wing() — normalizes folder names to valid wing names
  • Added IGNORE_DIRS — skips common non-project directories (node_modules, .git, pycache, etc.)
  • Called on server startup in main() and on each tool_status() call

cli.py

  • Modified cmd_init() — resolves the directory to an absolute path and passes it to config.init(root_dir=...)
  • Added TypeError fallback for detect_rooms_local() which does not accept yes kwarg in current release

Behavior

  • mempalace init ~/projects saves ~/projects as root_dir
  • On MCP server startup, subdirectories are scanned and new ones become wings automatically
  • Already-registered wings are skipped (no duplicates)
  • Deleted folders are ignored — their memories are preserved
  • Dotfiles and common non-project dirs (node_modules, .git, etc.) are excluded
  • No existing behavior is changed if root_dir is not set

Testing

Manually verified:

  1. root_dir is correctly saved to ~/.mempalace/config.json
  2. Subdirectories are detected as new wings on first scan
  3. Second scan returns empty (no duplicate registration)
  4. Creating a new folder and re-scanning detects it immediately
  5. IGNORE_DIRS folders are correctly skipped

99/101 tests pass. The 2 failures (test_convo_mining, test_project_mining) are a pre-existing Windows-specific issue — ChromaDB holds file locks on temp directories during shutil.rmtree cleanup. Unrelated to this change.

Automated tests for the new functionality would be a good follow-up — happy to add them if the approach looks good.

Related

This addresses the general usability concern of manual wing management. No existing issue tracks this specifically — happy to create one if preferred.

@adv3nt3

adv3nt3 commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Interesting feature idea, auto-discovering wings would save a lot of manual setup for multi-project users.

Some concerns:

  1. Full metadata scan on every status call. _sync_wings_from_root() runs on each tool_status() and loads up to 10,000 metadata records from ChromaDB to check existing wing names. This is the same pattern flagged in Bugs and improvements #159 point 4. A status call should be fast, scanning the filesystem and loading all metadata each time is expensive. Could cache known wings or only sync on startup.

  2. Double sync on startup. _sync_wings_from_root() is called in main() before the loop and again inside tool_status(). If an agent calls status right after connect, that's 3 full scans in seconds.

  3. TypeError fallback swallows real errors. The try: detect_rooms_local(..., yes=...) except TypeError: detect_rooms_local(...) pattern catches any TypeError inside detect_rooms_local, not just a missing kwarg. A real bug inside that function would be silently swallowed and the function would re-run without the flag.

  4. Accesses config internals. _sync_wings_from_root() uses _config._config_dir directly to write wing_config.json. This couples the MCP server to private config internals, should be a method on MempalaceConfig.

  5. Wing name collisions. _folder_to_wing("My-Project") and _folder_to_wing("my_project") both produce wing_my_project. Different folders could silently map to the same wing.

  6. No tests for a feature that runs on every startup and status call. The PR acknowledges this, but given the metadata scan cost and the new config file, automated tests would catch regressions early.

@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown
Author

Thanks for the thorough review @adv3nt3 — all great catches. Pushed a fix:

  1. Full metadata scan_sync_wings_from_root() now caches its result in a module-level variable. Subsequent calls are free unless force=True.
  2. Double sync — Removed the call from tool_status(). Now only fires once in main().
  3. TypeError fallback — Replaced broad except TypeError with inspect.signature() check so real bugs aren't swallowed.
  4. Private config access — Added public methods config_dir, load_wing_config(), save_wing_config() to MempalaceConfig. MCP server no longer touches _config_dir directly.
  5. Wing name collisions_folder_to_wing() now preserves hyphens, so My-Projectwing_my-project and my_projectwing_my_project produce different names.
  6. Tests — Added tests/test_auto_discover.py covering _folder_to_wing, _sync_wings_from_root (cache, ignore dirs, no root_dir), and MempalaceConfig root_dir/wing_config roundtrip.

@web3guru888 web3guru888 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #219feat: auto-discover wings from root directory on startup

Scope: +298/−4 · 4 file(s) · touches core

  • mempalace/cli.py (modified: +12/−2)
  • mempalace/config.py (modified: +46/−1)
  • ⚠️ mempalace/mcp_server.py (modified: +116/−1)
  • tests/test_auto_discover.py (added: +124/−0)

Technical Analysis

  • 🔌 MCP server dispatch changes — verify JSON-RPC compliance and backward compatibility
  • 🪟 Windows compatibility — verify path handling works cross-platform

Issues

  • ⚠️ Touches mempalace/mcp_server.py — Core MCP server — maintainer guards this closely

Strengths

  • ✅ Includes test coverage

🟡 Needs attention — touches guarded files and has items to address.


🏛️ Reviewed by MemPalace-AGI · Autonomous research system with perfect memory · Showcase: Truth Palace of Atlantis

matrix9neonebuchadnezzar2199-sketch pushed a commit to matrix9neonebuchadnezzar2199-sketch/mempalace that referenced this pull request Apr 11, 2026
…leanup (MemPalace#219)

- Remove col.get(metadatas) from _sync_wings_from_root; use wing_config.json only
- _folder_to_wing now preserves CJK/Unicode characters via \w regex
- Add fallback for empty slug after sanitization
- tool_status calls cached _sync_wings_from_root(force=False)
- Remove stray blank lines
- 4 new tests (CJK, Korean, empty fallback, no-ChromaDB dependency)

Made-with: Cursor
@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown
Author

@web3guru888 Thank you for the review. Addressed all three items in the latest commit:

  1. MCP server impact (−ChromaDB full scan) Removed col.get(include=["metadatas"], limit=10000) from _sync_wings_from_root(). Wing registration now uses wing_config.json as the single source of truth — no ChromaDB read at all during discovery. This eliminates the same performance risk identified in mempalace_list_wings / mempalace_list_rooms / mempalace_get_taxonomy silently return empty results on large collections #338 and reduces the mcp_server.py footprint of this feature.

  2. Windows / cross-platform compatibility folder_to_wing() now uses re.sub(r'[^\w\-]+', '', slug) which is Unicode-aware in Python's re module. CJK folder names (日本語, 한국어, etc.) are preserved instead of being stripped to underscores. Added tests for CJK and Korean folder names, plus an empty-slug fallback to "unnamed".

  3. JSON-RPC compliance tool_status() now calls _sync_wings_from_root(force=False) which returns the startup cache immediately (no I/O, no rescan). The JSON-RPC request/response format is untouched.

Tests: 16 passed (test_auto_discover.py), 131 total passed. 2 pre-existing Windows ChromaDB file-lock failures unrelated.

@bensig bensig changed the base branch from main to develop April 11, 2026 22:23
@bensig bensig requested a review from igorls as a code owner April 11, 2026 22:23
@igorls igorls added area/cli CLI commands area/mcp MCP server and tools enhancement New feature or request labels Apr 14, 2026
…leanup (MemPalace#219)

- Remove col.get(metadatas) from _sync_wings_from_root; use wing_config.json only
- _folder_to_wing now preserves CJK/Unicode characters via \w regex
- Add fallback for empty slug after sanitization
- tool_status calls cached _sync_wings_from_root(force=False)
- Remove stray blank lines
- 4 new tests (CJK, Korean, empty fallback, no-ChromaDB dependency)

Made-with: Cursor
Use strip('_-') so '--project--' normalizes to wing_project; inner hyphens unchanged.

Made-with: Cursor
@matrix9neonebuchadnezzar2199-sketch matrix9neonebuchadnezzar2199-sketch force-pushed the feat/auto-discover-wings-from-root-dir branch from 9e0d0d4 to 59e940b Compare April 14, 2026 11:25
@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown
Author

@web3guru888 Thank you for the review. Addressed all three items and rebased onto current develop (post-#852 ChromaBackend refactor).

  1. MCP server impact — minimized Removed col.get(include=["metadatas"]) entirely from _sync_wings_from_root(). Wing discovery now uses wing_config.json as the single source of truth — zero ChromaDB reads during discovery. Compatible with the new ChromaBackend abstraction layer from refactor: route all chromadb access through ChromaBackend (v4 prep) #852.

  2. Windows / cross-platform folder_to_wing() now uses re.sub(r'[^\w\-]+', '', slug) (Unicode-aware \w). CJK folder names are preserved. Empty slugs fall back to "unnamed". strip('_-') cleans leading/trailing noise.

  3. JSON-RPC compliance tool_status() calls _sync_wings_from_root(force=False) — returns startup cache immediately, no I/O. JSON-RPC format untouched.

Tests: 16 passed (test_auto_discover.py), 879 passed total. Rebased cleanly onto develop at b060171 (#852 merge).

matrix9neonebuchadnezzar2199-sketch added 2 commits April 14, 2026 20:57
@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown
Author

@web3guru888 Thank you for the review. Addressed all three items, rebased onto current develop (post-#852 ChromaBackend refactor), and CI is now fully green.

  1. MCP server impact — minimized Removed col.get(include=["metadatas"]) entirely from _sync_wings_from_root(). Wing discovery now uses wing_config.json as the single source of truth — zero ChromaDB reads during discovery. Compatible with the new ChromaBackend abstraction layer from refactor: route all chromadb access through ChromaBackend (v4 prep) #852.

  2. Windows / cross-platform folder_to_wing() now uses re.sub(r'[^\w\-]+', '', slug) (Unicode-aware \w). CJK folder names are preserved. Empty slugs fall back to "unnamed". strip('_-') cleans leading/trailing noise.

  3. JSON-RPC compliance tool_status() calls _sync_wings_from_root(force=False) — returns startup cache immediately, no I/O on every call. JSON-RPC format untouched.

Tests: 16 passed (test_auto_discover.py), full suite green. CI: 6/6 checks passed.

@igorls

igorls commented May 8, 2026

Copy link
Copy Markdown
Member

Hi, thanks for the contribution.

This PR has merge conflicts with develop, and the branch has not been updated in over 7 days, which puts it before our most recent release. The conflicts are likely against work that landed in that release.

Could you rebase onto develop so we can take another look?

If this change is no longer relevant, feel free to close the PR.

(This message is part of a periodic backlog pass, sent to all open PRs that match this state.)

@igorls igorls added the needs-rebase PR has merge conflicts with develop and needs rebase label May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli CLI commands area/mcp MCP server and tools enhancement New feature or request needs-rebase PR has merge conflicts with develop and needs rebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants