Skip to content

Latest commit

 

History

History
155 lines (107 loc) · 6.09 KB

File metadata and controls

155 lines (107 loc) · 6.09 KB

claude-memory-lint

Static linter for Claude Code memory directories — catches frontmatter rot, oversized files, and stop-word noise at write-time, before they pollute Claude's context window.

Claude Code memory files grow quickly. Once frontmatter goes inconsistent and files fill up with generic keywords (api, github, task, agent, …), the runtime router loses signal: every common word matches every file, context gets stuffed with irrelevant memory, and the model's attention drifts. claude-memory-lint (cml) surfaces those defects as ERROR / WARN / INFO when you write the file — not when you're already debugging a confused session.

Disclaimer: This is an independent third-party tool. It is not affiliated with, endorsed by, or sponsored by Anthropic. "Claude" and "Claude Code" are trademarks of Anthropic and are used here nominatively to identify the official CLI/product this tool integrates with.


Install

pip install claude-memory-lint
# development
pip install -e .[dev]

Python 3.10+. No required runtime dependencies.


Quickstart

# One-shot check
cml check ~/.claude/projects/my-project/memory/

# Fix missing frontmatter stubs (backup written alongside each changed file)
cml fix ~/.claude/projects/my-project/memory/

# Machine-readable output for CI
cml check --format json path/to/memory > findings.json

# SARIF for GitHub code scanning
cml check --format sarif path/to/memory > cml.sarif
cml check  ~/.claude/projects/<id>/memory/   # exits 1 on ERROR
cml fix    ~/.claude/projects/<id>/memory/   # auto-add missing aliases (.bak backup kept)
cml stats  ~/.claude/projects/<id>/memory/   # counts only, no file contents in output

How it works

claude-memory-lint architecture

Per-file rules (R001R006, R008R010, R012R017) each receive one ParsedFile and the active config, and return a list of Violation objects. Corpus rules (R007, R011) receive the full list of parsed files (needed to detect cross-file duplicates and orphaned backups). All violations are collected and fed into one of three reporters: text (default), json, or sarif.


Rules

ID Severity What it catches
R001 ERROR frontmatter missing or unterminated
R002 ERROR both aliases: and triggers: are empty / absent
R003 WARN at 25 KiB / ERROR at 30 KiB file size threshold
R004 WARN filename stem is not kebab-case ASCII
R005 INFO file is not referenced by any other memory file (orphan)
R006 WARN stop-word density in name + description ≥ 40 %
R007 INFO duplicate normalised stem (likely stale copy)
R008 INFO mtime older than 180 days
R009 ERROR secret literal pattern (GitHub PAT / AWS / Anthropic / OpenAI / Google / Slack / Stripe) — match is never echoed in output
R010 WARN (opt-in) dangling markdown link — target not found in tree or archive/
R011 WARN (opt-in) stale backup file (*.bak / .backup / .orig / trailing ~) older than 7 days
R012 WARN (opt-in) aliases: / triggers: items that reduce to stop-words only
R013 WARN (opt-in) high-salience emphasis emoji over threshold (collapses LLM attention weighting)
R014 WARN (default ON) supersedes: target does not resolve in tree or archive/
R015 WARN at 4 KiB / ERROR at 8 KiB (default ON) auto-inject file body exceeds per-turn token budget
R016 WARN (default ON) INDEX-role file has > 200 lines or mixes body content into what should be an entry-list
R017 WARN (opt-in) body line promises an artifact at an absolute path that does not exist on disk

Opt-in rules are enabled with --dangling-links, --stale-backup, --trigger-stopwords, --emphasis-density, or --phantom-artifact. Default-ON rules can be disabled with --no-supersedes-check, --no-inject-bloat-check, --no-index-purity-check.

R009 note (added v0.2.0): A real-world incident where a GitHub PAT literal sat in a memory file for days under .gitignore "protection" motivated this rule. The rule reports only the pattern type and line number — the matched substring is never written to stdout, JSON, SARIF, or any error path — because a lint that leaks the secret it caught is worse than no lint.


Privacy posture

  • No LLM calls. Heuristics only.
  • No file body in stdout. stats and check --format json print filenames, rule IDs, and counts — never file content.
  • Auto-fix writes a .bak next to the original and never sends anything off-machine.

Output formats

text   default — human-readable, one finding per line + summary
json   machine-readable; convenient for CI gates
sarif  SARIF 2.1.0 — uploadable to GitHub code scanning

Use as a pre-commit hook

repos:
  - repo: https://github.com/hinanohart/claude-memory-lint
    rev: v0.4.0
    hooks:
      - id: claude-memory-lint
        args: [check, --rule, R001, --rule, R002]

Use in CI

- name: lint memory directory
  run: |
    pip install claude-memory-lint
    cml check --format sarif path/to/memory > cml.sarif
- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: cml.sarif

Companion: claude-memory-router

This lint pairs with claude-memory-router:

  • The lint enforces write-time quality (aliases, size, freshness).
  • The router consumes that quality at read-time and only injects the most relevant memory files into the prompt.

Garbage-in, garbage-out applies to routers as to any other engine. The lint exists so the router can do its best work.


Testing

pip install -e .[dev]
pytest -v

The test suite is 95 tests covering parser edge cases, per-rule positive/negative cases for R001–R017, corpus rules R007 and R011, and the CLI front-end including JSON / SARIF reporters.


License

MIT. See LICENSE.