Feature Request: Built-in CLI output filtering/compression to reduce token usage (inspired by RTK)
Hey team! π
I recently came across an interesting project called [RTK (Rust Token Killer)](https://github.com/rtk-ai/rtk) β a CLI proxy that filters and compresses command output before it reaches the LLM context. The author shared their results on [Reddit](https://www.reddit.com/r/ClaudeAI/comments/1r2tt7q/i_saved_10m_tokens_89_on_my_claude_code_sessions/), and the numbers are impressive:
cargo test: 155 lines β 3 lines (98% reduction)
git status: 119 chars β 28 chars (76% reduction)
git log: compact summaries instead of full output
- Total savings over 2 weeks: ~10M tokens (89%)
The core idea is simple: most CLI output sent to the LLM is noise β passing tests, verbose logs, progress bars, redundant formatting. Stripping that out before it hits the context window saves a massive amount of tokens without losing any useful information.
Why this would be valuable as a built-in feature in your project:
- Users wouldn't need to install and configure a separate tool
- Filtering rules could be context-aware and tightly integrated with your existing command execution pipeline
- It directly reduces costs and improves response quality (less noise = better reasoning)
- It could be opt-in with sensible defaults
Possible implementation scope:
- Configurable output filters for common commands (
git, npm, cargo, pip, test runners, etc.)
- Smart truncation with summary (e.g., "47 tests passed, 2 failed" instead of full test output)
- User-defined rules for custom commands
- Toggle on/off per session or globally
I think this kind of optimization would be a huge quality-of-life improvement for users and a natural fit for your tool. Would love to hear your thoughts on whether this is something you'd consider exploring!
References:
Feature Request: Built-in CLI output filtering/compression to reduce token usage (inspired by RTK)
Hey team! π
I recently came across an interesting project called [RTK (Rust Token Killer)](https://github.com/rtk-ai/rtk) β a CLI proxy that filters and compresses command output before it reaches the LLM context. The author shared their results on [Reddit](https://www.reddit.com/r/ClaudeAI/comments/1r2tt7q/i_saved_10m_tokens_89_on_my_claude_code_sessions/), and the numbers are impressive:
cargo test: 155 lines β 3 lines (98% reduction)git status: 119 chars β 28 chars (76% reduction)git log: compact summaries instead of full outputThe core idea is simple: most CLI output sent to the LLM is noise β passing tests, verbose logs, progress bars, redundant formatting. Stripping that out before it hits the context window saves a massive amount of tokens without losing any useful information.
Why this would be valuable as a built-in feature in your project:
Possible implementation scope:
git,npm,cargo,pip, test runners, etc.)I think this kind of optimization would be a huge quality-of-life improvement for users and a natural fit for your tool. Would love to hear your thoughts on whether this is something you'd consider exploring!
References: