Skip to content

feat(api): unify line targeting using string specs#5

Merged
donk8r merged 4 commits into
masterfrom
fix/simplification
Jun 28, 2026
Merged

feat(api): unify line targeting using string specs#5
donk8r merged 4 commits into
masterfrom
fix/simplification

Conversation

@donk8r

@donk8r donk8r commented Jun 20, 2026

Copy link
Copy Markdown
Member
  • Replace integer and array line ranges with unified string specifications
  • Implement RangeSpec and PositionSpec for flexible line targeting
  • Support hyphenated ranges, hashes, and negative indices for EOF offsets
  • Update view, batch_edit, and extract_lines to use string-based ranges
  • Transition view command to support single path strings
  • Simplify line range parsing logic via new utility functions
  • Update JSON schemas to improve LLM guidance for range formats
  • Align test cases with the new string-based specification format

BREAKING CHANGE: line_range, from_range, append_line, and view parameters now require strings instead of integers or arrays

donk8r added 2 commits June 13, 2026 20:55
- Replace integer and array line ranges with unified string specifications
- Implement RangeSpec and PositionSpec for flexible line targeting
- Support hyphenated ranges, hashes, and negative indices for EOF offsets
- Update view, batch_edit, and extract_lines to use string-based ranges
- Transition view command to support single path strings
- Simplify line range parsing logic via new utility functions
- Update JSON schemas to improve LLM guidance for range formats
- Align test cases with the new string-based specification format

BREAKING CHANGE: line_range, from_range, append_line, and view parameters now require strings instead of integers or arrays
- Replace string-based line range specifications with typed start/end fields
- Remove multi-file support from view tool in favor of parallel calls
- Implement unified Endpoint enum to handle line numbers and hashes
- Update JSON schemas for view, batch_edit, and extract_lines tools
- Remove redundant multi-file view functions and range parsing logic
- Fix potential usize underflow during file start referencing
- Standardize endpoint resolution and line index clamping across tools
- Update all filesystem tests to align with new parameter schemas

BREAKING CHANGE: Line range strings are replaced by numeric start/end fields across all file operations, and the view tool now accepts a single path instead of an array.
@github-actions

Copy link
Copy Markdown

Octomind — developer:brief (octohub:glm)

📦 Brief: Unify all line-targeting parameters to compact string specs across view, batch_edit, and extract_lines

Overall risk: 🔴 HIGH · Cards: 2

# Change Risk Confidence
1 Line ranges/positions now require string specs ("10-25", "-1", "a3bd-c7f2") — old integer arrays rejected 🔴 ●●●
2 view parameter renamed pathspath with backward-compat serde alias 🟢 ●●●

Card 1/2: Line-targeting params converted from arrays/integers to string specs · 🔴 · ●●●

INTENT
Simplify the LLM-facing API by replacing heterogeneous line-target formats (bare integers, [start, end] arrays, ["hash1", "hash2"] arrays, nested arrays) with a single compact string representation ("10-25", "42", "-1", "a3bd-c7f2"). The commit message states this explicitly and flags it as a BREAKING CHANGE.

WHAT CHANGED

  • New parsing primitives in line_hash.rs: RangeSpec enum (Numbers(i64,i64) / Hashes(String,String)) and PositionSpec enum (Number(i64) / Hash(String)), with parse_range_spec() and parse_position_spec() functions. Number mode falls back to hash parsing if numeric parse fails.
  • view tool's lines parameter: was [start, end] integer array or nested [[1,50],[200,250]]; now a string "10-25" or array of strings ["1-50","200-250"]. Old integer arrays are rejected — range_tokens_from_array requires each element to be a Value::String.
  • batch_edit tool's line_range field: was BatchEditLineRange untagged enum (Single(i64), Range(Vec<i64>), Hash(String), HashRange(Vec<String>)); now a plain String. The BatchEditLineRange enum is deleted entirely. Bare JSON numbers are still tolerated at runtime (Value::Number(n) => n.to_string()), but arrays are not.
  • extract_lines tool: from_range changed from Vec<i64> to String; append_line changed from i64 to String. Bare JSON numbers are tolerated for append_line at runtime, but from_range arrays are rejected.
  • All JSON schemas (server.json, schemars schema functions in server.rs) updated to declare string types with examples.
  • All ~100+ test cases in fs_tests.rs updated to use string format.
  • Error messages throughout updated to suggest the new string format in guidance text.

IMPACT RADIUS
BREAKING for every MCP client (LLM or programmatic) currently calling these three tools. Any client sending the old array format (lines: [10, 20], line_range: [10, 15], from_range: [1, 5]) will receive an error instead of successful execution. There is no backward-compatibility shim for array formats — only bare JSON numbers are tolerated in batch_edit.line_range and extract_lines.append_line.

RISK
🔴 HIGH — This is a hard breaking change with no migration path for array-format callers. Any deployed client using the old [start, end] format will fail immediately. The view tool's lines parameter is the highest-impact surface since it's the most commonly called tool. The partial tolerance (bare numbers accepted for batch_edit.line_range and extract_lines.append_line but NOT for view.lines or from_range) creates an inconsistent compatibility surface that could confuse clients attempting incremental migration.

DIVERGENCE

  • 🧭 Schema vs runtime mismatch: ExtractLinesParams.append_line and BatchEditOperation.line_range are declared as String in the schema, but runtime parsing in core.rs (L654) and text_editing.rs (L1076) still accepts bare Value::Number. Clients guided by schema will send strings (the intent); old clients sending numbers still work. This is deliberate but undocumented — the schema doesn't reflect the runtime's permissiveness.

📎 Source

Card 2/2: `view` parameter renamed `paths` → `path` with backward-compat alias · 🟢 · ●●●

INTENT
Align the view tool's parameter name with the now-standard single-path usage pattern, while preserving backward compatibility for existing clients that send paths.

WHAT CHANGED

  • ViewParams.paths: Vec<String> renamed to ViewParams.path: Vec<String> with #[serde(alias = "paths")] — both keys deserialize correctly.
  • Runtime execute_view now checks path first, falls back to paths: call.parameters.get("path").or_else(|| call.parameters.get("paths")).
  • JSON schema (path_param_schema) and server.json updated to declare path as the primary key with oneOf: string | array.
  • Error messages updated from 'paths' to 'path'.
  • README and INSTRUCTIONS examples updated to use path.

IMPACT RADIUS
No breaking impact — the serde alias ensures old clients sending paths continue to work. The runtime fallback provides a second compatibility layer. New clients and LLMs reading the schema will use path.

RISK
🟢 LOW — Fully backward-compatible via serde alias + runtime fallback. No downstream impact detected.

📎 Source


📂 Files changed (7 files, ~600 lines)

{tokens:411541,cost:0.4396548}

- Align server.json with latest schema by removing deprecated fields
- Replace repositories array with repository object
- Update packages structure to include registry information
- Remove redundant tools definitions and categories from manifest
- Add line counts and estimated token costs to directory listings
- Implement parallel processing for file statistics
- Add binary file detection to skip metadata calculation
- Update tool descriptions to reflect new listing format
@donhardman donhardman force-pushed the fix/simplification branch from a4f4a84 to 94a21f2 Compare June 22, 2026 16:21
@github-actions

Copy link
Copy Markdown

Octomind — developer:brief (octohub:glm)

I have everything I need. Let me compose the brief.

📦 Brief: Line targeting API redesigned from string specs to typed scalar endpoints; multi-file view removed; directory listings now show line/token metadata; server.json migrated to MCP 2025-12 schema

Overall risk: 🟡 MEDIUM · Cards: 4

# Change Risk Confidence
1 Line targeting across all tools replaced with typed start/end scalar params (JSON type disambiguates number vs hash) — two consecutive breaking changes 🔴 ●●●
2 Multi-file view removed; view now accepts only a single path string 🟡 ●●●
3 Directory listings now read every file to annotate with line count + token estimate 🟡 ●●●
4 server.json migrated to MCP 2025-12-11 schema; inline tool definitions removed 🟢 ●●●
Card 1/4: Line targeting API replaced with typed scalar endpoints · 🔴 · ●●●

INTENT
Simplify the line-targeting API across view, batch_edit, and extract_lines by replacing string-based range specs ("10-25", "a3bd-c7f2") with individual scalar start/end fields where the JSON type itself disambiguates line numbers (integer) from hashes (string). This was done in two commits — first introducing string specs, then immediately replacing them with typed endpoints.

WHAT CHANGED

  • view: lines parameter (string/array of range strings) replaced by optional start and end scalar params. Omit both → whole file; start only → to EOF.
  • batch_edit: line_range string replaced by start (required) + optional end. Insert uses start as anchor; replace uses start..end (end omitted → single line).
  • extract_lines: from_range string replaced by from_start + optional from_end; append_line string replaced by scalar append_line (integer or hash string).
  • New Endpoint enum (Number(i64) / Hash(String)) and parse_endpoint() in line_hash.rs replace the deleted RangeSpec/PositionSpec/parse_range_spec/parse_position_spec.
  • In number mode, parse_endpoint tolerates stringified integers ("42"Number(42)); in hash mode, strings are always treated as hashes — all-digit hashes are now unambiguous because the JSON type disambiguates.

IMPACT RADIUS

  • All MCP clients calling view, batch_edit, or extract_lines must migrate parameter schemas. This is a BREAKING API change — any client using the old lines, line_range, from_range, or string append_line parameters will get errors.
  • server.json tool schemas updated to match (see Card 4).
  • text_editing::parse_line_range now calls parse_endpoint for each endpoint — the internal UnresolvedLineRange enum still carries Single/Range/Hash/HashRange variants, so downstream batch operation logic is unchanged.
  • resolve_line_range and resolve_line_range_clamped helpers in core.rs were removed; view now uses resolve_view_range which calls resolve_endpoint_to_line with clamping. extract_lines uses strict (non-clamping) resolve_line_index — out-of-bounds line numbers error instead of clamping.

RISK
🔴 HIGH — Two consecutive breaking changes to the same API surface in three commits. Any client pinned to the intermediate string-spec format (commit 596811f) will break again on cfb0759. The paths key is still accepted as a string alias in view, but passing an array to paths now explicitly errors with "make separate view calls."

DIVERGENCE

  • 🧩 Incomplete change: The paths legacy key is still accepted in execute_view as a string-only alias, but server.json no longer documents a paths field at all. Clients relying on paths as a string will still work, but this is undocumented behavior.

📎 Source

Card 2/4: Multi-file view removed — single path only · 🟡 · ●●●

INTENT
Simplify the view tool by removing multi-file batching. The caller now makes parallel view calls instead of passing an array of paths. This eliminates view_many_files, view_many_files_spec, view_file_multi_ranges, the ParsedLines enum, and all per-file range mapping logic.

WHAT CHANGED

  • execute_view now accepts only a single path string. Passing an array to path explicitly bails with "make separate view calls."
  • The 50-file limit check is gone.
  • file_ops.rs lost view_many_files, view_many_files_spec, and view_file_multi_ranges (~440 lines removed). The resolve_path import in file_ops.rs was removed (no longer needed).
  • ParsedLines enum (None/Single/MultiRangeSingleFile/PerFile) and parse_lines_param (~120 lines) deleted from core.rs.
  • view now goes straight to view_file_spec with an optional (start, end) range.

IMPACT RADIUS

  • Any MCP client passing path: ["file1.rs", "file2.rs"] or paths: [...] will get an error. The paths key still works but only as a string alias for path.
  • Directory listing and content search paths in execute_view are unchanged — they still dispatch to directory::list_directory.

RISK
🟡 MEDIUM — Breaking for clients that batch multiple files in one view call. The error message is clear and actionable. The paths string alias provides partial backward compatibility for single-file callers using the old key name.

📎 Source

Card 3/4: Directory listings now read every file to show line count + token estimate · 🟡 · ●●●

INTENT
Enhance directory listings to help LLMs scope unfamiliar trees and budget reads before opening files. Each file entry now shows path\tNL\t~Nt (line count + estimated token cost), with binary files marked (binary).

WHAT CHANGED

  • list_directory file-listing mode now reads every file via std::fs::read inside a spawn_blocking + rayon par_iter to count lines and estimate tokens.
  • Binary detection: NUL-density check on first 512 bytes (>10% NULs → labeled (binary), no line/token annotation).
  • Files that fail to read are listed without annotation (just the path).
  • Order is preserved via carried index so output stays alphabetic after parallel processing.
  • view tool description updated to mention the new listing format.

IMPACT RADIUS

  • All directory listing responses now have a different format — tab-separated metadata instead of bare filenames. Any client parsing listing output will need to handle the new format.
  • The estimate_tokens function (content.len().div_ceil(4)) is a rough heuristic — actual token counts will vary, but the ~ prefix signals approximation.

RISK
🟡 MEDIUM — Listing a directory with many large files now reads every file's full content (not just metadata) to count lines. For a directory with hundreds of MB of files, this adds significant I/O latency to what was previously a cheap directory walk. The spawn_blocking + rayon parallelism mitigates runtime impact but doesn't reduce total I/O. A directory with thousands of files could cause noticeable delay or memory pressure since each file is fully read into memory via std::fs::read.

QUESTIONS

  • Should there be a file-size cap for the line/token annotation (e.g., skip files >1MB and just list the path)? The binary check reads the first 512 bytes, but the full std::fs::read loads the entire file into memory before the NUL check can short-circuit.

📎 Source

Card 4/4: server.json migrated to MCP 2025-12-11 schema · 🟢 · ●●●

INTENT
Align server.json with the latest MCP server schema (2025-12-11). Replace deprecated fields (name_for_human, name_for_model, repositories array, modes, categories, tags) with the new format (title, repository object, packages array with registry info, websiteUrl).

WHAT CHANGED

  • repositories array → repository object with url + source fields.
  • name_for_humantitle; name_for_model removed (uses name).
  • modes.stdiopackages[0].transport + packageArguments.
  • Inline tools array (full input schemas for all 6 tools) removed entirely — tool definitions now come from the #[tool] macros in server.rs at runtime.
  • categories and tags removed.
  • pricing field removed (implicitly free).

IMPACT RADIUS

  • MCP registry/clients that read server.json for discovery will see the new schema. Tool schemas are no longer statically declared in server.json — they're served dynamically by the rmcp runtime.
  • No runtime behavior change — the Rust #[tool] annotations in server.rs are the source of truth for tool schemas.

RISK
🟢 LOW — Static manifest file change. No code behavior affected. The tool schemas served to clients come from the Rust code, not this file.

📎 Source

📂 Files changed (8 files, ~1200 lines)

{tokens:817758,cost:0.8317808}

- Add delete file functionality with undo support
- Implement negative line index resolution and clamped boundaries
- Improve batch edit diffs with cumulative offset tracking
- Refine line hash parsing and endpoint resolution utilities
- Enhance directory listings with size and token estimations
- Streamline file viewing and editing logic in MCP tools
- Update JSON schemas and documentation for better LLM compatibility
- Add comprehensive tests for line-targeting and batch edit operations
- Fix error handling for empty files and invalid line indices
@github-actions

Copy link
Copy Markdown

Octomind — developer:brief (octohub:glm)

Now I have all the evidence I need. Let me compile the brief.

📦 Brief: Delete command, shared line-index resolution, mode-aware schemas, and batch-edit diff offset tracking

Overall risk: 🟡 MEDIUM · Cards: 4

# Change Risk Confidence
1 text_editor delete command added — removes a file with undo support, refuses directories 🟢 ●●●
2 Line-index resolution (resolve_line_index / _clamped) consolidated into utils::line_hash with checked_neg overflow guard 🟡 ●●●
3 Line endpoint JSON schema is now mode-aware: plain integer in number mode, anyOf:[integer,string] in hash mode 🟢 ●●●
4 Batch-edit diff rendering tracks cumulative line-count offsets so multi-op diffs show correct final positions 🟡 ●●●

Card 1/4: text_editor delete command — file removal with undo · 🟢 · ●●●

INTENT
Add a delete command to the text_editor tool so the LLM can remove files without falling back to shell rm, with undo support for recoverability.

WHAT CHANGED

  • New TextEditorCommand::Delete variant added to the enum in server.rs; execute_text_editor dispatches "delete"text_editing::delete_file_spec.
  • delete_file_spec acquires the per-file lock, snapshots content into FILE_HISTORY (same undo stack as edits), then tokio_fs::remove_file. Refuses directories with a pointer to shell rm -r.
  • Error messages and hints across the codebase updated to reference delete alongside existing commands (e.g. shell misuse hints, create "file exists" guidance).

IMPACT RADIUS
Isolated change — the new command is dispatched only from execute_text_editor and uses the existing file-lock and undo-history infrastructure. No other tool's behavior changes.

RISK
🟢 LOW. The delete path reuses the same lock + history snapshot pattern as str_replace and batch_edit, so concurrency and undo semantics are consistent. Directory refusal prevents accidental recursive deletion.

📎 Source

Card 2/4: Line-index resolution consolidated to utils::line_hash with overflow guard · 🟡 · ●●●

INTENT
Eliminate duplicate resolve_line_index / resolve_line_index_clamped copies in core.rs and text_editing.rs by moving them to utils::line_hash as shared pub(crate) functions, and harden against i64::MIN negation overflow.

WHAT CHANGED

  • Both copies deleted from core.rs and text_editing.rs; replaced with use crate::utils::line_hash::{resolve_line_index, resolve_line_index_clamped}.
  • The consolidated versions use index.checked_neg() instead of -index, so i64::MIN (whose negation overflows in release builds) produces a clean out-of-range error or clamp instead of undefined/panicking behavior.
  • parse_endpoint split into parse_endpoint (reads global mode) + parse_endpoint_with_mode (takes explicit hash_mode: bool) for unit-testability without mutating the process-global OnceLock.

IMPACT RADIUS
All tools that resolve line numbers (view, batch_edit, extract_lines) now call the shared functions. Behavior is identical for all inputs except i64::MIN, which previously could overflow and now errors/clamps cleanly.

RISK
🟡 MEDIUM. The checked_neg fix addresses a real overflow path: i64::MIN passed as a negative line index would have triggered (-index) overflow in release mode. The fix is correct — checked_neg returns None, mapped to usize::MAX, which always exceeds total_lines and produces a clean error. The risk is low because i64::MIN as a line number is not a realistic LLM input, but the fix is still the right hardening.

📎 Source

Card 3/4: Mode-aware line endpoint JSON schema · 🟢 · ●●●

INTENT
Make the MCP tool schema for line endpoints (start/end/append_line) portable across all tool-calling stacks. In number mode, emit a plain integer (no union) so OpenAI strict structured outputs and Gemini — which reject oneOf/anyOf unions — can consume the schema. In hash mode, emit anyOf:[integer, string] using anyOf (not oneOf) for wider cross-stack support.

WHAT CHANGED

  • line_endpoint_schema in server.rs now branches on is_hash_mode(): plain {"type":"integer"} in number mode, anyOf union in hash mode.
  • Previously it always emitted oneOf:[integer, string] regardless of mode.
  • BatchEditOperation.end doc comment now states start and end must be the same kind (both numbers or both hashes).
  • parse_line_range in text_editing.rs enforces this: mixed (Number, Hash) or (Hash, Number) pairs return an error.

IMPACT RADIUS
Affects every MCP client consuming the tool schema. In number mode, clients that previously rejected the oneOf union (OpenAI strict, Gemini) can now use the schema. In hash mode, anyOf replaces oneOf — semantically equivalent for validation but with wider parser support.

RISK
🟢 LOW. The schema change is additive for number-mode clients (they get a simpler schema). Hash-mode clients see anyOf instead of oneOf, which is a strict superset in terms of validator support. The mixed number+hash rejection is a new validation that catches a real misuse pattern.

📎 Source

Card 4/4: Batch-edit diff tracks cumulative line-count offsets for correct final positions · 🟡 · ●●●

INTENT
Fix incorrect line numbers and hash prefixes in batch-edit diff output when multiple operations change the file's line count. Previously, each op's diff was rendered against original line coordinates, so any op after the first length-changing one showed stale positions — the on-disk write was always correct, but the diff the LLM saw was wrong.

WHAT CHANGED

  • A cumulative offset: i64 variable tracks the net line-count delta from all prior ops. Each op's rendered positions are shifted by this offset to reference the final file.
  • Context lines before/after each op are now pulled from new_lines (the final file) at shifted positions, not from original_lines at original positions.
  • Removed lines in replace diffs still use original coordinates (correct — they no longer exist in the final file).
  • Insert anchor on empty files: -1 ("after last line") now resolves to 0 (beginning) when total_lines == 0, instead of erroring.
  • execute_batch_edit in core.rs simplified: removed the defensive JSON-string parsing of operations (the typed Vec<BatchEditOperation> schema guarantees an array), and the 50-op limit moved to batch_edit_spec via #[schemars(length(max = 50))].
  • Batch-edit all-ops-failed error now lists each failed op's reason instead of just a count.

IMPACT RADIUS
Only affects the diff text returned to the LLM — the actual file write logic is unchanged. An LLM relying on diff line numbers to plan subsequent edits would have gotten wrong positions before; now they match the final file state.

RISK
🟡 MEDIUM. The offset tracking assumes non-overlapping operations (guaranteed by detect_conflicts for replaces and equal-anchor inserts). The code comment explicitly acknowledges this assumption. If a future change allows overlapping ops, the diff positions would silently drift. The ctx_line closure correctly guards against out-of-bounds access to new_lines, so no panic risk.

QUESTIONS
The comment says "an insert anchored strictly inside a replaced range renders approximately" — is this a known limitation that's acceptable because detect_conflicts prevents it in practice, or should it be hardened?

📎 Source

📂 Files changed (9 files, ~559 insertions / ~227 deletions)
  • src/utils/line_hash.rs — consolidated resolve_line_index/_clamped with checked_neg, parse_endpoint_with_mode split
  • src/mcp/fs/text_editing.rsdelete_file_spec, batch-edit diff offset tracking, mixed number+hash rejection, empty-file insert fix
  • src/mcp/fs/core.rsdelete dispatch, removed local resolve_line_index copies, simplified execute_batch_edit, empty-file view range fix
  • src/mcp/server.rsTextEditorCommand::Delete, mode-aware line_endpoint_schema, BatchEditOperation.end doc
  • src/mcp/fs/file_ops.rs — removed inline directory listing from view_file_spec, error message cleanup
  • src/mcp/fs/shell.rs — misuse hint updated to reference batch_edit
  • src/mcp/fs/fs_tests.rs — 278 lines of new tests: hash endpoints, mixed number+hash rejection, empty-file view, batch-edit diff positions
  • INSTRUCTIONS.md — updated file_ops.rs description, line-mode parsing clarification
  • README.md — delete command docs, content search examples, extract_lines endpoint docs

{tokens:489305,cost:0.5002016}

@donk8r donk8r merged commit c7fd6a4 into master Jun 28, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant