fix(msl): reject >32,767 fragments/entry on write instead of silent overflow#1079
Merged
trishorts merged 3 commits intoJun 29, 2026
Merged
Conversation
…verflow MslPrecursorRecord.FragmentCount is int16, so an entry with more than 32,767 fragment ions silently wrapped to a negative count on write and threw OverflowException only later on read. Validate on write and throw a clear ArgumentException naming the entry and the limit. Surfaced while verifying the .msl manuscript's "unbounded fragments per entry" claim; reachable for ~250+ residue proteoforms annotated with internal ions. A true fix (widen FragmentCount to int32) is a versioned format change; this is the safe non-breaking guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TM8gpAxcjWYschz3yEjixZ
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1079 +/- ##
==========================================
+ Coverage 82.33% 82.34% +0.01%
==========================================
Files 382 382
Lines 50509 50518 +9
Branches 6102 6103 +1
==========================================
+ Hits 41586 41599 +13
+ Misses 7734 7729 -5
- Partials 1189 1190 +1
🚀 New features to boost your workflow:
|
Add four NUnit tests for the FragmentCount int16 overflow guard: the in-memory (Write) and streaming (WriteStreaming) paths each reject a 32,768-fragment entry with an ArgumentException naming the entry and the 32,767 limit, and the 32,767 boundary still writes with no wrap (locks the comparison as '>' rather than '>='). Previously the throw branch was unhit, so whole-file coverage masked ~29% patch coverage on the new guard; this lifts it to 100%. Ref: smith-chem-wisc#1079 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
nbollis
approved these changes
Jun 24, 2026
pcruzparri
approved these changes
Jun 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
MslPrecursorRecord.FragmentCountis a signed 16-bit field, butMslWriternarrows the fragment count with an unchecked(short)cast. An entry with more than 32,767 fragment ions silently wraps to a negative count on write, producing a file that throwsOverflowException(negative array length) only later, on read — i.e. silent corruption at write time, surfaced far from the cause.This is reachable in practice: with internal-ion annotation (a supported
.mslfeature), the fragment count grows roughly quadratically with peptide length, so a proteoform of ~250+ residues annotated with internal ions can exceed 32,767 fragments.Fix
Validate on write.
MslWriternow throws a clear, actionableArgumentExceptionthat names the offending entry and the limit, on both the streaming and in-memory write paths, instead of emitting a corrupt file. No on-disk format change.Verification
Entry '…' has 32768 fragment ions, which exceeds the .msl per-entry limit of 32767 (FragmentCount is a 16-bit field in MslPrecursorRecord). …(Previously: silent wrap on write →
OverflowExceptionon read.)Scope
This is the safe, non-breaking guard against silent data corruption. Lifting the limit entirely (widening
FragmentCounttoint32) is a versioned format change and is intentionally out of scope here.🤖 Generated with Claude Code