Add isotopically-unresolved (average-mass) envelope support#1081
Draft
trishorts wants to merge 17 commits into
Draft
Add isotopically-unresolved (average-mass) envelope support#1081trishorts wants to merge 17 commits into
trishorts wants to merge 17 commits into
Conversation
…lopes Add proton-corrected MostAbundantObservedMass and AverageObservedMass to IsotopicEnvelope, plus an EnvelopeResolution flag, and GetMostAbundantOffset/ GetAverageOffset helpers on AverageResidue. These let downstream search select proteoform candidates by the experimentally most-detectable peak (the most abundant isotopologue, or the centroid for unresolved high-mass species) instead of the often-undetectable monoisotopic peak, avoiding off-by-N precursor mass errors. The existing un-proton-corrected MostAbundantObservedIsotopicMass is retained for compatibility. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tAbundantOffset, clarify mass docs Remove the GetMostAbundantOffset(mass) convenience from AverageResidue: it was just GetDiffToMonoisotopic(GetMostIntenseMassIndex(mass)), so consumers now compose the existing public methods directly (reviewer: "isn't this the same as diff to monoisotopic?"). Clarify the docs distinguishing MostAbundantObservedMass (proton- corrected neutral mass) from the existing MostAbundantObservedIsotopicMass (m/z x |charge|, not proton-corrected) (reviewer: "isn't this handled by the existing field?"). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
MostAbundantObservedMass and AverageObservedMass are fully derivable from the readonly Peaks and Charge, so make them computed get-only properties instead of fields set in every constructor. Removes ComputeObservedMasses() and its three call sites, eliminating the constructor boilerplate and the redundant no-op on the single-peak file-read path. Legacy MostAbundantObservedIsotopicMass is unchanged. Addresses review comments from PR smith-chem-wisc#1078 (nbollis). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Bound the average offset from above so a sort-order/sign regression that inflates GetAverageOffset is caught (was lower-bound only). - Tighten the centroid-vs-most-abundant assertion from 0.5 Da slack to 1e-6 now that the perfect-envelope centroid is confirmed >= apex. - Add an AverageObservedMass == mono + GetAverageOffset round-trip, mirroring the existing most-abundant round-trip. - Cover the 5-arg deconvolution constructor via a BuildPerfectPeaks helper shared with the 6-arg path. Addresses test-quality findings from the PR smith-chem-wisc#1078 review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts: # mzLib/MassSpectrometry/Deconvolution/AverageResidue/AverageResidue.cs
…o-intensity fallback Address minor test-quality findings from the PR smith-chem-wisc#1078 review. - AverageObservedMass_IsBetweenMonoAndAboveMostAbundant now asserts an upper bound (centroid <= heaviest peak's neutral mass), so the "between" contract the test name advertises is actually enforced rather than only lower-bounded. - Add AverageObservedMass_ZeroTotalIntensity_FallsBackToMostIntensePeakMass covering the totalIntensity == 0 centroid fallback (asserts a real mass, not NaN from divide-by-zero). The empty-peaks guard and GetAverageOffset's zero-total-intensity fallback are unreachable through the public API (the constructor's MaxBy throws on an empty value-type sequence; averagine intensities are never all-zero), so those defensive branches are left documented rather than exercised via reflection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ield + -1 sentinel Keep this PR focused on the resolved most-abundant-peak feature and move the isotopically-unresolved (centroid/average) machinery to a separate PR. - Expose MostAbundantObservedIsotopicMass as public (was internal) so consumers can read the observed most-abundant peak, and document the terminology (most-abundant vs monoisotopic vs average/centroid) explicitly on the members. - Constructors with no observed isotopic envelope (the neutral-mass-from-file path) now set MostAbundantObservedIsotopicMass = -1 (sentinel for "no most-abundant peak available") instead of a misleading 0. - Derive MostAbundantObservedMass from that single field (proton-corrected; differs by exactly charge*ProtonMass) so the two can never disagree, and it propagates the -1 sentinel. Also removes the per-access peak rescan. - Move out to the unresolved PR: AverageObservedMass, the EnvelopeResolution enum, Resolution/SetResolution, AverageResidue.GetAverageOffset, and their tests. The resolved search uses the pre-existing averagine most-abundant offset (GetDiffToMonoisotopic(GetMostIntenseMassIndex)). Deconvolution suite green (667 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011QVxqemPqC9BjvYtVg6aKY
Future-work follow-up to the resolved most-abundant PR. Adds the centroid/ average-mass machinery for isotopically unresolved (high-mass, >~35 kDa) envelopes, which have no individually resolved most-abundant peak. - IsotopicEnvelope.AverageObservedMass: intensity-weighted centroid neutral mass (the mass-spec "average mass"), distinct from the most-abundant (tallest peak) mass; shares the -1 "no observed envelope" sentinel. - EnvelopeResolution enum + Resolution/SetResolution: marks an envelope as Resolved (use most-abundant peak) or Unresolved (use average mass); participates in Equals/GetHashCode. - AverageResidue.GetAverageOffset: monoisotopic -> centroid offset on the averagine model. - TestUnresolvedMass: covers the centroid offset/mass, the zero-intensity and sentinel fallbacks, and the resolution-state equality. Explicit terminology comments throughout (most-abundant vs average/centroid vs monoisotopic). The unresolved charge-determination algorithm that would set Resolution.Unresolved in production is still to come; nothing sets it yet. Deconvolution suite green (674 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011QVxqemPqC9BjvYtVg6aKY
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #1081 +/- ##
=======================================
Coverage 82.34% 82.35%
=======================================
Files 382 382
Lines 50518 50550 +32
Branches 6103 6109 +6
=======================================
+ Hits 41600 41630 +30
Misses 7729 7729
- Partials 1189 1191 +2
🚀 New features to boost your workflow:
|
…servedNeutralMass Per review feedback: the property is the proton-corrected neutral mass of the most-abundant peak, so name it explicitly to distinguish it from the m/z×|charge| MostAbundantObservedIsotopicMass. Adjust the doc comments and tests to the new name; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011QVxqemPqC9BjvYtVg6aKY
…to unresolved branch Merge the resolved rename and update this branch's own references (AverageObservedMass + EnvelopeResolution doc crefs, TestUnresolvedMass usages). No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011QVxqemPqC9BjvYtVg6aKY
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Future-work follow-up to #1078, stacked on it. Splits the isotopically-unresolved (high-mass, >~35 kDa) average/centroid-mass machinery out of the resolved most-abundant PR so each PR maps to one concept.
Until #1078 merges, this PR's diff vs
masteralso shows the resolved changes from #1078; the net new surface here is:IsotopicEnvelope.AverageObservedMass— intensity-weighted centroid neutral mass (the mass-spec "average mass"), distinct from the most-abundant (tallest single peak) mass; shares the-1"no observed envelope" sentinel.EnvelopeResolutionenum +Resolution/SetResolution— marks an envelope Resolved (use the most-abundant peak) vs Unresolved (use the average mass); participates inEquals/GetHashCode.AverageResidue.GetAverageOffset— monoisotopic → centroid offset on the averagine model.TestUnresolvedMass— covers the centroid offset/mass, zero-intensity and sentinel fallbacks, and resolution-state equality.Status: draft. The unresolved charge-determination algorithm that would set
Resolution.Unresolvedin production does not exist yet — nothing sets it outside tests. This PR only lands the supporting API + tests so the later algorithm is additive.Terminology is documented explicitly on the members: most-abundant = tallest single peak; average/centroid = intensity-weighted mean; monoisotopic = all-light-isotope mass.
🤖 Generated with Claude Code