Modified config loading to be able to accept flags similar to model weights#652
Modified config loading to be able to accept flags similar to model weights#652bhavyak7-lab wants to merge 13 commits into
Conversation
|
Warning Review limit reached
More reviews will be available in 16 minutes and 15 seconds. Learn how PR review limits work. Your organization has reached its usage spending cap. Adjust your spending cap in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughConfig now maps multiple built-in YAMLs, resolves --config identifiers (exact or partial) or existing paths via _resolve_config_file, stores the resolved path in self._config_file, always loads the default builtin into _params, and conditionally loads a user YAML into _user_config. A unit test verifies timstof aliasing. ChangesConfiguration file resolution enhancement
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
casanovo/config.py (1)
112-120:⚠️ Potential issue | 🟠 Major | ⚡ Quick winLoad the config from the resolved path, not raw input.
Line 119 opens
Path(config_file)instead of the_resolve_config_file(...)result. Alias inputs (e.g., built-in names) still fail at file-open time.Proposed fix
def __init__(self, config_file: Optional[str] = None): """Initialize a Config object.""" - self.file = str(self._resolve_config_file(config_file)) + resolved_config_file = self._resolve_config_file(config_file) + self.file = str(resolved_config_file) with self._default_config.open() as f_in: self._params = yaml.safe_load(f_in) if config_file is None: self._user_config = {} else: - with Path(config_file).open() as f_in: + with resolved_config_file.open() as f_in: self._user_config = yaml.safe_load(f_in)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@casanovo/config.py` around lines 112 - 120, The code opens the original config_file path instead of the resolved path stored by _resolve_config_file, which can fail for aliased inputs; update the block that sets self._user_config to open the resolved path (use the same resolved value as assigned to self.file or call self._resolve_config_file(config_file) and convert to Path) so that Path(...) uses the resolved path rather than raw config_file; ensure you reference and use self.file or the result of _resolve_config_file(...) when creating the Path to open and yaml.safe_load into self._user_config.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@casanovo/config.py`:
- Around line 112-113: The change assigns a filesystem path to self.file even
when the default logical config is used, breaking the previous contract where
config.file could be the string "default"; update the initialization in the
Config class so that self.file remains the string "default" when
_resolve_config_file returned the sentinel/default case (use the existing
sentinel or test for _default_config being used), and only set self.file to a
filesystem path when a non-default config file is resolved; preserve use of
_resolve_config_file and _default_config.open for file IO but ensure self.file
stays "default" for the default flow so downstream metadata continues to receive
the original logical identifier.
- Around line 167-172: The _builtin_configs method only inspects instance attrs
via vars(self) so it misses class-level built-ins like _default_config and
_timstof_config; update _builtin_configs to iterate both the instance and its
class (e.g., inspect vars(self.__class__) and vars(self)) or combine them so
attributes ending with "_config" are found whether defined on the class or
instance, preserving the same filtering (isinstance(value, Path)) and the name
normalization (removeprefix/removesuffix) used currently.
---
Outside diff comments:
In `@casanovo/config.py`:
- Around line 112-120: The code opens the original config_file path instead of
the resolved path stored by _resolve_config_file, which can fail for aliased
inputs; update the block that sets self._user_config to open the resolved path
(use the same resolved value as assigned to self.file or call
self._resolve_config_file(config_file) and convert to Path) so that Path(...)
uses the resolved path rather than raw config_file; ensure you reference and use
self.file or the result of _resolve_config_file(...) when creating the Path to
open and yaml.safe_load into self._user_config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: be0962d1-df1b-4a58-b062-8a00ee2e78b0
📒 Files selected for processing (1)
casanovo/config.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #652 +/- ##
==========================================
- Coverage 95.47% 95.46% -0.01%
==========================================
Files 14 14
Lines 1656 1676 +20
==========================================
+ Hits 1581 1600 +19
- Misses 75 76 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
casanovo/config.py (1)
169-174:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBuilt-in config discovery still excludes class-level built-ins.
At Line 172, iterating
vars(self)misses class attributes like_default_configand_timstof_config. That leaves_builtin_configs()empty during resolution, so flag-style built-in names (for example,timstof) fail to resolve and raiseFileNotFoundError.💡 Minimal fix
def _builtin_configs(self): + attrs = {**vars(type(self)), **vars(self)} return { name.removeprefix("_").removesuffix("_config"): value - for name, value in vars(self).items() + for name, value in attrs.items() if name.endswith("_config") and isinstance(value, Path) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@casanovo/config.py` around lines 169 - 174, _builtin_configs currently only inspects instance attributes via vars(self) and therefore misses class-level built-ins like _default_config and _timstof_config; update _builtin_configs to also include class attributes (e.g. merge vars(type(self)) with vars(self) or use inspect.getmembers(self.__class__) / vars(self.__class__)) so the dict comprehension finds entries whose names end with "_config" and whose values are Path instances; keep the same name normalization (removeprefix/removesuffix) and return semantics but iterate both sources (class then instance) to ensure class-level built-ins are discovered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@casanovo/config.py`:
- Around line 178-181: The warnings.warn call in casanovo.config that emits "No
configuration specified; using default configuration." should include
stacklevel=2 so the warning points to the caller's code instead of this module;
update the warnings.warn invocation in the configuration initialization (the
warn call located near the default-config handling in casanovo.config) to pass
stacklevel=2 as an argument.
---
Duplicate comments:
In `@casanovo/config.py`:
- Around line 169-174: _builtin_configs currently only inspects instance
attributes via vars(self) and therefore misses class-level built-ins like
_default_config and _timstof_config; update _builtin_configs to also include
class attributes (e.g. merge vars(type(self)) with vars(self) or use
inspect.getmembers(self.__class__) / vars(self.__class__)) so the dict
comprehension finds entries whose names end with "_config" and whose values are
Path instances; keep the same name normalization (removeprefix/removesuffix) and
return semantics but iterate both sources (class then instance) to ensure
class-level built-ins are discovered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1f5f0ff1-e9be-4013-8406-aac8d55d275f
📒 Files selected for processing (1)
casanovo/config.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
casanovo/config.py (1)
176-182:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd
stacklevel=2to the warning.The
warnings.warncall at lines 178-181 lacksstacklevel=2, causing the warning to point at this module's internals rather than the caller's site. A past review flagged this and it was marked as addressed, but the fix is not present in the current code.Proposed fix
if config_file is None: warnings.warn( "No configuration specified; using default configuration.", UserWarning, + stacklevel=2, ) return self._orbitrap_config🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@casanovo/config.py` around lines 176 - 182, The warnings.warn call inside _resolve_config_file currently omits stacklevel, so update the call to include stacklevel=2 (i.e., warnings.warn("No configuration specified; using default configuration.", UserWarning, stacklevel=2)) so the warning points to the caller; modify the warning invocation in the _resolve_config_file method to pass stacklevel=2.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@casanovo/config.py`:
- Around line 176-182: The warnings.warn call inside _resolve_config_file
currently omits stacklevel, so update the call to include stacklevel=2 (i.e.,
warnings.warn("No configuration specified; using default configuration.",
UserWarning, stacklevel=2)) so the warning points to the caller; modify the
warning invocation in the _resolve_config_file method to pass stacklevel=2.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4b43ff06-711d-4716-bdfa-37c31e2d7066
📒 Files selected for processing (2)
casanovo/config.pytests/unit_tests/test_config.py
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit_tests/test_config.py (1)
19-28: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winAdd docstring for consistency.
All other test functions in this file include docstrings. Please add one here to document what this test verifies (e.g., "Test that timstof config resolution works with exact and partial matches").
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit_tests/test_config.py` around lines 19 - 28, Add a docstring to the test_timstof function describing its purpose (e.g., "Test that timstof config resolution works with exact and partial matches"); locate the function named test_timstof and place a short descriptive string literal immediately below its def line so the test matches the style of other tests in the file and documents that it verifies Config resolution for both exact ("timstof") and partial ("tims") inputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@casanovo/config.py`:
- Around line 189-203: The current permissive substring matching (using key
derived from config file and self._builtin_configs with the matches = [(name,
path) for name, path in builtins.items() if key in name] logic) allows
single-character keys to match unintended builtins; change the resolution logic
to first check for an exact key in builtins, then only perform substring
matching when len(key) >= 3 (or another minimal threshold you choose), and if
multiple substring matches remain return an explicit error or list the candidate
names (include the matched names from _builtin_configs in the message) rather
than silently picking one; update the branch that currently returns
matches[0][1] to instead surface the ambiguity or require a longer/explicit key.
- Around line 51-52: Remove the redundant module-level Path attributes
_orbitrap_config and _timstof_config and update code that references them to use
the canonical mapping _builtin_configs keyed by _default_config_key;
specifically, delete the definitions of _orbitrap_config and _timstof_config,
replace any use of _orbitrap_config (e.g., the references at the locations
calling it around where config is loaded and where it’s passed into functions —
previously at the spots noted in the review) with
_builtin_configs[_default_config_key], and ensure any leftover reference to
_timstof_config is removed or replaced similarly so only
_builtin_configs/_default_config_key drive built-in paths.
- Line 51: The builtin mapping for "orbitrap" is pointing to a non-existent
config_orbitrap.yaml while _orbitrap_config is defined as Path(__file__).parent
/ "config.yaml"; update _builtin_configs["orbitrap"] to reference
_orbitrap_config (or the canonical "config.yaml") so resolving "orbitrap" opens
the existing file, and either remove the unused _timstof_config symbol or wire
it into _builtin_configs if a TIMSTOF builtin is intended; modify the
_builtin_configs dict and cleanup _timstof_config accordingly.
In `@CHANGELOG.md`:
- Line 9: Update the CHANGELOG entry to add a trailing period and replace the
vague phrase "flag matching" with a clearer description such as "select configs
by model type using exact or partial name matching" (so the line reads: "Added
modified config loading to allow selecting configs by model type using exact or
partial name matching.").
In `@docs/file_formats.md`:
- Line 223: In docs/file_formats.md locate the sentence "Casanovo can be ran by
specifying a config file based on the type of model." and replace the incorrect
phrase "can be ran" with the correct past participle "can be run" so the line
reads "Casanovo can be run by specifying a config file based on the type of
model."
In `@tests/unit_tests/test_config.py`:
- Around line 30-34: Update the test in tests/unit_tests/test_config.py to
remove the unused local assignments inside the pytest.raises contexts and assert
error messages with the match parameter: replace "with
pytest.raises(ValueError): config = Config("t")" with "with
pytest.raises(ValueError, match='Ambiguous config'):\n Config('t')" and
replace "with pytest.raises(FileNotFoundError): config = Config('zzz')" with
"with pytest.raises(FileNotFoundError, match='Unknown config'):\n
Config('zzz')". This references the Config constructor behavior in
casanovo/config.py.
---
Outside diff comments:
In `@tests/unit_tests/test_config.py`:
- Around line 19-28: Add a docstring to the test_timstof function describing its
purpose (e.g., "Test that timstof config resolution works with exact and partial
matches"); locate the function named test_timstof and place a short descriptive
string literal immediately below its def line so the test matches the style of
other tests in the file and documents that it verifies Config resolution for
both exact ("timstof") and partial ("tims") inputs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6371f635-c555-4e3d-a697-f842361340fc
📒 Files selected for processing (4)
CHANGELOG.mdcasanovo/config.pydocs/file_formats.mdtests/unit_tests/test_config.py
| Casanovo operates based on settings defined in a [YAML configuration file](https://github.com/Noble-Lab/casanovo/blob/main/casanovo/config.yaml). | ||
| This file contains several options that affect how Casanovo processes your data and predicts peptide sequences. | ||
| If you run Casanovo without specifying a configuration file, it uses a set of default settings. | ||
| Casanovo can be ran by specifying a config file based on the type of model. |
There was a problem hiding this comment.
Grammar: Replace "can be ran" with "can be run".
"Ran" is the simple past tense; the correct past participle after "can be" is "run."
📝 Proposed fix
-Casanovo can be ran by specifying a config file based on the type of model.
+Casanovo can be run by specifying a config file based on the type of model. 📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Casanovo can be ran by specifying a config file based on the type of model. | |
| Casanovo can be run by specifying a config file based on the type of model. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/file_formats.md` at line 223, In docs/file_formats.md locate the
sentence "Casanovo can be ran by specifying a config file based on the type of
model." and replace the incorrect phrase "can be ran" with the correct past
participle "can be run" so the line reads "Casanovo can be run by specifying a
config file based on the type of model."
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
casanovo/config.py (1)
182-184:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winValidate that an existing config path is a file, not just “exists”.
Line 183 accepts directories; later
open()then fails withIsADirectoryError. This should fail early with a clear config resolution error.Suggested fix
config_path = Path(config_file) - if config_path.exists(): + if config_path.is_file(): return config_path + if config_path.exists(): + raise FileNotFoundError( + f"Config path '{config_file}' is not a file." + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@casanovo/config.py` around lines 182 - 184, The current config resolution uses config_path.exists() which accepts directories and later causes IsADirectoryError on open; change the check to ensure the path is a file (use config_path.is_file()) and if not raise a clear configuration resolution error (e.g., raise a ValueError or the module's ConfigResolutionError) with a descriptive message; update the block where config_path = Path(config_file) / the function that returns the resolved config path so callers get an early, explicit error rather than an IsADirectoryError.
♻️ Duplicate comments (2)
docs/file_formats.md (1)
236-236:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGrammar: Replace "can be ran" with "can be run".
"Ran" is the simple past tense; the correct past participle after "can be" is "run."
📝 Proposed fix
-Casanovo can be ran by specifying a config file based on the type of model. +Casanovo can be run by specifying a config file based on the type of model.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/file_formats.md` at line 236, Replace the incorrect phrase "Casanovo can be ran by specifying a config file based on the type of model." with "Casanovo can be run by specifying a config file based on the type of model." — locate the exact sentence in the docs (the line containing that phrase) and update "ran" to "run" to correct the grammar.CHANGELOG.md (1)
9-9:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd proper changelog subsection header.
The Unreleased section should follow the same structure as other releases with subsection headers (### Added, ### Changed, ### Fixed). Line 9 should be under a
### Addedsubsection to maintain consistency with the Keep a Changelog format.📝 Proposed fix
## [Unreleased] +### Added + - Added config file selection by model type name with exact or partial string matching (e.g., `--config timstof` or `--config tims`). + ## [5.2.0] - 2026-06-02🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CHANGELOG.md` at line 9, Add a "### Added" subsection header under the "Unreleased" section and place the existing bullet "Added config file selection by model type name with exact or partial string matching (e.g., `--config timstof` or `--config tims`)." beneath it so the Unreleased section follows the same Keep a Changelog structure as other releases; update the CHANGELOG.md Unreleased block to include the "### Added" header and move that line into that subsection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@casanovo/config.py`:
- Around line 195-199: The list comprehension that builds matches [(name, path)
for name, path in builtins.items() if key in name and len(key) > 2] prevents
short keys (e.g., "t") from participating and thus bypasses the ambiguous-value
handling; remove the length gate so matches is computed as [(name, path) for
name, path in builtins.items() if key in name], ensuring the
ambiguous-resolution branch (ValueError) can be reached when multiple builtins
match the provided key (referencing the variables key, builtins and the matches
comprehension in casanovo/config.py).
---
Outside diff comments:
In `@casanovo/config.py`:
- Around line 182-184: The current config resolution uses config_path.exists()
which accepts directories and later causes IsADirectoryError on open; change the
check to ensure the path is a file (use config_path.is_file()) and if not raise
a clear configuration resolution error (e.g., raise a ValueError or the module's
ConfigResolutionError) with a descriptive message; update the block where
config_path = Path(config_file) / the function that returns the resolved config
path so callers get an early, explicit error rather than an IsADirectoryError.
---
Duplicate comments:
In `@CHANGELOG.md`:
- Line 9: Add a "### Added" subsection header under the "Unreleased" section and
place the existing bullet "Added config file selection by model type name with
exact or partial string matching (e.g., `--config timstof` or `--config tims`)."
beneath it so the Unreleased section follows the same Keep a Changelog structure
as other releases; update the CHANGELOG.md Unreleased block to include the "###
Added" header and move that line into that subsection.
In `@docs/file_formats.md`:
- Line 236: Replace the incorrect phrase "Casanovo can be ran by specifying a
config file based on the type of model." with "Casanovo can be run by specifying
a config file based on the type of model." — locate the exact sentence in the
docs (the line containing that phrase) and update "ran" to "run" to correct the
grammar.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 851337e5-f385-44ed-9ae5-68b1aec5a3af
📒 Files selected for processing (4)
CHANGELOG.mdcasanovo/config.pydocs/file_formats.mdtests/unit_tests/test_config.py
Summary by CodeRabbit
New Features
Bug Fixes / Improvements
Documentation
Tests