Releases: mllam/neural-lam
v0.6.0
This release introduces new features including GIF animation support, wandb run resumption, and improved ensemble loading, alongside a large number of bug fixes and maintenance updates.
Added
-
Add support for GIF animation generation for model predictions #218 @kartikangiras
-
Add
AGENTS.mdfile to the repo to give agents more information about the codebase and the contribution culture.#416 @sadamov -
Enable
pin_memoryin DataLoaders when GPU is available for faster async CPU-to-GPU data transfers #236 @abhaygoudannavar -
Expose
--wandb_idCLI argument to allow resuming an existing W&B run by
ID. When provided,resume="allow"is set automatically so the same job
script works for both the initial submission and all resubmissions, making
it suitable for HPC systems with limited job runtimes or that may crash.
#197 @Mani212005
Changed
-
Change the default ensemble-loading behavior in
WeatherDataset/WeatherDataModuleto use all ensemble members as independent samples for ensemble datastores (with matching ensemble-member selection for forcing when available); single-member behavior now requires explicitly opting in via--load_single_member#332 @kshirajahere -
Refactor graph loading: move zero-indexing out of the model and update plotting to prepare using the research-branch graph I/O #184 @zweihuehner
-
Replace
print()-basedrank_zero_printwithlogurulogger.info()for structured log-level control #33
Fixed
-
Fix validation crash in
plot_error_mapand resolve DDP NCCL initialization error on single-device setups #193 @AdityaKumarSethia -
Fix
--metrics_watchhandling to avoid AttributeError when unset and improve warning behavior during evaluation #420 @archit7-beep -
Standardize all script references to use
create_graphinstead of the legacycreate_meshname in README andpyproject.toml, and fix minor README typos #426 @GiGiKoneti -
Initialize
da_forcing_meanandda_forcing_stdtoNonewhen forcing data is absent, fixingAttributeErrorinWeatherDatasetwithstandardize=True#369 @Sir-Sloth-The-Lazy -
Ensure proper sorting of
analysis_timeinNpyFilesDatastoreMEPS._get_analysis_timesindependent of the order in which files are processed with glob #386 @Gopisokk -
Switch to lat/lon-based plotting with
pcolormeshandcartopyfor accurate spatial visualisation regardless of underlying projection. #168 @sadamov -
Replace
shell=Truesubprocess call incompute_standardization_stats.pywith a safe argument list and Python-side hostname parsing to prevent command injection viaSLURM_JOB_NODELIST#264 @ashum9 -
Avoid NaN when standardizing fields with zero std #189 @varunsiravuri
-
Replaces multiple
assertstatements used for runtime input validation with explicitValueError#279 @Sir-Sloth-The-Lazy -
Fix README image paths to use absolute GitHub URLs so images display correctly on PyPI #188 @bk-simon
-
Fix typo in
ar_model.pythat causesAttributeErrorduring evaluation #204 @Ritinikhil -
Changed the hardcoded True to a conditional check "persistent_workers=self.num_workers > 0" #235 @santhil-cyber
-
Avoid eager download of the MEPS example dataset during pytest collection by lazily initializing it in
tests/conftest.py, allowing tests to run without triggering a dataset download at import time. #391 @Saptami191 -
fractional_plot_bundlenow correctly multiplies by fraction instead of dividing #222 @santhil-cyber -
Fix
all_gather_catproducing wrong shapes on single-device runs by only flattening whenall_gatheractually introduces a new leading dimension #424 @RajdeepKushwaha5 -
Infer spatial coordinate names for MDPDatastore (rather than assuming names
xandy), allows for e.g. lat/lon regular grids #169 @leifdenby
Maintenance
-
Update PR template to clarify milestone/roadmap requirement and maintenance changes #186 @joeloskarsson
-
Update CI/CD to use python 3.13 for testing and full range of current python versions for linting (3.10 - 3.14) #173 @observingClouds
-
Move development dependencies to dependency-group #174 @observingClouds
-
Update CI/CD to use only uv for full test suite and drop pdm #178 @observingClouds
-
Fix caching of MEPS example data in CI/CD #181 @observingClouds
-
Migrated build backend from PDM to Hatchling with hatch-vcs and added uv build in deploy CI
-
Warn when running with
--evalwithout--loadto avoid accidentally evaluating randomly initialized weights #190 @varunsiravuri
v0.5.0
This release contains maintenance and fixes, preventing some unexpected crashes and improving CICD and testing.
Added
- Expose run name as optional command line argument
--logger_run_nameto allow user-defined names
#156 @observingClouds
Fixed
-
Change default logging argument to prevent crash when running eval
#145 @joeloskarsson -
Fix wrong grid dimensionality when running with --output_std, resulting in crash
#147 @joeloskarsson -
Fix the order in create_graph.py which caused wrong G2M and M2G
#150 @YUTAIPAN -
Adding a more robust LaTeX availability check function #162 @lorenzo30salgado
Maintenance
-
Add link to full MEPS data #102 @joeloskarsson
-
Introducing
mypyfor static type checking and fixing type hints accordingly #113 @observingClouds -
Change all argparse instances to use ArgumentDefaultsHelpFormatter for easier maintaining defaults.
#145 @joeloskarsson -
Allow triggering CI manually to e.g. test for recent software incompatibilities. \152 @observingClouds
-
Fix
torchversion detection during CI when testing on CPU with pdm #154 @leifdenby -
Update link to MEPS example data #155 @joeloskarsson
-
Change deprecated
pynvmldependency tonvidia-ml-py#176 @observingClouds
v0.4.0
This release introduces a number of improvements to logging, multi-node training and variable rescaling, without making any major changes to the neural-lam structure.
Added
-
Add support for MLFlow logging and metrics tracking. #77
@khintz -
Add support for multi-node training.
#103 @SimonKamuk @sadamov -
Add option to clamp output prediction using limits specified in config file #92 @SimonKamuk
-
Add publication of releases to pypi.org. #71 @leifdenby, @observingClouds
Fixed
-
Only print on rank 0 to avoid duplicates of all print statements.
#103 @SimonKamuk @sadamov -
Fix MLFlow exception import introduced in #77.
#111
@observingClouds -
Fix duplicate tensor copy to CPU #106 @observingClouds
-
Fix bug where the inverse_softplus used in clamping caused nans in the gradients #123 @SimonKamuk
-
Add standardization to state diff stats from mdp datastore #122 @SimonKamuk
-
Set ci/cd badges to refer to the new test matrix #130 @SimonKamuk
-
use correct split of data with the
--eval valor--eval testcli arguments #139 @SimonKamuk
Maintenance
-
update ci/cd testing to use cuda 12.8 #140 @SimonKamuk
-
update ci/cd testing to use pre-commit v3.0.1 #140 @SimonKamuk
-
update AWS GPU ci/cd to use ami with larger (200GB) root volume and ensure
nvme drive is used for pip venvn
#126, @leifdenby -
update ci/cd testing setup to install torch version compatible with neural-lam
dependencies #115, @leifdenby -
switch to new npyfiles MEPS and mdp DANRA test datasets which are coincident
in time and space (on cropped ~100x100 grid-point domain)
#110, @leifdenby -
use dynamic versioning based on git tags and commit hashes
#118, @observingClouds -
add detect_anomaly=True to pl.Trainer in test_training.py #124, @SimonKamuk
v0.3.0
This release introduces Datastores to represent input data from different sources (including zarr and numpy) while keeping graph generation within neural-lam.
Added
- Introduce Datastores to represent input data from different sources, including zarr and numpy. #66 @leifdenby @sadamov
Fixed
-
Fix wandb environment variable disabling wandb during tests. Now correctly uses WANDB_MODE=disabled. #94 @joeloskarsson
-
Fix bugs introduced with datastores functionality relating visualation plots #91 @leifdenby
v0.2.0
Highlights
This release focuses on setting up the neural-lam repository and codebase to enable collaboration.
Detailed Changes
Added
-
Added tests for loading dataset, creating graph, and training model based on reduced MEPS dataset stored on AWS S3, along with automatic running of tests on push/PR to GitHub, including push to main branch. Added caching of test data to speed up running tests.
#38 #55
@SimonKamuk -
Replaced
constants.pywithdata_config.yamlfor data configuration management
#31
@sadamov -
new metrics (
nllandcrps_gauss) andmetricssubmodule, stddiv output option
c14b6b4
@joeloskarsson -
ability to "watch" metrics and log
c14b6b4
@joeloskarsson -
pre-commit setup for linting and formatting
#6, #8
@sadamov, @joeloskarsson -
added github pull-request template to ease contribution and review process
#53, @leifdenby -
ci/cd setup for running both CPU and GPU-based testing both with pdm and pip based installs #37, @khintz, @leifdenby
Changed
-
Clarify routine around requesting reviewer and assignee in PR template
#74
@joeloskarsson -
Argument Parser updated to use action="store_true" instead of 0/1 for boolean arguments.
(#72)
@ErikLarssonDev -
Optional multi-core/GPU support for statistics calculation in
create_parameter_weights.py
#22
@sadamov -
Robust restoration of optimizer and scheduler using
ckpt_path
#17
@sadamov -
Updated scripts and modules to use
data_config.yamlinstead ofconstants.py
#31
@sadamov -
Added new flags in
train_model.pyfor configuration previously inconstants.py
#31
@sadamov -
moved batch-static features ("water cover") into forcing component return by
WeatherDataset
#13
@joeloskarsson -
change validation metric from
maetormse
c14b6b4
@joeloskarsson -
change RMSE definition to compute sqrt after all averaging
#10
@joeloskarsson
Removed
WeatherDataset(torch.Dataset)no longer returns "batch-static" component of
training item (onlyprev_state,target_stateandforcing), the batch static features are
instead included in forcing
#13
@joeloskarsson
Maintenance
-
simplify pre-commit setup by 1) reducing linting to only cover static
analysis excluding imports from external dependencies (this will be handled
in build/test cicd action introduced later), 2) pinning versions of linting
tools in pre-commit config (and remove fromrequirements.txt) and 3) using
github action to run pre-commit.
#29
@leifdenby -
change copyright formulation in license to encompass all contributors
#47
@joeloskarsson -
Fix incorrect ordering of x- and y-dimensions in comments describing tensor
shapes for MEPS data
#52
@joeloskarsson -
Cap numpy version to < 2.0.0 (this cap was removed in #37, see below)
#68
@joeloskarsson -
Remove numpy < 2.0.0 version cap
#37
@leifdenby -
turn
neural-laminto a python package by moving all*.py-files into the
neural_lam/source directory and updating imports accordingly. This means
all cli functions are now invoke through the package name, e.g.python -m neural_lam.train_modelinstead ofpython train_model.py(and can be done
anywhere once the package has been installed).
#32, @leifdenby -
move from
requirements.txttopyproject.tomlfor defining package dependencies.
#37, @leifdenby -
Add slack and new publication info to readme
#78
@joeloskarsson
Compatibility
This version has been tested with Python 3.9-3.12.
Upgrade Steps
To upgrade to neural-lam v0.2.0:
-
Update your local version:
git pullthe latest changes from the repository and install locallypip install -e .- Note: This release is not yet available on PyPI. You will need to install it from the GitHub repository.
-
Adapt any code relying on constants previously defined in
neural_lam/constants.pyto instead read them from the new YAML config file.
Links
v0.1.0
Initial version of Neural-LAM, reproduces the workshop paper.