Releases · huggingface/pytorch-image-models

08 May 20:06

rwightman

v1.0.27

0e968e1

Release v1.0.27 Latest

Latest

April 23, 2026

Add Gemma4 ViT encoders w/ NaFlex pipeline support (variable aspect/size per image). Thanks Yonghye Kwon
Support DINOv3 weights in NaFlexVit. Thanks Yonghye Kwon
Some improvements to Muon fallback (AdamW/NadamW) lr behavior

What's Changed

🔒 Pin GitHub Actions to commit SHAs by @paulinebm in #2689
Improve fallback (adamw/nadamw) LR handling for Muon optimizer by @rwightman in #2688
Fix NaFlexVit DINOv3 support: propagate rope_type and rotate_half by @developer0hye in #2692
chore: bump doc-builder SHA for PR upload workflow by @rtrompier in #2694
Gemma4 by @rwightman in #2697
Add encoder_pool option to gemma4 classification model to toggle soft… by @rwightman in #2698
Fix some performance regressions with torch.compile + Tasks. Fix #2693 by @rwightman in #2699

New Contributors

@paulinebm made their first contribution in #2689

Full Changelog: v1.0.26...v1.0.27

Contributors

rtrompier, rwightman, and 2 other contributors

Assets 2

23 Mar 18:13

rwightman

v1.0.26

8d0f79e

Release v1.0.26

March 23, 2026

Improve pickle checkpoint handling security. Default all loading to weights_only=True, add safe_global for ArgParse.
Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass is_causal through for SSL tasks.
Fix class & register token uses with ViT and no pos embed enabled.
Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr).
Improve consistency of output projection / MLP dimensions for attention pooling layers.
Hiera model F.SDPA optimization to allow Flash Attention kernel use.
Caution added to SGDP optimizer.
Release 1.0.26. First maintenance release since my departure from Hugging Face.

What's Changed

fix: replace 5 bare except clauses with except Exception by @haosenwang1018 in #2672
Add timmx model export tool to README by @Boulaouaney in #2673
Enhance SGDP optimizer with caution parameter by @Yuan-Jinghui in #2675
Fix CLS and Reg tokens usage when pos_embed is disabled by @sinahmr in #2676
default weights_only=True for load fns by @rwightman in #2679
Fix Hiera global attention to use 4D tensors for efficient SDPA dispatch by @Raiden129 in #2680
Improve 2d and latent attention pool dimension handling. Fix #2682 by @rwightman in #2684
Improve attention mask handling for vision_transformer and eva and related blocks by @rwightman in #2686
Implement PRR as a pooling module. Alternative to #2678 by @rwightman in #2685

New Contributors

@haosenwang1018 made their first contribution in #2672
@Raiden129 made their first contribution in #2680

Full Changelog: v1.0.25...v1.0.26

Contributors

rwightman, sinahmr, and 4 other contributors

Assets 2

23 Feb 17:22

rwightman

v1.0.25

9326ff2

Release v1.0.25

Feb 23, 2026

Add token distillation training support to distillation task wrappers
Remove some torch.jit usage in prep for official deprecation
Caution added to AdamP optimizer
Call reset_parameters() even if meta-device init so that buffers get init w/ hacks like init_empty_weights
Tweak Muon optimizer to work with DTensor/FSDP2 (clamp_ instead of clamp_min_, alternate NS branch for DTensor)
Release 1.0.25

Jan 21, 2026

Compat Break: Fix oversight w/ QKV vs MLP bias in ParallelScalingBlock (& DiffParallelScalingBlock)
- Does not impact any trained timm models but could impact downstream use.

What's Changed

Token distill task & distill task refactoring by @rwightman in #2647
Fix distilled head dropout using wrong token in PiT forward_head by @hassonofer in #2649
Fix #2653, no models with weights impacted so just a clean fix by @rwightman in #2654
Add the cautious optimizer to AdamP. by @Yuan-Jinghui in #2657
Enhance the numerical stability of the Cautious Optimizer by @Yuan-Jinghui in #2658
Some misc fixes for torch.jit deprecation and meta device init by @rwightman in #2664
fix(optim): replace bare except with Exception in Lion optimizer by @llukito in #2666
Change clamp_min_ to clamp_(min=) as former doesn't work with DTensor / FSDP2 by @rwightman in #2668
Add DTensor compatible NS impl for Muon by @rwightman in #2669

New Contributors

@Yuan-Jinghui made their first contribution in #2657
@llukito made their first contribution in #2666

Full Changelog: v1.0.24...v1.0.25

Contributors

rwightman, hassonofer, and 2 other contributors

Assets 2

07 Jan 00:28

rwightman

v1.0.24

90cae8c

Release v1.0.24

Jan 5 & 6, 2025

Patch Release 1.0.24 (fix for 1.0.23)
Add new benchmark result csv files for inference timing on all models w/ RTX Pro 6000, 5090, and 4090 cards w/ PyTorch 2.9.1
Fix moved module error in deprecated timm.models.layers import path that impacts legacy imports
Release 1.0.23

Dec 30, 2025

Add better NAdaMuon trained dpwee, dwee, dlittle (differential) ViTs with a small boost over previous runs
- https://huggingface.co/timm/vit_dlittle_patch16_reg1_gap_256.sbb_nadamuon_in1k (83.24% top-1)
- https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.80% top-1)
- https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.67% top-1)
Add a ~21M param timm variant of the CSATv2 model at 512x512 & 640x640
- https://huggingface.co/timm/csatv2_21m.sw_r640_in1k (83.13% top-1)
- https://huggingface.co/timm/csatv2_21m.sw_r512_in1k (82.58% top-1)
Factor non-persistent param init out of __init__ into a common method that can be externally called via init_non_persistent_buffers() after meta-device init.

Dec 12, 2025

Add CSATV2 model (thanks https://github.com/gusdlf93) -- a lightweight but high res model with DCT stem & spatial attention. https://huggingface.co/Hyunil/CSATv2
Add AdaMuon and NAdaMuon optimizer support to existing timm Muon impl. Appears more competitive vs AdamW with familiar hparams for image tasks.
End of year PR cleanup, merge aspects of several long open PR
- Merge differential attention (DiffAttention), add corresponding DiffParallelScalingBlock (for ViT), train some wee vits
  - https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_in1k
  - https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_in1k
- Add a few pooling modules, LsePlus and SimPool
- Cleanup, optimize DropBlock2d (also add support to ByobNet based models)
Bump unit tests to PyTorch 2.9.1 + Python 3.13 on upper end, lower still PyTorch 1.13 + Python 3.10

Dec 1, 2025

Add lightweight task abstraction, add logits and feature distillation support to train script via new tasks.
Remove old APEX AMP support

What's Changed

Add val-interval argument by @t0278611 in #2606
Add coord attn and some variants that I had lying around by @rwightman in #2617
Distill fixups by @rwightman in #2598
A simplification and some fixes for DropBlock2d. by @rwightman in #2620
Other pooling... by @rwightman in #2621
Experimenting with differential attention by @rwightman in #2314
Differential + parallel attn by @rwightman in #2625
AdaMuon impl w/ a few other ideas based on recent reading by @rwightman in #2626
Csatv2 contribution by @rwightman in #2627
Add HParams sections to hfdocs by @rwightman in #2630
Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in #2633
[BUG] Modify autocasting in fast normalization functions to handle optional weight params safely by @tesfaldet in #2631
'init_non_persistent_buffers' scheme by @rwightman in #2632
Add docstrings to layer helper functions and modules by @raimbekovm in #2634
refactor(scheduler): add type hints to CosineLRScheduler by @haru-256 in #2640
A few misc weights to close out 2025 by @rwightman in #2639
Update typing in other scheduler classes. Add unit tests. by @rwightman in #2641

New Contributors

@t0278611 made their first contribution in #2606
@salmanmkc made their first contribution in #2633
@tesfaldet made their first contribution in #2631
@raimbekovm made their first contribution in #2634
@haru-256 made their first contribution in #2640

Full Changelog: v1.0.22...v1.0.24

Contributors

tesfaldet, rwightman, and 4 other contributors

Assets 2

05 Jan 21:42

rwightman

v1.0.23

b643217

Release v1.0.23

Dec 30, 2025

Add better NAdaMuon trained dpwee, dwee, dlittle (differential) ViTs with a small boost over previous runs
- https://huggingface.co/timm/vit_dlittle_patch16_reg1_gap_256.sbb_nadamuon_in1k (83.24% top-1)
- https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.80% top-1)
- https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.67% top-1)
Add a ~21M param timm variant of the CSATv2 model at 512x512 & 640x640
- https://huggingface.co/timm/csatv2_21m.sw_r640_in1k (83.13% top-1)
- https://huggingface.co/timm/csatv2_21m.sw_r512_in1k (82.58% top-1)
Factor non-persistent param init out of __init__ into a common method that can be externally called via init_non_persistent_buffers() after meta-device init.

Dec 12, 2025

Add CSATV2 model (thanks https://github.com/gusdlf93) -- a lightweight but high res model with DCT stem & spatial attention. https://huggingface.co/Hyunil/CSATv2
Add AdaMuon and NAdaMuon optimizer support to existing timm Muon impl. Appears more competitive vs AdamW with familiar hparams for image tasks.
End of year PR cleanup, merge aspects of several long open PR
- Merge differential attention (DiffAttention), add corresponding DiffParallelScalingBlock (for ViT), train some wee vits
  - https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_in1k
  - https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_in1k
- Add a few pooling modules, LsePlus and SimPool
- Cleanup, optimize DropBlock2d (also add support to ByobNet based models)
Bump unit tests to PyTorch 2.9.1 + Python 3.13 on upper end, lower still PyTorch 1.13 + Python 3.10

Dec 1, 2025

Add lightweight task abstraction, add logits and feature distillation support to train script via new tasks.
Remove old APEX AMP support

What's Changed

Add val-interval argument by @t0278611 in #2606
Add coord attn and some variants that I had lying around by @rwightman in #2617
Distill fixups by @rwightman in #2598
A simplification and some fixes for DropBlock2d. by @rwightman in #2620
Other pooling... by @rwightman in #2621
Experimenting with differential attention by @rwightman in #2314
Differential + parallel attn by @rwightman in #2625
AdaMuon impl w/ a few other ideas based on recent reading by @rwightman in #2626
Csatv2 contribution by @rwightman in #2627
Add HParams sections to hfdocs by @rwightman in #2630
Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in #2633
[BUG] Modify autocasting in fast normalization functions to handle optional weight params safely by @tesfaldet in #2631
'init_non_persistent_buffers' scheme by @rwightman in #2632
Add docstrings to layer helper functions and modules by @raimbekovm in #2634
refactor(scheduler): add type hints to CosineLRScheduler by @haru-256 in #2640
A few misc weights to close out 2025 by @rwightman in #2639
Update typing in other scheduler classes. Add unit tests. by @rwightman in #2641

New Contributors

@t0278611 made their first contribution in #2606
@salmanmkc made their first contribution in #2633
@tesfaldet made their first contribution in #2631
@raimbekovm made their first contribution in #2634
@haru-256 made their first contribution in #2640

Full Changelog: v1.0.22...v1.0.23

Contributors

tesfaldet, rwightman, and 4 other contributors

Assets 2

05 Nov 04:08

rwightman

v1.0.22

78c3724

Release v1.0.22

Patch release for priority LayerScale initialization regression in 1.0.21

What's Changed

Add some weights for efficientnet_x / efficientnet_h models by @rwightman in #2602
Update result csvs by @rwightman in #2603
Fix LayerScale ignoring init_values by @Ilya-Fradlin in #2605

New Contributors

@Ilya-Fradlin made their first contribution in #2605

Full Changelog: v1.0.21...v1.0.22

Contributors

rwightman and Ilya-Fradlin

Assets 2

24 Oct 22:39

rwightman

v1.0.21

625386b

Release v1.0.21

Oct 16-20, 2025

Add an impl of the Muon optimizer (based on https://github.com/KellerJordan/Muon) with customizations
- extra flexibility and improved handling for conv weights and fallbacks for weight shapes not suited for orthogonalization
- small speedup for NS iterations by reducing allocs and using fused (b)add(b)mm ops
- by default uses AdamW (or NAdamW if nesterov=True) updates if muon not suitable for parameter shape (or excluded via param group flag)
- like torch impl, select from several LR scale adjustment fns via adjust_lr_fn
- select from several NS coefficient presets or specify your own via ns_coefficients
First 2 steps of 'meta' device model initialization supported
- Fix several ops that were breaking creation under 'meta' device context
- Add device & dtype factory kwarg support to all models and modules (anything inherting from nn.Module) in timm
License fields added to pretrained cfgs in code
Release 1.0.21

What's Changed

Add calculate_drop_path_rates helper by @rwightman in #2589
Review huggingface_hub integration by @Wauplin in #2592
Adding device/dtype factory_kwargs to modules and models by @rwightman in #2591
Consistent license handling throughout timm by @alexanderdann in #2585
Add impl of Muon optimizer. Fix #2580 by @rwightman in #2596
Rename 'simple' flag for Muon to 'fallback' by @rwightman in #2599

New Contributors

@alexanderdann made their first contribution in #2585

Full Changelog: v1.0.20...v1.0.21

Contributors

rwightman, Wauplin, and alexanderdann

Assets 2

21 Sep 17:28

rwightman

v1.0.20

234907e

Release v1.0.20

Sept 21, 2025

Remap DINOv3 ViT weight tags from lvd_1689m -> lvd1689m to match (same for sat_493m -> sat493m)
Release 1.0.20

Sept 17, 2025

DINOv3 (https://arxiv.org/abs/2508.10104) ConvNeXt and ViT models added. ConvNeXt models were mapped to existing timm model. ViT support done via the EVA base model w/ a new RotaryEmbeddingDinoV3 to match the DINOv3 specific RoPE impl
- HuggingFace Hub: https://huggingface.co/collections/timm/timm-dinov3-68cb08bb0bee365973d52a4d
MobileCLIP-2 (https://arxiv.org/abs/2508.20691) vision encoders. New MCI3/MCI4 FastViT variants added and weights mapped to existing FastViT and B, L/14 ViTs.
MetaCLIP-2 Worldwide (https://arxiv.org/abs/2507.22062) ViT encoder weights added.
SigLIP-2 (https://arxiv.org/abs/2502.14786) NaFlex ViT encoder weights added via timm NaFlexViT model.
Misc fixes and contributions

What's Changed

Pass init_values at hieradet_sam2 by @hassonofer in #2559
Add mobileclip2 encoder weights by @rwightman in #2560
Add support for Gemma 3n MobileNetV5 encoder weight loading by @rwightman in #2561
Fix #2562, add siglip2 naflex vit encoder weights by @rwightman in #2564
fix: create results_dir if missing before saving results by @zhima771 in #2576
feat(validate): add precision, recall, and F1 metrics by @ha405 in #2568
Allow user to ask for features other than image and label in ImageDataset by @grodino in #2571
Add MobileCLIP2 image encoders by @rwightman in #2578
Add DINOv3 support by @rwightman in #2579

New Contributors

@hassonofer made their first contribution in #2559
@zhima771 made their first contribution in #2576
@ha405 made their first contribution in #2568

Full Changelog: v1.0.19...v1.0.20

Contributors

rwightman, grodino, and 3 other contributors

Assets 2

24 Jul 03:06

rwightman

v1.0.19

d08d5a0

Release v1.0.19

Patch release for Python 3.9 compat break in 1.0.18

July 23, 2025

Add set_input_size() method to EVA models, used by OpenCLIP 3.0.0 to allow resizing for timm based encoder models.
Release 1.0.18, needed for PE-Core S & T models in OpenCLIP 3.0.0
Fix small typing issue that broke Python 3.9 compat. 1.0.19 patch release.

July 21, 2025

ROPE support added to NaFlexViT. All models covered by the EVA base (eva.py) including EVA, EVA02, Meta PE ViT, timm SBB ViT w/ ROPE, and Naver ROPE-ViT can be now loaded in NaFlexViT when use_naflex=True passed at model creation time
More Meta PE ViT encoders added, including small/tiny variants, lang variants w/ tiling, and more spatial variants.
PatchDropout fixed with NaFlexViT and also w/ EVA models (regression after adding Naver ROPE-ViT)
Fix XY order with grid_indexing='xy', impacted non-square image use in 'xy' mode (only ROPE-ViT and PE impacted).

What's Changed

Add ROPE support to NaFlexVit (axial and mixed), and support most (all?) EVA based vit models & weights by @rwightman in #2552
Support set_input_size() in EVA models by @rwightman in #2554

Full Changelog: v1.0.17...v1.0.18

Contributors

rwightman

Assets 2

23 Jul 20:03

rwightman

v1.0.18

e6ab6bc

Release v1.0.18

July 23, 2025

Add set_input_size() method to EVA models, used by OpenCLIP 3.0.0 to allow resizing for timm based encoder models.
Release 1.0.18, needed for PE-Core S & T models in OpenCLIP 3.0.0

July 21, 2025

ROPE support added to NaFlexViT. All models covered by the EVA base (eva.py) including EVA, EVA02, Meta PE ViT, timm SBB ViT w/ ROPE, and Naver ROPE-ViT can be now loaded in NaFlexViT when use_naflex=True passed at model creation time
More Meta PE ViT encoders added, including small/tiny variants, lang variants w/ tiling, and more spatial variants.
PatchDropout fixed with NaFlexViT and also w/ EVA models (regression after adding Naver ROPE-ViT)
Fix XY order with grid_indexing='xy', impacted non-square image use in 'xy' mode (only ROPE-ViT and PE impacted).

What's Changed

Add ROPE support to NaFlexVit (axial and mixed), and support most (all?) EVA based vit models & weights by @rwightman in #2552
Support set_input_size() in EVA models by @rwightman in #2554

Full Changelog: v1.0.17...v1.0.18

Contributors

rwightman

Assets 2

Uh oh!

Releases: huggingface/pytorch-image-models

Release v1.0.27

April 23, 2026

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.26

March 23, 2026

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.25

Feb 23, 2026

Jan 21, 2026

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.24

Jan 5 & 6, 2025

Dec 30, 2025

Dec 12, 2025

Dec 1, 2025

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.23

Dec 30, 2025

Dec 12, 2025

Dec 1, 2025

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.22

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.21

Oct 16-20, 2025

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.20

Sept 21, 2025

Sept 17, 2025

What's Changed

New Contributors

Contributors

Uh oh!

Release v1.0.19

July 23, 2025

July 21, 2025

What's Changed

Contributors

Uh oh!

Release v1.0.18

July 23, 2025

July 21, 2025

What's Changed

Contributors

Uh oh!