Skip to content

Releases: huggingface/pytorch-image-models

Release v1.0.27

08 May 20:06

Choose a tag to compare

April 23, 2026

  • Add Gemma4 ViT encoders w/ NaFlex pipeline support (variable aspect/size per image). Thanks Yonghye Kwon
  • Support DINOv3 weights in NaFlexVit. Thanks Yonghye Kwon
  • Some improvements to Muon fallback (AdamW/NadamW) lr behavior

What's Changed

New Contributors

Full Changelog: v1.0.26...v1.0.27

Release v1.0.26

23 Mar 18:13

Choose a tag to compare

March 23, 2026

  • Improve pickle checkpoint handling security. Default all loading to weights_only=True, add safe_global for ArgParse.
  • Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass is_causal through for SSL tasks.
  • Fix class & register token uses with ViT and no pos embed enabled.
  • Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr).
  • Improve consistency of output projection / MLP dimensions for attention pooling layers.
  • Hiera model F.SDPA optimization to allow Flash Attention kernel use.
  • Caution added to SGDP optimizer.
  • Release 1.0.26. First maintenance release since my departure from Hugging Face.

What's Changed

New Contributors

Full Changelog: v1.0.25...v1.0.26

Release v1.0.25

23 Feb 17:22

Choose a tag to compare

Feb 23, 2026

  • Add token distillation training support to distillation task wrappers
  • Remove some torch.jit usage in prep for official deprecation
  • Caution added to AdamP optimizer
  • Call reset_parameters() even if meta-device init so that buffers get init w/ hacks like init_empty_weights
  • Tweak Muon optimizer to work with DTensor/FSDP2 (clamp_ instead of clamp_min_, alternate NS branch for DTensor)
  • Release 1.0.25

Jan 21, 2026

  • Compat Break: Fix oversight w/ QKV vs MLP bias in ParallelScalingBlock (& DiffParallelScalingBlock)
    • Does not impact any trained timm models but could impact downstream use.

What's Changed

  • Token distill task & distill task refactoring by @rwightman in #2647
  • Fix distilled head dropout using wrong token in PiT forward_head by @hassonofer in #2649
  • Fix #2653, no models with weights impacted so just a clean fix by @rwightman in #2654
  • Add the cautious optimizer to AdamP. by @Yuan-Jinghui in #2657
  • Enhance the numerical stability of the Cautious Optimizer by @Yuan-Jinghui in #2658
  • Some misc fixes for torch.jit deprecation and meta device init by @rwightman in #2664
  • fix(optim): replace bare except with Exception in Lion optimizer by @llukito in #2666
  • Change clamp_min_ to clamp_(min=) as former doesn't work with DTensor / FSDP2 by @rwightman in #2668
  • Add DTensor compatible NS impl for Muon by @rwightman in #2669

New Contributors

Full Changelog: v1.0.24...v1.0.25

Release v1.0.24

07 Jan 00:28

Choose a tag to compare

Jan 5 & 6, 2025

  • Patch Release 1.0.24 (fix for 1.0.23)
  • Add new benchmark result csv files for inference timing on all models w/ RTX Pro 6000, 5090, and 4090 cards w/ PyTorch 2.9.1
  • Fix moved module error in deprecated timm.models.layers import path that impacts legacy imports
  • Release 1.0.23

Dec 30, 2025

Dec 12, 2025

Dec 1, 2025

  • Add lightweight task abstraction, add logits and feature distillation support to train script via new tasks.
  • Remove old APEX AMP support

What's Changed

New Contributors

Full Changelog: v1.0.22...v1.0.24

Release v1.0.23

05 Jan 21:42

Choose a tag to compare

Dec 30, 2025

Dec 12, 2025

Dec 1, 2025

  • Add lightweight task abstraction, add logits and feature distillation support to train script via new tasks.
  • Remove old APEX AMP support

What's Changed

New Contributors

Full Changelog: v1.0.22...v1.0.23

Release v1.0.22

05 Nov 04:08

Choose a tag to compare

Patch release for priority LayerScale initialization regression in 1.0.21

What's Changed

New Contributors

Full Changelog: v1.0.21...v1.0.22

Release v1.0.21

24 Oct 22:39

Choose a tag to compare

Oct 16-20, 2025

  • Add an impl of the Muon optimizer (based on https://github.com/KellerJordan/Muon) with customizations
    • extra flexibility and improved handling for conv weights and fallbacks for weight shapes not suited for orthogonalization
    • small speedup for NS iterations by reducing allocs and using fused (b)add(b)mm ops
    • by default uses AdamW (or NAdamW if nesterov=True) updates if muon not suitable for parameter shape (or excluded via param group flag)
    • like torch impl, select from several LR scale adjustment fns via adjust_lr_fn
    • select from several NS coefficient presets or specify your own via ns_coefficients
  • First 2 steps of 'meta' device model initialization supported
    • Fix several ops that were breaking creation under 'meta' device context
    • Add device & dtype factory kwarg support to all models and modules (anything inherting from nn.Module) in timm
  • License fields added to pretrained cfgs in code
  • Release 1.0.21

What's Changed

New Contributors

Full Changelog: v1.0.20...v1.0.21

Release v1.0.20

21 Sep 17:28

Choose a tag to compare

Sept 21, 2025

  • Remap DINOv3 ViT weight tags from lvd_1689m -> lvd1689m to match (same for sat_493m -> sat493m)
  • Release 1.0.20

Sept 17, 2025

What's Changed

New Contributors

Full Changelog: v1.0.19...v1.0.20

Release v1.0.19

24 Jul 03:06

Choose a tag to compare

Patch release for Python 3.9 compat break in 1.0.18

July 23, 2025

  • Add set_input_size() method to EVA models, used by OpenCLIP 3.0.0 to allow resizing for timm based encoder models.
  • Release 1.0.18, needed for PE-Core S & T models in OpenCLIP 3.0.0
  • Fix small typing issue that broke Python 3.9 compat. 1.0.19 patch release.

July 21, 2025

  • ROPE support added to NaFlexViT. All models covered by the EVA base (eva.py) including EVA, EVA02, Meta PE ViT, timm SBB ViT w/ ROPE, and Naver ROPE-ViT can be now loaded in NaFlexViT when use_naflex=True passed at model creation time
  • More Meta PE ViT encoders added, including small/tiny variants, lang variants w/ tiling, and more spatial variants.
  • PatchDropout fixed with NaFlexViT and also w/ EVA models (regression after adding Naver ROPE-ViT)
  • Fix XY order with grid_indexing='xy', impacted non-square image use in 'xy' mode (only ROPE-ViT and PE impacted).

What's Changed

  • Add ROPE support to NaFlexVit (axial and mixed), and support most (all?) EVA based vit models & weights by @rwightman in #2552
  • Support set_input_size() in EVA models by @rwightman in #2554

Full Changelog: v1.0.17...v1.0.18

Release v1.0.18

23 Jul 20:03

Choose a tag to compare

July 23, 2025

  • Add set_input_size() method to EVA models, used by OpenCLIP 3.0.0 to allow resizing for timm based encoder models.
  • Release 1.0.18, needed for PE-Core S & T models in OpenCLIP 3.0.0

July 21, 2025

  • ROPE support added to NaFlexViT. All models covered by the EVA base (eva.py) including EVA, EVA02, Meta PE ViT, timm SBB ViT w/ ROPE, and Naver ROPE-ViT can be now loaded in NaFlexViT when use_naflex=True passed at model creation time
  • More Meta PE ViT encoders added, including small/tiny variants, lang variants w/ tiling, and more spatial variants.
  • PatchDropout fixed with NaFlexViT and also w/ EVA models (regression after adding Naver ROPE-ViT)
  • Fix XY order with grid_indexing='xy', impacted non-square image use in 'xy' mode (only ROPE-ViT and PE impacted).

What's Changed

  • Add ROPE support to NaFlexVit (axial and mixed), and support most (all?) EVA based vit models & weights by @rwightman in #2552
  • Support set_input_size() in EVA models by @rwightman in #2554

Full Changelog: v1.0.17...v1.0.18