Skip to content

Releases: ludwig-ai/ludwig

v0.17.6

Choose a tag to compare

@w4nderlust w4nderlust released this 26 Jun 23:15

What's new

Preprocessing progress callback — implement in your subclass to receive live progress updates (0.0 to 1.0) during feature preprocessing. Works with pandas, Dask, and Ray backends out of the box.

class MyCallback(Callback):
    def on_preprocess_progress(self, progress: float, **kwargs):
        print(f"Preprocessing: {progress:.0%}")

Fixes

  • MLflow 3.x filesystem tracking store compatibility in CI
  • GPU Docker images now correctly ship CUDA-enabled PyTorch wheels (cu126)

v0.17.5

Choose a tag to compare

@w4nderlust w4nderlust released this 29 May 22:07

What's changed

Bug fix: GPU Docker images now install the CUDA build of PyTorch

The GPU Docker images (ludwig-gpu, ludwig-ray-gpu) were incorrectly shipping the CPU build of PyTorch (torch==2.12.0 without a +cu* suffix) despite being GPU images. This meant GPU training silently fell back to CPU.

Root cause (two issues):

  1. The --force-reinstall step used --extra-index-url instead of --index-url. With --extra-index-url, pip checks PyPI first and finds the CPU wheel (torch==2.12.0) there, so it never looks at the PyTorch CUDA index.

  2. The CUDA index suffix was cu124, but torch==2.12.0 is not published on the cu124 index (which only goes up to 2.6.0+cu124). The fix switches to cu126, where torch==2.12.0+cu126 is available.

Fix:

  • Changed --extra-index-url--index-url on the force-reinstall step in both GPU Dockerfiles so pip goes exclusively to the PyTorch wheel server.
  • Changed cu124cu126 throughout both GPU Dockerfiles (including the Ray base image tag).

Verified locally: torch==2.12.0+cu126 with CUDA build version: 12.6 confirmed inside the rebuilt image.

Updated Docker images (0.17.5)

  • ludwigai/ludwig:0.17.5 — CPU
  • ludwigai/ludwig-gpu:0.17.5 — CUDA 12.6 (torch 2.12.0+cu126)
  • ludwigai/ludwig-ray:0.17.5 — CPU + Ray
  • ludwigai/ludwig-ray-gpu:0.17.5 — CUDA 12.6 + Ray (torch 2.12.0+cu126)

v0.17.3

Choose a tag to compare

@w4nderlust w4nderlust released this 24 May 17:19

What's changed

Bug fixes

  • Fixed LLM fine-tuning crash with torchao>=0.17 (#4170)
    torchao 0.17 requires torch>=2.11 (uses torch.utils._pytree.register_constant), but the previous Docker images shipped torch==2.6.0, causing an AttributeError on import. Ludwig itself now enforces torch>=2.11 so the combination can never resolve to an incompatible pair again.

  • Fixed torchaudio / torchcodec audio loadingtorchaudio>=2.11 delegates all audio I/O to torchcodec, which requires FFmpeg. The CI images and Docker images now install FFmpeg explicitly.

  • Fixed CI dependency resolver poisoning — passing --extra-index-url https://download.pytorch.org/whl/cpu to the Ludwig [test] install step caused uv to resolve all packages (including datasets, ray, packaging) through the PyTorch wheel server, returning ancient versions. The install order is now: pin all torch-family packages from the CPU extra-index first, then install .[test] against plain PyPI.

Dependency changes

Updated lower bounds in pyproject.toml:

Package Old lower bound New lower bound
torch >=1.13 >=2.11
torchaudio >=0.13 >=2.11
torchvision >=0.14 >=0.26
torchcodec (new) >=0.1
transformers >=4.36 >=5.0
torchao (llm extra) >=0.8.0 >=0.17.0

Docker images

All four images (ludwig, ludwig-ray, ludwig-gpu, ludwig-ray-gpu) are rebuilt with:

  • torch==2.12.0
  • torchvision==0.27.0
  • torchaudio==2.11.0
  • torchcodec (CPU/CUDA variant as appropriate)
  • FFmpeg installed in the image

AutoML improvement

  • Changed the default tabular combiner in AutoML from tabnet to ft_transformer, which generally performs better on tabular datasets.

Ludwig 0.17.2

Choose a tag to compare

@w4nderlust w4nderlust released this 23 May 06:32

What's changed

Bug fixes

  • Fix LLM fine-tuning crash (AttributeError: module 'torch.utils._pytree' has no attribute 'register_constant') caused by torchao>=0.11 being installed against torch<2.11 — tighten all torch-family lower bounds to prevent incompatible combinations (#4170)
  • Update all Docker images from torch==2.6.0 to torch==2.12.0 / torchvision==0.27.0 / torchaudio==2.11.0, eliminating the mismatch between the Ray base image's bundled torch and the torchao version pulled in by ludwig[llm]

New features

  • AutoML default tabular combiner changed from tabnet to ft_transformer — better out-of-the-box performance on most tabular datasets
  • Added ft_transformer and tabtransformer to AutoML combiner defaults with tuned hyperopt parameter spaces

Dependency updates

Package Before After
torch >=2.0 >=2.11
torchaudio >=2.0 >=2.11
torchvision >=0.15 >=0.26
transformers >=4.36 >=5.0
torchao (llm extra) >=0.9.0 >=0.17.0

Docker images

  • ludwigai/ludwig:0.17.2 — CPU, torch 2.12.0
  • ludwigai/ludwig-gpu:0.17.2 — CUDA 12.4, torch 2.12.0+cu124
  • ludwigai/ludwig-ray:0.17.2 — CPU + Ray 2.54, torch 2.12.0
  • ludwigai/ludwig-ray-gpu:0.17.2 — CUDA 12.4 + Ray 2.54, torch 2.12.0+cu124

Ludwig 0.17.1

Choose a tag to compare

@w4nderlust w4nderlust released this 18 May 21:37

What's changed

Bug fixes

  • Fix preprocessing pipeline crash for output features without preprocessing config (e.g. anomaly output type) — KeyError in build_preprocessing_parameters, build_dataset, build_metadata, and build_data
  • Fix v0.17.0 PyPI deployment: the v0.17.0 tag was created before the version bump commit landed, causing the PyPI build to produce ludwig-0.16.2 and fail with a duplicate-upload error

New features

  • StudioCallback: writes metrics.jsonl and trials.jsonl for Ludwig Studio integration
  • Per-call callbacks on train() and predict()
  • Hyperopt executor hardened for ray-free environments (OptunaExecutor no longer requires Ray)
  • SIGUSR1/SIGUSR2 pause/resume support in training

Upgrade notes

  • Replace pip install ludwig==0.17.0 with pip install ludwig==0.17.1 (0.17.0 was never successfully published to PyPI)

Ludwig 0.17.0

Choose a tag to compare

@w4nderlust w4nderlust released this 16 May 20:30
b6b6bc9

New Features

Lazy Preprocessing for Audio and Image

Ludwig 0.17 introduces lazy preprocessing — the most significant change to the training pipeline in several releases. Previously, audio and image features required a full preprocessing pass before training could begin: decode every file, resize/resample, write to disk. For large multimodal datasets, this meant waiting hours before a single training step ran.

Now you can start training immediately.

  • preprocessing_mode: lazy — audio and image features are decoded on the fly during training, directly from raw file paths. No upfront pass. Training starts in seconds.
  • preprocessing_mode: lazy_cached — decoded tensors are cached as memory-mapped files on the first pass through the data. Subsequent epochs hit the cache directly, with zero decode overhead after the first.
  • preprocessing_mode: eager — the previous default, preserved for full backwards compatibility.
  • prefetch_size — configurable prefetch queue depth for the background decoder thread, letting you tune the CPU/GPU overlap for your hardware.
  • A background prefetch thread decodes ahead of the training loop, keeping the GPU fed without blocking the forward pass.
  • Ray distributed training decodes lazy features inside the Ray data pipeline (not inside training actors), so decode work is properly parallelized across the cluster.

This matters most for audio and image datasets too large to preprocess in full — but it also makes iteration faster for any multimodal workload. (#4171, #4173)


Mega-AutoML: Rebuilt from the Ground Up

The AutoML infrastructure in Ludwig has been completely overhauled. (#4168, #4169)

YAML search space — search spaces are now declared in YAML and loaded via SearchSpace._from_specs(). This makes it straightforward to define custom search spaces, share them across experiments, and version-control them alongside your configs.

Dataset quality analysis — before building a search space, Ludwig now profiles your dataset: size, class balance, modality distribution, output cardinality. The search space is then constructed with awareness of what the data actually looks like.

Dataset-size-aware hyperparameter caps — epoch counts and batch sizes are now automatically capped based on dataset size, preventing both under-training on large datasets and OOM on tiny ones. Transformer-based combiners (cross_attention, perceiver) get additional batch size caps to prevent GPU memory exhaustion.

Learning rate and stability fixes — transformer-based combiners now have a capped learning rate upper bound to prevent NaN loss during hyperopt on sensitive architectures.

configs_from_dataframe improvementsdefault_epochs is now correctly threaded through to TrainerSpec, and the default search space builder is properly imported and called.


HuggingFace Dataset Library — 500+ Datasets

Ludwig now ships with a built-in library of over 500 HuggingFace datasets, usable directly from ludwig.datasets. This release adds:

  • Datasets spanning every modality — text classification, NER, question answering, summarization, audio classification, image classification, multimodal tasks, and more.
  • Custom loaders for complex datasets — for HF datasets that require non-trivial loading logic (custom splits, column renaming, label normalization, multi-config handling).
  • ESC-50 (environmental audio classification), WikiANN (multilingual NER), GoEmotions (fine-grained emotion classification), New Yorker Caption Contest (multimodal humor), and many more.

These aren't just dataset references — each ships with a Ludwig config that maps columns to features, sets appropriate output types, and is smoke-tested end-to-end.


Refactoring

Visualize Package

The visualize.py module had grown to 4,144 lines. It's now split into a domain-scoped visualize/ package, with submodules organized by visualization type (learning curves, confusion matrices, calibration, hyperopt, etc.). The CLI entrypoint and all public APIs are unchanged. (#4154)

Text Encoders

Removed a stale duplicate text/encoders.py that had diverged from the canonical encoder implementations. Deep nesting in several encoder classes has been flattened for readability. (#4159)


Bug Fixes

  • LLM extra now requires torch>=2.7 — the LLM extra (pip install ludwig[llm]) now pins torch>=2.7, which is required for quantization and Flash Attention 2 support in the current transformers stack. Non-OSError exceptions during pretrained model loading no longer trigger the retry loop.
  • TabPFN v2 guard — Ludwig now raises a clear error at config validation time when a tabpfn_v2 combiner is configured but the tabpfn package is not installed, instead of failing at model construction.
  • Ray lazy decode placement — lazy audio/image features are now decoded inside the Ray data pipeline, before data reaches training actors. This keeps decode work off the critical path and avoids serialization of decoded tensors across Ray object store. Missing lazy_audio_params / lazy_image_params now emit a warning rather than a silent no-op.
  • Smoke test stability — diversity retry logic for sorted classification datasets, media-aware shuffle buffer sizing, and per-modality buffer tuning to eliminate label collapse in small evaluation splits.

CI

Distributed integration tests now run in 6 parallel groups (up from 1), cutting distributed test wall time by ~5×. Integration test groups are renamed to sequential letters for clarity. (#4172)


Installation

```
pip install ludwig==0.17.0
```

GitHub: https://github.com/ludwig-ai/ludwig
Docs: https://ludwig.ai

v0.16.2

Choose a tag to compare

@w4nderlust w4nderlust released this 08 May 07:50

Bug Fixes & Test Improvements

Bug Fixes

  • fix: call init_dist_strategy("local") in tune_batch_size_fn and tune_learning_rate_fn — fixes RuntimeError: Distributed strategy not initialized (#4149) when auto-tuning batch size or learning rate via Ray with non-MeanMetric output features (e.g. number output → MSEMetric).
  • fix: remove redundant dtype check in text_feature — removes a strict integer-only dtype guard that broke tests passing float32 tensors (the function already casts to int32 internally).
  • fix: add visualize __main__.py for python -m ludwig.visualize — restores python -m ludwig.visualize after the visualize module was split into a package.
  • fix: update test_serve_v2 to use numpy_to_python — updates test import after _numpy_safe was renamed to numpy_to_python in data_utils.

Refactoring

  • refactor: major api.py cleanup — guard clauses, extraction, type annotations, docstring fixes across api.py, serve.py, serve_v2.py, visualize/, and several utils.

Tests

  • test: regression tests for transformers import safety and Ray tune dist strategy — prevents recurrence of #4142 (broken PreTrainedModel import) and #4149 (missing init_dist_strategy in Ray tune fns).
  • fix: rewrite torch_utils tests to not fail in no-CUDA CI environments — CUDA-specific GPU isolation tests now skip gracefully on CPU-only runners; idempotency test rewritten to verify behavior directly.

v0.16.1

Choose a tag to compare

@w4nderlust w4nderlust released this 07 May 05:01

Bug Fixes

  • from ludwig.api import LudwigModel fails on Python 3.12 — When torchao and PyTorch are version-mismatched (torchao calls torch.utils._pytree.register_constant, added in PyTorch 2.5+), transformers's lazy loader raises ModuleNotFoundError for any class defined in modeling_utils.py, including PreTrainedModel. llm_utils.py and text_feature.py both imported these classes at module level; they are now deferred to TYPE_CHECKING. Follows the same fix applied to hf_utils.py in v0.15.1. (#4142)

  • Replace assert with explicit exceptions; fix mutable default arg — Internal assert statements replaced with ValueError/RuntimeError so they aren't silently stripped with python -O. Fixed a mutable default argument that could cause cross-call state leakage. (#4152)

  • Unified ruff toolchain — Replaced black + isort + flake8 with a single ruff invocation for linting and formatting. No behavior change for users.

v0.16.0

Choose a tag to compare

@w4nderlust w4nderlust released this 06 May 16:06

New Features

Timeseries Forecasting

  • PatchTST & N-BEATS encoders — State-of-the-art patch-based and basis-expansion timeseries encoders. Both support multivariate and univariate forecasting with TimeseriesOutputFeature. (#4147)
  • MASE & sMAPE metrics — Mean Absolute Scaled Error and symmetric Mean Absolute Percentage Error added for forecasting evaluation. (#4147)

Advanced PEFT Adapters

  • New adapter types — TinyLoRA, C3A, OFT (Orthogonal Fine-Tuning), HRA (Householder Reflection Adaptation), WaveFT, LN-Tuning, VBLoRA. (#4146)
  • New LoRA initializers — PiSSA (Principal Singular values and Singular vectors Adaptation), EVA (Explained Variance Adaptation), CorDA/LoftQ. (#4146)

Phase 6: Future Capabilities

  • LLM config generationludwig generate_config "describe your task" uses an LLM to write the YAML config for you. (#4092)
  • HyperNetwork combiner — Conditioning-based feature fusion where one feature generates weights for others. (#4092)
  • Nash-MTL & Pareto-MTL — Game-theoretic and preference-based multi-task loss balancing strategies. (#4092)

New Examples

  • VLM fine-tuning — LLaVA, Qwen2-VL, InternVL via is_multimodal: true. (#4140)
  • Mamba-2 / Jamba encoders — State-space model encoders for sequence tasks. (#4140)
  • Ray Serve & KServe deployment — Distributed and Kubernetes-native serving shims. (#4140)
  • Multi-task & HyperNetwork examples (#4112)

Bug Fixes

  • Python 3.12 import fix — Deferred PreTrainedModel import to TYPE_CHECKING to fix ImportError on Python 3.12 when HuggingFace transformers is not installed in all code paths.
  • Dask image bytes UnicodeDecodeErrordask.config.set({"dataframe.convert-string": False}) is now applied at import time, preventing UnicodeDecodeError when image bytes pass through Dask string columns. (#4151)
  • Dask shuffle partd race condition — Replaced file-based Dask shuffle (which hit a partd lock race under concurrent workers) with tasks-based shuffle. (#4150)
  • Encoder input_shape contract — Fixed a contract violation where certain encoders did not correctly report or handle input_shape, causing shape mismatches during model construction. (#4148)
  • Ray backend GPU underutilizationRayDatasetBatcher now runs to_tensors locally in the producer thread instead of spawning remote Ray tasks per block. Datasets are materialized before training to avoid Parquet re-reads on every epoch. (#4144)

v0.15.1

Choose a tag to compare

@w4nderlust w4nderlust released this 05 May 05:29

Bug fixes

  • Ray training 3.7x slowdown eliminatedLudwigProgressBar was calling rt.report() on every training batch when running inside Ray workers (~1.9 s/call through the Ray GCS). With hundreds of batches this completely dominated wall-clock time. Per-batch progress reporting is now suppressed; training metrics continue to be reported correctly at eval/checkpoint time. Ray overhead is now ~1.7x vs local (fixed TorchTrainer setup cost), down from 3.7x. (#4144)

  • GPU underutilization in Ray backend fixedRayDatasetBatcher was running to_tensors via map_batches (spawning a Ray remote task per dataset block with scheduling overhead). It now runs locally in the producer thread. Also: datasets are now materialized before training to avoid re-reading Parquet from disk on every epoch. (#4144)

  • Python 3.14 compatibilityLudwigBaseConfig subclasses crashed with PydanticUserError: Field requires a type annotation on Python 3.14 because annotations are now stored lazily via __annotate_func__. Fixed in _LudwigModelMeta.__new__. (#4144)

  • ModernBERT tokenizer — Models containing "bert" in their name (e.g. answerdotai/ModernBERT-base) were incorrectly routed to BertTokenizer (WordPiece), causing Missing [UNK] token errors. ModernBERT now correctly uses HFTokenizer (AutoTokenizer / BPE). (#4144)

  • Dask meta= parameter — Multiple feature types (binary, category, sequence, timeseries, text, vector) called .map() without a meta= argument, causing ValueError: Metadata inference failed in map when using the Dask engine (backend: {type: ray, processor: {type: dask}}). All bare .map() calls are now fixed. (#4144)