Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -102,13 +102,16 @@ pipe = LTX2Pipeline.from_pretrained("Lightricks/LTX-2", torch_dtype=torch.bfloat

LTX-2 supports a two-stage pipeline (base + latent upsample) for production quality. For a demo, the single-stage pipeline is usually sufficient and faster.

**Don't default two-stage on for IC-LoRAs — check the card.** The common LTX-2.3 two-stage recipe runs stage 2 as an x2 latent-upsample + refine **with the IC-LoRA disabled** (`disable_lora()`), re-rendering on the bare base — which re-degrades exactly the LoRA-conditioned detail (text, identity). Whether two-stage is appropriate is per-LoRA, per the card: e.g. the *In/Out-painting* card explicitly says "both tasks use a two-stage pipeline," whereas the *Ingredients* card recommends a single-pass 30-step recipe and never mentions two stages. Single-stage is the safer demo default unless the card calls for two-stage.

## Inference defaults

- **Resolution**: typically multiples of 32. Common sizes: 768×512, 1216×704, 704×480.
- **Frame count**: typically `8k+1` (e.g. 121, 161, 257). Compute from `duration_seconds * fps` and round.
- **fps**: 24 by default; some LoRAs are trained at different rates (16, 30) — check the model card.
- **`num_inference_steps`**: 30–50 for non-distilled. Distilled checkpoints (look for "distilled" in the name) often run at 8–12.
- **`negative_prompt`**: LTX is sensitive to negative prompts. A good default: `"worst quality, inconsistent motion, blurry, jittery, distorted"`.
- **Distilled IC-LoRA — disable audio guidance too, not just `guidance_scale`.** `LTX2InContextPipeline` computes `do_classifier_free_guidance = guidance_scale > 1 OR audio_guidance_scale > 1`, and **`audio_guidance_scale` defaults to `7.0`**. So `guidance_scale=1.0` is not enough — CFG stays on via audio (and `stg_scale` defaults on), which doubles/mis-batches the forward against the in-context reference tokens (wrong recipe, sometimes a runtime error). For a distilled IC-LoRA pass all four: `pipe(..., guidance_scale=1.0, stg_scale=0.0, audio_guidance_scale=1.0, audio_stg_scale=0.0)`.

## Frame-count helper

Expand All @@ -120,7 +123,16 @@ def num_frames_for_duration(seconds, fps=24, base=8):

## Native pipeline path (LTX-2.3 fallback)

The native repo is a fallback when the diffusers path doesn't work, or for specific conditionings that diffusers may not support yet. Try `LTX2InContextPipeline` on the diffusers path first (see the IC-LoRAs section above). The pattern looks like:
The native repo is a fallback when the diffusers path doesn't work, or for specific conditionings that diffusers may not support yet. Try `LTX2InContextPipeline` on the diffusers path first (see the IC-LoRAs section above).

**The native repo depends on the model generation** — pick by the LoRA's base model (and what its card's snippet imports):

| Base model | Native repo | Package(s) / import | Pipeline classes |
|---|---|---|---|
| `LTX-Video` (0.9.x) | `github.com/Lightricks/LTX-Video` | `ltx_video` | `from ltx_video.pipelines import LTXPipeline` |
| `LTX-2`, `LTX-2.3` | `github.com/Lightricks/LTX-2` | `ltx-core` + `ltx-pipelines` (editable) | `ltx_pipelines.ic_lora.ICLoraPipeline`, `TI2VidOneStagePipeline`, … |

For the 0.9.x series the pattern looks like:

```bash
# in requirements.txt
Expand All @@ -136,6 +148,19 @@ The native path doesn't use `load_lora_weights`. Instead, LoRAs are usually wire

When the LoRA's model card has a Python snippet using the native repo, copy its construction pattern verbatim. The native API changes more often than diffusers' does, so don't paraphrase.

### LTX-2.x native path on ZeroGPU — gotchas

For `LTX-2` / `LTX-2.3` (the `ltx-core` + `ltx-pipelines` repo), clone + `pip install -e packages/ltx-core packages/ltx-pipelines` at runtime in `app.py` and pin a commit. Three things bite on ZeroGPU:

- **Native loader bypasses ZeroGPU virtualization → "No CUDA GPUs available" at startup.** The native safetensors loader does `safe_open(path, device="cuda")` and copies host→device inside safetensors' own C++ (`cudaMemcpy`), **bypassing `torch.Tensor.to`** — the call ZeroGPU patches to virtualize + pack module-scope weights. Nothing packs and module-scope placement raises *"No CUDA GPUs are available."* Fix: monkeypatch the loader to open on CPU then move with `.to`:
```python
with safetensors.safe_open(shard, framework="pt", device="cpu") as f:
value = f.get_tensor(name).to(device=device) # torch path → ZeroGPU-virtualisable
```
Also: patch attention to **SDPA** (FA3 crashes on Blackwell ZeroGPU), and **never call `torch.cuda.*` / `get_device_capability()` at module scope** (it poisons virtualization).
- **Native two-stage pins two 22B transformers → offload-disk overflow.** `ICLoraPipeline` builds two independent `ModelLedger`s (stage-1 with the LoRA, stage-2 without), each loading its **own** full transformer. The CLI loads them sequentially, but ZeroGPU pins all weights at module scope, so enabling stage 2 pins **both** (~143G) and overflows the offload disk (`OSError: [Errno 28] No space left on device`, ~96G cap). For a demo: pin stage 1, then have stage 2 **reuse stage 1's pinned modules** (`for n in [...]: setattr(s2, n, getattr(s1, n))`) so only one transformer is resident (~71G).
- **`ICLoraPipeline` is distilled-only.** Per the LTX-2 `ltx-pipelines` guide, IC-LoRA inference runs in distilled-only mode (fixed 8-step sigmas, no `num_inference_steps`/`guidance_scale`/`negative_prompt`); the docs do not describe non-distilled IC-LoRA or IC-LoRA + CFG/STG (guidance lives only on full-model pipelines like `TI2VidOneStagePipeline`, and in two-stage is stage-1-only). So if a card recommends a *non-distilled* recipe (e.g. 30 steps + guidance + STG on a `…-dev` base), there is no stock IC-LoRA pipeline for it — use the supported distilled `ICLoraPipeline` and note the recipe difference, or fall back to the diffusers `LTX2InContextPipeline` (which exposes steps/guidance).

## IC-LoRAs (in-context conditioning)

LTX IC-LoRAs condition the model on a reference video alongside a first-frame image. The diffusers path uses `LTX2InContextPipeline` with standard `load_lora_weights` + `set_adapters`.
Expand Down Expand Up @@ -173,6 +198,7 @@ Set `@spaces.GPU(duration=...)` to comfortably exceed expected generation time.
## Things to watch

- **Latest pipelines often need git diffusers.** Pin to `git+https://github.com/huggingface/diffusers` in `requirements.txt` for any LTX variant released in the last few weeks.
- **`git+diffusers` is a moving target — verify LTX2 output quality.** Bare `@main` inherits whatever's there. The LTX2 text connector carried a token-reversal regression (PR #13564) that scrambled prompt tokens/registers — degrading prompt adherence and fine detail (e.g. garbled on-screen text), worst on short prompts — until **PR #13931 (merged 2026-06-19)** fixed it. If LTX2 output looks weak or garbled, confirm the installed diffusers commit is **at or past #13931**, and prefer pinning to a known-good commit (`git+https://github.com/huggingface/diffusers@<sha>`) over bare `@main` for reproducibility.
- **Don't `torch.compile`.** LTX is fast enough on ZeroGPU without it; compile is incompatible anyway.
- **`enable_vae_tiling()` for higher resolutions.** LTX VAE memory grows with resolution; enable tiling for outputs above 768px on either axis.
- **Negative prompts matter more than for image models.** Don't ship with an empty negative_prompt unless the LoRA's model card says so.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,8 @@ When sharing the URL with the user:
- Offer to look at the logs if anything fails.
- Don't add a long postamble — they want to click the link, not read more text.

**Confirm a redeploy is actually live before testing it.** An `app.py`-only push does **not** change the Space's reported `runtime.stage` — the old replica keeps serving "RUNNING" while the new build swaps in, so a `gradio_client` test can silently hit **stale code**. To be sure: push → `api.restart_space(repo)` → poll until the stage leaves and returns to RUNNING → grep the boot logs (`/logs/run`) for a unique `[VERSION] …` marker you printed at module scope → then test. Also set `demo.launch(show_error=True)` so `gradio_client` surfaces the real traceback instead of a generic `AppError`.

## Publish-time failures (before the build starts)

These happen during `create_repo` or `upload_file`, *before* the Space build pipeline runs. Diagnose by reading the exception, not the container logs (the container hasn't started yet).
Expand Down
Loading