Add Single Stage Training support to core modules#5
Open
shun31y wants to merge 5 commits into
Open
Conversation
…onalGaussianDistribution class
…ng VQ, enhancing decoder and encoder configurations
There was a problem hiding this comment.
Pull Request Overview
Adds configuration flags and modules to support single-stage end-to-end training, clustering-based VQ, and enhanced perceptual loss options.
- Introduces
is_single_trainingflag and newReconstructionLoss_Single_Stageto unify stage-1 and stage-2 training. - Adds
clustering_vqsupport andDiagonalGaussianDistributionin VectorQuantizer. - Extends perceptual loss to combine LPIPS and ConvNeXt and provides 3D perceptual loss implementations.
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| utils/train_utils.py | Imports and selects ReconstructionLoss_Single_Stage based on is_single_training |
| modeling/titok.py | Adds is_single_training, clustering_vq, and only_finetune_decoder flags; updates encoder/decoder logic |
| modeling/quantizer/quantizer.py | Introduces clustering_vq logic in forward and adds DiagonalGaussianDistribution |
| modeling/modules/perceptual_loss.py | Implements combined LPIPS+ConvNeXt and 3D perceptual loss classes |
| modeling/modules/lpips.py | Adds LPIPS implementation with model download and VGG backbone |
| modeling/modules/losses.py | Adds ReconstructionLoss_Single_Stage subclass |
| modeling/modules/blocks.py | Updates TiTok encoder/decoder reshape paths based on is_single_training |
| modeling/modules/init.py | Exposes ReconstructionLoss_Single_Stage in package exports |
Comments suppressed due to low confidence (5)
modeling/titok.py:80
- This comment is no longer accurate after adding
is_single_trainingandonly_finetune_decoder. Update it to reflect the new flag logic or remove it.
# This should be False for stage1 and True for stage2.
modeling/quantizer/quantizer.py:131
- The new
DiagonalGaussianDistributionclass lacks unit tests. Consider adding tests forsample,mode, andklmethods to ensure correct behavior.
class DiagonalGaussianDistribution(object):
modeling/modules/lpips.py:4
- Please replace
(year)with the actual year or remove placeholder text to complete the license header.
All Bytedance's Modifications are Copyright (year) Bytedance Ltd. and/or its affiliates.
modeling/quantizer/quantizer.py:80
- The variables
min_encoding_indicesanddused in theclustering_vqblock are not defined in this scope, which will cause a NameError. Ensure that these are computed earlier inforwardor passed into this block before use.
encoding_indices = gather(min_encoding_indices)
modeling/modules/perceptual_loss.py:543
- Index 8 is out of range for a tensor with
num_frames = 8(valid indices 0–7). Change to a valid frame index (e.g., 7) or usenum_frames - 1.
target[:, :, 8] = torch.rand(2, 3, w, h).clamp(0, 1)
| if "lpips" in model_name and "convnext_s" in model_name: | ||
| loss_config = model_name.split('-')[-2:] | ||
| self.loss_weight_lpips, self.loss_weight_convnext = float(loss_config[0]), float(loss_config[1]) | ||
| print(f"self.loss_weight_lpips, self.loss_weight_convnext: {self.loss_weight_lpips}, {self.loss_weight_convnext}") |
There was a problem hiding this comment.
[nitpick] Using print inside __init__ can clutter logs; consider using a logger or removing this debug statement.
Suggested change
| print(f"self.loss_weight_lpips, self.loss_weight_convnext: {self.loss_weight_lpips}, {self.loss_weight_convnext}") | |
| logger.debug(f"self.loss_weight_lpips, self.loss_weight_convnext: {self.loss_weight_lpips}, {self.loss_weight_convnext}") |
ensan-hcl
approved these changes
Aug 18, 2025
kentosasaki-jp
approved these changes
Aug 28, 2025
Member
|
@shun31y Can you upload config file? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Train end-to-end without the MaskGIT pseudo-code.
What
Add Single Stage Training (based on TA-TiTok).
How