You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently experiencing some issues while attempting to reproduce an FID score of 7.13 using SD-VAE. Specifically, I replaced the VA-VAE with SD-VAE in your code framework and utilized the same training settings outlined in lightningdit_xl_vavae_f16d32_64ep_cfg.yaml, running it for 100,000 steps. Unfortunately, I am only able to achieve a model with an FID score of 8.970012796829849.
I kindly request your assistance in reproducing the FID score of 7.13. Could you please provide the training code or offer some guidance?
Thank you once again for your exceptional work!
The text was updated successfully, but these errors were encountered:
The following are the possible reasons I can think of.
Latent normalization. During the training of SD-VAE, we maintained the same normalization operation as DiT, which directly multiplies the latent by 0.18215. This means you need to set latent_norm in the config to false and latent_multiplier to 0.18215. Since we trained a large number of tokenizers for the study, the channel-wise normalization we adopted is a stable operation, but it may not be the most optimal for SD-VAE.
Sampling details. For this part of the experiment, we used the dopri5 sample method, without using Euler with 250 steps, CFG interval, and timestep shift. A friendly reminder is that the number of samples needs to be ensured to be 50k. Our paper also reports some results for 10k, but the values are higher than the FID at 50k.
Feel free to provide more details for further discussion.
Thank you for your outstanding work.
I am currently experiencing some issues while attempting to reproduce an FID score of 7.13 using SD-VAE. Specifically, I replaced the VA-VAE with SD-VAE in your code framework and utilized the same training settings outlined in
lightningdit_xl_vavae_f16d32_64ep_cfg.yaml
, running it for 100,000 steps. Unfortunately, I am only able to achieve a model with an FID score of 8.970012796829849.I kindly request your assistance in reproducing the FID score of 7.13. Could you please provide the training code or offer some guidance?
Thank you once again for your exceptional work!
The text was updated successfully, but these errors were encountered: