-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loss can't decay during training hunyuan lora #89
Comments
Your LR is really low for that number of steps. Usually my loss starts around .2 and by the end reaches around .1, at least with timestep sampling shift and flow shift 7.0. I usually run for about 2000 steps at 2e-4 so for 1e-5 you would need significantly more I think. I haven't seen anyone have much success below 5e-5 fwiw! That said - how is your output? Loss is not the be all end all metric, for instance if I train with timestep sampling sigmoid, flow shift 1(good for characters), loss doesn't really decrease but the target is still learned. |
hi Sarania Thanks again |
Sorry it took me a minute to reply I've had a rough week, this is my loss graph for a recent run with sigmoid/1: versus the same dataset and everything but with shift/7: Both of these had LR 1e-4 for 3600 steps and the sigmoid run turned out better, which seems to be the case any time I train on images only. |
My captions aren't special really. For image datasets I tend to use fairly simple natural language captions that first describe the subject I'm training, then the background. For instance: "photo of ***** sitting in the grass with her legs crossed, wearing a white dress and smiling. Her hair is curled and she's wearing brown boots. In the background are bushes, trees, and a building, all slightly blurred." "photo of ***** in a pink bra and jean shorts posing with her arms up. Behind her is a table and a mirror and on the table are various objects including a hairbrush." (The ****s are just censored personal names). I usually use Florence2 or Joycaption to autocaption them, then refine them manually. For videos I do everything manually and tend to be more verbose, making sure to mention the action or movement in the first sentence because that's what gets weighted most heavily. If you're training a character it definitely helps to use their name in context like "photo of Name standing in the park" etc. If you're having trouble getting any results at all no matter how you switch the settings around, it's possible something in your dataset is wrecking your gradients. I've had that happen before and sometimes it can be hard to weed out so if you aren't getting any results at all I'd recommend trying with another dataset and seeing how that goes. |
FYI, I have finished the hunyuan lora training , all things good , but the loss is quite high:
Msteps: 100%|██████████| 3720/3720 [5:40:44<00:00, 5.50s/it, avr_loss=2.77]saving checkpoint: /home/ubuntu/ComfyUI/output/lora_hunyuan/hunyuan-aorun-000001.safetensors
my hparameters is as following:
The text was updated successfully, but these errors were encountered: