-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot offload / swap blocks. RuntimeError: "fill cpu" not implemented for Float8_efm3fn
#90
Comments
Firstly, you dont need fp8_llm. Secondly, i experienced OOM when using the --block to swaps feature, so i recommend you not to use blocks to swap for now. And since you are using xformers, use split_attn too "--xformers --split_attn" (you should use triton and sageattention on Linux btw). I trained some loras using only 8GB VRAM so 12GB is possible! |
Also, first time seeing Float8_efm3fn ☣ |
OK, a couple of things.
|
If you are training for a character (not a style) try these to improve the quality:
I hope this will help you! |
I managed to use blocks_to_swap for FP8. I'm training on a video dataset so i dont know if this will work for you or not but the key is to clear the data cache, and lower the resolution for long video (to about half or one third the resolution of the short video). |
I’m not using video at all. Not sure if that makes a difference here, but nothing I did with FP8 worked with swap_blocks for me. I might have to try different learning rates. The output is complete trash and the loss/epoch never drops below .25, definitely trash output. Running for FAR longer than all the guides show. I’m not training a specific character, rather trying to train pictures of an animal. So, all slightly different specifics, but generally the same. IDK. I cropped all my photos to 544x960, and have only that one resolution there now. Trying again with above settings. |
.25 is far too high. Can you share your training parameter? |
I left this thread open in case there is something about the |
I am trying to train photos on a 12GB card in Linux. Getting OOM, so I tried to enable swap blocks 36. Regardless what value I use, I am getting a traceback list that includes this runtime error in the middle.
I don't have a lot of VRAM, so I thought I would try the FP8 model instead of the 25GB FP16. Got this error, swapped it out for the full, same error.
This is my script to start it (shell, but ignore the
\
s). I copied it from photo (traveling today) so it might a typo somewhere but the theme should be there. I suppose I could have tried to remove --fp8_base. I remember trying fp16_base which isn't a thing apparently.Generally,
3. In my training data of normal resolution photos, should I pre-crop and resize them?
4. The training seems to be able to handle avi picture, and webp, considering it didn't error out on me. Should I manually make these jpg to be safe?
Really great work, thanks a ton!
The text was updated successfully, but these errors were encountered: