-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update pull.yml to test snapshot saving and loading #1486
base: main
Are you sure you want to change the base?
Conversation
test snapshot saving and loading
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1486
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit e733606 with merge base 083fdaf ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Fixed typos.
cuda-32.json because somebody would rather fail a job than accept a partil group
@jerryzh168 @Jack-Khuu can you please have a look what happens with reloading of the Int4 quantized Linear class from torchao? https://hud.pytorch.org/pr/pytorch/torchchat/1486#36825796920 shows this:
pull / test-gpu-aoti-bfloat16 (cuda, stories15M) / linux-job (gh) |
Remove fp16 and fp32 int4 quantized models for now. @jerryzh168 Not sure why these dtypes are not compatible with int4 quantization?
Thanks for the find, it's using cuda so it should be using the new subclass APIs too hmmm torchchat/torchchat/utils/quantize.py Lines 114 to 117 in 53a1004
|
test snapshot saving and loading