-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the support for deepseek architecture .gguf #36144
Comments
cc @SunMarc @muellerzr @MekkCyber - who's the right person to ping for GGUF loading? |
Seems that all deepseek-r1 gguf checkpoints are sharded, I think we should add sharded gguf support firstly. |
Anyway, regardless the sharded gguf weights (we can merge weights with tool from llama.cpp), will #35926 block us currently? Because the deepseek-v3 support hasn't landed yet. |
yes deepseek v3 is still not supported for now, the pr is functional but some small adjustments are needed |
Hello @MekkCyber ! Isn't the reason that the transformers don't support it because the deepseek .gguf file can't be merged? |
Hello @zh-jp! I think it can be merged using |
we need GGUF supported! please! |
6 similar comments
we need GGUF supported! please! |
we need GGUF supported! please! |
we need GGUF supported! please! |
we need GGUF supported! please! |
we need GGUF supported! please! |
we need GGUF supported! please! |
Feature request
The current version does not support gguf under the deepseek architecture. It is hoped that the deepseek architecture will be added. [supported-model-architectures]
Motivation
In some framework based transformers (e.g. vllm) will raise error when load .gguf file of deepseek model or quantized deepseek model.
Your contribution
Is there any guidance to help users add relevant support?
The text was updated successfully, but these errors were encountered: