Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load pre-quantized Janus Pro 7B #1498

Open
neilmehta24 opened this issue Jan 30, 2025 · 0 comments
Open

Cannot load pre-quantized Janus Pro 7B #1498

neilmehta24 opened this issue Jan 30, 2025 · 0 comments

Comments

@neilmehta24
Copy link

System Info

Windows 11 Home
x64
python 3.11.9

env:

accelerate==1.3.0
attrdict==2.0.1
bitsandbytes==0.45.1
certifi==2024.12.14
charset-normalizer==3.4.1
colorama==0.4.6
einops==0.8.0
filelock==3.13.1
fsspec==2024.6.1
huggingface-hub==0.28.0
idna==3.10
-e git+https://github.com/deepseek-ai/Janus.git@a74a59f8a9084b78c7760c955037503b9b55b862#egg=janus
Jinja2==3.1.4
MarkupSafe==2.1.5
mpmath==1.3.0
networkx==3.3
numpy==2.1.2
packaging==24.2
pillow==11.0.0
psutil==6.1.1
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.5.2
sentencepiece==0.2.0
six==1.17.0
sympy==1.13.1
timm==1.0.14
tokenizers==0.21.0
torch==2.6.0+cu124
torchaudio==2.6.0+cu124
torchvision==0.21.0+cu124
tqdm==4.67.1
transformers==4.48.1
typing_extensions==4.12.2
urllib3==2.3.0

Reproduction

Clone the Janus repo:

https://github.com/deepseek-ai/Janus . The rest of the examples are run from the root of that repo.

Download and run quantized model:

from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM

model_path = "neilmehta24/janus-pro-7b-4bit"

vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
    model_path, trust_remote_code=True
)

Error:

  File "C:\Users\windo\neil\Janus\load_janus_pro.py", line 21, in <module>
    vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\modeling_utils.py", line 4224, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\modeling_utils.py", line 4794, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\modeling_utils.py", line 875, in _load_state_dict_into_meta_model
    hf_quantizer.create_quantized_param(model, param, param_name, param_device, state_dict, unexpected_keys)
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\quantizers\quantizer_bnb_4bit.py", line 226, in create_quantized_param
    new_value = bnb.nn.Params4bit.from_prequantized(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\bitsandbytes\nn\modules.py", line 280, in from_prequantized
    self.quant_state = QuantState.from_dict(qs_dict=quantized_stats, device=device)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\bitsandbytes\functional.py", line 756, in from_dict
    raise ValueError(
ValueError: There should be exactly one `quant_state` item with ending from ['bitsandbytes__fp4', 'bitsandbytes__nf4'].
Detected ['aligner.layers.0.weight.quant_state.bitsandbytes__fp4', 'gen_aligner.layers.0.weight.quant_state.bitsandbytes__fp4'].

Here are the keys in qs_dict at the source of the error:

dict_keys(['aligner.layers.0.weight.absmax', 'aligner.layers.0.weight.quant_map', 'aligner.layers.0.weight.quant_state.bitsandbytes__fp4', 'gen_aligner.layers.0.weight.absmax', 'gen_aligner.layers.0.weight.quant_map', 'gen_aligner.layers.0.weight.quant_state.bitsandbytes__fp4'])

For reference, the original config.json:

https://huggingface.co/deepseek-ai/Janus-Pro-7B/blob/main/config.json

the quantized config.json:

https://huggingface.co/neilmehta24/janus-pro-7b-4bit/blob/main/config.json

the model upload code:

from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM

model_path = "deepseek-ai/Janus-Pro-7B"

quantization_config = dict(load_in_4bit=True)
vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
    model_path, trust_remote_code=True, quantization_config=quantization_config, torch_dtype="auto"
)
vl_gpt.push_to_hub("janus-pro-7b-4bit")

Notes:

  • The model runs just fine when it's quantized without being saved.
  • Seems like it's a shortcoming when writing the config file or validating the keys? Especially since the model runs just fine when quantized on-the-fly.

Expected behavior

I would expect the model to load, since it can load and quantize just fine when it has access to the full weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant