You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
Loading checkpoint shards: 0%| | 0/15 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/mmilin/projects/ming_benchmark_vllm/AutoSmoothQuant/autosmoothquant/examples/test_model.py", line 62, in <module>
main()
File "/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/mmilin/projects/ming_benchmark_vllm/AutoSmoothQuant/autosmoothquant/examples/test_model.py", line 40, in main
model = Int8LlamaForCausalLM.from_pretrained(args.model_path, quant_config, device_map="sequential")
File "/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3706, in from_pretrained
) = cls._load_pretrained_model(
File "/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/transformers/modeling_utils.py", line 4116, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/transformers/modeling_utils.py", line 778, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/mmilin/projects/ming_benchmark_vllm/autosmq-venv/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 345, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([1024, 8192]) in "weight" (which has shape torch.Size([8192, 8192])), this look incorrect.
BTW, I was able to convert and load Llama-2-7b model without any error. Any idea how to fix it? Looks like group attention related.
Many thanks ahead!
The text was updated successfully, but these errors were encountered:
Hi! We haven't encountered this problem before. Could you please post your config.json for both models?
And which linear layer does the shape mismatch exist?
Hi! We haven't encountered this problem before. Could you please post your config.json for both models? And which linear layer does the shape mismatch exist?
Hello! Thanks for the nice work!
I want to quantize Llama-2-70B. I was able to export the quantized model without any error. However, when I test the model:
I encounter this error:
BTW, I was able to convert and load Llama-2-7b model without any error. Any idea how to fix it? Looks like group attention related.
Many thanks ahead!
The text was updated successfully, but these errors were encountered: