Unable to start tgi-gaudi server with meta-llama/Llama-3.2-11B-Vision-Instruct #270

dmsuehir · 2025-02-06T22:48:33Z

System Info

Docker image: ghcr.io/huggingface/tgi-gaudi:2.3.1
OS: Ubuntu 22.04
HL-SMI Version: hl-1.18.0-fw-53.1.1.1
Driver Version: 1.18.0-ee698fb

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

I'm trying to run tgi-gaudi with meta-llama/Llama-3.2-11B-Vision-Instruct. I'm starting the tgi-gaudi server using the 2.3.1 docker image like:

docker run -d \
    --name tgi-gaudi-llama-vision \
    -p 8399:80 \
    --env http_proxy=$http_proxy \
    --env https_proxy=$https_proxy \
    --env HF_HUB_DISABLE_PROGRESS_BARS=1 \
    --env HF_HUB_ENABLE_HF_TRANSFER=0 \
    --env HABANA_VISIBLE_DEVICES=0 \
    --env OMPI_MCA_btl_vader_single_copy_mechanism=none \
    --env PREFILL_BATCH_BUCKET_SIZE=1 \
    --env BATCH_BUCKET_SIZE=1 \
    --env MAX_BATCH_TOTAL_TOKENS=4096 \
    --env ENABLE_HPU_GRAPH=true \
    --env LIMIT_HPU_GRAPH=true \
    --env USE_FLASH_ATTENTION=true \
    --env FLASH_ATTENTION_RECOMPUTE=true \
    --env HF_TOKEN=${HF_TOKEN} \
    --runtime=habana \
    --cap-add=sys_nice \
    --ipc=host \
    ghcr.io/huggingface/tgi-gaudi:2.3.1 \
    --model-id meta-llama/Llama-3.2-11B-Vision-Instruct \
    --max-input-tokens 3048 \
    --max-total-tokens 4096

I'm seeing the following error in the logs:

Traceback (most recent call last):
  File "/usr/local/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/cli.py", line 170, in serve
    server.serve(
  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/server.py", line 259, in serve
    asyncio.run(
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/server.py", line 213, in serve_inner
    model = get_model_with_lora_adapters(
  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/models/__init__.py", line 227, in get_model_with_lora_adapters
    model = get_model(
  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/models/__init__.py", line 200, in get_model
    return CausalLM(
  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/models/causal_lm.py", line 695, in __init__
    raise ValueError(f"Model type {model.config.model_type} is not supported!")
ValueError: Model type mllama_text_model is not supported!

Expected behavior

I would expect the server to start successfully, since I see the meta-llama/Llama-3.2-11B-Vision-Instruct model linked on the supported models page as "Mllama".

The text was updated successfully, but these errors were encountered:

yuanwu2017 · 2025-02-10T01:05:43Z

It will be enabled in next release.

dmsuehir · 2025-02-10T17:36:54Z

It will be enabled in next release.

Good to hear it, thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to start tgi-gaudi server with meta-llama/Llama-3.2-11B-Vision-Instruct #270

Unable to start tgi-gaudi server with meta-llama/Llama-3.2-11B-Vision-Instruct #270

dmsuehir commented Feb 6, 2025

yuanwu2017 commented Feb 10, 2025

dmsuehir commented Feb 10, 2025

Unable to start tgi-gaudi server with meta-llama/Llama-3.2-11B-Vision-Instruct #270

Unable to start tgi-gaudi server with meta-llama/Llama-3.2-11B-Vision-Instruct #270

Comments

dmsuehir commented Feb 6, 2025

System Info

Information

Tasks

Reproduction

Expected behavior

yuanwu2017 commented Feb 10, 2025

dmsuehir commented Feb 10, 2025