Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to load model from /mnt/models/model.file when trying to run granite model #691

Open
miabbott opened this issue Jan 31, 2025 · 5 comments

Comments

@miabbott
Copy link

Using a ThinkPad T14s Gen 2i with Fedora 41; installed v0.5.4 via pip install ramalama

When trying to run a Granite model, I got an error failed to load model from /mnt/models/model.file . See details below.

$  ramalama run huggingface://ibm-granite/granite-3.1-8b-instruct
Fetching 15 files:   0%|                                                                                                                                                                                                                                                                                                                                              | 0/15 [00:00<?, ?it/s]Downloading 'generation_config.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/3EVKVggOldJcKSsGjSdoUCN1AyQ=.0e7ca8f1a4cf587858c703b962bf0730b90ce699.incomplete'
Downloading 'model-00001-of-00004.safetensors' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/IO4xwqmZYzFmxznkwkiNSBwO1H0=.191c4e9c6263d9cf591104f2d16ab2c39dcc43c1ad0680cc5a34d5c86d61ee41.incomplete'
Downloading 'added_tokens.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/SeqzFlf9ZNZ3or_wZAOIdsM3Yxw=.183eb810668f005a1ed0e0c8be060e5f47c23f2f.incomplete'
Downloading '.gitattributes' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/wPaCkH-WbT7GsmxMKKrNZTV4nSM=.a6344aac8c09253b3b630fb776ae94478aa0275b.incomplete'
Downloading 'model-00002-of-00004.safetensors' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/t9msAuTjAZjuQnmzGOwTjiptvIU=.c7c38b0d5a436775b09d764465ed6e6eb7a8c4e302d05e301e151c96e3076f22.incomplete'
Downloading 'merges.txt' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/PtHk0z_I45atnj23IIRhTExwT3w=.f8479fb696fe07332c55300a6accf8cc191acc6a.incomplete'
Downloading 'config.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/8_PA_wEVGiVa2goH2H4KQOQpvVY=.9e06dd77173ab8d7abcb7c5d8104c9979a09158f.incomplete'
generation_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 132/132 [00:00<00:00, 773kB/s]
added_tokens.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 87.0/87.0 [00:00<00:00, 546kB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/generation_config.json                                                                                                                                                                                                           | 0.00/87.0 [00:00<?, ?B/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/added_tokens.json
Downloading 'README.md' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/Xn7B-BWUGOee2Y6hCZtEhtFu4BE=.9374560b34a68eb0c9edd6666048eb2018e448f1.incomplete'
.gitattributes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.52k/1.52k [00:00<00:00, 11.5MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.gitattributes                                                                                                                                                                                                                   | 0.00/442k [00:00<?, ?B/s]
README.md: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20.4k/20.4k [00:00<00:00, 47.1MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/README.md
Downloading 'model-00004-of-00004.safetensors' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/-dFtyT7kcgbTHt1cy9JKqruJCR4=.9d86d201ff8e73d8a46e92b543c9dd44f133e60b35ccada4a76439af62f22212.incomplete'                                                                          | 0.00/20.4k [00:00<?, ?B/s]
Downloading 'model-00003-of-00004.safetensors' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/DaGOU-KRMVrY0aYktrsE34tL0Bs=.f02784b72391fa04e9b986313c1a1720ce88f0eb40f7ae81fa0daadc93049457.incomplete'
Downloading 'model.safetensors.index.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/yVzAsSxRSINSz-tQbpx-TLpfkLU=.6e06d9ba85709934f6a8cad738d36225c8044890.incomplete'
                                                                                                                                                                                                                                                                                                                                                                                            Downloading 'special_tokens_map.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/ahkChHUJFxEmOdq5GDFEmerRzCY=.386500a5040da66c6db3d8b9c44ccd1ee202c744.incomplete'
merges.txt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 442k/442k [00:00<00:00, 4.39MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/merges.txt███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 442k/442k [00:00<00:00, 4.41MB/s]
model.safetensors.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29.8k/29.8k [00:00<00:00, 8.39MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/model.safetensors.index.json                                                                                                                                                                                                    | 0.00/29.8k [00:00<?, ?B/s]
                                                                                                                                                                                                                                                                                                                                                                                            Downloading 'tokenizer.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/HgM_lKo9sdSCfRtVg7MMFS7EKqo=.11f3f2c66235d30bc9bf04ad7e1b4a19d80e84da.incomplete'                                                                                                                     | 0.00/4.99G [00:00<?, ?B/s]
Downloading 'tokenizer_config.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/vzaExXFZNBay89bvlQv-ZcI6BTg=.3e1e4c1b08817b77f3010e7aab1b87ad645e6a42.incomplete'
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.48M/3.48M [00:00<00:00, 26.0MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/tokenizer.json                                                                                                                                                                                                                  | 0.00/4.97G [00:00<?, ?B/s]
Downloading 'vocab.json' to '/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/.cache/huggingface/download/j3m-Hy6QvBddw8RXA1uSWl1AJ0c=.0a11f2016e660fd490f7bf168e6d1f9c86a8f744.incomplete'
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 790/790 [00:00<00:00, 5.13MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/config.json
special_tokens_map.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 701/701 [00:00<00:00, 4.45MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/special_tokens_map.json
vocab.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 777k/777k [00:00<00:00, 11.2MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/vocab.json                                                                                                                                                                                                                        | 0.00/790 [00:00<?, ?B/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8.07k/8.07k [00:00<00:00, 76.1MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/tokenizer_config.json                                                                                                                                                                                                             | 0.00/701 [00:00<?, ?B/s]
model-00004-of-00004.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.41G/1.41G [02:15<00:00, 10.4MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/model-00004-of-00004.safetensors                                                                                                                                                                                       | 1.41G/4.99G [02:15<05:45, 10.4MB/s]
model-00003-of-00004.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.97G/4.97G [07:57<00:00, 10.4MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/model-00003-of-00004.safetensors█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████  | 4.96G/4.99G [07:56<00:03, 10.4MB/s]
model-00001-of-00004.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.97G/4.97G [07:58<00:00, 10.4MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/model-00001-of-00004.safetensors█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 4.97G/4.99G [07:57<00:02, 10.4MB/s]
model-00002-of-00004.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.99G/4.99G [07:59<00:00, 10.4MB/s]
Download complete. Moving file to /var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/model-00002-of-00004.safetensors██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 4.99G/4.99G [07:59<00:00, 10.4MB/s]
Fetching 15 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [08:00<00:00, 32.01s/it]
Trying to pull quay.io/ramalama/ramalama:latest...
Getting image source signatures
Copying blob e6e7a002185f done   | 
Copying blob ec465ce79861 done   | 
Copying blob facf1e7dd3e0 done   | 
Copying blob c1a2e726093f done   | 
Copying config a7c40555ab done   | 
Writing manifest to image destination
Loading modelgguf_init_from_file_impl: failed to read magic                                                                                                                                                                                                                                                                                                                                  
llama_model_load: error loading model: llama_model_loader: failed to load model from /mnt/models/model.file

llama_model_load_from_file_impl: failed to load model
initialize_model: error: unable to load model from file: /mnt/models/model.file

Repeated run with --debug:

$ ramalama --debug run huggingface://ibm-granite/granite-3.1-8b-instruct
run_cmd:  podman inspect quay.io/ramalama/ramalama:0.5
Working directory: None
Ignore stderr: False
Ignore all: True
exec_cmd:  podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_bHsGsFm5ih --pull=newer -t --device /dev/dri --mount=type=bind,src=/var/home/miabbott/.local/share/ramalama/models/huggingface/ibm-granite/granite-3.1-8b-instruct,destination=/mnt/models/model.file,ro quay.io/ramalama/ramalama:latest llama-run -c 2048 --temp 0.8 --jinja -v /mnt/models/model.file
Loading modelllama_model_load_from_file_impl: using device Kompute0 (Intel(R) Xe Graphics (TGL GT2)) - 15896 MiB free                                                                                                                                                                                                                                                                        
gguf_init_from_file_impl: failed to read magic
llama_model_load: error loading model: llama_model_loader: failed to load model from /mnt/models/model.file

llama_model_load_from_file_impl: failed to load model
initialize_model: error: unable to load model from file: /mnt/models/model.file

See the details of the raw podman run command with debug logging:

$ podman --log-level=debug run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_bHsGsFm5ih --pull=newer -t --device /dev/dri --mount=type=bind,src=/var/home/miabbott/.local/share/ramalama/models/huggingface/ibm-granite/granite-3.1-8b-instruct,destination=/mnt/models/model.file,ro quay.io/ramalama/ramalama:latest llama-run -c 2048 --temp 
0.8 --jinja -v /mnt/models/model.file                                                                                                                                                          
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called run.PersistentPreRunE(podman --log-level=debug run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_bHsGsFm5ih --pull=newer -t --device /dev/dri --mount=type=bind,src=/var/home/miabbott/.local/share/ramalama/models/huggingface/ibm-granite/granite-3.1-8b-instruct,destination=/mnt/models/model.file,ro quay.io/ramalama/ramalama:latest llama-ru
n -c 2048 --temp 0.8 --jinja -v /mnt/models/model.file) 
DEBU[0000] Using conmon: "/usr/bin/conmon"                                                                                                                                                    
INFO[0000] Using sqlite as database backend                                                                                                                                                   
DEBU[0000] Using graph driver overlay                                                                                                                                                         
DEBU[0000] Using graph root /var/home/miabbott/.local/share/containers/storage                                                                                                                                                                                                                                                                                                               
DEBU[0000] Using run root /run/user/1000/containers                                                                                                                                                                                                                                                                                                                                          
DEBU[0000] Using static dir /var/home/miabbott/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp                  
DEBU[0000] Using volume path /var/home/miabbott/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                                                                                                                                                       
DEBU[0000] [graphdriver] trying provided driver "overlay"                                                                                                                                                                                                                                                                                                                                    DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs                                                                                                                                                                                                                                                                                                                                    
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false                                                                                                                                                                                                                                                                                              
DEBU[0000] Initializing event backend journald                                                                                                                                                                                                                                                                                                                                               
DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument                                                                                                                                                                                                                                                         
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument                                                                                                                                                                                                                                                               
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument                                                  
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument                                                                                                                                                                                                                                                             
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"                                                                                                                                                                                                                                                                                                                                                 INFO[0000] Setting parallel job count to 25                                                                                                                                                                                                                                                                                                                                                  DEBU[0000] Pulling image quay.io/ramalama/ramalama:latest (policy: newer)                                                                                                                                                                                                                                                                                                                    DEBU[0000] Looking up image "quay.io/ramalama/ramalama:latest" in local containers storage                                                                                                                                                                                                                                                                                                   DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] }                                                                                                                                                                                                                                                                                                                             DEBU[0000] Trying "quay.io/ramalama/ramalama:latest" ...                                                                                                                                                                                                                                                                                                                                     
DEBU[0000] parsed reference into "[overlay@/var/home/miabbott/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs]@a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0000] Found image "quay.io/ramalama/ramalama:latest" as "quay.io/ramalama/ramalama:latest" in local containers storage 
DEBU[0000] Found image "quay.io/ramalama/ramalama:latest" as "quay.io/ramalama/ramalama:latest" in local containers storage ([overlay@/var/home/miabbott/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs]@a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492) 
DEBU[0000] exporting opaque data as blob "sha256:a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0000] Loading registries configuration "/etc/containers/registries.conf"                                                                                                                 
DEBU[0000] Loading registries configuration "/etc/containers/registries.conf.d/000-shortnames.conf" 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] }                                                                                                                              
DEBU[0000] Attempting to pull candidate quay.io/ramalama/ramalama:latest for quay.io/ramalama/ramalama:latest 
DEBU[0000] Using registries.d directory /etc/containers/registries.d 
DEBU[0000] Trying to access "quay.io/ramalama/ramalama:latest"                                                                                                                                
DEBU[0000] No credentials matching quay.io/ramalama/ramalama found in /run/user/1000/containers/auth.json 
DEBU[0000] No credentials matching quay.io/ramalama/ramalama found in /var/home/miabbott/.config/containers/auth.json 
DEBU[0000] Found credentials for quay.io/ramalama/ramalama in credential helper containers-auth.json in file /var/home/miabbott/.docker/config.json 
DEBU[0000]  No signature storage configuration found for quay.io/ramalama/ramalama:latest, using built-in default file:///var/home/miabbott/.local/share/containers/sigstore 
DEBU[0000] Looking for TLS certificates and private keys in /etc/docker/certs.d/quay.io 
DEBU[0000] GET https://quay.io/v2/                                                                                                                                                            
DEBU[0000] Ping https://quay.io/v2/ status 401                                                 
DEBU[0000] GET https://quay.io/v2/auth?account=miabbott&scope=repository%3Aramalama%2Framalama%3Apull&service=quay.io 
DEBU[0000] Increasing token expiration to: 60 seconds                                                                                                                                         
DEBU[0000] GET https://quay.io/v2/ramalama/ramalama/manifests/latest                                                                                                                          
DEBU[0001] Content-Type from manifest GET is "application/vnd.oci.image.index.v1+json"                                                                                                        
DEBU[0001] GET https://quay.io/v2/ramalama/ramalama/manifests/sha256:1f370473600430d8f2572a5f385911c125dc0b58989910afca7a80ec965ad8b6 
DEBU[0001] Content-Type from manifest GET is "application/vnd.oci.image.manifest.v1+json"                                                                                                     
DEBU[0001] Skipping pull candidate quay.io/ramalama/ramalama:latest as the image is not newer (pull policy newer)                                                                                                                                                                                                                                                                            DEBU[0001] Looking up image "quay.io/ramalama/ramalama:latest" in local containers storage 
DEBU[0001] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0001] Trying "quay.io/ramalama/ramalama:latest" ...                                                                                                                   
DEBU[0001] parsed reference into "[overlay@/var/home/miabbott/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs]@a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0001] Found image "quay.io/ramalama/ramalama:latest" as "quay.io/ramalama/ramalama:latest" in local containers storage 
DEBU[0001] Found image "quay.io/ramalama/ramalama:latest" as "quay.io/ramalama/ramalama:latest" in local containers storage ([overlay@/var/home/miabbott/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs]@a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492) 
DEBU[0001] [graphdriver] trying provided driver "overlay" 
DEBU[0001] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0001] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false  
DEBU[0001] exporting opaque data as blob "sha256:a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0001] Looking up image "quay.io/ramalama/ramalama:latest" in local containers storage 
DEBU[0001] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0001] Trying "quay.io/ramalama/ramalama:latest" ... 
DEBU[0001] parsed reference into "[overlay@/var/home/miabbott/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs]@a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0001] Found image "quay.io/ramalama/ramalama:latest" as "quay.io/ramalama/ramalama:latest" in local containers storage 
DEBU[0001] Found image "quay.io/ramalama/ramalama:latest" as "quay.io/ramalama/ramalama:latest" in local containers storage ([overlay@/var/home/miabbott/.local/share/containers/storage+/run/user/1000/containers:overlay.mount_program=/usr/bin/fuse-overlayfs]@a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492) 
DEBU[0001] exporting opaque data as blob "sha256:a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0001] Inspecting image a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492 
DEBU[0001] exporting opaque data as blob "sha256:a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0001] Inspecting image a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492 
DEBU[0001] Inspecting image a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492 
DEBU[0001] Inspecting image a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492 
DEBU[0001] using systemd mode: false                    
DEBU[0001] setting container name ramalama_bHsGsFm5ih   
DEBU[0001] Non-CDI device /dev/dri; assuming standard device 
DEBU[0001] No hostname set; container's hostname will default to runtime default 
DEBU[0001] Loading seccomp profile from "/usr/share/containers/seccomp.json" 
DEBU[0001] Adding mount /proc                           
DEBU[0001] Adding mount /dev                            
DEBU[0001] Adding mount /dev/pts                        
DEBU[0001] Adding mount /dev/mqueue                     
DEBU[0001] Adding mount /sys                            
DEBU[0001] Adding mount /sys/fs/cgroup                  
DEBU[0001] Adding mount /dev/dri/card1                  
DEBU[0001] Adding mount /dev/dri/renderD128             
DEBU[0001] Allocated lock 0 for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] exporting opaque data as blob "sha256:a7c40555ab12c0b25cb99c2130506b15958aa4c228e4f56e301cb738f68c2492" 
DEBU[0001] Created container "b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a"  
DEBU[0001] Container "b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a" has work directory "/var/home/miabbott/.local/share/containers/storage/overlay-containers/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a/userdata" 
DEBU[0001] Container "b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a" has run directory "/run/user/1000/containers/overlay-containers/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a/userdata" 
DEBU[0001] Handling terminal attach                     
INFO[0001] Received shutdown.Stop(), terminating!        PID=1108751
DEBU[0001] Enabling signal proxying                     
DEBU[0001] Made network namespace at /run/user/1000/netns/netns-f18aeaab-d8b7-e891-a918-304ad34eb395 for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] overlay: mount_data=lowerdir=/var/home/miabbott/.local/share/containers/storage/overlay/l/I3DZLKAY5GM6E3AI2YKODNCA7M:/var/home/miabbott/.local/share/containers/storage/overlay/l/36XUVWVBWD5UHEJI6DNVFSHWPQ:/var/home/miabbott/.local/share/containers/storage/overlay/l/B6ZIDKFUQFAFPRMTVJYVHGZFQP:/var/home/miabbott/.local/share/containers/storage/overlay/l/SIZV35HKMFAKJCOR
FGGGT3CAXB,upperdir=/var/home/miabbott/.local/share/containers/storage/overlay/02891134fbbb38d7381ff0301f2efd43325b1065f06f4caf6a3b7e502fb80922/diff,workdir=/var/home/miabbott/.local/share/containers/storage/overlay/02891134fbbb38d7381ff0301f2efd43325b1065f06f4caf6a3b7e502fb80922/work,volatile,context="system_u:object_r:container_file_t:s0:c1022,c1023" 
DEBU[0001] pasta arguments: --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1000/netns/netns-f18aeaab-d8b7-e891-a918-304ad34eb395 --map-guest-addr 169.254.1.2 
DEBU[0001] Mounted container "b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a" at "/var/home/miabbott/.local/share/containers/storage/overlay/02891134fbbb38d7381ff0301f2efd43325b1065f06f4caf6a3b7e502fb80922/merged" 
DEBU[0001] Created root filesystem for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a at /var/home/miabbott/.local/share/containers/storage/overlay/02891134fbbb38d7381ff0301f2efd43325b1065f06f4caf6a3b7e502fb80922/merged 
INFO[0001] pasta logged warnings: "Couldn't get any nameserver address" 
DEBU[0001] /proc/sys/crypto/fips_enabled does not contain '1', not adding FIPS mode bind mounts  
DEBU[0001] Setting Cgroups for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a to user.slice:libpod:b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Set root propagation to "rslave"             
DEBU[0001] reading hooks from /usr/share/containers/oci/hooks.d 
DEBU[0001] Workdir "/" resolved to host path "/var/home/miabbott/.local/share/containers/storage/overlay/02891134fbbb38d7381ff0301f2efd43325b1065f06f4caf6a3b7e502fb80922/merged" 
DEBU[0001] Created OCI spec for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a at /var/home/miabbott/.local/share/containers/storage/overlay-containers/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a/userdata/config.json 
DEBU[0001] /usr/bin/conmon messages will be logged to syslog 
DEBU[0001] running conmon: /usr/bin/conmon               args="[--api-version 1 -c b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a -u b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a -r /usr/bin/crun -b /var/home/miabbott/.local/share/containers/storage/overlay-containers/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a/userdata -
p /run/user/1000/containers/overlay-containers/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a/userdata/pidfile -n ramalama_bHsGsFm5ih --exit-dir /run/user/1000/libpod/tmp/exits --persist-dir /run/user/1000/libpod/tmp/persist/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a --full-attach -s -l journald --log-level debug --syslog -t --conmon-pidfi
le /run/user/1000/containers/overlay-containers/b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/home/miabbott/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-a
rg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /var/home/miabbott/.local/share/containers/storage/volumes -
-exit-command-arg --db-backend --exit-command-arg sqlite --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --e
xit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --stopped-only --exit-command-arg --rm --exit-command-arg b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a]"
DEBU[0001] Received: 1108806                            
INFO[0001] Got Conmon PID as 1108804                    
DEBU[0001] Created container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a in OCI runtime 
DEBU[0001] found local resolver, using "/run/systemd/resolve/resolv.conf" to get the nameservers 
DEBU[0001] Attaching to container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Received a resize event: {Width:381 Height:89} 
DEBU[0001] Starting container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a with command [llama-run -c 2048 --temp 0.8 --jinja -v /mnt/models/model.file] 
DEBU[0001] Started container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Notify sent successfully                     
Loading modelllama_model_load_from_file_impl: using device Kompute0 (Intel(R) Xe Graphics (TGL GT2)) - 15896 MiB free                                                                         
gguf_init_from_file_impl: failed to read magic
llama_model_load: error loading model: llama_model_loader: failed to load model from /mnt/models/model.file

llama_model_load_from_file_impl: failed to load model
initialize_model: error: unable to load model from file: /mnt/models/model.file
DEBU[0001] Checking if container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a should restart 
DEBU[0001] Removing container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Cleaning up container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Tearing down network namespace at /run/user/1000/netns/netns-f18aeaab-d8b7-e891-a918-304ad34eb395 for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Successfully cleaned up container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Unmounted container "b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a" 
DEBU[0001] Removing all exec sessions for container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a 
DEBU[0001] Container b10ba7f989e301fefb82e9542897767505836df0b9b36b18922169fa1fdc793a storage is already unmounted, skipping... 
DEBU[0001] Called run.PersistentPostRunE(podman --log-level=debug run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_bHsGsFm5ih --pull=newer -t --device /dev/dri --mount=type=bind,src=/var/home/miabbott/.local/share/ramalama/models/huggingface/ibm-granite/granite-3.1-8b-instruct,destination=/mnt/models/model.file,ro quay.io/ramalama/ramalama:latest llama-r
un -c 2048 --temp 0.8 --jinja -v /mnt/models/model.file) 
DEBU[0001] Shutting down engines                        

It looks like the source of the bind mount is a symbolic link:

$ ls -latr /var/home/miabbott/.local/share/ramalama/models/huggingface/ibm-granite/granite-3.1-8b-instruct
lrwxrwxrwx. 1 miabbott miabbott 62 Jan 31 12:11 /var/home/miabbott/.local/share/ramalama/models/huggingface/ibm-granite/granite-3.1-8b-instruct -> ../../../repos/huggingface/ibm-granite/granite-3.1-8b-instruct

Could that be tripping things up? Maybe?

If I swap out the symbolic link for the realpath, the podman run command is a bit more successful:

$ podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_bHsGsFm5ih --pull=newer -t --device /dev/dri --mount=type=bind,src=/var/home/miabbott/.local/share/ramalama/repos/huggingface/ibm-granite/granite-3.1-8b-instruct/,destination=/mnt/models/model.file,ro quay.io/ramalama/ramalama:latest llama-run -c 2048 --temp 0.8 --jinja -v /mnt/models/model.filea
curl_easy_perform() failed: HTTP response code said error
terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_3::detail::parse_error'
  what():  [json.exception.parse_error.101] parse error at line 1, column 1: attempting to parse an empty input; check that your input string or stream contains the expected JSON
@kush-gupt
Copy link
Contributor

That initial hugging face link seems to be pointing to a full model directory and not a GGUF directly, try running a pointer to a quantized gguf or ollama:

ramalama run huggingface://lmstudio-community/granite-3.1-8b-instruct-GGUF/granite-3.1-8b-instruct-Q4_K_M.gguf
or for ollama
ramalama run granite3.1-dense

@miabbott
Copy link
Author

Ah, this my own ignorance about running models. I was just clicking around on the IBM Granite docs and found my way to their model page (https://huggingface.co/ibm-granite/granite-3.1-8b-instruct). I thought it was in a format that could be run directly.

Perhaps ramalama could be enhanced to detect if there is no GGUF found and fail more gracefully with a more helpful error?

FWIW, using the suggestion of pointing to the GGUF directly worked successfully for me.

@jasonbrooks
Copy link

I had this same experience today, both w/ this granite model and the new neuralmagic models. I could use some sort of explainer on what's expected to work -- perhaps out of scope for ramalama, but it'd help make working with AI more boring ;)

@dougsland
Copy link
Collaborator

dougsland commented Feb 1, 2025

I had this same experience today, both w/ this granite model and the new neuralmagic models. I could use some sort of explainer on what's expected to work -- perhaps out of scope for ramalama, but it'd help make working with AI more boring ;)

@jasonbrooks Could you please share the neuralmagic models used so we can test as well?

@rhatdan
Copy link
Member

rhatdan commented Feb 1, 2025

I think RamaLama should support said models, but I believe llama.cpp can not handle it, so we would need to change runtime to vLLM to handle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants