Certain ONNX models ignore the system prompt #1172

RonanKMcGovern · 2025-01-29T13:45:19Z

System Info

Here's a model that follows the system prompt:

HuggingFaceTB/SmolLM2-1.7B-Instruct

Here are two that do not:

onnx-community/Llama-3.2-3B-Instruct-onnx-web-gqa
onnx-community/Qwen2.5-Coder-1.5B-Instruct

Is this intentional or accidental?

Environment/Platform

Description

I'm running these models in q4f16 with webgpu

Reproduction

I'm following the examples provided for smollm in the examples, but swapping the model.

josedandrade · 2025-01-31T17:39:39Z

System Info

Here's a model that follows the system prompt:

HuggingFaceTB/SmolLM2-1.7B-Instruct

Here are two that do not:

onnx-community/Llama-3.2-3B-Instruct-onnx-web-gqa

onnx-community/Qwen2.5-Coder-1.5B-Instruct

Is this intentional or accidental?

Environment/Platform

Website/web-app[ ] Browser extension[ ] Server-side (e.g., Node.js, Deno, Bun)[ ] Desktop app (e.g., Electron)[ ] Other (e.g., VSCode extension)

Description

I'm running these models in q4f16 with webgpu

Reproduction

I'm following the examples provided for smollm in the examples, but swapping the model.

Where/How do you set your system prompt?

xenova · 2025-02-08T11:59:40Z

Could you provide more information about the problem you are facing? Is the model producing incorrect results?

RonanKMcGovern · 2025-02-08T18:53:10Z

It has no awareness of the system prompt. Possibly could be due to quantising. Smollm was quantized with some calibration samples and seems not to have the issue

…

On Sat 8 Feb 2025 at 12:00, Joshua Lochner ***@***.***> wrote: Could you provide more information about the problem you are facing? Is the model producing incorrect results? — Reply to this email directly, view it on GitHub <#1172 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASVG6CXNMBE22AP7GEXTLML2OXWUDAVCNFSM6AAAAABWC4S33OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBVGIZTCMRWHE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

xenova · 2025-02-08T19:16:16Z

Can you please provide an example of input/output that you are seeing? It may be that the model itself doesn't support a system role (which you can check by looking at the chat template in the tokenizer_config.json file)

RonanKMcGovern · 2025-02-09T16:07:10Z

Sure, here is a full repo: https://github.com/TrelisResearch/llama-system-prompt-issue

BTW, yes good point on checking the tokeniser. Indeed the system prompt is in there.

xenova · 2025-02-09T16:17:50Z

This may just be a limitation of the model itself. Are you able to get good performance with the python library? It may be good to use that as a benchmark for the model's capabilities.

RonanKMcGovern · 2025-02-10T09:21:10Z

Yeah the models themselves work fine with transformers and instruction follow

…

On Sun 9 Feb 2025 at 16:18, Joshua Lochner ***@***.***> wrote: This may just be a limitation of the model itself. Are you able to get good performance with the python library? It may be good to use that as a benchmark for the model's capabilities. — Reply to this email directly, view it on GitHub <#1172 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASVG6CVQ6LOKAX5C3L7LZ7L2O55ULAVCNFSM6AAAAABWC4S33OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBWGM4DCOBQGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

drlima · 2025-02-20T18:13:09Z

Hey, folks.

I'm seeing the same issues here.

On onnxruntime via Python it follows the system prompt, but not on Android's onnxruntime. Both run the same model.

My hypothesis is that the attention mask is not being properly set on the Android's version.

RonanKMcGovern added the bug Something isn't working label Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Certain ONNX models ignore the system prompt #1172

Certain ONNX models ignore the system prompt #1172

RonanKMcGovern commented Jan 29, 2025

josedandrade commented Jan 31, 2025

System Info

Environment/Platform

Description

Reproduction

xenova commented Feb 8, 2025

RonanKMcGovern commented Feb 8, 2025 via email

xenova commented Feb 8, 2025

RonanKMcGovern commented Feb 9, 2025

xenova commented Feb 9, 2025

RonanKMcGovern commented Feb 10, 2025 via email

drlima commented Feb 20, 2025

Certain ONNX models ignore the system prompt #1172

Certain ONNX models ignore the system prompt #1172

Comments

RonanKMcGovern commented Jan 29, 2025

System Info

Environment/Platform

Description

Reproduction

josedandrade commented Jan 31, 2025

System Info

Environment/Platform

Description

Reproduction

xenova commented Feb 8, 2025

RonanKMcGovern commented Feb 8, 2025 via email

xenova commented Feb 8, 2025

RonanKMcGovern commented Feb 9, 2025

xenova commented Feb 9, 2025

RonanKMcGovern commented Feb 10, 2025 via email

drlima commented Feb 20, 2025