Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Add DeepSeek-R1-Distill and Hermes-3-Llama-3.2 #652

Merged
merged 1 commit into from
Jan 21, 2025

Conversation

CharlieFRuan
Copy link
Contributor

This PR adds the following models to the prebuilt list:

  • DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC
  • DeepSeek-R1-Distill-Qwen-7B-q4f32_1-MLC
  • DeepSeek-R1-Distill-Llama-8B-q4f16_1-MLC
  • DeepSeek-R1-Distill-Llama-8B-q4f32_1-MLC
  • Hermes-3-Llama-3.2-3B-q4f16_1-MLC
  • Hermes-3-Llama-3.2-3B-q4f32_1-MLC

We will add DeepSeek-R1-Distill-Qwen-1.5B afterward, which is currently experiencing correctness issues.

Separately, we fix the handling of role_content_sep and role_empty_sep when it is "", which evaluates to false (currently we make it ": ", which is inconsistent with what the model expects).

@CharlieFRuan CharlieFRuan merged commit 808685b into mlc-ai:main Jan 21, 2025
1 check passed
CharlieFRuan added a commit that referenced this pull request Jan 21, 2025
### Change

- The only change is #652, adding
the following models:
  - `DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC`
  - `DeepSeek-R1-Distill-Qwen-7B-q4f32_1-MLC`
  - `DeepSeek-R1-Distill-Llama-8B-q4f16_1-MLC`
  - `DeepSeek-R1-Distill-Llama-8B-q4f32_1-MLC`
  - `Hermes-3-Llama-3.2-3B-q4f16_1-MLC`
  - `Hermes-3-Llama-3.2-3B-q4f32_1-MLC`

### TVMjs
- No change, version `0.18.0-dev2` just like 0.2.71
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant