model: Support Janus-pro #3203

mickqian · 2025-01-29T04:28:55Z

Motivation

Support deepseek-ai/Janus-Pro models, #3195

Modifications

Janus model files
related tests

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling.

zhyncs · 2025-01-29T07:14:00Z

Coooooool!

codeaxn · 2025-01-31T13:16:04Z

Forgive my noobishness, but how to test the PR?
I run:

python -m sglang.launch_server --model-path deepseek-ai/Janus-Pro-7B --trust-remote-code --chat-template
 janus --port 30000 --host 0.0.0.0

And it aborts with the error below:

  File "/home/user/tools/sglang/python/sglang/srt/models/registry.py", line 65, in resolve_model_cls
    return self._raise_for_unsupported(architectures)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tools/sglang/python/sglang/srt/models/registry.py", line 32, in _raise_for_unsupported
    raise ValueError(ValueError: Model architectures ['MultiModalityCausalLM'] are not supported for now. Supported architectures: dict_keys(['BaichuanForCausalLM', 'ChatGLMModel', 'CohereForCausalLM', 'Cohere2ForCausalLM', 'DbrxForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'DeepseekV3ForCausalLM', 'ExaoneForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'Gemma2ForSequenceClassification', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GraniteForCausalLM', 'Grok1ForCausalLM', 'Grok1ModelForCausalLM', 'InternLM2ForCausalLM', 'InternLM2ForRewardModel', 'LlamaForCausalLM', 'Phi3ForCausalLM', 'InternLM3ForCausalLM', 'LlamaForClassification', 'LlamaForCausalLMEagle', 'LlamaEmbeddingModel', 'MistralModel', 'LlamaForSequenceClassification', 'LlamaForSequenceClassificationWithNormal_Weights', 'LlavaLlamaForCausalLM', 'LlavaQwenForCausalLM', 'LlavaMistralForCausalLM', 'LlavaVidForCausalLM', 'MiniCPMForCausalLM', 'MiniCPM3ForCausalLM', 'MiniCPMV', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MllamaForConditionalGeneration', 'OlmoForCausalLM', 'Olmo2ForCausalLM', 'OlmoeForCausalLM', 'Phi3SmallForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2ForCausalLMEagle', 'Qwen2MoeForCausalLM', 'Qwen2VLForConditionalGeneration', 'StableLmForCausalLM', 'TorchNativeLlamaForCausalLM', 'TorchNativePhi3ForCausalLM', 'XverseForCausalLM', 'XverseMoeForCausalLM', 'YiVLForCausalLM'])

[2025-01-31 13:12:23] Received sigquit from a child proces. It usually means the child failed.
Killed

mickqian · 2025-02-01T06:58:32Z

Forgive my noobishness, but how to test the PR? I run:

python -m sglang.launch_server --model-path deepseek-ai/Janus-Pro-7B --trust-remote-code --chat-template
 janus --port 30000 --host 0.0.0.0

And it aborts with the error below:

  File "/home/user/tools/sglang/python/sglang/srt/models/registry.py", line 65, in resolve_model_cls
    return self._raise_for_unsupported(architectures)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tools/sglang/python/sglang/srt/models/registry.py", line 32, in _raise_for_unsupported
    raise ValueError(ValueError: Model architectures ['MultiModalityCausalLM'] are not supported for now. Supported architectures: dict_keys(['BaichuanForCausalLM', 'ChatGLMModel', 'CohereForCausalLM', 'Cohere2ForCausalLM', 'DbrxForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'DeepseekV3ForCausalLM', 'ExaoneForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'Gemma2ForSequenceClassification', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GraniteForCausalLM', 'Grok1ForCausalLM', 'Grok1ModelForCausalLM', 'InternLM2ForCausalLM', 'InternLM2ForRewardModel', 'LlamaForCausalLM', 'Phi3ForCausalLM', 'InternLM3ForCausalLM', 'LlamaForClassification', 'LlamaForCausalLMEagle', 'LlamaEmbeddingModel', 'MistralModel', 'LlamaForSequenceClassification', 'LlamaForSequenceClassificationWithNormal_Weights', 'LlavaLlamaForCausalLM', 'LlavaQwenForCausalLM', 'LlavaMistralForCausalLM', 'LlavaVidForCausalLM', 'MiniCPMForCausalLM', 'MiniCPM3ForCausalLM', 'MiniCPMV', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MllamaForConditionalGeneration', 'OlmoForCausalLM', 'Olmo2ForCausalLM', 'OlmoeForCausalLM', 'Phi3SmallForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2ForCausalLMEagle', 'Qwen2MoeForCausalLM', 'Qwen2VLForConditionalGeneration', 'StableLmForCausalLM', 'TorchNativeLlamaForCausalLM', 'TorchNativePhi3ForCausalLM', 'XverseForCausalLM', 'XverseMoeForCausalLM', 'YiVLForCausalLM'])

[2025-01-31 13:12:23] Received sigquit from a child proces. It usually means the child failed.
Killed

Could you make sure the local repo has been updated, and python has chosen the correct sglang version? BTW, this PR will be unstable until merged.

codeaxn · 2025-02-01T16:03:56Z

Forgive my noobishness, but how to test the PR? I run:

python -m sglang.launch_server --model-path deepseek-ai/Janus-Pro-7B --trust-remote-code --chat-template
 janus --port 30000 --host 0.0.0.0

And it aborts with the error below:

  File "/home/user/tools/sglang/python/sglang/srt/models/registry.py", line 65, in resolve_model_cls
    return self._raise_for_unsupported(architectures)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/tools/sglang/python/sglang/srt/models/registry.py", line 32, in _raise_for_unsupported
    raise ValueError(ValueError: Model architectures ['MultiModalityCausalLM'] are not supported for now. Supported architectures: dict_keys(['BaichuanForCausalLM', 'ChatGLMModel', 'CohereForCausalLM', 'Cohere2ForCausalLM', 'DbrxForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'DeepseekV3ForCausalLM', 'ExaoneForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'Gemma2ForSequenceClassification', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GraniteForCausalLM', 'Grok1ForCausalLM', 'Grok1ModelForCausalLM', 'InternLM2ForCausalLM', 'InternLM2ForRewardModel', 'LlamaForCausalLM', 'Phi3ForCausalLM', 'InternLM3ForCausalLM', 'LlamaForClassification', 'LlamaForCausalLMEagle', 'LlamaEmbeddingModel', 'MistralModel', 'LlamaForSequenceClassification', 'LlamaForSequenceClassificationWithNormal_Weights', 'LlavaLlamaForCausalLM', 'LlavaQwenForCausalLM', 'LlavaMistralForCausalLM', 'LlavaVidForCausalLM', 'MiniCPMForCausalLM', 'MiniCPM3ForCausalLM', 'MiniCPMV', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MllamaForConditionalGeneration', 'OlmoForCausalLM', 'Olmo2ForCausalLM', 'OlmoeForCausalLM', 'Phi3SmallForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2ForCausalLMEagle', 'Qwen2MoeForCausalLM', 'Qwen2VLForConditionalGeneration', 'StableLmForCausalLM', 'TorchNativeLlamaForCausalLM', 'TorchNativePhi3ForCausalLM', 'XverseForCausalLM', 'XverseMoeForCausalLM', 'YiVLForCausalLM'])

[2025-01-31 13:12:23] Received sigquit from a child proces. It usually means the child failed.
Killed

Could you make sure the local repo has been updated, and python has chosen the correct sglang version? BTW, this PR will be unstable until merged.

The issue was caused by a missing dependency. Managed to get it working after running "pip install addict".

codeaxn · 2025-02-01T16:26:08Z

Any plans to add text to image?

yizhang2077

Sorry for late review, I take a basic review through it and leave some comments here, I'll take a closer look these days.

python/sglang/test/test_utils.py

test/srt/test_vision_openai_server.py

python/sglang/srt/hf_transformers_utils.py

python/sglang/srt/managers/schedule_batch.py

BaiStone2017 · 2025-02-24T09:17:01Z

janus-pro支持多模态理解和多模态生成，大佬，当前这套代码是仅支持多模态理解吧？

yizhang2077

I leave some questions and comments, review is continued...

yizhang2077 · 2025-03-02T17:31:29Z

python/sglang/srt/managers/image_processor.py

@@ -134,7 +152,7 @@ def calculate_max_num_frames() -> int:
            ret = (max_req_input_len - len(input_ids)) // self.NUM_TOKEN_PER_FRAME
            return min(ret, 100)

-        MAX_NUM_FRAMES = calculate_max_num_frames()
+        MAX_NUM_FRAMES = 30


why do we change max_num_frames here

It's because I find it hard to estimate the final number of tokens expanded from an image, it's implementation-specific. As a result, a hard limit is set here. Maybe I should slightly loose it a bit?

A possible way to address it, is to introduce something like max_sampled_frames into SamplingParams. But I haven't seen anything like this in existing popular frameworks.

If it is hard to estimate, I think we can remove calculate_max_num_frames(), and we need tune MAX_NUM_FRAMES to a value which don't cause OOM in most case.

Sure, but how much memory should we assume we have?

I think we can set the number to server_args or env variable (default 30 is ok), once oom is raised, we can leave an entrance for user to change.

Seems fine to me, will do it in a separate pr

test/srt/test_vision_openai_server.py

python/sglang/srt/layers/rotary_embedding.py

python/sglang/srt/models/minicpmv.py

python/sglang/srt/model_loader/weight_utils.py

yizhang2077 · 2025-03-02T17:54:13Z

Could we paste mmmu benchmark result here?

python/sglang/srt/configs/qwen2_5_vl_config.py

yizhang2077 · 2025-03-02T17:58:00Z

janus-pro支持多模态理解和多模态生成，大佬，当前这套代码是仅支持多模态理解吧？

hi @BaiStone2017, I think this commit only support multimodal understanding

python/sglang/srt/configs/janus_pro.py

python/sglang/srt/models/deepseek_janus_pro.py

mickqian · 2025-03-03T12:46:24Z

mmmu result:
(janus doesn't provide a transformer implementation, I managed to test it locally)

	sglang	hf
Janus-Pro-1B	0.337	0.345
Janus-Pro-7B	0.356	0.346

python/sglang/srt/models/deepseek_janus_pro.py

yizhang2077 · 2025-03-03T16:25:23Z

It seems to change too many files. There are somes refractor code and some of them will change the structure under the folder of managers, do you think this change is ok？ @merrymercy

add python/sglang/srt/managers/multi_modality_padding.py for common method of image padding
split image processor to several files and put it into image_processors folder.

mickqian force-pushed the janus-pro branch from 250e105 to 0183d9d Compare January 29, 2025 04:38

mickqian marked this pull request as ready for review January 29, 2025 07:13

mickqian requested review from merrymercy, Ying1123, zhyncs, hnyls2002, ispobock and ByronHsu as code owners January 29, 2025 07:13

zhyncs assigned yizhang2077 Jan 29, 2025

mickqian force-pushed the janus-pro branch 8 times, most recently from b6542eb to a29f545 Compare January 31, 2025 13:10

mickqian force-pushed the janus-pro branch from a29f545 to 8d47848 Compare February 2, 2025 13:58

mickqian mentioned this pull request Feb 6, 2025

model: Intern vl 2.5 #3351

Open

5 tasks

yizhang2077 reviewed Feb 12, 2025

View reviewed changes

python/sglang/test/test_utils.py Outdated Show resolved Hide resolved

test/srt/test_vision_openai_server.py Show resolved Hide resolved

python/sglang/srt/hf_transformers_utils.py Outdated Show resolved Hide resolved

python/sglang/srt/hf_transformers_utils.py Outdated Show resolved Hide resolved

yizhang2077 reviewed Feb 12, 2025

View reviewed changes

python/sglang/srt/managers/schedule_batch.py Outdated Show resolved Hide resolved

mickqian force-pushed the janus-pro branch from 8d47848 to ae142ec Compare February 13, 2025 03:19

mickqian requested review from HandH1998 and BBuf as code owners February 13, 2025 03:19

mickqian force-pushed the janus-pro branch from ae142ec to 34028ad Compare February 13, 2025 03:20

mickqian force-pushed the janus-pro branch 2 times, most recently from 8446258 to a8b0764 Compare February 16, 2025 15:37

mickqian changed the title ~~feat: Support Janus-pro~~ model: Support Janus-pro Feb 16, 2025

mickqian force-pushed the janus-pro branch 2 times, most recently from 0ee809f to 85d8d81 Compare February 16, 2025 16:01

This was referenced Feb 18, 2025

fix: apply cache size limit of attention mask for VisionAttention #3657

Merged

[Bug] Vision attention mask cache is never released and cause OOM #3651

Closed

mickqian force-pushed the janus-pro branch 5 times, most recently from 9e8aecd to a1226ed Compare March 2, 2025 05:09

yizhang2077 reviewed Mar 2, 2025

View reviewed changes

python/sglang/srt/model_loader/weight_utils.py Show resolved Hide resolved

yizhang2077 reviewed Mar 2, 2025

View reviewed changes

python/sglang/srt/configs/qwen2_5_vl_config.py Outdated Show resolved Hide resolved

merrymercy requested a review from HaiShaw as a code owner March 3, 2025 08:12

yizhang2077 reviewed Mar 3, 2025

View reviewed changes

python/sglang/srt/configs/janus_pro.py Show resolved Hide resolved

python/sglang/srt/models/deepseek_janus_pro.py Outdated Show resolved Hide resolved

yizhang2077 reviewed Mar 3, 2025

View reviewed changes

python/sglang/srt/models/deepseek_janus_pro.py Show resolved Hide resolved

mickqian force-pushed the janus-pro branch from a1226ed to ca3b095 Compare March 3, 2025 11:15

mickqian force-pushed the janus-pro branch from ca3b095 to a3b5cac Compare March 3, 2025 12:50

yizhang2077 reviewed Mar 3, 2025

View reviewed changes

python/sglang/srt/models/deepseek_janus_pro.py Outdated Show resolved Hide resolved

python/sglang/srt/models/deepseek_janus_pro.py Outdated Show resolved Hide resolved

zhaochenyang20 mentioned this pull request Mar 4, 2025

Development Roadmap (2025 H1) #4042

Open

55 tasks

mickqian force-pushed the janus-pro branch from a3b5cac to 8e6765d Compare March 4, 2025 01:10

model: Support Janus-pro

931c250

mickqian force-pushed the janus-pro branch from 8e6765d to 931c250 Compare March 4, 2025 02:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model: Support Janus-pro #3203

model: Support Janus-pro #3203

mickqian commented Jan 29, 2025 •

edited

Loading

zhyncs commented Jan 29, 2025

codeaxn commented Jan 31, 2025 •

edited

Loading

mickqian commented Feb 1, 2025 •

edited

Loading

codeaxn commented Feb 1, 2025

codeaxn commented Feb 1, 2025

yizhang2077 left a comment •

edited

Loading

BaiStone2017 commented Feb 24, 2025

yizhang2077 left a comment •

edited

Loading

yizhang2077 Mar 2, 2025

mickqian Mar 3, 2025 •

edited

Loading

mickqian Mar 3, 2025 •

edited

Loading

yizhang2077 Mar 3, 2025 •

edited

Loading

mickqian Mar 3, 2025

yizhang2077 Mar 3, 2025

mickqian Mar 4, 2025 •

edited

Loading

yizhang2077 commented Mar 2, 2025

yizhang2077 commented Mar 2, 2025

mickqian commented Mar 3, 2025

yizhang2077 commented Mar 3, 2025 •

edited

Loading

model: Support Janus-pro #3203

Are you sure you want to change the base?

model: Support Janus-pro #3203

Conversation

mickqian commented Jan 29, 2025 • edited Loading

Motivation

Modifications

Checklist

zhyncs commented Jan 29, 2025

codeaxn commented Jan 31, 2025 • edited Loading

mickqian commented Feb 1, 2025 • edited Loading

codeaxn commented Feb 1, 2025

codeaxn commented Feb 1, 2025

yizhang2077 left a comment • edited Loading

Choose a reason for hiding this comment

BaiStone2017 commented Feb 24, 2025

yizhang2077 left a comment • edited Loading

Choose a reason for hiding this comment

yizhang2077 Mar 2, 2025

Choose a reason for hiding this comment

mickqian Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

mickqian Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

yizhang2077 Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

mickqian Mar 3, 2025

Choose a reason for hiding this comment

yizhang2077 Mar 3, 2025

Choose a reason for hiding this comment

mickqian Mar 4, 2025 • edited Loading

Choose a reason for hiding this comment

yizhang2077 commented Mar 2, 2025

yizhang2077 commented Mar 2, 2025

mickqian commented Mar 3, 2025

yizhang2077 commented Mar 3, 2025 • edited Loading

mickqian commented Jan 29, 2025 •

edited

Loading

codeaxn commented Jan 31, 2025 •

edited

Loading

mickqian commented Feb 1, 2025 •

edited

Loading

yizhang2077 left a comment •

edited

Loading

yizhang2077 left a comment •

edited

Loading

mickqian Mar 3, 2025 •

edited

Loading

mickqian Mar 3, 2025 •

edited

Loading

yizhang2077 Mar 3, 2025 •

edited

Loading

mickqian Mar 4, 2025 •

edited

Loading

yizhang2077 commented Mar 3, 2025 •

edited

Loading