Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about apply_seq_parallel_monkey_patch("zigzag_ring_attn", "llama") in eval_vision_niah.py for Qwen2 model and ring attention #30

Open
zhang9302002 opened this issue Oct 31, 2024 · 1 comment

Comments

@zhang9302002
Copy link

Hello, thanks for your great work. I have some little questions.

When testing a Qwen2 based model, like llava_qwen or lmms-lab/LongVA-7B, on V-NIAH benchmark,

there is a function apply_seq_parallel_monkey_patch("zigzag_ring_attn", "llama").

  • How can this monkey patch work since Qwen2 has a different architecture from LLaMA?
def apply_zigzag_ring_attn_monkey_patch_llama():
    transformers.models.llama.modeling_llama.LlamaFlashAttention2._flash_attention_forward = (
        new_flash_attn_forward
    )
    transformers.models.llama.modeling_llama.LlamaDecoderLayer.forward = (
        new_decoder_forward
    )
  • Does replacement have any effect for class Qwen2ForCausalLM_RingAttn?
  • Then how is zigzag_ring_attn performed during benchmarking for llava_qwen based models?
@jzhang38
Copy link
Collaborator

jzhang38 commented Nov 20, 2024

We did not use monkey patch for qwen. we directly load Qwen2ForCausalLM_RingAttn in easy_context/modeling_qwen2.py

No, this monkey patch does not have effect for Qwen2ForCausalLM_RingAttn

Then how is zigzag_ring_attn performed during benchmarking for llava_qwen based models?

The only place we need zigzag_ring_attn is during text training and evaluation on V-NIAH. We do not use zigzag_ring_attn for other benchmarks in LMMs-Eval, or image-text training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants