Fix sdpa in sam and refactor relative position embeddings #36422

geetu040 · 2025-02-26T12:41:49Z

What does this PR do?

This PR identifies and fixes the following problems with SamVisionSdpaAttention

attn_weights that are returned by SamVisionAttention and SamVisionSdpaAttention are of different sizes: [batch_size, num_attention_head, seq_len, seq_len] and [batch_size * num_attention_head, seq_len, seq_len] respectively.
scaled_dot_product_attention in torch doesnot return attn_weights, so these attn_weights are calculated manually in SamVisionSdpaAttention which defeats the purpose of sdpa in the first place. Instead, if output_attentions==True we should fall back to eager implementation in SamVisionAttention.
relative position embeddings in both attention layers, can be refactored to cleaner code, fewer reshaping and a single function SamVisionAttention.get_decomposed_rel_pos.

I had to change add_decomposed_rel_pos function, since when output_attentions==False we are falling back to "eager", in which case the overloaded method is not called and parent's method is not compatible.
This change affected some other classes as well

GotOcr2VisionAttention completely inherits SamVisionAttention, since modular, just running utils/modular_model_converter.py should fix it.
TFSamVisionAttention mimics the SamVisionAttention class, both look slightly different now.

A few questions from my side, related to add_decomposed_rel_pos

Should I update TFSamVisionAttention class to also look like SamVisionAttention?
Or should I keep the original function add_decomposed_rel_pos and make slight changes to make it compatible with SamVisionSdpaAttention as well and also apply these changes to TFSamVisionAttention?

Who can review?

@amyeroberts, @qubvel, @zucchini-nlp

qubvel · 2025-02-26T12:57:28Z

Hi @geetu040, thanks for opening the PR!

qubvel · 2025-02-26T12:57:36Z

run-slow: sam

github-actions · 2025-02-26T12:59:02Z

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/sam']
quantizations: [] ...

geetu040 added 3 commits February 26, 2025 15:54

fall back to eager if output_attentions

dea7503

improve relative position embeddings

400e948

run modular on got_ocr2

c8d5f1c

qubvel added the Vision label Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sdpa in sam and refactor relative position embeddings #36422

Fix sdpa in sam and refactor relative position embeddings #36422

geetu040 commented Feb 26, 2025

qubvel commented Feb 26, 2025

qubvel commented Feb 26, 2025

github-actions bot commented Feb 26, 2025

Fix sdpa in sam and refactor relative position embeddings #36422

Are you sure you want to change the base?

Fix sdpa in sam and refactor relative position embeddings #36422

Conversation

geetu040 commented Feb 26, 2025

What does this PR do?

Who can review?

qubvel commented Feb 26, 2025

qubvel commented Feb 26, 2025

github-actions bot commented Feb 26, 2025