Mask2Former _init_weights #35877

tommiekerssies · 2025-01-24T13:26:45Z

System Info

transformers version: 4.45.2
Platform: Linux-6.8.0-51-generic-x86_64-with-glibc2.39
Python version: 3.11.9
Huggingface_hub version: 0.24.6
Safetensors version: 0.4.4
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.4.0+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: no
Using GPU in script?: yes
GPU type: NVIDIA RTX A6000

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Run _init_weights

Expected behavior

The _init_weights method of Mask2Former has multiple problems. It initializes nn.Embeddings with an std of .02 (original Mask2Former code uses PyTorch's default init with std of 1.0). Similarly, the mask MLP is initialised wrongly with zero biases. Finally, another example of a problem is that the initialisation of the multi-scale deformable attention is overwritten by the branch for the Mask2FormerPixelDecoderEncoderOnly.

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2025-01-24T16:37:49Z

cc @qubvel for Mask2Former, but I think we would accept a PR to fix this! If you make that PR, can you include links to the relevant lines in the original Mask2Former code so we can compare against the reference implementation?

qubvel · 2025-01-27T09:02:38Z

Hi @tommiekerssies, thanks for opening the issue! _init_weights are not ideal for some vision models in transformers, we would appreciate PR to fix it!

tommiekerssies added the bug label Jan 24, 2025

qubvel added Core: Modeling Internals of the library; Models. Vision labels Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mask2Former _init_weights #35877

Mask2Former _init_weights #35877

tommiekerssies commented Jan 24, 2025 •

edited

Loading

Rocketknight1 commented Jan 24, 2025

qubvel commented Jan 27, 2025

Mask2Former _init_weights #35877

Mask2Former _init_weights #35877

Comments

tommiekerssies commented Jan 24, 2025 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Rocketknight1 commented Jan 24, 2025

qubvel commented Jan 27, 2025

tommiekerssies commented Jan 24, 2025 •

edited

Loading