Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask2Former _init_weights #35877

Open
4 tasks
tommiekerssies opened this issue Jan 24, 2025 · 2 comments
Open
4 tasks

Mask2Former _init_weights #35877

tommiekerssies opened this issue Jan 24, 2025 · 2 comments
Labels
bug Core: Modeling Internals of the library; Models. Vision

Comments

@tommiekerssies
Copy link

tommiekerssies commented Jan 24, 2025

System Info

  • transformers version: 4.45.2
  • Platform: Linux-6.8.0-51-generic-x86_64-with-glibc2.39
  • Python version: 3.11.9
  • Huggingface_hub version: 0.24.6
  • Safetensors version: 0.4.4
  • Accelerate version: not installed
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.4.0+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: no
  • Using GPU in script?: yes
  • GPU type: NVIDIA RTX A6000

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Run _init_weights

Expected behavior

The _init_weights method of Mask2Former has multiple problems. It initializes nn.Embeddings with an std of .02 (original Mask2Former code uses PyTorch's default init with std of 1.0). Similarly, the mask MLP is initialised wrongly with zero biases. Finally, another example of a problem is that the initialisation of the multi-scale deformable attention is overwritten by the branch for the Mask2FormerPixelDecoderEncoderOnly.

@Rocketknight1
Copy link
Member

cc @qubvel for Mask2Former, but I think we would accept a PR to fix this! If you make that PR, can you include links to the relevant lines in the original Mask2Former code so we can compare against the reference implementation?

@qubvel
Copy link
Member

qubvel commented Jan 27, 2025

Hi @tommiekerssies, thanks for opening the issue! _init_weights are not ideal for some vision models in transformers, we would appreciate PR to fix it!

@qubvel qubvel added Core: Modeling Internals of the library; Models. Vision labels Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Core: Modeling Internals of the library; Models. Vision
Projects
None yet
Development

No branches or pull requests

3 participants