You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using distributed or parallel set-up in script?: no
Using GPU in script?: yes
GPU type: NVIDIA RTX A6000
Who can help?
No response
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Run _init_weights
Expected behavior
The _init_weights method of Mask2Former has multiple problems. It initializes nn.Embeddings with an std of .02 (original Mask2Former code uses PyTorch's default init with std of 1.0). Similarly, the mask MLP is initialised wrongly with zero biases. Finally, another example of a problem is that the initialisation of the multi-scale deformable attention is overwritten by the branch for the Mask2FormerPixelDecoderEncoderOnly.
The text was updated successfully, but these errors were encountered:
cc @qubvel for Mask2Former, but I think we would accept a PR to fix this! If you make that PR, can you include links to the relevant lines in the original Mask2Former code so we can compare against the reference implementation?
Hi @tommiekerssies, thanks for opening the issue! _init_weights are not ideal for some vision models in transformers, we would appreciate PR to fix it!
System Info
transformers
version: 4.45.2Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run _init_weights
Expected behavior
The _init_weights method of Mask2Former has multiple problems. It initializes nn.Embeddings with an std of .02 (original Mask2Former code uses PyTorch's default init with std of 1.0). Similarly, the mask MLP is initialised wrongly with zero biases. Finally, another example of a problem is that the initialisation of the multi-scale deformable attention is overwritten by the branch for the Mask2FormerPixelDecoderEncoderOnly.
The text was updated successfully, but these errors were encountered: