Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lacking clarity of parameters #7

Open
AmitMY opened this issue Feb 14, 2025 · 1 comment
Open

Lacking clarity of parameters #7

AmitMY opened this issue Feb 14, 2025 · 1 comment
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@AmitMY
Copy link
Contributor

AmitMY commented Feb 14, 2025

  1. What is this being used for?
    https://github.com/GerrySant/multimodalhugs/blob/master/examples/multimodal_translation/pose2text_translation/configs/example_config.yaml#L12-L14

It is not super clear to me what are valid values. Are these?

  backbone_name: "google/byt5-small"            
  pretrained_backbone: "google/byt5-small" 
  1. And for feat_dim - maybe explain in the comment that it is the pose, for example 178 points with 3 dimensions

I also think this should be optional:
https://github.com/GerrySant/multimodalhugs/blob/master/examples/multimodal_translation/pose2text_translation/configs/example_config.yaml#L19

  1. Is this actually float16 or bfloat16?
    https://github.com/GerrySant/multimodalhugs/blob/master/examples/multimodal_translation/pose2text_translation/configs/example_config.yaml#L48

  2. Learning rate seems too high
    https://github.com/GerrySant/multimodalhugs/blob/master/examples/multimodal_translation/pose2text_translation/configs/example_config.yaml#L31

  3. Does this only modify the src tokenizer?
    https://github.com/GerrySant/multimodalhugs/blob/master/examples/multimodal_translation/pose2text_translation/configs/example_config.yaml#L56
    The name suggests so, but you do allow generation_prompt which is in the output tokenizer

@GerrySant
Copy link
Owner

  1. (Work in progress) I must correct the config, as it should be like:
  backbone_name: "<backbone-model-type>"            # Identifier for the pretrained backbone (e.g., "m2m100", "t5").
  pretrained_backbone: "<pretrained-backbone-weights>"    # Weights or checkpoint identifier for the pretrained backbone. For instance "google/byt5-small"
  feat_dim: 534                                           # Dimensionality of the features produced by the feature extractor if present, otherwise should be the dimensionality of the features that are inputed to the network.

For poses, lets say that each pose has the shape: [t, people, d, xyz], feat_dim = d*xyz*people

  1. (Work in progress) Yeah, it should be better explained.

  2. It uses the float position adopted by the Trainer

  3. (Work in progress) The values respective to “training” in the example configuration have been chosen without any criteria. The intention is to use the values obtained after training some models with a decent performance.
    Also, it is possible that there is a bug in the prioritization of the training arguments between those specified in the config and those specified in the training command, so for the moment it is recommended to specify the training hyperparameters during the multimodalhugs-train command.

  4. Yes. The new tokens are used to extend the pretrained tokenizer, but the extended tokenizer is only used to create the new embeddings for the encoder (the new embeddings extend the embeddings from the pretrained backbone embeddings).
    (Work in progress) Regarding this behaviour excluding generation prompt, fixing it has now been identified as a priority.
    (Work in progress) Finally, it is planned to rename the tokenizer_src_langs_path parameter to new_vocabulary in the near future to better reflect the way it works.

@GerrySant GerrySant added bug Something isn't working documentation Improvements or additions to documentation labels Feb 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants