-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing documentation - new languages #12
Labels
Comments
even when my file is Saving the dataset (1/1 shards): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 96404/96404 [00:00<00:00, 1277394.34 examples/s]
Saving the dataset (1/1 shards): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1738/1738 [00:00<00:00, 165089.69 examples/s]
Saving the dataset (1/1 shards): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1090/1090 [00:00<00:00, 99849.11 examples/s]
/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
warnings.warn(
Traceback (most recent call last):
File "/data/amoryo/conda/envs/multimodalhugs/bin/multimodalhugs-setup", line 8, in <module>
sys.exit(main())
^^^^^^
File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/multimodalhugs_cli/training_setup.py", line 34, in main
pose2text_setup(args.config_path)
File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/training_setup/pose2sign_training_setup.py", line 66, in main
model = model_class.build_model(**model_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/models/multimodal_embedder.py", line 451, in build_model
source_embeddings = SpecialTokensEmbeddings.build_module(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/modules/special_tokens_embeddings.py", line 49, in build_module
custom_embeddings = CustomEmbedding.build_module(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/modules/custom_embedding.py", line 55, in build_module
module.old_embeddings.weight.data[:] = old_embs_weight[:used_size]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
RuntimeError: The expanded size of the tensor (1024) must match the existing size (1472) at non-singleton dimension 1. Target sizes: [384, 1024]. Tensor sizes: [384, 1472] |
Idea 1: implement the code necessary to set idea 2: In |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There's this file:
https://github.com/GerrySant/multimodalhugs/blob/master/examples/multimodal_translation/pose2text_translation/other/new_languages_how2sign.txt
But I don't know how it should be constructed.
Also, why is there no
__slt__
token like the documentation or__en__
?In
setup
there should be validation of that file, to make sure it fits a formatIn my run, I created a file with one new token per line, but it errors out so I expect that's wrong.
The text was updated successfully, but these errors were encountered: