We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove_unused_columns
training
config.yaml
multimodalhugs-train \ --task "translation" \ --model_name_or_path $MODEL_PATH \ --processor_name_or_path $PROCESSOR_PATH \ --run_name $MODEL_NAME \ --dataset_dir $DATA_PATH \ --output_dir $OUTPUT_PATH \ --do_train True \ --do_eval True \ --fp16 \ --label_smoothing_factor 0.1 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 8 \ --evaluation_strategy "steps" \ --eval_steps 2000 \ --save_strategy "steps" \ --save_steps 2000 \ --save_total_limit 3 \ --load_best_model_at_end true \ --metric_for_best_model 'chrf' \ --overwrite_output_dir \ --gradient_accumulation_steps 4 \ --learning_rate 1e-3 \ --warmup_steps 20000 \ --max_steps 200000 \ --predict_with_generate True
I get this error:
checkpoint: None [INFO|trainer.py:811] 2025-02-14 17:13:23,349 >> The following columns in the training set don't have a corresponding argument in `MultiModalEmbedderModel.forward` and have been ignored: source, source_start, output_text, generation_prompt, source_end, source_prompt. If source, source_start, output_text, generation_prompt, source_end, source_prompt are not expected by `MultiModalEmbedderModel.forward`, you can safely ignore this message. Traceback (most recent call last): File "/data/amoryo/conda/envs/multimodalhugs/bin/multimodalhugs-train", line 8, in <module> sys.exit(main()) ^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/multimodalhugs_cli/train.py", line 25, in main translation_main() File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/tasks/run_translation.py", line 715, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1967, in _inner_training_loop train_dataloader = self.get_train_dataloader() ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 892, in get_train_dataloader train_dataset = self._remove_unused_columns(train_dataset, description="training") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 820, in _remove_unused_columns raise ValueError( ValueError: No columns in the dataset match the model's forward method signature. The following columns have been ignored: [source, source_start, output_text, generation_prompt, source_end, source_prompt]. Please check the dataset and model. You may need to set `remove_unused_columns=False` in `TrainingArguments`.
(multimodalhugs) amoryo@u20-chiivm0-604:~/sign-language/signwriting-transcription$ multimodalhugs-train --task translation --model_name_or_path /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model --processor_name_or_path /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor --run_name signwriting_transcription_model --dataset_dir /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/datasets/pose2text --output_dir /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/output --do_train True --do_eval True --fp16 --label_smoothing_factor 0.1 --per_device_train_batch_size 8 --per_device_eval_batch_size 8 --evaluation_strategy steps --eval_steps 2000 --save_strategy steps --save_steps 2000 --save_total_limit 3 --load_best_model_at_end true --metric_for_best_model chrf --overwrite_output_dir --gradient_accumulation_steps 4 --learning_rate 1e-3 --warmup_steps 20000 --max_steps 200000 --predict_with_generate True /data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead warnings.warn( WARNING:multimodalhugs.tasks.run_translation:Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, 16-bits training: True INFO:multimodalhugs.tasks.run_translation:Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, batch_eval_metrics=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=2000, eval_strategy=IntervalStrategy.STEPS, eval_use_gather_object=False, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=None, generation_num_beams=None, gradient_accumulation_steps=4, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=True, group_by_length=False, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=False, hub_strategy=HubStrategy.EVERY_SAVE, hub_token=<HUB_TOKEN>, ignore_data_skip=False, include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.1, learning_rate=0.001, length_column_name=length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/output/runs/Feb14_17-13-03_u20-chiivm0-604, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=500, logging_strategy=IntervalStrategy.STEPS, lr_scheduler_kwargs={}, lr_scheduler_type=SchedulerType.LINEAR, max_grad_norm=1.0, max_steps=200000, metric_for_best_model=chrf, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=3.0, optim=OptimizerNames.ADAMW_TORCH, optim_args=None, optim_target_modules=None, output_dir=/scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/output, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=8, per_device_train_batch_size=8, predict_with_generate=True, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=<PUSH_TO_HUB_TOKEN>, ray_scope=last, remove_unused_columns=True, report_to=['wandb'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=signwriting_transcription_model, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=2000, save_strategy=IntervalStrategy.STEPS, save_total_limit=3, seed=42, skip_memory_metrics=True, sortish_sampler=False, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=20000, weight_decay=0.0, ) [INFO|configuration_utils.py:731] 2025-02-14 17:13:04,230 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model/config.json [INFO|configuration_utils.py:800] 2025-02-14 17:13:04,231 >> Model config MultiModalEmbedderConfig(model_type='multimodal_embedder', feat_dim=534, feature_extractor_type=None, no_scale_embedding=False, pretrained_feature_extractor=None, freeze_feature_extractor=False, vl_mapper_type='linear', vl_mapper_layer_norm_before=True, vl_mapper_layer_norm=False, vl_mapper_activation=False, vl_factor=None, vl_mapper_dropout=0.1, freeze_vl_mapper=False, new_embeddings_vocab_size=11, backbone_used_vocab_size=384, init_lang_abbr='avg', freeze_new_embeddings=False, freeze_old_embeddings=False, backbone_name='t5', backbone_cfg=None, pretrained_backbone='google/byt5-small', freeze_backbone=False, encoder_embed_dim=1472, feature_extractor_cfg=None, is_encoder_decoder=True, pad_token_id=0, bos_token_id=None, eos_token_id=1, max_length=20) [INFO|configuration_utils.py:1038] 2025-02-14 17:13:04,231 >> Generate config GenerationConfig { "eos_token_id": 1, "pad_token_id": 0 } [INFO|processing_utils.py:660] 2025-02-14 17:13:04,232 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor/processor_config.json [INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:2513] 2025-02-14 17:13:04,239 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [INFO|processing_utils.py:660] 2025-02-14 17:13:04,239 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor/processor_config.json [WARNING|processing_utils.py:953] 2025-02-14 17:13:04,239 >> Some kwargs in processor config are unused and will not have any effect: reduce_holistic_poses. [INFO|processing_utils.py:722] 2025-02-14 17:13:04,242 >> Processor Pose2TextTranslationProcessor: - tokenizer: ByT5Tokenizer(name_or_path='/scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor', vocab_size=256, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '<pad>', 'additional_special_tokens': ['<extra_id_0>', '<extra_id_1>', '<extra_id_2>', '<extra_id_3>', '<extra_id_4>', '<extra_id_5>', '<extra_id_6>', '<extra_id_7>', '<extra_id_8>', '<extra_id_9>', '<extra_id_10>', '<extra_id_11>', '<extra_id_12>', '<extra_id_13>', '<extra_id_14>', '<extra_id_15>', '<extra_id_16>', '<extra_id_17>', '<extra_id_18>', '<extra_id_19>', '<extra_id_20>', '<extra_id_21>', '<extra_id_22>', '<extra_id_23>', '<extra_id_24>', '<extra_id_25>', '<extra_id_26>', '<extra_id_27>', '<extra_id_28>', '<extra_id_29>', '<extra_id_30>', '<extra_id_31>', '<extra_id_32>', '<extra_id_33>', '<extra_id_34>', '<extra_id_35>', '<extra_id_36>', '<extra_id_37>', '<extra_id_38>', '<extra_id_39>', '<extra_id_40>', '<extra_id_41>', '<extra_id_42>', '<extra_id_43>', '<extra_id_44>', '<extra_id_45>', '<extra_id_46>', '<extra_id_47>', '<extra_id_48>', '<extra_id_49>', '<extra_id_50>', '<extra_id_51>', '<extra_id_52>', '<extra_id_53>', '<extra_id_54>', '<extra_id_55>', '<extra_id_56>', '<extra_id_57>', '<extra_id_58>', '<extra_id_59>', '<extra_id_60>', '<extra_id_61>', '<extra_id_62>', '<extra_id_63>', '<extra_id_64>', '<extra_id_65>', '<extra_id_66>', '<extra_id_67>', '<extra_id_68>', '<extra_id_69>', '<extra_id_70>', '<extra_id_71>', '<extra_id_72>', '<extra_id_73>', '<extra_id_74>', '<extra_id_75>', '<extra_id_76>', '<extra_id_77>', '<extra_id_78>', '<extra_id_79>', '<extra_id_80>', '<extra_id_81>', '<extra_id_82>', '<extra_id_83>', '<extra_id_84>', '<extra_id_85>', '<extra_id_86>', '<extra_id_87>', '<extra_id_88>', '<extra_id_89>', '<extra_id_90>', '<extra_id_91>', '<extra_id_92>', '<extra_id_93>', '<extra_id_94>', '<extra_id_95>', '<extra_id_96>', '<extra_id_97>', '<extra_id_98>', '<extra_id_99>', '<extra_id_100>', '<extra_id_101>', '<extra_id_102>', '<extra_id_103>', '<extra_id_104>', '<extra_id_105>', '<extra_id_106>', '<extra_id_107>', '<extra_id_108>', '<extra_id_109>', '<extra_id_110>', '<extra_id_111>', '<extra_id_112>', '<extra_id_113>', '<extra_id_114>', '<extra_id_115>', '<extra_id_116>', '<extra_id_117>', '<extra_id_118>', '<extra_id_119>', '<extra_id_120>', '<extra_id_121>', '<extra_id_122>', '<extra_id_123>', '<extra_id_124>', '__pose__', '__gsg__', '__slf__', '__asq__', '__ssr__', '__ase__', '__ils__', '__sgg__', '__cse__', '__svk__', '__dse__']}, clean_up_tokenization_spaces=True), added_tokens_decoder={ 0: AddedToken("<pad>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True), 1: AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True), 2: AddedToken("<unk>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True), 259: AddedToken("<extra_id_0>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 260: AddedToken("<extra_id_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 261: AddedToken("<extra_id_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 262: AddedToken("<extra_id_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 263: AddedToken("<extra_id_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 264: AddedToken("<extra_id_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 265: AddedToken("<extra_id_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 266: AddedToken("<extra_id_7>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 267: AddedToken("<extra_id_8>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 268: AddedToken("<extra_id_9>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 269: AddedToken("<extra_id_10>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 270: AddedToken("<extra_id_11>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 271: AddedToken("<extra_id_12>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 272: AddedToken("<extra_id_13>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 273: AddedToken("<extra_id_14>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 274: AddedToken("<extra_id_15>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 275: AddedToken("<extra_id_16>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 276: AddedToken("<extra_id_17>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 277: AddedToken("<extra_id_18>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 278: AddedToken("<extra_id_19>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 279: AddedToken("<extra_id_20>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 280: AddedToken("<extra_id_21>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 281: AddedToken("<extra_id_22>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 282: AddedToken("<extra_id_23>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 283: AddedToken("<extra_id_24>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 284: AddedToken("<extra_id_25>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 285: AddedToken("<extra_id_26>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 286: AddedToken("<extra_id_27>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 287: AddedToken("<extra_id_28>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 288: AddedToken("<extra_id_29>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 289: AddedToken("<extra_id_30>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 290: AddedToken("<extra_id_31>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 291: AddedToken("<extra_id_32>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 292: AddedToken("<extra_id_33>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 293: AddedToken("<extra_id_34>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 294: AddedToken("<extra_id_35>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 295: AddedToken("<extra_id_36>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 296: AddedToken("<extra_id_37>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 297: AddedToken("<extra_id_38>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 298: AddedToken("<extra_id_39>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 299: AddedToken("<extra_id_40>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 300: AddedToken("<extra_id_41>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 301: AddedToken("<extra_id_42>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 302: AddedToken("<extra_id_43>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 303: AddedToken("<extra_id_44>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 304: AddedToken("<extra_id_45>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 305: AddedToken("<extra_id_46>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 306: AddedToken("<extra_id_47>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 307: AddedToken("<extra_id_48>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 308: AddedToken("<extra_id_49>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 309: AddedToken("<extra_id_50>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 310: AddedToken("<extra_id_51>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 311: AddedToken("<extra_id_52>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 312: AddedToken("<extra_id_53>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 313: AddedToken("<extra_id_54>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 314: AddedToken("<extra_id_55>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 315: AddedToken("<extra_id_56>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 316: AddedToken("<extra_id_57>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 317: AddedToken("<extra_id_58>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 318: AddedToken("<extra_id_59>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 319: AddedToken("<extra_id_60>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 320: AddedToken("<extra_id_61>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 321: AddedToken("<extra_id_62>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 322: AddedToken("<extra_id_63>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 323: AddedToken("<extra_id_64>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 324: AddedToken("<extra_id_65>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 325: AddedToken("<extra_id_66>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 326: AddedToken("<extra_id_67>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 327: AddedToken("<extra_id_68>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 328: AddedToken("<extra_id_69>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 329: AddedToken("<extra_id_70>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 330: AddedToken("<extra_id_71>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 331: AddedToken("<extra_id_72>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 332: AddedToken("<extra_id_73>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 333: AddedToken("<extra_id_74>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 334: AddedToken("<extra_id_75>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 335: AddedToken("<extra_id_76>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 336: AddedToken("<extra_id_77>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 337: AddedToken("<extra_id_78>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 338: AddedToken("<extra_id_79>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 339: AddedToken("<extra_id_80>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 340: AddedToken("<extra_id_81>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 341: AddedToken("<extra_id_82>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 342: AddedToken("<extra_id_83>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 343: AddedToken("<extra_id_84>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 344: AddedToken("<extra_id_85>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 345: AddedToken("<extra_id_86>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 346: AddedToken("<extra_id_87>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 347: AddedToken("<extra_id_88>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 348: AddedToken("<extra_id_89>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 349: AddedToken("<extra_id_90>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 350: AddedToken("<extra_id_91>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 351: AddedToken("<extra_id_92>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 352: AddedToken("<extra_id_93>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 353: AddedToken("<extra_id_94>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 354: AddedToken("<extra_id_95>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 355: AddedToken("<extra_id_96>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 356: AddedToken("<extra_id_97>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 357: AddedToken("<extra_id_98>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 358: AddedToken("<extra_id_99>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 359: AddedToken("<extra_id_100>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 360: AddedToken("<extra_id_101>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 361: AddedToken("<extra_id_102>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 362: AddedToken("<extra_id_103>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 363: AddedToken("<extra_id_104>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 364: AddedToken("<extra_id_105>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 365: AddedToken("<extra_id_106>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 366: AddedToken("<extra_id_107>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 367: AddedToken("<extra_id_108>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 368: AddedToken("<extra_id_109>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 369: AddedToken("<extra_id_110>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 370: AddedToken("<extra_id_111>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 371: AddedToken("<extra_id_112>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 372: AddedToken("<extra_id_113>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 373: AddedToken("<extra_id_114>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 374: AddedToken("<extra_id_115>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 375: AddedToken("<extra_id_116>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 376: AddedToken("<extra_id_117>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 377: AddedToken("<extra_id_118>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 378: AddedToken("<extra_id_119>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 379: AddedToken("<extra_id_120>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 380: AddedToken("<extra_id_121>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 381: AddedToken("<extra_id_122>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 382: AddedToken("<extra_id_123>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 383: AddedToken("<extra_id_124>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 384: AddedToken("__pose__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 385: AddedToken("__gsg__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 386: AddedToken("__slf__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 387: AddedToken("__asq__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 388: AddedToken("__ssr__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 389: AddedToken("__ase__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 390: AddedToken("__ils__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 391: AddedToken("__sgg__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 392: AddedToken("__cse__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 393: AddedToken("__svk__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 394: AddedToken("__dse__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), } { "processor_class": "Pose2TextTranslationProcessor", "reduce_holistic_poses": true } [INFO|modeling_utils.py:3675] 2025-02-14 17:13:04,329 >> loading weights file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model/model.safetensors [INFO|configuration_utils.py:1038] 2025-02-14 17:13:08,925 >> Generate config GenerationConfig { "eos_token_id": 1, "pad_token_id": 0 } [INFO|configuration_utils.py:733] 2025-02-14 17:13:09,413 >> loading configuration file config.json from cache at /home/amoryo/data/.cache/huggingface/hub/models--google--byt5-small/snapshots/68377bdc18a2ffec8a0533fef03b1c513a4dd49d/config.json [INFO|configuration_utils.py:800] 2025-02-14 17:13:09,414 >> Model config T5Config { "_name_or_path": "/home/patrick/t5/byt5-small", "architectures": [ "T5ForConditionalGeneration" ], "classifier_dropout": 0.0, "d_ff": 3584, "d_kv": 64, "d_model": 1472, "decoder_start_token_id": 0, "dense_act_fn": "gelu_new", "dropout_rate": 0.1, "eos_token_id": 1, "feed_forward_proj": "gated-gelu", "gradient_checkpointing": false, "initializer_factor": 1.0, "is_encoder_decoder": true, "is_gated_act": true, "layer_norm_epsilon": 1e-06, "model_type": "t5", "num_decoder_layers": 4, "num_heads": 6, "num_layers": 12, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "tie_word_embeddings": false, "tokenizer_class": "ByT5Tokenizer", "transformers_version": "4.44.2", "use_cache": true, "vocab_size": 384 } [INFO|modeling_utils.py:3678] 2025-02-14 17:13:09,537 >> loading weights file pytorch_model.bin from cache at /home/amoryo/data/.cache/huggingface/hub/models--google--byt5-small/snapshots/68377bdc18a2ffec8a0533fef03b1c513a4dd49d/pytorch_model.bin [INFO|configuration_utils.py:1038] 2025-02-14 17:13:17,346 >> Generate config GenerationConfig { "decoder_start_token_id": 0, "eos_token_id": 1, "pad_token_id": 0 } [INFO|modeling_utils.py:4507] 2025-02-14 17:13:17,403 >> All model checkpoint weights were used when initializing T5ForConditionalGeneration. [INFO|modeling_utils.py:4515] 2025-02-14 17:13:17,403 >> All the weights of T5ForConditionalGeneration were initialized from the model checkpoint at google/byt5-small. If your task is similar to the task the model of the checkpoint was trained on, you can already use T5ForConditionalGeneration for predictions without further training. [INFO|configuration_utils.py:993] 2025-02-14 17:13:17,534 >> loading configuration file generation_config.json from cache at /home/amoryo/data/.cache/huggingface/hub/models--google--byt5-small/snapshots/68377bdc18a2ffec8a0533fef03b1c513a4dd49d/generation_config.json [INFO|configuration_utils.py:1038] 2025-02-14 17:13:17,534 >> Generate config GenerationConfig { "decoder_start_token_id": 0, "eos_token_id": 1, "pad_token_id": 0 } [INFO|modeling_utils.py:4507] 2025-02-14 17:13:17,575 >> All model checkpoint weights were used when initializing MultiModalEmbedderModel. [INFO|modeling_utils.py:4515] 2025-02-14 17:13:17,575 >> All the weights of MultiModalEmbedderModel were initialized from the model checkpoint at /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model. If your task is similar to the task the model of the checkpoint was trained on, you can already use MultiModalEmbedderModel for predictions without further training. [INFO|configuration_utils.py:991] 2025-02-14 17:13:17,599 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model/generation_config.json [INFO|configuration_utils.py:1038] 2025-02-14 17:13:17,600 >> Generate config GenerationConfig { "eos_token_id": 1, "pad_token_id": 0 } WARNING:multimodalhugs.tasks.run_translation:label_smoothing is enabled but the `prepare_decoder_input_ids_from_labels` method is not defined for `MultiModalEmbedderModel`. This will lead to loss being calculated twice and will take up more memory train_dataset: Dataset({ features: ['source', 'source_start', 'source_end', 'source_prompt', 'generation_prompt', 'output_text'], num_rows: 96404 }) [WARNING|trainer.py:598] 2025-02-14 17:13:23,154 >> max_steps is given, it will override any value given in num_train_epochs [INFO|trainer.py:648] 2025-02-14 17:13:23,154 >> Using auto half precision backend INFO:multimodalhugs.tasks.run_translation: MultiModalEmbedderModel( (vl_mapper): VLMapper( (layer_norm_before): LayerNorm((534,), eps=1e-05, elementwise_affine=True) (mapping_layer): Linear(in_features=534, out_features=1472, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (special_tokens_embeddings): SpecialTokensEmbeddings( (special_tokens_embeddings): CustomEmbedding( (old_embeddings): Embedding(384, 1472) (new_embeddings): Embedding(11, 1472) ) ) (backbone): T5ForConditionalGeneration( (shared): Embedding(384, 1472) (encoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-11): 11 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (decoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerCrossAttention( (EncDecAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (2): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-3): 3 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerCrossAttention( (EncDecAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (2): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (lm_head): Linear(in_features=1472, out_features=384, bias=False) ) ) INFO:multimodalhugs.tasks.run_translation: Model Summary: +--------------------------------+-------------------+---------------------------+ | Module Name | N_parameters | N_training_parameters | +--------------------------------+-------------------+---------------------------+ | vl_mapper | 788,588 | 788,588 | | special_tokens_embeddings | 581,440 | 581,440 | | backbone | 299,072,512 | 299,072,512 | +--------------------------------+-------------------+---------------------------+ checkpoint: None [INFO|trainer.py:811] 2025-02-14 17:13:23,349 >> The following columns in the training set don't have a corresponding argument in `MultiModalEmbedderModel.forward` and have been ignored: source, source_start, output_text, generation_prompt, source_end, source_prompt. If source, source_start, output_text, generation_prompt, source_end, source_prompt are not expected by `MultiModalEmbedderModel.forward`, you can safely ignore this message. Traceback (most recent call last): File "/data/amoryo/conda/envs/multimodalhugs/bin/multimodalhugs-train", line 8, in <module> sys.exit(main()) ^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/multimodalhugs_cli/train.py", line 25, in main translation_main() File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/tasks/run_translation.py", line 715, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1967, in _inner_training_loop train_dataloader = self.get_train_dataloader() ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 892, in get_train_dataloader train_dataset = self._remove_unused_columns(train_dataset, description="training") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 820, in _remove_unused_columns raise ValueError( ValueError: No columns in the dataset match the model's forward method signature. The following columns have been ignored: [source, source_start, output_text, generation_prompt, source_end, source_prompt]. Please check the dataset and model. You may need to set `remove_unused_columns=False` in `TrainingArguments`. (multimodalhugs) amoryo@u20-chiivm0-604:~/sign-language/signwriting-transcription$
The text was updated successfully, but these errors were encountered:
yeah, I think so...
Sorry, something went wrong.
Solution: multimodalhugs-train should inject remove_unused_columns=False if it always has to be set
multimodalhugs-train
remove_unused_columns=False
Now working on the usage of the config.training arguments.
config.training
Regarding specifying the wandb project:
export WANDB_PROJECT=my_prpject_name
beed242
3036356
No branches or pull requests
training
section in theconfig.yaml
doesn't do anything? Seems like the train script does not load a config.I get this error:
Full log
The text was updated successfully, but these errors were encountered: