Training script fails: must add `remove_unused_columns` #13

AmitMY · 2025-02-14T16:15:12Z

Is it the case that training section in the config.yaml doesn't do anything? Seems like the train script does not load a config.
how is the WANDB project specified?
Running the train command

multimodalhugs-train \
    --task "translation" \
    --model_name_or_path $MODEL_PATH \
    --processor_name_or_path $PROCESSOR_PATH \
    --run_name $MODEL_NAME \
    --dataset_dir $DATA_PATH \
    --output_dir $OUTPUT_PATH \
    --do_train True \
    --do_eval True \
    --fp16 \
    --label_smoothing_factor 0.1 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 8 \
    --evaluation_strategy "steps" \
    --eval_steps 2000 \
    --save_strategy "steps" \
    --save_steps 2000 \
    --save_total_limit 3 \
    --load_best_model_at_end true \
    --metric_for_best_model 'chrf' \
    --overwrite_output_dir \
    --gradient_accumulation_steps 4 \
    --learning_rate 1e-3 \
    --warmup_steps 20000 \
    --max_steps 200000 \
    --predict_with_generate True

I get this error:

checkpoint: None
[INFO|trainer.py:811] 2025-02-14 17:13:23,349 >> The following columns in the training set don't have a corresponding argument in `MultiModalEmbedderModel.forward` and have been ignored: source, source_start, output_text, generation_prompt, source_end, source_prompt. If source, source_start, output_text, generation_prompt, source_end, source_prompt are not expected by `MultiModalEmbedderModel.forward`,  you can safely ignore this message.
Traceback (most recent call last):
  File "/data/amoryo/conda/envs/multimodalhugs/bin/multimodalhugs-train", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/multimodalhugs_cli/train.py", line 25, in main
    translation_main()
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/tasks/run_translation.py", line 715, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1967, in _inner_training_loop
    train_dataloader = self.get_train_dataloader()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 892, in get_train_dataloader
    train_dataset = self._remove_unused_columns(train_dataset, description="training")
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 820, in _remove_unused_columns
    raise ValueError(
ValueError: No columns in the dataset match the model's forward method signature. The following columns have been ignored: [source, source_start, output_text, generation_prompt, source_end, source_prompt]. Please check the dataset and model. You may need to set `remove_unused_columns=False` in `TrainingArguments`.

Full log

(multimodalhugs) amoryo@u20-chiivm0-604:~/sign-language/signwriting-transcription$ multimodalhugs-train --task translation --model_name_or_path /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model --processor_name_or_path /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor --run_name signwriting_transcription_model --dataset_dir /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/datasets/pose2text --output_dir /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/output --do_train True --do_eval True --fp16 --label_smoothing_factor 0.1 --per_device_train_batch_size 8 --per_device_eval_batch_size 8 --evaluation_strategy steps --eval_steps 2000 --save_strategy steps --save_steps 2000 --save_total_limit 3 --load_best_model_at_end true --metric_for_best_model chrf --overwrite_output_dir --gradient_accumulation_steps 4 --learning_rate 1e-3 --warmup_steps 20000 --max_steps 200000 --predict_with_generate True
/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead
  warnings.warn(
WARNING:multimodalhugs.tasks.run_translation:Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, 16-bits training: True
INFO:multimodalhugs.tasks.run_translation:Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False},
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
batch_eval_metrics=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_persistent_workers=False,
dataloader_pin_memory=True,
dataloader_prefetch_factor=None,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=True,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_do_concat_batches=True,
eval_on_start=False,
eval_steps=2000,
eval_strategy=IntervalStrategy.STEPS,
eval_use_gather_object=False,
evaluation_strategy=steps,
fp16=True,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=4,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=True,
group_by_length=False,
half_precision_backend=auto,
hub_always_push=False,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=HubStrategy.EVERY_SAVE,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
include_num_input_tokens_seen=False,
include_tokens_per_second=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.1,
learning_rate=0.001,
length_column_name=length,
load_best_model_at_end=True,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=/scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/output/runs/Feb14_17-13-03_u20-chiivm0-604,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_kwargs={},
lr_scheduler_type=SchedulerType.LINEAR,
max_grad_norm=1.0,
max_steps=200000,
metric_for_best_model=chrf,
mp_parameters=,
neftune_noise_alpha=None,
no_cuda=False,
num_train_epochs=3.0,
optim=OptimizerNames.ADAMW_TORCH,
optim_args=None,
optim_target_modules=None,
output_dir=/scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/output,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=8,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['wandb'],
restore_callback_states_from_checkpoint=False,
resume_from_checkpoint=None,
run_name=signwriting_transcription_model,
save_on_each_node=False,
save_only_model=False,
save_safetensors=True,
save_steps=2000,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=3,
seed=42,
skip_memory_metrics=True,
sortish_sampler=False,
split_batches=None,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torch_empty_cache_steps=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_cpu=False,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=20000,
weight_decay=0.0,
)
[INFO|configuration_utils.py:731] 2025-02-14 17:13:04,230 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model/config.json
[INFO|configuration_utils.py:800] 2025-02-14 17:13:04,231 >> Model config MultiModalEmbedderConfig(model_type='multimodal_embedder', feat_dim=534, feature_extractor_type=None, no_scale_embedding=False, pretrained_feature_extractor=None, freeze_feature_extractor=False, vl_mapper_type='linear', vl_mapper_layer_norm_before=True, vl_mapper_layer_norm=False, vl_mapper_activation=False, vl_factor=None, vl_mapper_dropout=0.1, freeze_vl_mapper=False, new_embeddings_vocab_size=11, backbone_used_vocab_size=384, init_lang_abbr='avg', freeze_new_embeddings=False, freeze_old_embeddings=False, backbone_name='t5', backbone_cfg=None, pretrained_backbone='google/byt5-small', freeze_backbone=False, encoder_embed_dim=1472, feature_extractor_cfg=None, is_encoder_decoder=True, pad_token_id=0, bos_token_id=None, eos_token_id=1, max_length=20)
[INFO|configuration_utils.py:1038] 2025-02-14 17:13:04,231 >> Generate config GenerationConfig {
  "eos_token_id": 1,
  "pad_token_id": 0
}

[INFO|processing_utils.py:660] 2025-02-14 17:13:04,232 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor/processor_config.json
[INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2267] 2025-02-14 17:13:04,237 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2513] 2025-02-14 17:13:04,239 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|processing_utils.py:660] 2025-02-14 17:13:04,239 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor/processor_config.json
[WARNING|processing_utils.py:953] 2025-02-14 17:13:04,239 >> Some kwargs in processor config are unused and will not have any effect: reduce_holistic_poses. 
[INFO|processing_utils.py:722] 2025-02-14 17:13:04,242 >> Processor Pose2TextTranslationProcessor:
- tokenizer: ByT5Tokenizer(name_or_path='/scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/pose2text_translation_processor', vocab_size=256, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '<pad>', 'additional_special_tokens': ['<extra_id_0>', '<extra_id_1>', '<extra_id_2>', '<extra_id_3>', '<extra_id_4>', '<extra_id_5>', '<extra_id_6>', '<extra_id_7>', '<extra_id_8>', '<extra_id_9>', '<extra_id_10>', '<extra_id_11>', '<extra_id_12>', '<extra_id_13>', '<extra_id_14>', '<extra_id_15>', '<extra_id_16>', '<extra_id_17>', '<extra_id_18>', '<extra_id_19>', '<extra_id_20>', '<extra_id_21>', '<extra_id_22>', '<extra_id_23>', '<extra_id_24>', '<extra_id_25>', '<extra_id_26>', '<extra_id_27>', '<extra_id_28>', '<extra_id_29>', '<extra_id_30>', '<extra_id_31>', '<extra_id_32>', '<extra_id_33>', '<extra_id_34>', '<extra_id_35>', '<extra_id_36>', '<extra_id_37>', '<extra_id_38>', '<extra_id_39>', '<extra_id_40>', '<extra_id_41>', '<extra_id_42>', '<extra_id_43>', '<extra_id_44>', '<extra_id_45>', '<extra_id_46>', '<extra_id_47>', '<extra_id_48>', '<extra_id_49>', '<extra_id_50>', '<extra_id_51>', '<extra_id_52>', '<extra_id_53>', '<extra_id_54>', '<extra_id_55>', '<extra_id_56>', '<extra_id_57>', '<extra_id_58>', '<extra_id_59>', '<extra_id_60>', '<extra_id_61>', '<extra_id_62>', '<extra_id_63>', '<extra_id_64>', '<extra_id_65>', '<extra_id_66>', '<extra_id_67>', '<extra_id_68>', '<extra_id_69>', '<extra_id_70>', '<extra_id_71>', '<extra_id_72>', '<extra_id_73>', '<extra_id_74>', '<extra_id_75>', '<extra_id_76>', '<extra_id_77>', '<extra_id_78>', '<extra_id_79>', '<extra_id_80>', '<extra_id_81>', '<extra_id_82>', '<extra_id_83>', '<extra_id_84>', '<extra_id_85>', '<extra_id_86>', '<extra_id_87>', '<extra_id_88>', '<extra_id_89>', '<extra_id_90>', '<extra_id_91>', '<extra_id_92>', '<extra_id_93>', '<extra_id_94>', '<extra_id_95>', '<extra_id_96>', '<extra_id_97>', '<extra_id_98>', '<extra_id_99>', '<extra_id_100>', '<extra_id_101>', '<extra_id_102>', '<extra_id_103>', '<extra_id_104>', '<extra_id_105>', '<extra_id_106>', '<extra_id_107>', '<extra_id_108>', '<extra_id_109>', '<extra_id_110>', '<extra_id_111>', '<extra_id_112>', '<extra_id_113>', '<extra_id_114>', '<extra_id_115>', '<extra_id_116>', '<extra_id_117>', '<extra_id_118>', '<extra_id_119>', '<extra_id_120>', '<extra_id_121>', '<extra_id_122>', '<extra_id_123>', '<extra_id_124>', '__pose__', '__gsg__', '__slf__', '__asq__', '__ssr__', '__ase__', '__ils__', '__sgg__', '__cse__', '__svk__', '__dse__']}, clean_up_tokenization_spaces=True),  added_tokens_decoder={
	0: AddedToken("<pad>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
	1: AddedToken("</s>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
	2: AddedToken("<unk>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
	259: AddedToken("<extra_id_0>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	260: AddedToken("<extra_id_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	261: AddedToken("<extra_id_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	262: AddedToken("<extra_id_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	263: AddedToken("<extra_id_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	264: AddedToken("<extra_id_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	265: AddedToken("<extra_id_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	266: AddedToken("<extra_id_7>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	267: AddedToken("<extra_id_8>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	268: AddedToken("<extra_id_9>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	269: AddedToken("<extra_id_10>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	270: AddedToken("<extra_id_11>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	271: AddedToken("<extra_id_12>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	272: AddedToken("<extra_id_13>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	273: AddedToken("<extra_id_14>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	274: AddedToken("<extra_id_15>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	275: AddedToken("<extra_id_16>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	276: AddedToken("<extra_id_17>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	277: AddedToken("<extra_id_18>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	278: AddedToken("<extra_id_19>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	279: AddedToken("<extra_id_20>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	280: AddedToken("<extra_id_21>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	281: AddedToken("<extra_id_22>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	282: AddedToken("<extra_id_23>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	283: AddedToken("<extra_id_24>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	284: AddedToken("<extra_id_25>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	285: AddedToken("<extra_id_26>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	286: AddedToken("<extra_id_27>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	287: AddedToken("<extra_id_28>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	288: AddedToken("<extra_id_29>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	289: AddedToken("<extra_id_30>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	290: AddedToken("<extra_id_31>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	291: AddedToken("<extra_id_32>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	292: AddedToken("<extra_id_33>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	293: AddedToken("<extra_id_34>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	294: AddedToken("<extra_id_35>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	295: AddedToken("<extra_id_36>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	296: AddedToken("<extra_id_37>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	297: AddedToken("<extra_id_38>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	298: AddedToken("<extra_id_39>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	299: AddedToken("<extra_id_40>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	300: AddedToken("<extra_id_41>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	301: AddedToken("<extra_id_42>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	302: AddedToken("<extra_id_43>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	303: AddedToken("<extra_id_44>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	304: AddedToken("<extra_id_45>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	305: AddedToken("<extra_id_46>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	306: AddedToken("<extra_id_47>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	307: AddedToken("<extra_id_48>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	308: AddedToken("<extra_id_49>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	309: AddedToken("<extra_id_50>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	310: AddedToken("<extra_id_51>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	311: AddedToken("<extra_id_52>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	312: AddedToken("<extra_id_53>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	313: AddedToken("<extra_id_54>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	314: AddedToken("<extra_id_55>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	315: AddedToken("<extra_id_56>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	316: AddedToken("<extra_id_57>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	317: AddedToken("<extra_id_58>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	318: AddedToken("<extra_id_59>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	319: AddedToken("<extra_id_60>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	320: AddedToken("<extra_id_61>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	321: AddedToken("<extra_id_62>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	322: AddedToken("<extra_id_63>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	323: AddedToken("<extra_id_64>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	324: AddedToken("<extra_id_65>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	325: AddedToken("<extra_id_66>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	326: AddedToken("<extra_id_67>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	327: AddedToken("<extra_id_68>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	328: AddedToken("<extra_id_69>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	329: AddedToken("<extra_id_70>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	330: AddedToken("<extra_id_71>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	331: AddedToken("<extra_id_72>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	332: AddedToken("<extra_id_73>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	333: AddedToken("<extra_id_74>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	334: AddedToken("<extra_id_75>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	335: AddedToken("<extra_id_76>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	336: AddedToken("<extra_id_77>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	337: AddedToken("<extra_id_78>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	338: AddedToken("<extra_id_79>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	339: AddedToken("<extra_id_80>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	340: AddedToken("<extra_id_81>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	341: AddedToken("<extra_id_82>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	342: AddedToken("<extra_id_83>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	343: AddedToken("<extra_id_84>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	344: AddedToken("<extra_id_85>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	345: AddedToken("<extra_id_86>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	346: AddedToken("<extra_id_87>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	347: AddedToken("<extra_id_88>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	348: AddedToken("<extra_id_89>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	349: AddedToken("<extra_id_90>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	350: AddedToken("<extra_id_91>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	351: AddedToken("<extra_id_92>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	352: AddedToken("<extra_id_93>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	353: AddedToken("<extra_id_94>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	354: AddedToken("<extra_id_95>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	355: AddedToken("<extra_id_96>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	356: AddedToken("<extra_id_97>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	357: AddedToken("<extra_id_98>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	358: AddedToken("<extra_id_99>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	359: AddedToken("<extra_id_100>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	360: AddedToken("<extra_id_101>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	361: AddedToken("<extra_id_102>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	362: AddedToken("<extra_id_103>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	363: AddedToken("<extra_id_104>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	364: AddedToken("<extra_id_105>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	365: AddedToken("<extra_id_106>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	366: AddedToken("<extra_id_107>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	367: AddedToken("<extra_id_108>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	368: AddedToken("<extra_id_109>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	369: AddedToken("<extra_id_110>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	370: AddedToken("<extra_id_111>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	371: AddedToken("<extra_id_112>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	372: AddedToken("<extra_id_113>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	373: AddedToken("<extra_id_114>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	374: AddedToken("<extra_id_115>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	375: AddedToken("<extra_id_116>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	376: AddedToken("<extra_id_117>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	377: AddedToken("<extra_id_118>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	378: AddedToken("<extra_id_119>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	379: AddedToken("<extra_id_120>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	380: AddedToken("<extra_id_121>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	381: AddedToken("<extra_id_122>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	382: AddedToken("<extra_id_123>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	383: AddedToken("<extra_id_124>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	384: AddedToken("__pose__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	385: AddedToken("__gsg__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	386: AddedToken("__slf__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	387: AddedToken("__asq__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	388: AddedToken("__ssr__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	389: AddedToken("__ase__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	390: AddedToken("__ils__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	391: AddedToken("__sgg__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	392: AddedToken("__cse__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	393: AddedToken("__svk__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	394: AddedToken("__dse__", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}

{
  "processor_class": "Pose2TextTranslationProcessor",
  "reduce_holistic_poses": true
}

[INFO|modeling_utils.py:3675] 2025-02-14 17:13:04,329 >> loading weights file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model/model.safetensors
[INFO|configuration_utils.py:1038] 2025-02-14 17:13:08,925 >> Generate config GenerationConfig {
  "eos_token_id": 1,
  "pad_token_id": 0
}

[INFO|configuration_utils.py:733] 2025-02-14 17:13:09,413 >> loading configuration file config.json from cache at /home/amoryo/data/.cache/huggingface/hub/models--google--byt5-small/snapshots/68377bdc18a2ffec8a0533fef03b1c513a4dd49d/config.json
[INFO|configuration_utils.py:800] 2025-02-14 17:13:09,414 >> Model config T5Config {
  "_name_or_path": "/home/patrick/t5/byt5-small",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "classifier_dropout": 0.0,
  "d_ff": 3584,
  "d_kv": 64,
  "d_model": 1472,
  "decoder_start_token_id": 0,
  "dense_act_fn": "gelu_new",
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "gradient_checkpointing": false,
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "is_gated_act": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_decoder_layers": 4,
  "num_heads": 6,
  "num_layers": 12,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "tokenizer_class": "ByT5Tokenizer",
  "transformers_version": "4.44.2",
  "use_cache": true,
  "vocab_size": 384
}

[INFO|modeling_utils.py:3678] 2025-02-14 17:13:09,537 >> loading weights file pytorch_model.bin from cache at /home/amoryo/data/.cache/huggingface/hub/models--google--byt5-small/snapshots/68377bdc18a2ffec8a0533fef03b1c513a4dd49d/pytorch_model.bin
[INFO|configuration_utils.py:1038] 2025-02-14 17:13:17,346 >> Generate config GenerationConfig {
  "decoder_start_token_id": 0,
  "eos_token_id": 1,
  "pad_token_id": 0
}

[INFO|modeling_utils.py:4507] 2025-02-14 17:13:17,403 >> All model checkpoint weights were used when initializing T5ForConditionalGeneration.

[INFO|modeling_utils.py:4515] 2025-02-14 17:13:17,403 >> All the weights of T5ForConditionalGeneration were initialized from the model checkpoint at google/byt5-small.
If your task is similar to the task the model of the checkpoint was trained on, you can already use T5ForConditionalGeneration for predictions without further training.
[INFO|configuration_utils.py:993] 2025-02-14 17:13:17,534 >> loading configuration file generation_config.json from cache at /home/amoryo/data/.cache/huggingface/hub/models--google--byt5-small/snapshots/68377bdc18a2ffec8a0533fef03b1c513a4dd49d/generation_config.json
[INFO|configuration_utils.py:1038] 2025-02-14 17:13:17,534 >> Generate config GenerationConfig {
  "decoder_start_token_id": 0,
  "eos_token_id": 1,
  "pad_token_id": 0
}

[INFO|modeling_utils.py:4507] 2025-02-14 17:13:17,575 >> All model checkpoint weights were used when initializing MultiModalEmbedderModel.

[INFO|modeling_utils.py:4515] 2025-02-14 17:13:17,575 >> All the weights of MultiModalEmbedderModel were initialized from the model checkpoint at /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use MultiModalEmbedderModel for predictions without further training.
[INFO|configuration_utils.py:991] 2025-02-14 17:13:17,599 >> loading configuration file /scratch/amoryo/tmp/signwriting-transcription/results/signwriting_transcription_model/trained_model/generation_config.json
[INFO|configuration_utils.py:1038] 2025-02-14 17:13:17,600 >> Generate config GenerationConfig {
  "eos_token_id": 1,
  "pad_token_id": 0
}

WARNING:multimodalhugs.tasks.run_translation:label_smoothing is enabled but the `prepare_decoder_input_ids_from_labels` method is not defined for `MultiModalEmbedderModel`. This will lead to loss being calculated twice and will take up more memory
train_dataset: Dataset({
    features: ['source', 'source_start', 'source_end', 'source_prompt', 'generation_prompt', 'output_text'],
    num_rows: 96404
})
[WARNING|trainer.py:598] 2025-02-14 17:13:23,154 >> max_steps is given, it will override any value given in num_train_epochs
[INFO|trainer.py:648] 2025-02-14 17:13:23,154 >> Using auto half precision backend
INFO:multimodalhugs.tasks.run_translation:
MultiModalEmbedderModel(
  (vl_mapper): VLMapper(
    (layer_norm_before): LayerNorm((534,), eps=1e-05, elementwise_affine=True)
    (mapping_layer): Linear(in_features=534, out_features=1472, bias=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (special_tokens_embeddings): SpecialTokensEmbeddings(
    (special_tokens_embeddings): CustomEmbedding(
      (old_embeddings): Embedding(384, 1472)
      (new_embeddings): Embedding(11, 1472)
    )
  )
  (backbone): T5ForConditionalGeneration(
    (shared): Embedding(384, 1472)
    (encoder): T5Stack(
      (embed_tokens): Embedding(384, 1472)
      (block): ModuleList(
        (0): T5Block(
          (layer): ModuleList(
            (0): T5LayerSelfAttention(
              (SelfAttention): T5Attention(
                (q): Linear(in_features=1472, out_features=384, bias=False)
                (k): Linear(in_features=1472, out_features=384, bias=False)
                (v): Linear(in_features=1472, out_features=384, bias=False)
                (o): Linear(in_features=384, out_features=1472, bias=False)
                (relative_attention_bias): Embedding(32, 6)
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (1): T5LayerFF(
              (DenseReluDense): T5DenseGatedActDense(
                (wi_0): Linear(in_features=1472, out_features=3584, bias=False)
                (wi_1): Linear(in_features=1472, out_features=3584, bias=False)
                (wo): Linear(in_features=3584, out_features=1472, bias=False)
                (dropout): Dropout(p=0.1, inplace=False)
                (act): NewGELUActivation()
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
        (1-11): 11 x T5Block(
          (layer): ModuleList(
            (0): T5LayerSelfAttention(
              (SelfAttention): T5Attention(
                (q): Linear(in_features=1472, out_features=384, bias=False)
                (k): Linear(in_features=1472, out_features=384, bias=False)
                (v): Linear(in_features=1472, out_features=384, bias=False)
                (o): Linear(in_features=384, out_features=1472, bias=False)
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (1): T5LayerFF(
              (DenseReluDense): T5DenseGatedActDense(
                (wi_0): Linear(in_features=1472, out_features=3584, bias=False)
                (wi_1): Linear(in_features=1472, out_features=3584, bias=False)
                (wo): Linear(in_features=3584, out_features=1472, bias=False)
                (dropout): Dropout(p=0.1, inplace=False)
                (act): NewGELUActivation()
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (final_layer_norm): T5LayerNorm()
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (decoder): T5Stack(
      (embed_tokens): Embedding(384, 1472)
      (block): ModuleList(
        (0): T5Block(
          (layer): ModuleList(
            (0): T5LayerSelfAttention(
              (SelfAttention): T5Attention(
                (q): Linear(in_features=1472, out_features=384, bias=False)
                (k): Linear(in_features=1472, out_features=384, bias=False)
                (v): Linear(in_features=1472, out_features=384, bias=False)
                (o): Linear(in_features=384, out_features=1472, bias=False)
                (relative_attention_bias): Embedding(32, 6)
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (1): T5LayerCrossAttention(
              (EncDecAttention): T5Attention(
                (q): Linear(in_features=1472, out_features=384, bias=False)
                (k): Linear(in_features=1472, out_features=384, bias=False)
                (v): Linear(in_features=1472, out_features=384, bias=False)
                (o): Linear(in_features=384, out_features=1472, bias=False)
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (2): T5LayerFF(
              (DenseReluDense): T5DenseGatedActDense(
                (wi_0): Linear(in_features=1472, out_features=3584, bias=False)
                (wi_1): Linear(in_features=1472, out_features=3584, bias=False)
                (wo): Linear(in_features=3584, out_features=1472, bias=False)
                (dropout): Dropout(p=0.1, inplace=False)
                (act): NewGELUActivation()
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
        (1-3): 3 x T5Block(
          (layer): ModuleList(
            (0): T5LayerSelfAttention(
              (SelfAttention): T5Attention(
                (q): Linear(in_features=1472, out_features=384, bias=False)
                (k): Linear(in_features=1472, out_features=384, bias=False)
                (v): Linear(in_features=1472, out_features=384, bias=False)
                (o): Linear(in_features=384, out_features=1472, bias=False)
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (1): T5LayerCrossAttention(
              (EncDecAttention): T5Attention(
                (q): Linear(in_features=1472, out_features=384, bias=False)
                (k): Linear(in_features=1472, out_features=384, bias=False)
                (v): Linear(in_features=1472, out_features=384, bias=False)
                (o): Linear(in_features=384, out_features=1472, bias=False)
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (2): T5LayerFF(
              (DenseReluDense): T5DenseGatedActDense(
                (wi_0): Linear(in_features=1472, out_features=3584, bias=False)
                (wi_1): Linear(in_features=1472, out_features=3584, bias=False)
                (wo): Linear(in_features=3584, out_features=1472, bias=False)
                (dropout): Dropout(p=0.1, inplace=False)
                (act): NewGELUActivation()
              )
              (layer_norm): T5LayerNorm()
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (final_layer_norm): T5LayerNorm()
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (lm_head): Linear(in_features=1472, out_features=384, bias=False)
  )
)

INFO:multimodalhugs.tasks.run_translation:
Model Summary:
+--------------------------------+-------------------+---------------------------+
| Module Name                    | N_parameters      | N_training_parameters     |
+--------------------------------+-------------------+---------------------------+
| vl_mapper                      |           788,588 |                   788,588 |
| special_tokens_embeddings      |           581,440 |                   581,440 |
| backbone                       |       299,072,512 |               299,072,512 |
+--------------------------------+-------------------+---------------------------+

checkpoint: None
[INFO|trainer.py:811] 2025-02-14 17:13:23,349 >> The following columns in the training set don't have a corresponding argument in `MultiModalEmbedderModel.forward` and have been ignored: source, source_start, output_text, generation_prompt, source_end, source_prompt. If source, source_start, output_text, generation_prompt, source_end, source_prompt are not expected by `MultiModalEmbedderModel.forward`,  you can safely ignore this message.
Traceback (most recent call last):
  File "/data/amoryo/conda/envs/multimodalhugs/bin/multimodalhugs-train", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/multimodalhugs_cli/train.py", line 25, in main
    translation_main()
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/multimodalhugs/tasks/run_translation.py", line 715, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 1967, in _inner_training_loop
    train_dataloader = self.get_train_dataloader()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 892, in get_train_dataloader
    train_dataset = self._remove_unused_columns(train_dataset, description="training")
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/amoryo/conda/envs/multimodalhugs/lib/python3.11/site-packages/transformers/trainer.py", line 820, in _remove_unused_columns
    raise ValueError(
ValueError: No columns in the dataset match the model's forward method signature. The following columns have been ignored: [source, source_start, output_text, generation_prompt, source_end, source_prompt]. Please check the dataset and model. You may need to set `remove_unused_columns=False` in `TrainingArguments`.
(multimodalhugs) amoryo@u20-chiivm0-604:~/sign-language/signwriting-transcription$

The text was updated successfully, but these errors were encountered:

GerrySant · 2025-02-15T08:45:38Z

yeah, I think so...

AmitMY · 2025-02-15T09:11:51Z

Solution:
multimodalhugs-train should inject remove_unused_columns=False if it always has to be set

GerrySant · 2025-02-15T09:15:22Z

Now working on the usage of the config.training arguments.

Regarding specifying the wandb project:

export WANDB_PROJECT=my_prpject_name

AmitMY changed the title ~~Training script fails~~ Training script fails: must add remove_unused_columns Feb 15, 2025

GerrySant closed this as completed in beed242 Feb 15, 2025

GerrySant reopened this Feb 15, 2025

GerrySant closed this as completed in 3036356 Feb 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training script fails: must add `remove_unused_columns` #13

Training script fails: must add `remove_unused_columns` #13

AmitMY commented Feb 14, 2025 •

edited

Loading

GerrySant commented Feb 15, 2025

AmitMY commented Feb 15, 2025

GerrySant commented Feb 15, 2025

Training script fails: must add remove_unused_columns #13

Training script fails: must add remove_unused_columns #13

Comments

AmitMY commented Feb 14, 2025 • edited Loading

GerrySant commented Feb 15, 2025

AmitMY commented Feb 15, 2025

GerrySant commented Feb 15, 2025

Training script fails: must add `remove_unused_columns` #13

Training script fails: must add `remove_unused_columns` #13

AmitMY commented Feb 14, 2025 •

edited

Loading