Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I run the following code on the colab, I run into some problems #25

Open
keyingfan opened this issue Dec 12, 2024 · 5 comments
Open

Comments

@keyingfan
Copy link

keyingfan commented Dec 12, 2024

When I run the following code on the colab, I run into some problems

!accelerate launch --mixed_precision="fp16" finetune_instruct_pix2pix.py \
  --pretrained_model_name_or_path="timbrooks/instruct-pix2pix" \
  --dataset_name="instruction-tuning-sd/cartoonization" \
  --use_ema \
  --enable_xformers_memory_efficient_attention \
  --resolution=256 --random_flip \
  --train_batch_size=2 --gradient_accumulation_steps=4 --gradient_checkpointing \
  --max_train_steps=15000 \
  --checkpointing_steps=5000 --checkpoints_total_limit=1 \
  --learning_rate=5e-05 --lr_warmup_steps=0 \
  --mixed_precision=fp16 \
  --val_image_url="https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png" \
  --validation_prompt="Generate a cartoonized version of the natural image" \
  --seed=42 \
  --output_dir="cartoonization-finetuned" \
  --report_to=wandb \
  --push_to_hub

Error:

  warn(f"Failed to load image Python extension: {e}")
The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
  warn(f"Failed to load image Python extension: {e}")
2024-12-12 19:59:02.380421: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-12 19:59:02.399663: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-12 19:59:02.405402: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-12 19:59:02.419254: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-12 19:59:04.004323: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
12/12/2024 19:59:06 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_deprecation.py:131: FutureWarning: 'Repository' (from 'huggingface_hub.repository') is deprecated and will be removed from version '1.0'. Please prefer the http-based alternatives instead. Given its large adoption in legacy code, the complete removal is only planned on next major release.
For more details, please read https://huggingface.co/docs/huggingface_hub/concepts/git_vs_http.
  warnings.warn(warning_message, FutureWarning)
/content/instruction-tuned-sd/cartoonization-finetuned is already a clone of https://huggingface.co/xxxxx/cartoonization-finetuned. Make sure you pull the latest changes with `repo.git_pull()`.
12/12/2024 19:59:07 - WARNING - huggingface_hub.repository - /content/instruction-tuned-sd/cartoonization-finetuned is already a clone of https://huggingface.co/xxxxx/cartoonization-finetuned. Make sure you pull the latest changes with `repo.git_pull()`.
{'variance_type', 'timestep_spacing', 'rescale_betas_zero_snr', 'clip_sample_range'} was not found in config. Values will be initialized to default values.
{'mid_block_add_attention', 'latents_mean', 'force_upcast', 'shift_factor', 'scaling_factor', 'use_quant_conv', 'latents_std', 'use_post_quant_conv'} was not found in config. Values will be initialized to default values.
{'encoder_hid_dim', 'class_embeddings_concat', 'attention_type', 'conv_in_kernel', 'resnet_skip_time_act', 'time_cond_proj_dim', 'encoder_hid_dim_type', 'addition_time_embed_dim', 'time_embedding_dim', 'dropout', 'cross_attention_norm', 'conv_out_kernel', 'projection_class_embeddings_input_dim', 'transformer_layers_per_block', 'num_attention_heads', 'addition_embed_type_num_heads', 'timestep_post_act', 'reverse_transformer_layers_per_block', 'mid_block_only_cross_attention', 'resnet_out_scale_factor', 'addition_embed_type', 'time_embedding_type', 'time_embedding_act_fn'} was not found in config. Values will be initialized to default values.
Traceback (most recent call last):
  File "/content/instruction-tuned-sd/finetune_instruct_pix2pix.py", line 1137, in <module>
    main()
  File "/content/instruction-tuned-sd/finetune_instruct_pix2pix.py", line 652, in main
    dataset = load_dataset(
  File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 2129, in load_dataset
    builder_instance = load_dataset_builder(
  File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 1886, in load_dataset_builder
    builder_instance: DatasetBuilder = builder_cls(
  File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 342, in __init__
    self.config, self.config_id = self._create_builder_config(
  File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 590, in _create_builder_config
    raise ValueError(f"BuilderConfig {builder_config} doesn't have a '{key}' key.")
ValueError: BuilderConfig ParquetConfig(name='default', version=0.0.0, data_dir=None, data_files={'train': ['hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00000-of-00007-342bf1c5df43ea16.parquet', 'hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00001-of-00007-ce8cae1ac51b900d.parquet', 'hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00002-of-00007-0c0432cda6ca5870.parquet', 'hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00003-of-00007-6d3d7d54376e0dd5.parquet', 'hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00004-of-00007-98f0165b64247bae.parquet', 'hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00005-of-00007-c3ae3df137feae84.parquet', 'hf://datasets/instruction-tuning-sd/cartoonization@948d883aac8e449656494f9e8dd3254b3d102a8f/data/train-00006-of-00007-6ad3fcf037ee2a98.parquet']}, description=None, batch_size=None, columns=None, features=None, filters=None) doesn't have a 'use_auth_token' key.
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1168, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 763, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'finetune_instruct_pix2pix.py', '--pretrained_model_name_or_path=timbrooks/instruct-pix2pix', '--dataset_name=instruction-tuning-sd/cartoonization', '--use_ema', '--enable_xformers_memory_efficient_attention', '--resolution=256', '--random_flip', '--train_batch_size=2', '--gradient_accumulation_steps=4', '--gradient_checkpointing', '--max_train_steps=15000', '--checkpointing_steps=5000', '--checkpoints_total_limit=1', '--learning_rate=5e-05', '--lr_warmup_steps=0', '--mixed_precision=fp16', '--val_image_url=https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png', '--validation_prompt=Generate a cartoonized version of the natural image', '--seed=42', '--output_dir=cartoonization-finetuned', '--report_to=wandb', '--push_to_hub']' returned non-zero exit status 1.```
@sayakpaul
Copy link
Member

Please properly format the command and the error message.

@keyingfan
Copy link
Author

Please properly format the command and the error message.

Sorry, it's been edited.

@sayakpaul
Copy link
Member

Can you check if load_dataset on the dataset related arguments is working as expected?

from datasets import load_dataset

dataset = load_dataset("instruction-tuning-sd/cartoonization", None, cache_dir=args.cache_dir,)

@keyingfan
Copy link
Author

Can you check if load_dataset on the dataset related arguments is working as expected?

from datasets import load_dataset

dataset = load_dataset("instruction-tuning-sd/cartoonization", None, cache_dir=args.cache_dir,)

Luckily, I've solved the problem.
Deleted line 656 ''' use_auth_token=True,''' in finetune_instruct_pix2pix.py, error disappeared.

@sayakpaul
Copy link
Member

Feel free to open a PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants