Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosine LR decay doesn't seem to work #81

Open
paul-lupu opened this issue Feb 10, 2025 · 0 comments
Open

Cosine LR decay doesn't seem to work #81

paul-lupu opened this issue Feb 10, 2025 · 0 comments

Comments

@paul-lupu
Copy link

Hello! I attempted to use cosine to try and break past a plateau but it doesn't seem to have worked. Am I missing something glaringly obvious?


accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 \
    hv_train_network.py --dit ckpts/mp_rank_00_model_states.pt \
    --blocks_to_swap 36 --save_every_n_steps 250 \
    --dataset_config character.toml --mixed_precision bf16 --xformers --split_attn \
    --optimizer_type adamw8bit --learning_rate 1e-4 --gradient_checkpointing \
    --max_data_loader_n_workers 1 --persistent_data_loader_workers \
    --network_module networks.lora --network_dim 32 --gradient_accumulation_steps 2 \
    --timestep_sampling shift --discrete_flow_shift 7.0 \
    --lr_warmup_steps 100 --lr_scheduler cosine \
    --lr_decay_steps 1000 --lr_scheduler_num_cycles 2 \
    --max_train_epochs 400 --seed 42 --network_weights output/dancing_characterg-step00003500.safetensors \
    --output_dir output --output_name dancing_characterg_2 --log_with tensorboard --logging_dir ./logs

From my understanding I should have seen oscillations in the LR rate in the 2nd row of charts.

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant