- Environment Setup
- Configuration Guide
- Training Process
- Image Generation
- Troubleshooting
- Best Practices
- Google Colab Pro+ (Recommended)
- NVIDIA GPU (T4 or better)
- Python 3.9+
- 15GB+ Disk Space
# Install core dependencies
!pip install torch==2.0.1+cu118
!pip install diffusers==0.19.3 transformers==4.31.0 accelerate==0.21.0
# Install additional utilities
!pip install xformers wandb safetensors
from google.colab import drive
drive.mount('/content/drive')
drive_mount_path: "/content/drive"
images_dir: "/content/drive/MyDrive/babanne-images"
lora_output_dir: "/content/drive/MyDrive/lora_output"
instance_prompt: "a photo of <myspecialstyle> lace fabric"
# ... other parameters
- Set
images_dir
to your Google Drive folder containing lace images - Customize
instance_prompt
with your unique token - Adjust training parameters based on GPU capacity:
train_batch_size: 1 # Reduce if OOM errors occur resolution: 512 # 768 for higher quality (requires more VRAM)
python main.py --mode train
Mounting Google Drive...
Starting LoRA training...
Loading base model: stabilityai/stable-diffusion-xl-base-1.0
Creating annotations for 250 images...
Step 100, Loss: 0.1245
Saved checkpoint at step 500
Training completed successfully!
- Check loss values decreasing over time
- Verify checkpoint saving
- Monitor GPU memory usage (nvidia-smi)
python main.py --mode inference
refined_output.png
in your lora_output_dir- Multiple versions with timestamps if run repeatedly
Modify instance_prompt
in config.yaml:
instance_prompt: "close-up of <myspecialstyle> lace pattern with gold threads"
1. CUDA Out of Memory
# Solutions:
- Reduce batch_size in config.yaml
- Lower resolution to 512
- Enable memory optimizations:
```python
pipe.enable_xformers_memory_efficient_attention()
pipe.enable_model_cpu_offload()
2. Missing Dependencies
# Fix missing packages
!pip install [missing-package]
3. Poor Generation Quality
- Increase training steps (2000-5000)
- Use higher quality source images
- Experiment with different learning rates (1e-5 to 1e-4)
- Use 200-300 high-quality JPEG images
- Maintain consistent image dimensions
- Use descriptive prompts with unique token
- Start with 1000 training steps, increase gradually
- Try different refiner strengths (0.2-0.5)
- Experiment with guidance scales (5-15)
- Combine with negative prompts:
negative_prompt: "blurry, low quality, duplicate"
# config.yaml optimizations
mixed_precision: "fp16" # For modern GPUs
gradient_checkpointing: true
use_xformers: true