Skip to content

Latest commit

 

History

History
105 lines (80 loc) · 3.05 KB

README.md

File metadata and controls

105 lines (80 loc) · 3.05 KB

Super-class guided Transformer for Zero-Shot Attribute Classification

Sehyung Kim*, Chanhyeong Yang*, Jihwan Park, Taehoon Song, Hyunwoo J. Kim†.

AAAI 2025


SugaFormer

This is the official implementation of AAAI 2025 paper "Super-class guided Transformer for Zero-Shot Attribute Classification"


Environment Setting

git clone https://github.com/mlvlab/SugaFormer.git
cd SugaFormer
conda create -n sugaformer python==3.9
conda activate sugaformer
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

Dataset Preparation

To run experiments for VAW, you need both the images from the Visual Genome dataset and the annotation files. Follow the steps below:

  1. Download the Visual Genome images from the link.
  2. Download the annotation files for VAW experiments from the link.

Organize the Data

After downloading the Visual Genome images and annotation files, organize them into the following directory structure:

data/
└── vaw/
     ├── images/
     │   ├── VG_100K/
     │   └── VG_100K_2/
     │
     └── annotations/
         ├── train.json
         ├── test.json
         ├── ...

Training

VAW Fully-Supervised

Train the model in the fully-supervised setting:

./configs/vaw/train_fs.sh

VAW Zero-Shot (base2novel)

Train the model in the zero-shot setting:

./configs/vaw/train_zs.sh

Evaluation

VAW Fully-Supervised

Evaluate the model in the fully-supervised setting:

./configs/vaw/eval_fs.sh

VAW Zero-Shot (base2novel)

Evaluate the model in the zero-shot setting:

./configs/vaw/eval_zs.sh

Acknowledgements

This repository is built upon the following works:

  • DETR (Facebook Research): The codebase we built upon and the foundation for our base model.

  • LAVIS (Salesforce): Pre-trained Vision-Language Models (BLIP2) that we utilized for feature extraction and knowledge transfer.

Contact

If you have any questions, please create an issue on this repository or contact at [email protected].

Citation

If you find our work interesting, please consider giving a ⭐ and citation.

@article{kim2025super,
  title={Super-class guided Transformer for Zero-Shot Attribute Classification},
  author={Kim, Sehyung and Yang, Chanhyeong and Park, Jihwan and Song, Taehoon and Kim, Hyunwoo J},
  journal={arXiv preprint arXiv:2501.05728},
  year={2025}
}