Skip to content
/ AED Public

Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown

License

Notifications You must be signed in to change notification settings

balabooooo/AED

Repository files navigation

Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown

arXiv PWC PWC PWC

This repository is an official implementation of the paper Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown.

This repository is still under development, and feel free to raise any issues at any time.

Abstract

Multi-object tracking (MOT) emerges as a pivotal and highly promising branch in the field of computer vision. Classical closed-vocabulary MOT (CV-MOT) methods aim to track objects of predefined categories. Recently, some open-vocabulary MOT (OV-MOT) methods have successfully addressed the problem of tracking unknown categories. However, we found that the CV-MOT and OV-MOT methods each struggle to excel in the tasks of the other. In this paper, we present a unified framework, Associate Everything Detected (AED), that simultaneously tackles CV-MOT and OV-MOT by integrating with any off-the-shelf detector and supports unknown categories. Different from existing tracking-by-detection MOT methods, AED gets rid of prior knowledge (e.g. motion cues) and relies solely on highly robust feature learning to handle complex trajectories in OV-MOT tasks while keeping excellent performance in CV-MOT tasks. Specifically, we model the association task as a similarity decoding problem and propose a sim-decoder with an association-centric learning mechanism. The sim-decoder calculates similarities in three aspects: spatial, temporal, and cross-clip. Subsequently, association-centric learning leverages these threefold similarities to ensure that the extracted features are appropriate for continuous tracking and robust enough to generalize to unknown categories. Compared with existing powerful OV-MOT and CV-MOT methods, AED achieves superior performance on TAO, SportsMOT, and DanceTrack without any prior knowledge.

NewsπŸ”₯

  • (2024/12/10) The demo using GroundingDINO + AED has been released. You can track on your own video now!
  • (2024/9/14) Our paper is available at arXiv.

Comming soon

  • Track on your own video.
  • Deploy AED using TensorRT.

Main Results

TAO Test Set

Method Training Data Detector Base-TETA Base-AssocA Novel-TETA Novel-AssocA URL
AED TAO-train RegionCLIP 37.2 40.4 27.8 29.1 ⬇️
AED TAO-train Co-DETR 54.8 54.1 48.9 51.8 model

SportsMOT Test Set

Method Training Data HOTA IDF1 AssA MOTA URL
AED TAO-train 72.8 76.8 61.4 95.0
AED SportsMOT-train 77.0 80.0 68.1 95.1 model

DanceTrack Test Set

Method Training Data HOTA IDF1 AssA MOTA URL
AED TAO-train 55.2 57.0 37.8 91.0
AED DanceTrack-train 66.6 69.7 54.3 92.2 model

Installation

The codebase is built on top ofΒ MOTRv2.

Requirements

  • Install pytorch using conda (optional), PyTorch>=1.5.1, torchvision>=0.6.1
conda create -n aed python=3.8
conda activate aed
# pytorch installation please refer to https://pytorch.org/get-started/previous-versions/
# e.g. for cuda 11.3
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -c conda-forge
  • Other Requirements
pip install -r requirements.txt
  • Build MultiScaleDeformableAttention
cd <AED_HOME>
cd ./models/ops
sh ./make.sh

Dataset preparation

It is recommended to symlink the dataset root toΒ <AED_HOME>/data.

TAO Dataset

  1. Pleases download TAO from here.
  2. Note that you need to fill in this form to request missing AVA and HACS videos in the TAO dataset.
  3. Convert TAO to COCO format and generate TAO val & test v1 filefollowing OVTrack, or you can simply download from here.

SportsMOT Dataset

Pleases download SportsMOT from SportsMOT.

DanceTrack Dataset

Pleases download DanceTrack from DanceTrack.

Detection Results

We've run the inference phase on two detectors, RegionCLIP and Co-DETR, and saved their detection results as JSON files.

For YOLOX, we get the detection results from MixSort and MOTRv2 for SportsMOT and DanceTrack respectively.

All of the detection results can be downloaded from here.

Here are the details for the json files:

JSON File Dataset Detector
TAO_Co-DETR_test.json TAO (base + novel), test Co-DETR (LVIS)
TAO_Co-DETR_train.json TAO (base + novel), train Co-DETR (LVIS)
TAO_Co-DETR_val.json TAO (base + novel), val Co-DETR (LVIS)
TAO_RegionCLIP_test.json TAO (base + novel), test RegionCLIP (regionclip_finetuned-lvis_rn50 + rpn_lvis_866_lsj)
TAO_RegionCLIP_train.json TAO (base + novel), train RegionCLIP (regionclip_finetuned-lvis_rn50 + rpn_lvis_866_lsj)
TAO_RegionCLIP_val.json TAO (base + novel), val RegionCLIP (regionclip_finetuned-lvis_rn50 + rpn_lvis_866_lsj)
YOLOX_DanceTrack_train_val_test.json DanceTrack, train + val + test YOLOX from MOTRv2
YOLOX_SportsMOT_train_val_test.json SportsMOT, train + val + test YOLOX from MixSort

When the downloads are complete, the folder structure should follow:

β”œβ”€β”€ configs
β”‚   β”œβ”€β”€ dancetrack.args
β”‚   β”œβ”€β”€ sportsmot.args
β”‚   └── tao.args
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ DanceTrack
β”‚   β”‚   β”œβ”€β”€ dancetrack_url.xlsx
β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ dancetrack0003
β”‚   β”‚   β”‚   └── ...
β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ dancetrack0001
β”‚   β”‚   β”‚   └── ...
β”‚   β”‚   └── val
β”‚   β”‚       β”œβ”€β”€ dancetrack0004
β”‚   β”‚       └── ...
β”‚   β”œβ”€β”€ detections
β”‚   β”‚   β”œβ”€β”€ TAO_Co-DETR_test.json
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ SportsMOT
β”‚   β”‚   β”œβ”€β”€ dataset
β”‚   β”‚   β”‚   β”œβ”€β”€ annotations
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   └── splits_txt
β”‚   β”‚       β”œβ”€β”€ basketball.txt
β”‚   β”‚       β”œβ”€β”€ football.txt
β”‚   β”‚       β”œβ”€β”€ test.txt
β”‚   β”‚       β”œβ”€β”€ train.txt
β”‚   β”‚       β”œβ”€β”€ val.txt
β”‚   β”‚       └── volleyball.txt
β”‚   └── TAO
β”‚       β”œβ”€β”€ annotations
β”‚       β”‚   β”œβ”€β”€ checksums
β”‚       β”‚   β”œβ”€β”€ README.md
β”‚       β”‚   β”œβ”€β”€ tao_test_burst_v1.json
β”‚       β”‚   β”œβ”€β”€ train_ours_v1.json
β”‚       β”‚   β”œβ”€β”€ validation_ours_v1.json
β”‚       β”‚   └── ...
β”‚       └── frames
β”‚           β”œβ”€β”€ test
β”‚           β”œβ”€β”€ train
β”‚           └── val
└──...
   

Training

Download coco pretrained weight from here (Deformable DETR + iterative bounding box refinement) first. Then put the downloaded weight into <AED_HOME>/pretrained. Please make sure you set the right absolute path of --pretrained, --mot_path, --train_det_path, and --val_det_path.

# TAO
cd <AED_HOME>
# e.g. bash ./tools/train_tao.sh configs/tao.args 0
bash tools/train_tao.sh [config path] [GPU index]
# SportsMOT
bash tools/train_sportsmot.sh [config path] [GPU index]
# DanceTrack
bash tools/train_dancetrack.sh [config path] [GPU index]

Multi-GPU is not supported yet. After training, the results are saved in <AED_HOME>/exps/[dataset name]

Inference

Put the downloaded weights into <AED_HOME>/pretrained like:

pretrained
β”œβ”€β”€ dancetrack_ckpt_train.pth
β”œβ”€β”€ r50_deformable_detr_plus_iterative_bbox_refinement-checkpoint.pth
β”œβ”€β”€ sportsmot_ckpt_train.pth
└── tao_ckpt_train_base.pth

Start inference:

cd <AED_HOME>
# TAO
# e.g. bash tools/inference_tao.sh pretrained/tao_ckpt_train_base.pth configs/tao.args test 0
# Remember to choose the right --val_det_path in the config to specify a detector.
bash tools/inference_tao.sh [checkpoint path] [config path] [split (val / test)] [GPU index]
# SportsMOT
bash tools/inference_sportsmot.sh [checkpoint path] [config path] [split (val / test)] [GPU index]
# DanceTrack
bash tools/inference_dancetrack.sh [checkpoint path] [config path] [split (val / test)] [GPU index]

After inference, the results are saved in <AED_HOME>/exps/[dataset name]_infer_results.

For SportsMOT and DanceTrack, you can upload the results to codalab to get the final score.

Evaluations (Optional)

TAO

cd <AED_HOME>
# e.g. python tools/eval_tao.py --ann_file ./data/validation_ours_v1.json --res_path exps/tao_infer_results/infer1/inference_result/infer_result.json
python tools/eval_tao.py --ann_file path_to_annotations --res_path path_to_results

SportsMOT & DanceTrack

You need to use TrackEval for evaluation (val set).

# move to the path of AED
cd <AED_HOME>
# e.g. 
# bash eval_sportsmot.sh \
# ./data/SportsMOT/dataset/val \
# ./data/SportsMOT/splits_txt/val.txt \
# exps/sportsmot_infer_results/infer1/result_txt \
# exps/sportsmot_infer_results/infer1
bash tools/eval_dancetrack.sh [GT path] [split txt path] [result_txt path] [output path]
bash tools/eval_sportsMOT.sh [GT path] [split txt path] [result_txt path] [output path]

Split txt of DanceTrack can be found in here.

Demo (AED + GroundingDINO)

Install GroundingDINO following the GroundingDINO repository:

cd <AED_HOME>
cd GroundingDINO
# set the CUDA_HOME, e.g. /usr/local/cuda
export CUDA_HOME=/usr/local/cuda
pip install -e .
cd <AED_HOME>
mkdir pretrained
cd pretrained
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth

Run the demo:

cd <AED_HOME>
# e.g. bash tools/run_demo.sh configs/demo.args 0
bash tools/run_demo.sh configs/demo.args [GPU index]

You can set the "--text_prompt" argument in configs/demo.args following GroundingDINO to track other categories.

Acknowledgements & Citation

We would like to express our sincere gratitude to the following works (in no particular order): MOTRv2, OVTrack, QDTrack, RegionCLIP, Co-DETR, YOLOX and GroundingDINO.

If you find this work useful, please consider to cite our paper:

@article{fang2024associate,
  title={Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown},
  author={Fang, Zimeng and Liang, Chao and Zhou, Xue and Zhu, Shuyuan and Li, Xi},
  journal={arXiv preprint arXiv:2409.09293},
  year={2024}
}

About

Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published