This is a repository for QAT finetune on yolov5 using TensorRT's pytorch_quantization tool
Suggest to use docker environment.
Download docker imageοΌ
docker pull longxiaowyh/yolov5:v2.0
Create docker containerοΌ
nvidia-docker run -itu root:root --name yolov5 --gpus all -v /your_path:/target_path -v /tmp/.X11-unix/:/tmp/.X11-unix/ -e DISPLAY=unix$DISPLAY -e GDK_SCALE -e GDK_DPI_SCALE -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility --shm-size=64g yolov5:v2.0 /bin/bash
1.Clone and apply patch
git clone [email protected]:yhwang-hub/yolov7_quantization.git
2.Install dependencies
pip install pytorch-quantization --extra-index-url
3.Prepare coco dataset
βββ annotations
βΒ Β βββ captions_train2017.json
βΒ Β βββ captions_val2017.json
βΒ Β βββ instances_train2017.json
βΒ Β βββ instances_val2017.json
βΒ Β βββ person_keypoints_train2017.json
βΒ Β βββ person_keypoints_val2017.json
βββ coco -> coco
βββ coco128
βΒ Β βββ images
βΒ Β βββ labels
βΒ Β βββ README.txt
βββ images
βΒ Β βββ train2017
βΒ Β βββ val2017
βββ labels
βΒ Β βββ train2017
βΒ Β βββ train2017.cache
βΒ Β βββ val2017
βββ train2017.cache
βββ train2017.txt
βββ val2017.cache
βββ val2017.txt
python --weights ./weights/ --cocodir /home/wyh/disk/coco/ --batch_size 5 --save_ptq True --eval_origin --eval_ptq --sensitive True
Modify the ignore_layers parameter in as follows
parser.add_argument("--ignore_layers", type=str, default="model\.24\.m\.(.*)", help="regx")
python --weights ./weights/ --cocodir /home/wyh/disk/coco/ --batch_size 5 --save_ptq True --eval_origin --eval_ptq --sensitive False
python --weights ./weights/ --cocodir /home/wyh/disk/coco/ --batch_size 5 --save_ptq True --save_qat True --eval_origin --eval_ptq --eval_qat
This script includes steps below:
Insert Q&DQ nodes to get fake-quant pytorch model Pytorch quntization tool provides automatic insertion of QDQ function. But for yolov7 model, it can not get the same performance as PTQ, because in Explicit mode(QAT mode), TensorRT will henceforth refer Q/DQ nodes' placement to restrict the precision of the model. Some of the automatic added Q&DQ nodes can not be fused with other layers which will cause some extra useless precision convertion. In our script, We find Some rules and restrictions for yolov7, QDQ nodes are automatically analyzed and configured in a rule-based manner, ensuring that they are optimal under TensorRT. Ensuring that all nodes are running INT8(confirmed with tool:trt-engine-explorer, see scripts/ for details of this part, please refer quantization/, About the guidance of Q&DQ insert, please refer Guidance_of_QAT_performance_optimization
PTQ calibration After inserting Q&DQ nodes, we recommend to run PTQ-Calibration first. Per experiments, Histogram(MSE) is the best PTQ calibration method for yolov7. Note: if you are satisfied with PTQ result, you could also skip QAT.
QAT training After QAT, need to finetune traning our model. after getting the accuracy we are satisfied, Saving the weights to files