diff --git a/README.md b/README.md
index 253332ed4..d1d8c896f 100755
--- a/README.md
+++ b/README.md
@@ -1,18 +1,22 @@
-#  Multi-Channel Multi-Part Network for Person Re-identification
+#  Lightweight Multi-Branch Network for Person Re-Identification
 
-This repo support
+Pytorch implementation for the paper [Lightweight Multi-Branch Network for Person Re-Identification]
+<!-- (https://arxiv.org/).  -->
+
+![](/utils/LightMBN.png)
+
+This repo supports
 - [x] easy dataset preparation, including Market-1501, DukeMTMC-ReID, CUHK03, MOT17...
 - [x] sota deep neural networks and various options(tricks) for reid
 - [x] easy combination of different kinds of loss function
 - [x] end-to-end training and evaluation
 - [x] less package requirements
 
-
 List of functions
 - Warm up cosine annealing learning rate
 - Random erasing augmentation
 - Cutout augmentation
-- Batch Drop Block and Batch Erasing
+- Drop Block and Batch Erasing
 - Label smoothing(Cross Entropy loss)
 - Triplet loss
 - Multi-Simulatity loss
@@ -24,28 +28,35 @@ List of functions
 - BNNeck
 
 Inplemented networks:
-- Multi-Channel Multi-Part Network, which we proposed
+- Lightweight Multi-Branch Network(LightMBN), which we proposed
 - PCB [[link]](https://arxiv.org/pdf/1711.09349.pdf)
 - MGN [[link]](https://arxiv.org/abs/1804.01438)
 - Bag of tricks [[link]](http://openaccess.thecvf.com/content_CVPRW_2019/papers/TRMTMCT/Luo_Bag_of_Tricks_and_a_Strong_Baseline_for_Deep_Person_CVPRW_2019_paper.pdf)
 - OSNet [[link]](https://arxiv.org/abs/1905.00953)
 - Batch Drop Block(BDB) for Person ReID [[link]](https://arxiv.org/abs/1811.07130)
 
-
 ## Getting Started
 The designed code architecture is concise and easy explicable, where the file engine.py defines the train/ test process and main.py controls the overall epochs, and the folders model, loss, optimizer including respective parts of neural network.
 
-The user-friendly command-line module argparse helps us indicate different datasets, networks, loss functions, and tricks as we need, 
-the detailed options/configurations are described in the bottom of this page.
+The user-friendly command-line module argparse helps us indicate different datasets, networks, loss functions, and tricks as we need, the detailed options/configurations are described in the bottom of this page.
 
-If you don't have any dataset yet, run `git clone https://github.com/jixunbo/ReIDataset.git` to download Market-1501, DukeMTMC, and MOT17.
+If you don't have any dataset yet, run 
+```
+git clone https://github.com/jixunbo/ReIDataset.git
+```
+to download Market-1501, DukeMTMC, CUHK03 and MOT17.
 
-To inplement Multi-Parts Multi-Channels Network with Multi-Similarity loss, run
+To inplement our Lightweight Multi-Branch Network with Multi-Similarity loss, run
 
-`python [path to repo]/main.py --datadir [path to datasets] --data_train dukemtmc --data_test dukemtmc --model MCMP_n --batchid 8 --batchimage 6 --batchtest 32 --test_every 20 --epochs 110 --loss 0.5*CrossEntropy+0.5*MSLoss --margin 0.7 --nGPU 1 --lr 6e-4 --optimizer ADAM --random_erasing --feats 512 --pool avg --save '' --if_labelsmooth --w_cosine_annealing`
+```
+python [path to repo]/main.py --datadir [path to datasets] --data_train market1501 --data_test market1501 --model LMBN_n --batchid 6 --batchimage 8 --batchtest 32 --test_every 20 --epochs 110 --loss 0.5*CrossEntropy+0.5*MSLoss --margin 0.7 --nGPU 1 --lr 6e-4 --optimizer ADAM --random_erasing --feats 512 --save '' --if_labelsmooth --w_cosine_annealing
+```
 
 Also, using pre-defined config file
-`python [path to repo]/main.py --config [path to repo]/mcmp_config.yaml --save ''`
+
+````
+python [path to repo]/main.py --config [path to repo]/lmbn_config.yaml --save ''
+````
 
 All logs, results and parameters will be saved in folder 'experiment'.
 
@@ -57,40 +68,58 @@ Note that, the option '--datadir' is the dataset root, which contains folder Mar
 
 '--epochs' is the epochs we'd like to train, while '--test_every 10' means evaluation will be excuted in every 10 epochs, the parameters of network and optimizer are updated after every every evaluation. 
 
-Actually, for the MCMP model we have two kinds of backbone, MCMP_r we use ResNet 50 or ResNet 50 IBN as backbone, while MCMP_n is OSNet, OSNet contrains much less parameters but could achieve a little bit better performance than ResNet50.
-
-If you would like to re-inplement Bag of Tricks, run
-
-`python [path to repo]/main.py --datadir [path to datasets] --data_train Market1501 --data_test Market1501 --model ResNet50 --batchid 16 --batchimage 4 --batchtest 32 --test_every 10 --epochs 120 --save '' --decay_type step_40_70 --loss 0.5*CrossEntropy+0.5*Triplet --margin 0.3 --nGPU 1 --lr 3.5e-4 --optimizer ADAM --random_erasing --warmup 'linear' --if_labelsmooth`
+Actually, for the LightMBN model we have two kinds of backbone, LMBN_r we use ResNet50 as backbone, while LMBN_n is OSNet, OSNet contrains much less parameters but could achieve a little bit better performance than ResNet50.
 
-or 
-
-`python [path to repo]/main.py --config [path to repo]/bag_of_tricks_config.yaml --save`
+### Results
+| Model | Market1501 | DukeMTMC-reID | CUHK03-D | CUHK03-L |
+| --- | -- | -- | --- | --- |
+| LightMBN(OSNet) | 96.3 (91.5) | 92.1 (83.7) | 85.4(82.6) | 87.2(85.1) |
+| LightMBN(ResNet) | 96.1 (90.4) |  90.5 (82.2) | 81.0(79.2) | 85.2(83.5) |
+| BoT | 94.2 (85.4) |  86.7 (75.8) |  |  |
+| PCB | 95.1 (86.3) |  87.6 (76.6) |  |  |
+| MGN | 94.7 (87.5) | 88.7 (79.4) |  |  |
 
-If you would like to re-inplement PCB with powerful training tricks, run
+Note, Rank-1(mAP), the results are produced by our repo without re-ranking, models and configurations may differ from original paper.
 
-`python [path to repo]/main.py --datadir [path to datasets] --data_train Market1501 --data_test Market1501 --model PCB --batchid 8 --batchimage 8 --batchtest 32 --test_every 10 --epochs 120 --save '' --decay_type step_50_80_110 --loss 0.5*CrossEntropy+0.5*MSLoss --margin 0.7 --nGPU 1 --lr 5e-3 --optimizer ADAM --random_erasing --warmup 'linear' --if_labelsmooth --bnneck --parts 3`
+Additionally, the evaluation metric method is the same as bag of tricks [repo](https://github.com/michuanhaohao/reid-strong-baseline/blob/master/utils/reid_metric.py).
 
-Note that, the option '--parts' is used to set the number of stripes to be devided, original paper set 6.
 
-And also, for MGN model run
-
-`python [path to repo]/main.py --datadir [path to datasets] --data_train Market1501 --data_test Market1501 --model MGN --batchid 16 --batchimage 4 --batchtest 32 --test_every 10 --epochs 120 --save '' --decay_type step_50_80_110 --loss 0.5*CrossEntropy+0.5*Triplet --margin 1.2 --nGPU 1 --lr 2e-4 --optimizer ADAM --random_erasing --warmup 'linear' --if_labelsmooth`
+### Pre-trained models
+and correpondent config files can be found [here](https://1drv.ms/u/s!Ap1wlV4d0agrao4DxXe8loc_k30?e=I9PJXP) .
 
 If you have pretrained model and config file, run
+```
+python [path to repo]/main.py --test_only --config [path to repo]/lmbn_config.yaml --pre_train [path to pretrained model]
+```
+to see the performance of the model.
 
-`python [path to repo]/main.py --test_only --config [path to repo]/mcmp_config.yaml --pre_train [path to pretrained model]` to see the performance of the model.
-
+If you would like to re-inplement Bag of Tricks, run
+```
+python [path to repo]/main.py --datadir [path to datasets] --data_train market1501 --data_test market1501 --model ResNet50 --batchid 16 --batchimage 4 --batchtest 32 --test_every 10 --epochs 120 --save '' --decay_type step_40_70 --loss 0.5*CrossEntropy+0.5*Triplet --margin 0.3 --nGPU 1 --lr 3.5e-4 --optimizer ADAM --random_erasing --warmup 'linear' --if_labelsmooth
+```
+or 
+```
+python [path to repo]/main.py --config [path to repo]/bag_of_tricks_config.yaml --save ''
+```
 
+If you would like to re-inplement PCB with powerful training tricks, run
+```
+python [path to repo]/main.py --datadir [path to datasets] --data_train Market1501 --data_test Market1501 --model PCB --batchid 8 --batchimage 8 --batchtest 32 --test_every 10 --epochs 120 --save '' --decay_type step_50_80_110 --loss 0.5*CrossEntropy+0.5*MSLoss --margin 0.7 --nGPU 1 --lr 5e-3 --optimizer SGD --random_erasing --warmup 'linear' --if_labelsmooth --bnneck --parts 3
+```
 
-[here](https://drive.google.com/open?id=1dIsI0b9kgytd02tl5cPLMBON7eHlyIA5) is the MCMP **pre-trained model** and config file.
+Note that, the option '--parts' is used to set the number of stripes to be devided, original paper set 6.
 
+And also, for MGN model run
+```
+python [path to repo]/main.py --datadir [path to datasets] --data_train Market1501 --data_test Market1501 --model MGN --batchid 16 --batchimage 4 --batchtest 32 --test_every 10 --epochs 120 --save '' --decay_type step_50_80_110 --loss 0.5*CrossEntropy+0.5*Triplet --margin 1.2 --nGPU 1 --lr 2e-4 --optimizer ADAM --random_erasing --warmup 'linear' --if_labelsmooth
+```
 
+###Resume Training
 
 If you want to resume training process, we assume you have the checkpoint file 'model-latest.pth', run
-
-`python [path to repo]/main.py --config [path to repo]/mcmp_config.yaml --load [path to checkpoint]`
-
+```
+python [path to repo]/main.py --config [path to repo]/lmbn_config.yaml --load [path to checkpoint]
+```
 Of course, you can also set options individually using argparse command-line without config file.
 
 ## Easy Inplementation
@@ -101,34 +130,21 @@ Open this [notebook](https://colab.research.google.com/drive/14aRebdOqJSfNlwXiI5
 Please be sure that your are using Google's powerful GPU(Tesla P100 or T4).
 
 The whole training process(120 epochs) takes ~9 hours.
+If you are hard-core player ^ ^ and you'd like to try different models or options, see Get Started as above.
 
-If you are hard-core player ^ ^ and you'd like to try different models or options, see Get Started as follows.
-
-### Results
-| Model | Market1501 | DukeMTMC-reID |
-| --- | -- | -- |
-| MCMP_n | 96.3 (91.3) |  92.0 (83.2) |
-| MCMP_r | 95.8 (90.4) |  90.5 (82.0) |
-| BoT | 94.2 (85.4) |  86.7 (75.8) |
-| PCB | 95.1 (86.3) |  87.6 (76.6) |
-| MGN | 94.7 (87.5) | 88.7 (79.4) |
-
-Note, Rank-1(mAP), the results are produced by our repo without re-ranking, models and configurations may differ from original paper.
-
-Additionally, the evaluation metric method is the same as bag of tricks [repo](https://github.com/michuanhaohao/reid-strong-baseline/blob/master/utils/reid_metric.py).
 
-### Option Description
+## Option Description
 '--nThread': type=int, default=4, number of threads for data loading.
 
 '--cpu', action='store_true', if raise, use cpu only.
 
 '--nGPU', type=int, default=1, number of GPUs.
 
-''--config', type=str, default="", config path,if you have config file,use to set options, you don't need to input any option again.
+--config', type=str, default="", config path,if you have config file,use to set options, you don't need to input any option again.
 
- '--datadir', type=str, is the dataset root, which contains folder Market-1501, DukeMTMC-ReID etw..
+'--datadir', type=str, is the dataset root, which contains folder Market-1501, DukeMTMC-ReID etw..
 
-'--data_train' and '--data_test', type=str, specify the name of train/test dataset, which we can train on one dataset but test on another dataset, supported options: Market1501, DukeMTMC, MOT17, CUHK03.
+'--data_train' and '--data_test', type=str, specify the name of train/test dataset, which we can train on one dataset but test on another dataset, supported options: market1501, dukemtmc, MOT17, cuhk03_spilited(767/700 protocol).
 
 '--batchid 6' and '--batchimage 8': type=int, indicate that each batch contrains 6 persons, each person has 8 different images, totally 48 images.
 
@@ -146,7 +162,7 @@ Additionally, the evaluation metric method is the same as bag of tricks [repo](h
 
 '--epochs', type=int, is the epochs we'd like to train, while '--test_every 10' means evaluation will be excuted in every 10 epochs, the parameters of network and optimizer are updated after every every evaluation. 
 
-'--model', default='MGN', name of model, options: MCMP_n, MCMP_r,  ResNet50, PCB, MGN.
+'--model', default='LMBN_n', name of model, options: LMBN_n, LMBN_r,  ResNet50, PCB, MGN, etw..
 
 '--loss', type=str, default='0.5\*CrossEntropy+0.5\*Triplet', you can combine different loss functions and corresponding weights, you can use only one loss function or 2 and more functions, e.g. '1\*CrossEntropy', '0.5\*CrossEntropy+0.5\*MSLoss+0.0005\*CenterLoss', options: CrossEntropy, Triplet, MSLoss, CenterLoss, Focal, GroupLoss.
 
@@ -166,9 +182,9 @@ Additionally, the evaluation metric method is the same as bag of tricks [repo](h
 
 ''--width', type=int, default=128, width of the input image.
 
-'--num_classes', type=int, default=751, number of classes of train dataset, but normally you don't need to set it, it'll be automatically setted.
+'--num_classes', type=int, default=751, number of classes of train dataset, but normally you don't need to set it, it'll be automatically setted depend on the dataset.
 
-'--lr', type=float, default=2e-4, initial learning rate.
+'--lr', type=float, default=6e-4, initial learning rate.
 
 '--gamma', type=float, default=0.1,learning rate decay factor for step decay.
 
@@ -190,7 +206,7 @@ Additionally, the evaluation metric method is the same as bag of tricks [repo](h
 
 '--cutout', action='store_true', if raise, use cutout augmentation.
 
-'--random_erasing', action='store_true', use random erasing augmentation.
+'--random_erasing', action='store_true', if raise, use random erasing augmentation.
 
 '--probability', type=float, default=0.5, probability of random erasing.
 
@@ -199,4 +215,4 @@ Additionally, the evaluation metric method is the same as bag of tricks [repo](h
 '--num_anchors', type=int, default=1, number of iterations of computing group loss.
 
 ### Acknowledgments
-The codes are expanded from [deep-person-reid](https://github.com/KaiyangZhou/deep-person-reid) and [MGN-pytorch](https://github.com/seathiefwang/MGN-pytorch).
+The codes was built on the top of  [deep-person-reid](https://github.com/KaiyangZhou/deep-person-reid) and [MGN-pytorch](https://github.com/seathiefwang/MGN-pytorch) , We thank the authors for sharing their code publicly.
diff --git a/data_v2/datamanager.py b/data_v2/datamanager.py
index 542438c9b..028f96753 100755
--- a/data_v2/datamanager.py
+++ b/data_v2/datamanager.py
@@ -150,7 +150,7 @@ def __init__(self, args):
         batch_size_test = args.batchtest
         workers = args.nThread
         train_sampler = 'random'
-        cuhk03_labeled = False
+        cuhk03_labeled = args.cuhk03_labeled
         cuhk03_classic_split = False
         market1501_500k = False
 
diff --git a/engine_v3.py b/engine_v3.py
index 8be13413a..74d9ff91e 100644
--- a/engine_v3.py
+++ b/engine_v3.py
@@ -1,9 +1,7 @@
-import os
 import torch
 import numpy as np
-from scipy.spatial.distance import cdist
-from utils.functions import cmc, mean_ap, cmc_baseline, eval_liaoxingyu
-from utils.re_ranking import re_ranking
+from utils.functions import evaluation
+from utils.re_ranking import re_ranking, re_ranking_gpu
 
 
 class Engine():
@@ -26,16 +24,12 @@ def __init__(self, args, model, optimizer, scheduler, loss, loader, ckpt):
 
         if torch.cuda.is_available():
             self.ckpt.write_log('[INFO] GPU: ' + torch.cuda.get_device_name(0))
-            # print(torch.backends.cudnn.benchmark)
 
         self.ckpt.write_log(
             '[INFO] Starting from epoch {}'.format(self.scheduler.last_epoch + 1))
 
-        # print(ckpt.log)
-        # print(self.scheduler._last_lr)
-
     def train(self):
-        # self.loss.step()
+
         epoch = self.scheduler.last_epoch
         lr = self.scheduler.get_last_lr()[0]
 
@@ -64,8 +58,6 @@ def train(self):
                 batch + 1, len(self.train_loader),
                 self.loss.display_loss(batch)),
                 end='' if batch + 1 != len(self.train_loader) else '\n')
-            # if batch == 0:
-            #     break
 
         self.scheduler.step()
         self.loss.end_log(len(self.train_loader))
@@ -77,8 +69,7 @@ def test(self):
         self.model.eval()
 
         self.ckpt.add_log(torch.zeros(1, 6))
-        # qf = self.extract_feature(self.query_loader,self.args).numpy()
-        # gf = self.extract_feature(self.test_loader,self.args).numpy()
+
         with torch.no_grad():
 
             qf, query_ids, query_cams = self.extract_feature(
@@ -87,45 +78,16 @@ def test(self):
                 self.test_loader, self.args)
 
         if self.args.re_rank:
-            q_g_dist = np.dot(qf, np.transpose(gf))
-            q_q_dist = np.dot(qf, np.transpose(qf))
-            g_g_dist = np.dot(gf, np.transpose(gf))
-            dist = re_ranking(q_g_dist, q_q_dist, g_g_dist)
+            # q_g_dist = np.dot(qf, np.transpose(gf))
+            # q_q_dist = np.dot(qf, np.transpose(qf))
+            # g_g_dist = np.dot(gf, np.transpose(gf))
+            # dist = re_ranking(q_g_dist, q_q_dist, g_g_dist)
+            dist = re_ranking_gpu(qf, gf, 20, 6, 0.3)
         else:
-            # dist = cdist(qf, gf,metric='cosine')
-
             # cosine distance
             dist = 1 - torch.mm(qf, gf.t()).cpu().numpy()
 
-            # m, n = qf.shape[0], gf.shape[0]
-            # dist = torch.pow(qf, 2).sum(dim=1, keepdim=True).expand(m, n) + \
-            #           torch.pow(gf, 2).sum(dim=1, keepdim=True).expand(n, m).t()
-            # dist.addmm_(1, -2, qf, gf.t())
-            # dist = dist.cpu().numpy()
-            # dist = np.dot(qf,np.transpose(gf))
-        # print('2')
-
-        # r = cmc(dist, self.queryset.ids, self.testset.ids, self.queryset.cameras, self.testset.cameras,
-        #         separate_camera_set=False,
-        #         single_gallery_shot=False,
-        #         first_match_break=True)
-        # m_ap = mean_ap(dist, self.queryset.ids, self.testset.ids,
-        #                self.queryset.cameras, self.testset.cameras)
-        # r = cmc(dist, query_label, gallery_label, query_cam, gallery_cam,
-        #         separate_camera_set=False,
-        #         single_gallery_shot=False,
-        #         first_match_break=True)
-        # m_ap = mean_ap(dist, query_label, gallery_label, query_cam, gallery_cam)
-        # r, m_ap = cmc_baseline(dist, query_label, gallery_label, query_cam, gallery_cam,
-        #         separate_camera_set=False,
-        #         single_gallery_shot=False,
-        #         first_match_break=True)
-        # r, m_ap = cmc_baseline(dist, query_ids, gallery_ids, query_cams, gallery_cams,
-        #                        separate_camera_set=False,
-        #                        single_gallery_shot=False,
-        #                        first_match_break=True)
-        # r,m_ap=eval_liaoxingyu(dist, query_label, gallery_label, query_cam, gallery_cam, 50)
-        r, m_ap = eval_liaoxingyu(
+        r, m_ap = evaluation(
             dist, query_ids, gallery_ids, query_cams, gallery_cams, 50)
 
         self.ckpt.log[-1, 0] = epoch
@@ -145,8 +107,7 @@ def test(self):
         )
 
         if not self.args.test_only:
-            # self.ckpt.save(self, epoch, is_best=(
-            #     self.ckpt.log[best[1][1], 0] == epoch))
+
             self._save_checkpoint(epoch, r[0], self.ckpt.dir, is_best=(
                 self.ckpt.log[best[1][1], 0] == epoch))
             self.ckpt.plot_map_rank(epoch)
@@ -173,35 +134,21 @@ def extract_feature(self, loader, args):
             outputs = self.model(input_img)
             f2 = outputs.data.cpu()
 
-            # else:
-            #     f1 = outputs[-1].data.cpu()
-            #     # flip
-            #     inputs = inputs.index_select(
-            #         3, torch.arange(inputs.size(3) - 1, -1, -1))
-            #     input_img = inputs.to(self.device)
-            #     outputs = self.model(input_img)
-            #     f2 = outputs[-1].data.cpu()
-
             ff = f1 + f2
             if ff.dim() == 3:
                 fnorm = torch.norm(
                     ff, p=2, dim=1, keepdim=True)  # * np.sqrt(ff.shape[2])
                 ff = ff.div(fnorm.expand_as(ff))
                 ff = ff.view(ff.size(0), -1)
-                # ff = ff.view(ff.size(0), -1)
-                # fnorm = torch.norm(ff, p=2, dim=1, keepdim=True)
-                # ff = ff.div(fnorm.expand_as(ff))
 
             else:
                 fnorm = torch.norm(ff, p=2, dim=1, keepdim=True)
                 ff = ff.div(fnorm.expand_as(ff))
-                # pass
-            # fnorm = torch.norm(ff, p=2, dim=1, keepdim=True)
-            # ff = ff.div(fnorm.expand_as(ff))
+
             features = torch.cat((features, ff), 0)
             pids.extend(pid)
             camids.extend(camid)
-            # print(features.shape)
+
         return features, np.asarray(pids), np.asarray(camids)
 
     def terminate(self):
diff --git a/mcmp_config.yaml b/lmbn_config.yaml
similarity index 82%
rename from mcmp_config.yaml
rename to lmbn_config.yaml
index b850ea2e8..cbbebda9d 100644
--- a/mcmp_config.yaml
+++ b/lmbn_config.yaml
@@ -1,9 +1,8 @@
 T: 3
 act: relu
-activation_map: false
 amsgrad: false
-batchid: 8
-batchimage: 6
+batchid: 6
+batchimage: 8
 batchtest: 32
 beta1: 0.9
 beta2: 0.999
@@ -11,14 +10,15 @@ bnneck: false
 config: ''
 cosine_annealing: false
 cpu: false
+cuhk03_labeled: false
 cutout: false
 dampening: 0
-data_test: dukemtmc
-data_train: dukemtmc
+data_test: market1501
+data_train: market1501
 datadir: /content/ReIDataset/
 decay_type: step_50_80_110
 drop_block: false
-epochs: 110
+epochs: 140
 epsilon: 1.0e-08
 feat_inference: after
 feats: 512
@@ -30,10 +30,11 @@ loss: 0.5*CrossEntropy+0.5*MSLoss
 lr: 0.0006
 lr_decay: 60
 margin: 0.7
-model: MCMP_n
+model: LMBN_n
 momentum: 0.9
 nGPU: 1
 nThread: 4
+nep_id: ''
 nesterov: false
 num_anchors: 1
 num_classes: 751
@@ -45,7 +46,7 @@ probability: 0.5
 random_erasing: true
 reset: false
 sampler: true
-test_every: 20
+test_every: 10
 w_cosine_annealing: true
 w_ratio: 1.0
 warmup: constant
diff --git a/loss/__init__.py b/loss/__init__.py
index a3c538dac..d5d363a07 100644
--- a/loss/__init__.py
+++ b/loss/__init__.py
@@ -20,7 +20,7 @@
 class LossFunction():
     def __init__(self, args, ckpt):
         super(LossFunction, self).__init__()
-        print('[INFO] Making loss...')
+        ckpt.write_log('[INFO] Making loss...')
 
         self.nGPU = args.nGPU
         self.args = args
@@ -29,17 +29,16 @@ def __init__(self, args, ckpt):
             weight, loss_type = loss.split('*')
             if loss_type == 'CrossEntropy':
                 if args.if_labelsmooth:
-                    # print(args.num_classes)
                     loss_function = CrossEntropyLabelSmooth(
                         num_classes=args.num_classes)
-                    # print('Label smooth on')
+                    ckpt.write_log('[INFO] Label Smoothing On.')
                 else:
                     loss_function = nn.CrossEntropyLoss()
             elif loss_type == 'Triplet':
                 loss_function = TripletLoss(args.margin)
             elif loss_type == 'GroupLoss':
                 loss_function = GroupLoss(
-                    T=args.T, num_classes=args.num_classes, num_anchors=args.num_anchors)
+                    total_classes=args.num_classes, max_iter=args.T, num_anchors=args.num_anchors)
             elif loss_type == 'MSLoss':
                 loss_function = MultiSimilarityLoss(margin=args.margin)
             elif loss_type == 'Focal':
@@ -50,17 +49,6 @@ def __init__(self, args, ckpt):
                 loss_function = CenterLoss(
                     num_classes=args.num_classes, feat_dim=args.feats)
 
-            # elif loss_type == 'Mix':
-            #     self.fl = FocalLoss(reduction='mean')
-            #     if args.if_labelsmooth:
-            #         self.ce = CrossEntropyLabelSmooth(
-            #             num_classes=args.num_classes)
-            #         print('Label smooth on')
-            #     else:
-            #         self.ce = nn.CrossEntropyLoss()
-
-            #     self.tri = TripletLoss(args.margin)
-
             self.loss.append({
                 'type': loss_type,
                 'weight': float(weight),
@@ -70,37 +58,17 @@ def __init__(self, args, ckpt):
         if len(self.loss) > 1:
             self.loss.append({'type': 'Total', 'weight': 0, 'function': None})
 
-        # for l in self.loss:
-        #     if l['function'] is not None:
-        #         print('{:.3f} * {}'.format(l['weight'], l['type']))
-        #         self.loss_module.append(l['function'])
-
         self.log = torch.Tensor()
-        # self.start_log()
-        # print(self.log,'kkkk')
-
-        # device = torch.device('cpu' if args.cpu else 'cuda')
-        # self.loss_module.to(device)
-
-        # # if args.load != '':
-        # #     self.load(ckpt.dir, cpu=args.cpu)
-        # if not args.cpu and args.nGPU > 1:
-        #     self.loss_module = nn.DataParallel(
-        #         self.loss_module, range(args.nGPU)
-        #     )
 
     def compute(self, outputs, labels):
         losses = []
-        # print(self.log, 'iiuu')
-        # print(self.loss,'oooooo')
+
         for i, l in enumerate(self.loss):
-            # print(i,'iiiiii')
             if l['type'] in ['CrossEntropy']:
 
                 if isinstance(outputs[0], list):
                     loss = [l['function'](output, labels)
                             for output in outputs[0]]
-                    # print(loss)
                 elif isinstance(outputs[0], torch.Tensor):
                     loss = [l['function'](outputs[0], labels)]
                 else:
@@ -110,18 +78,28 @@ def compute(self, outputs, labels):
                 loss = sum(loss)
                 effective_loss = l['weight'] * loss
                 losses.append(effective_loss)
-                # print(self.log,'llllog')
-                # print(self.log.device)
+
                 self.log[-1, i] += effective_loss.item()
 
             elif l['type'] in ['Triplet', 'MSLoss']:
-                # print('ppppppppp')
                 if isinstance(outputs[-1], list):
-                    # print('99999999')
                     loss = [l['function'](output, labels)
                             for output in outputs[-1]]
                 elif isinstance(outputs[-1], torch.Tensor):
-                    # print('6666666666')
+                    loss = [l['function'](outputs[-1], labels)]
+                else:
+                    raise TypeError(
+                        'Unexpected type: {}'.format(type(outputs[-1])))
+                loss = sum(loss)
+                effective_loss = l['weight'] * loss
+                losses.append(effective_loss)
+                self.log[-1, i] += effective_loss.item()
+
+            elif l['type'] in ['GroupLoss']:
+                if isinstance(outputs[-1], list):
+                    loss = [l['function'](output[0], labels, output[1])
+                            for output in zip(outputs[-1], outputs[0][:3])]
+                elif isinstance(outputs[-1], torch.Tensor):
                     loss = [l['function'](outputs[-1], labels)]
                 else:
                     raise TypeError(
@@ -176,8 +154,7 @@ def plot_loss(self, apath, epoch):
             label = '{} Loss'.format(l['type'])
             fig = plt.figure()
             plt.title(label)
-            # print(self.log[:, i].numpy(), label)
-            # print(axis)
+
             plt.plot(axis, self.log[:, i].numpy(), label=label)
             plt.legend()
             plt.xlabel('Epochs')
diff --git a/loss/grouploss.py b/loss/grouploss.py
index be2572ab7..049275b61 100644
--- a/loss/grouploss.py
+++ b/loss/grouploss.py
@@ -1,158 +1,307 @@
+"""The Group Loss for Deep Metric Learning
+
+Reference:
+Elezi et al. The Group Loss for Deep Metric Learning. ECCV 2020.
+
+Code adapted from https://github.com/dvl-tum/group_loss
+
+"""
+
+import torch.nn as nn
 import torch
-from torch import nn
 import torch.nn.functional as F
-
 import numpy as np
 
 
-class GroupLoss(nn.Module):
-    """Triplet loss with hard positive/negative mining.
+def dynamics(W, X, tol=1e-6, max_iter=5, mode='replicator', **kwargs):
+    """
+    Selector for dynamics
+    Input:
+    W:  the pairwise nxn similarity matrix (with zero diagonal)
+    X:  an (n,m)-array whose rows reside in the n-dimensional simplex
+    tol:  error tolerance
+    max_iter:  maximum number of iterations
+    mode: 'replicator' to run the replicator dynamics
+    """
+
+    if mode == 'replicator':
+        X = _replicator(W, X, tol, max_iter)
+    else:
+        raise ValueError('mode \'' + mode + '\' is not defined.')
 
-    Reference:
-    Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
+    return X
 
-    Code imported from https://github.com/Cysu/open-reid/blob/master/reid/loss/triplet.py.
 
-    Args:
-        margin (float): margin for triplet.
+def _replicator(W, X, tol, max_iter):
+    """
+    Replicator Dynamics
+    Output:
+    X:  the population(s) at convergence
+    i:  the number of iterations needed to converge
+    prec:  the precision reached by the dynamical system
     """
 
-    def __init__(self, T=10, num_classes=751, num_anchors=0):
-        super(GroupLoss, self).__init__()
+    i = 0
+    while i < max_iter:
+        X = X * torch.matmul(W, X)
+        X /= X.sum(dim=X.dim() - 1).unsqueeze(X.dim() - 1)
+        i += 1
 
-        self.T = T
-        self.num_classes = num_classes
+    return X
+
+
+class GroupLoss(nn.Module):
+    def __init__(self, total_classes, tol=-1., max_iter=5, num_anchors=3, tem=79, mode='replicator', device='cuda:0'):
+        super(GroupLoss, self).__init__()
+        self.m = total_classes
+        self.tol = tol
+        self.max_iter = max_iter
+        self.mode = mode
+        self.device = device
+        self.criterion = nn.NLLLoss().to(device)
         self.num_anchors = num_anchors
-        self.nllloss = nn.NLLLoss()
-        # self.cross_entropy=nn.CrossEntropyLoss()
-
-    def forward(self, features, X, targets):
-        """
-        Args:
-            inputs: feature matrix with shape (batch_size, feat_dim)
-            targets: ground truth labels with shape (num_classes)
-        """
-        n, m = X.size()
-        device = X.device
-        # compute pearson r
-        ff = features.clone().detach()
-        fnorm = torch.norm(ff, p=2, dim=1, keepdim=True)
-        ff = ff.div(fnorm.expand_as(ff)).cpu().numpy()
-        coef = np.corrcoef(ff)
-
-        # features_ = features.detach().cpu().numpy()
-        # coef = np.corrcoef(features_)
-
-        diago = np.arange(coef.shape[0])
-        coef[diago, diago] = 0
-        # W = F.relu(torch.tensor((coef - np.diag(np.diag(coef))),
-        #                         dtype=torch.float, device=device))
-        W = F.relu(torch.tensor(coef,
-                                dtype=torch.float, device=device))
-        # print(W,'wwwwwwwwwwww')
-        for i in range(n):
-            if torch.sum(W[i]) == 0:
-                # print(W,'wwwwwwwwwwww')
-
-                W[i, i] = 1
-                # print(W,'wwwwwwwwwwww')
-
-        # print(W,'wwwwwwwww')
-        X = F.softmax(X, dim=1)
-        # print(X)
-        # print(torch.argmax(X,dim=1))
-        # ramdom select anchors
-        ids = torch.unique(targets)
-        # num_samples = n / len(ids)
-        # print(X.dtype)
-        # print(targets)
-        # print(id(X))
-        # X_=X.clone().detach()
-        anchors = []
-        for id_ in ids:
-            anchor = list(np.random.choice(torch.where(targets == id_)[
-                0].cpu(), size=self.num_anchors, replace=False))
-            # print(id,'ididiid')
-            # print(torch.sum(X[anchors]))
-            # print(torch.argmax(X[anchors]))
-            anchors += anchor
-
-            # print(torch.argmax(X[anchors]))
-
-        # print(X[:20,:5],'xxxxxxx')
-        # print(id(X))
-        # print(torch.where(X==torch.max(X,dim=1)))
-
-        for i in range(self.T):
-            X_ = X.clone().detach()
-            X_[anchors] = torch.tensor(F.one_hot(
-                targets[anchors], self.num_classes), dtype=torch.float, device=device)
-            # print(i)
-            # print(X,'xxxxxxxxxxxx')
-            # print(X_,'---------')
-            Pi = torch.mm(W, X_)
-            # print(Pi)
-            # print(Pi, 'pipipi')
-
-            PX = torch.mul(X, Pi)
-
-            # X = F.normalize(PX, dim=1, p=1)
-
-            # print(PX,'pxpxpx')
-            # print(PX.shape)
-
-            # 111111111111111111111111
-            # Norm = np.sum(PX.detach().cpu().numpy(),
-            #               axis=1).reshape(-1)  # .expand(n,m)
-            # # print(Norm,'norm')
-            # Q = 1 / Norm
-            # # print(Q,'QQQQQQQQQ')
-            # Q = torch.diag(torch.tensor(Q, dtype=torch.float, device=device))
-
-            # 2222222222222222222222222
-            # denom = PX.detach().norm(p=1, dim=1, keepdim=True).clamp_min(1e-12).expand_as(PX)
-            # X=PX/denom
-
-            # 3333333333333333333333
-            # Q = torch.diag(1 / PX.norm(p=1, dim=1).clamp_min(1e-12))
-            Q = torch.diag(1 / PX.detach().norm(p=1, dim=1).clamp_min(1e-12))
-            X = torch.mm(Q, PX)
-
-            # 444444444444444444444444444444
-            # Q = torch.diag(1 / torch.matmul(
-            #     PX, torch.ones(m, dtype=torch.float, device=device)))
-            # print(Q,'qqqqq')
-            # X = torch.matmul(Q, PX)
-            # Q=torch.pow(Q,-1)
-            # print(X)
-
-            # 555555555555555555555555555555555555
-        # X = F.softmax(PX, dim=1)
-
-        # print(X.requires_grad)
-        loss = self.nllloss(torch.log(X.clamp_min(1e-12)), targets)
-
-        # loss= self.cross_entropy(X,targets)
+        self.temperature = tem
+
+    def _init_probs_uniform(self, labs, L, U):
+        """ Initialized the probabilities of GTG from uniform distribution """
+        n = len(L) + len(U)
+        ps = torch.zeros(n, self.m).to(self.device)
+        ps[U, :] = 1. / self.m
+        ps[L, labs] = 1.
+
+        # check if probs sum up to 1.
+        assert torch.allclose(ps.sum(dim=1), torch.ones(n).cuda())
+        return ps
+
+    def _init_probs_prior(self, probs, labs, L, U):
+        """ Initiallized probabilities from the softmax layer of the CNN """
+        n = len(L) + len(U)
+        ps = torch.zeros(n, self.m).to(self.device)
+        ps[U, :] = probs[U, :]
+        ps[L, labs] = 1.
+
+        # check if probs sum up to 1.
+        assert torch.allclose(ps.sum(dim=1), torch.ones(n).cuda())
+        return ps
+
+    def _init_probs_prior_only_classes(self, probs, labs, L, U, classes_to_use):
+        """ Different version of the previous version when it considers only classes in the minibatch,
+            might need tuning in order to reach the same performance as _init_probs_prior """
+        n = len(L) + len(U)
+        ps = torch.zeros(n, self.m).to(self.device)
+        ps[U, :] = probs[torch.meshgrid(
+            torch.tensor(U), torch.from_numpy(classes_to_use))]
+        ps[L, labs] = 1.
+        ps /= ps.sum(dim=ps.dim() - 1).unsqueeze(ps.dim() - 1)
+        return ps
+
+    def set_negative_to_zero(self, W):
+        return F.relu(W)
+
+    def _get_W(self, x):
+
+        x = (x - x.mean(dim=1).unsqueeze(1))
+        norms = x.norm(dim=1)
+        W = torch.mm(x, x.t()) / torch.ger(norms, norms)
+
+        W = self.set_negative_to_zero(W.cuda())
+        return W
+
+    def get_labeled_and_unlabeled_points(self, labels, num_points_per_class, num_classes=100):
+        labs, L, U = [], [], []
+        labs_buffer = np.zeros(num_classes)
+        num_points = labels.shape[0]
+        for i in range(num_points):
+            if labs_buffer[labels[i]] == num_points_per_class:
+                U.append(i)
+            else:
+                L.append(i)
+                labs.append(labels[i])
+                labs_buffer[labels[i]] += 1
+        return labs, L, U
+
+    def forward(self, fc7, labels, probs, classes_to_use=None):
+        # print(fc7)
+        # print(type(fc7))
+        # print(labels)
+        # print(type(labels))
+        # print(probs)
+        # print(type(probs))
+        probs = F.softmax(probs / self.temperature)
+        labs, L, U = self.get_labeled_and_unlabeled_points(
+            labels, self.num_anchors, self.m)
+        W = self._get_W(fc7)
+        if type(probs) is type(None):
+            ps = self._init_probs_uniform(labs, L, U)
+        else:
+            if type(classes_to_use) is type(None):
+                ps = probs
+                ps = self._init_probs_prior(ps, labs, L, U)
+            else:
+                ps = probs
+                ps = self._init_probs_prior_only_classes(
+                    ps, labs, L, U, classes_to_use)
+        ps = dynamics(W, ps, self.tol, self.max_iter, self.mode)
+        probs_for_gtg = torch.log(ps + 1e-12)
+        loss = self.criterion(probs_for_gtg, labels)
         return loss
 
-        # #inputs = 1. * inputs / (torch.norm(inputs, 2, dim=-1, keepdim=True).expand_as(inputs) + 1e-12)
-        # # Compute pairwise distance, replace by the official when merged
-        # dist = torch.pow(inputs, 2).sum(dim=1, keepdim=True).expand(n, n)
-        # dist = dist + dist.t()
-        # dist.addmm_(1, -2, inputs, inputs.t())
-        # dist = dist.clamp(min=1e-12).sqrt()  # for numerical stability
-        # # For each anchor, find the hardest positive and negative
-        # mask = targets.expand(n, n).eq(targets.expand(n, n).t())
-        # print(mask[:8, :8])
-        # dist_ap, dist_an = [], []
-        # for i in range(n):
-        #     dist_ap.append(dist[i][mask[i]].max().unsqueeze(0))
-        #     dist_an.append(dist[i][mask[i] == 0].min().unsqueeze(0))
-        # dist_ap = torch.cat(dist_ap)
-        # dist_an = torch.cat(dist_an)
-        # # Compute ranking hinge loss
-        # y = torch.ones_like(dist_an)
-        # loss = self.ranking_loss(dist_an, dist_ap, y)
-        # if self.mutual:
-        #     return loss, dist
-        # return loss
+
+# import torch
+# from torch import nn
+# import torch.nn.functional as F
+
+# import numpy as np
+
+
+# class GroupLoss(nn.Module):
+#     """Triplet loss with hard positive/negative mining.
+
+#     Reference:
+#     Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
+
+#     Code imported from https://github.com/Cysu/open-reid/blob/master/reid/loss/triplet.py.
+
+#     Args:
+#         margin (float): margin for triplet.
+#     """
+
+#     def __init__(self, T=10, num_classes=751, num_anchors=0):
+#         super(GroupLoss, self).__init__()
+
+#         self.T = T
+#         self.num_classes = num_classes
+#         self.num_anchors = num_anchors
+#         self.nllloss = nn.NLLLoss()
+#         # self.cross_entropy=nn.CrossEntropyLoss()
+
+#     def forward(self, features, X, targets):
+#         """
+#         Args:
+#             inputs: feature matrix with shape (batch_size, feat_dim)
+#             targets: ground truth labels with shape (num_classes)
+#         """
+#         n, m = X.size()
+#         device = X.device
+#         # compute pearson r
+#         ff = features.clone().detach()
+#         fnorm = torch.norm(ff, p=2, dim=1, keepdim=True)
+#         ff = ff.div(fnorm.expand_as(ff)).cpu().numpy()
+#         coef = np.corrcoef(ff)
+
+#         # features_ = features.detach().cpu().numpy()
+#         # coef = np.corrcoef(features_)
+
+#         diago = np.arange(coef.shape[0])
+#         coef[diago, diago] = 0
+#         # W = F.relu(torch.tensor((coef - np.diag(np.diag(coef))),
+#         #                         dtype=torch.float, device=device))
+#         W = F.relu(torch.tensor(coef,
+#                                 dtype=torch.float, device=device))
+#         # print(W,'wwwwwwwwwwww')
+#         for i in range(n):
+#             if torch.sum(W[i]) == 0:
+#                 # print(W,'wwwwwwwwwwww')
+
+#                 W[i, i] = 1
+#                 # print(W,'wwwwwwwwwwww')
+
+#         # print(W,'wwwwwwwww')
+#         X = F.softmax(X, dim=1)
+#         # print(X)
+#         # print(torch.argmax(X,dim=1))
+#         # ramdom select anchors
+#         ids = torch.unique(targets)
+#         # num_samples = n / len(ids)
+#         # print(X.dtype)
+#         # print(targets)
+#         # print(id(X))
+#         # X_=X.clone().detach()
+#         anchors = []
+#         for id_ in ids:
+#             anchor = list(np.random.choice(torch.where(targets == id_)[
+#                 0].cpu(), size=self.num_anchors, replace=False))
+#             # print(id,'ididiid')
+#             # print(torch.sum(X[anchors]))
+#             # print(torch.argmax(X[anchors]))
+#             anchors += anchor
+
+#             # print(torch.argmax(X[anchors]))
+
+#         # print(X[:20,:5],'xxxxxxx')
+#         # print(id(X))
+#         # print(torch.where(X==torch.max(X,dim=1)))
+
+#         for i in range(self.T):
+#             X_ = X.clone().detach()
+#             X_[anchors] = torch.tensor(F.one_hot(
+#                 targets[anchors], self.num_classes), dtype=torch.float, device=device)
+#             # print(i)
+#             # print(X,'xxxxxxxxxxxx')
+#             # print(X_,'---------')
+#             Pi = torch.mm(W, X_)
+#             # print(Pi)
+#             # print(Pi, 'pipipi')
+
+#             PX = torch.mul(X, Pi)
+
+#             # X = F.normalize(PX, dim=1, p=1)
+
+#             # print(PX,'pxpxpx')
+#             # print(PX.shape)
+
+#             # 111111111111111111111111
+#             # Norm = np.sum(PX.detach().cpu().numpy(),
+#             #               axis=1).reshape(-1)  # .expand(n,m)
+#             # # print(Norm,'norm')
+#             # Q = 1 / Norm
+#             # # print(Q,'QQQQQQQQQ')
+#             # Q = torch.diag(torch.tensor(Q, dtype=torch.float, device=device))
+
+#             # 2222222222222222222222222
+#             # denom = PX.detach().norm(p=1, dim=1, keepdim=True).clamp_min(1e-12).expand_as(PX)
+#             # X=PX/denom
+
+#             # 3333333333333333333333
+#             # Q = torch.diag(1 / PX.norm(p=1, dim=1).clamp_min(1e-12))
+#             Q = torch.diag(1 / PX.detach().norm(p=1, dim=1).clamp_min(1e-12))
+#             X = torch.mm(Q, PX)
+
+#             # 444444444444444444444444444444
+#             # Q = torch.diag(1 / torch.matmul(
+#             #     PX, torch.ones(m, dtype=torch.float, device=device)))
+#             # print(Q,'qqqqq')
+#             # X = torch.matmul(Q, PX)
+#             # Q=torch.pow(Q,-1)
+#             # print(X)
+
+#             # 555555555555555555555555555555555555
+#         # X = F.softmax(PX, dim=1)
+
+#         # print(X.requires_grad)
+#         loss = self.nllloss(torch.log(X.clamp_min(1e-12)), targets)
+
+#         # loss= self.cross_entropy(X,targets)
+#         return loss
+
+#         # #inputs = 1. * inputs / (torch.norm(inputs, 2, dim=-1, keepdim=True).expand_as(inputs) + 1e-12)
+#         # # Compute pairwise distance, replace by the official when merged
+#         # dist = torch.pow(inputs, 2).sum(dim=1, keepdim=True).expand(n, n)
+#         # dist = dist + dist.t()
+#         # dist.addmm_(1, -2, inputs, inputs.t())
+#         # dist = dist.clamp(min=1e-12).sqrt()  # for numerical stability
+#         # # For each anchor, find the hardest positive and negative
+#         # mask = targets.expand(n, n).eq(targets.expand(n, n).t())
+#         # print(mask[:8, :8])
+#         # dist_ap, dist_an = [], []
+#         # for i in range(n):
+#         #     dist_ap.append(dist[i][mask[i]].max().unsqueeze(0))
+#         #     dist_an.append(dist[i][mask[i] == 0].min().unsqueeze(0))
+#         # dist_ap = torch.cat(dist_ap)
+#         # dist_an = torch.cat(dist_an)
+#         # # Compute ranking hinge loss
+#         # y = torch.ones_like(dist_an)
+#         # loss = self.ranking_loss(dist_an, dist_ap, y)
+#         # if self.mutual:
+#         #     return loss, dist
+#         # return loss
diff --git a/main.py b/main.py
index c51632501..612dc171c 100644
--- a/main.py
+++ b/main.py
@@ -1,10 +1,10 @@
 import data_v1
 import data_v2
-import loss
-import model
-import optim
-import engine_v1
-import engine_v2
+from loss import make_loss
+from model import make_model
+from optim import make_optimizer, make_scheduler
+# import engine_v1
+# import engine_v2
 import engine_v3
 import os.path as osp
 from option import args
@@ -25,10 +25,9 @@
 # loader = data.Data(args)
 ckpt = utility.checkpoint(args)
 loader = data_v2.ImageDataManager(args)
-model = model.Model(args, ckpt)
-optimzer = optim.make_optimizer(args, model)
-loss = loss.make_loss(args, ckpt) if not args.test_only else None
-
+model = make_model(args, ckpt)
+optimzer = make_optimizer(args, model)
+loss = make_loss(args, ckpt) if not args.test_only else None
 
 start = -1
 if args.load != '':
@@ -38,7 +37,7 @@
 if args.pre_train != '':
     ckpt.load_pretrained_weights(model, args.pre_train)
 
-scheduler = optim.make_scheduler(args, optimzer, start)
+scheduler = make_scheduler(args, optimzer, start)
 
 # print('[INFO] System infomation: \n {}'.format(get_pretty_env_info()))
 ckpt.write_log('[INFO] Model parameters: {com[0]} flops: {com[1]}'.format(com=compute_model_complexity(model, (1, 3, args.height, args.width))
diff --git a/model/__init__.py b/model/__init__.py
index 1d9e6c465..69a133e1c 100755
--- a/model/__init__.py
+++ b/model/__init__.py
@@ -1,4 +1,3 @@
-import os
 from importlib import import_module
 
 import torch
@@ -7,63 +6,67 @@
 from collections import OrderedDict
 
 
-class Model(nn.Module):
+def make_model(args, ckpt):
 
-    def __init__(self, args, ckpt):
-        super(Model, self).__init__()
-        ckpt.write_log('[INFO] Making {} model...'.format(args.model))
-        if args.drop_block:
-            ckpt.write_log('[INFO] Using batch drop block with h_ratio {} and w_ratio {}.'.format(args.h_ratio, args.w_ratio))
+    ckpt.write_log('[INFO] Building {} model...'.format(args.model))
 
-        self.device = torch.device('cpu' if args.cpu else 'cuda')
-        self.nGPU = args.nGPU
-        # self.save_models = args.save_models
+    device = torch.device('cpu' if args.cpu else 'cuda')
+    # nGPU = args.nGPU
 
-        module = import_module('model.' + args.model.lower())
-        # self.model = module.make_model(args).to(self.device)
-        self.model = getattr(module, args.model)(args).to(self.device)
+    module = import_module('model.' + args.model.lower())
+    model = getattr(module, args.model)(args).to(device)
 
-        if not args.cpu and args.nGPU > 1:
-            self.model = nn.DataParallel(self.model, range(args.nGPU))
+    if not args.cpu and args.nGPU > 1:
+        model = nn.DataParallel(model, range(args.nGPU))
 
-        # if args.load != '' or args.pre_train != '':
-        #     print(ckpt.dir)
-        #     self.load(
-        #         ckpt.dir,
-        #         pre_train=args.pre_train,
-        #         resume=args.resume,
-        #         cpu=args.cpu
-        #     )
-        # else:
-        #     print('Pretained or latest model not exist, training from scratch.')
+    return model
 
-    def forward(self, x):
-        return self.model(x)
+# class Model(nn.Module):
 
-    def get_model(self):
-        if self.nGPU == 1:
-            return self.model
-        else:
-            return self.model.module
-
-    def save(self, apath, epoch, is_best=False):
-        target = self.get_model()
-        torch.save(
-            target.state_dict(),
-            os.path.join(apath, 'model', 'model_latest.pt')
-        )
-        if is_best:
-            torch.save(
-                target.state_dict(),
-                os.path.join(apath, 'model', 'model_best.pt')
-            )
+#     def __init__(self, args, ckpt):
+#         super(Model, self).__init__()
+#         ckpt.write_log('[INFO] Making {} model...'.format(args.model))
+#         if args.drop_block:
+#             ckpt.write_log('[INFO] Using batch drop block with h_ratio {} and w_ratio {}.'.format(args.h_ratio, args.w_ratio))
 
-        if self.save_models:
-            torch.save(
-                target.state_dict(),
-                os.path.join(apath, 'model', 'model_{}.pt'.format(epoch))
-            )
+#         self.device = torch.device('cpu' if args.cpu else 'cuda')
+#         self.nGPU = args.nGPU
+
+#         module = import_module('model.' + args.model.lower())
+#         # self.model = module.make_model(args).to(self.device)
+#         self.model = getattr(module, args.model)(args).to(self.device)
 
+#         if not args.cpu and args.nGPU > 1:
+#             self.model = nn.DataParallel(self.model, range(args.nGPU))
+
+#     def forward(self, x):
+#         return self.model(x)
+
+#     def get_model(self):
+#         if self.nGPU == 1:
+#             return self.model
+#         else:
+#             return self.model.module
+
+#     def save(self, apath, epoch, is_best=False):
+#         target = self.get_model()
+#         torch.save(
+#             target.state_dict(),
+#             os.path.join(apath, 'model', 'model_latest.pt')
+#         )
+#         if is_best:
+#             torch.save(
+#                 target.state_dict(),
+#                 os.path.join(apath, 'model', 'model_best.pt')
+#             )
+
+
+#         if self.save_models:
+#             torch.save(
+#                 target.state_dict(),
+#                 os.path.join(apath, 'model', 'model_{}.pt'.format(epoch))
+#             )
+'''
     def load(self, apath, pre_train='', resume=-1, cpu=False):
         if cpu:
             kwargs = {'map_location': lambda storage, loc: storage}
@@ -135,4 +138,4 @@ def load(self, apath, pre_train='', resume=-1, cpu=False):
                 ),
                 # strict=False
             )
-
+'''
diff --git a/model/attention.py b/model/attention.py
index 857a416ac..70aeb5955 100644
--- a/model/attention.py
+++ b/model/attention.py
@@ -1,45 +1,15 @@
-###########################################################################
-# Created by: CASIA IVA
-# Email: jliu@nlpr.ia.ac.cn
-# Copyright (c) 2018
-
-# Reference: Dual Attention Network for Scene Segmentation
-# https://arxiv.org/pdf/1809.02983.pdf
-# https://github.com/junfu1115/DANet/blob/master/encoding/nn/attention.py
-###########################################################################
-
-import numpy as np
 import torch
 import math
 import random
 from torch.nn import Module, Sequential, Conv2d, ReLU, AdaptiveMaxPool2d, AdaptiveAvgPool2d, \
     NLLLoss, BCELoss, CrossEntropyLoss, AvgPool2d, MaxPool2d, Parameter, Linear, Sigmoid, Softmax, Dropout, Embedding
 from torch.nn import functional as F
-from torch.autograd import Variable
 from torch import nn
 
 torch_ver = torch.__version__[:3]
 
-__all__ = ['BatchDrop', 'BatchRandomErasing','PAM_Module', 'CAM_Module', 'Dual_Module', 'SE_Module']
-
-
-class BatchDrop(nn.Module):
-    def __init__(self, h_ratio, w_ratio):
-        super(BatchDrop, self).__init__()
-        self.h_ratio = h_ratio
-        self.w_ratio = w_ratio
-
-    def forward(self, x):
-        if self.training:
-            h, w = x.size()[-2:]
-            rh = round(self.h_ratio * h)
-            rw = round(self.w_ratio * w)
-            sx = random.randint(0, h - rh)
-            sy = random.randint(0, w - rw)
-            mask = x.new_ones(x.size())
-            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
-            x = x * mask
-        return x
+__all__ = ['BatchDrop', 'BatchFeatureErase_Top', 'BatchRandomErasing',
+           'PAM_Module', 'CAM_Module', 'Dual_Module', 'SE_Module']
 
 
 class BatchRandomErasing(nn.Module):
@@ -72,17 +42,49 @@ def forward(self, img):
                     x1 = random.randint(0, img.size()[2] - h)
                     y1 = random.randint(0, img.size()[3] - w)
                     if img.size()[1] == 3:
-                        img[:,0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                        img[:,1, x1:x1 + h, y1:y1 + w] = self.mean[1]
-                        img[:,2, x1:x1 + h, y1:y1 + w] = self.mean[2]
+                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
+                        img[:, 1, x1:x1 + h, y1:y1 + w] = self.mean[1]
+                        img[:, 2, x1:x1 + h, y1:y1 + w] = self.mean[2]
                     else:
-                        img[:,0, x1:x1 + h, y1:y1 + w] = self.mean[0]
+                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
                     return img
 
         return img
 
 
+class BatchDrop(nn.Module):
+    """
+    Ref: Batch DropBlock Network for Person Re-identification and Beyond
+    https://github.com/daizuozhuo/batch-dropblock-network/blob/master/models/networks.py
+    Created by: daizuozhuo
+    """
+
+    def __init__(self, h_ratio, w_ratio):
+        super(BatchDrop, self).__init__()
+        self.h_ratio = h_ratio
+        self.w_ratio = w_ratio
+
+    def forward(self, x):
+        if self.training:
+            h, w = x.size()[-2:]
+            rh = round(self.h_ratio * h)
+            rw = round(self.w_ratio * w)
+            sx = random.randint(0, h - rh)
+            sy = random.randint(0, w - rw)
+            mask = x.new_ones(x.size())
+            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
+            x = x * mask
+        return x
+
+
 class BatchDropTop(nn.Module):
+    """
+    Ref: Top-DB-Net: Top DropBlock for Activation Enhancement in Person Re-Identification
+    https://github.com/RQuispeC/top-dropblock/blob/master/torchreid/models/bdnet.py
+    Created by: RQuispeC
+
+    """
+
     def __init__(self, h_ratio):
         super(BatchDropTop, self).__init__()
         self.h_ratio = h_ratio
@@ -115,30 +117,28 @@ def forward(self, x, visdrop=False):
 
 
 class BatchFeatureErase_Top(nn.Module):
+    """
+    Ref: Top-DB-Net: Top DropBlock for Activation Enhancement in Person Re-Identification
+    https://github.com/RQuispeC/top-dropblock/blob/master/torchreid/models/bdnet.py
+    Created by: RQuispeC
+
+    """
+
     def __init__(self, channels, bottleneck_type, h_ratio=0.33, w_ratio=1., double_bottleneck=False):
         super(BatchFeatureErase_Top, self).__init__()
-        # if double_bottleneck:
-        #     self.drop_batch_bottleneck = nn.Sequential(
-        #         Bottleneck(channels, 512),
-        #         Bottleneck(channels, 512)
-        #     )
-        # else:
-        #     self.drop_batch_bottleneck = Bottleneck(channels, 512)
 
         self.drop_batch_bottleneck = bottleneck_type(channels, 512)
 
-        # self.drop_batch_drop_basic = BatchDrop(h_ratio, w_ratio)
+        self.drop_batch_drop_basic = BatchDrop(h_ratio, w_ratio)
         self.drop_batch_drop_top = BatchDropTop(h_ratio)
 
     def forward(self, x, drop_top=True, bottleneck_features=True, visdrop=False):
         features = self.drop_batch_bottleneck(x)
+
         if drop_top:
             x = self.drop_batch_drop_top(features, visdrop=visdrop)
-
-        # if drop_top:
-        #     x = self.drop_batch_drop_top(x, visdrop=visdrop)
-        # else:
-        #     x = self.drop_batch_drop_basic(features, visdrop=visdrop)
+        else:
+            x = self.drop_batch_drop_basic(features, visdrop=visdrop)
         if visdrop:
             return x  # x is dropmask
         if bottleneck_features:
@@ -244,6 +244,15 @@ def forward(self, x):
 
 
 class Dual_Module(Module):
+    """
+    # Created by: CASIA IVA
+    # Email: jliu@nlpr.ia.ac.cn
+    # Copyright (c) 2018
+
+    # Reference: Dual Attention Network for Scene Segmentation
+    # https://arxiv.org/pdf/1809.02983.pdf
+    # https://github.com/junfu1115/DANet/blob/master/encoding/nn/attention.py
+    """
 
     def __init__(self, in_dim):
         super(Dual_Module).__init__()
@@ -255,5 +264,3 @@ def forward(self, x):
         out1 = self.pam(x)
         out2 = self.cam(x)
         return out1 + out2
-
-
diff --git a/model/c.py b/model/c.py
index cb3ee31ea..7b59381d0 100644
--- a/model/c.py
+++ b/model/c.py
@@ -6,71 +6,12 @@
 import random
 import math
 from .osnet import osnet_x1_0, OSBlock
-from .attention import PAM_Module, CAM_Module, SE_Module, Dual_Module
+from .attention import BatchDrop, BatchRandomErasing, PAM_Module, CAM_Module, SE_Module, Dual_Module
 from .bnneck import BNNeck, BNNeck3
 
 from torch.autograd import Variable
 
 
-class BatchDrop(nn.Module):
-    def __init__(self, h_ratio, w_ratio):
-        super(BatchDrop, self).__init__()
-        self.h_ratio = h_ratio
-        self.w_ratio = w_ratio
-
-    def forward(self, x):
-        if self.training:
-            h, w = x.size()[-2:]
-            rh = round(self.h_ratio * h)
-            rw = round(self.w_ratio * w)
-            sx = random.randint(0, h - rh)
-            sy = random.randint(0, w - rw)
-            mask = x.new_ones(x.size())
-            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
-            x = x * mask
-        return x
-
-
-class BatchRandomErasing(nn.Module):
-    def __init__(self, probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=[0.4914, 0.4822, 0.4465]):
-        super(BatchRandomErasing, self).__init__()
-
-        self.probability = probability
-        self.mean = mean
-        self.sl = sl
-        self.sh = sh
-        self.r1 = r1
-
-    def forward(self, img):
-        if self.training:
-
-            if random.uniform(0, 1) > self.probability:
-                return img
-
-            for attempt in range(100):
-
-                area = img.size()[2] * img.size()[3]
-
-                target_area = random.uniform(self.sl, self.sh) * area
-                aspect_ratio = random.uniform(self.r1, 1 / self.r1)
-
-                h = int(round(math.sqrt(target_area * aspect_ratio)))
-                w = int(round(math.sqrt(target_area / aspect_ratio)))
-
-                if w < img.size()[3] and h < img.size()[2]:
-                    x1 = random.randint(0, img.size()[2] - h)
-                    y1 = random.randint(0, img.size()[3] - w)
-                    if img.size()[1] == 3:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                        img[:, 1, x1:x1 + h, y1:y1 + w] = self.mean[1]
-                        img[:, 2, x1:x1 + h, y1:y1 + w] = self.mean[2]
-                    else:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                    return img
-
-        return img
-
-
 class C(nn.Module):
     def __init__(self, args):
         super(C, self).__init__()
@@ -79,14 +20,11 @@ def __init__(self, args):
         self.chs = 512 // self.n_ch
 
         osnet = osnet_x1_0(pretrained=True)
-        # attention = CAM_Module(256)
-        # attention = SE_Module(256)
 
         self.backone = nn.Sequential(
             osnet.conv1,
             osnet.maxpool,
             osnet.conv2,
-            # attention,
             osnet.conv3[0]
         )
 
diff --git a/model/g_c.py b/model/g_c.py
index b7c3ec157..08829fbf8 100644
--- a/model/g_c.py
+++ b/model/g_c.py
@@ -6,71 +6,12 @@
 import random
 import math
 from .osnet import osnet_x1_0, OSBlock
-from .attention import PAM_Module, CAM_Module, SE_Module, Dual_Module
+from .attention import BatchDrop, BatchRandomErasing, PAM_Module, CAM_Module, SE_Module, Dual_Module
 from .bnneck import BNNeck, BNNeck3
 
 from torch.autograd import Variable
 
 
-class BatchDrop(nn.Module):
-    def __init__(self, h_ratio, w_ratio):
-        super(BatchDrop, self).__init__()
-        self.h_ratio = h_ratio
-        self.w_ratio = w_ratio
-
-    def forward(self, x):
-        if self.training:
-            h, w = x.size()[-2:]
-            rh = round(self.h_ratio * h)
-            rw = round(self.w_ratio * w)
-            sx = random.randint(0, h - rh)
-            sy = random.randint(0, w - rw)
-            mask = x.new_ones(x.size())
-            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
-            x = x * mask
-        return x
-
-
-class BatchRandomErasing(nn.Module):
-    def __init__(self, probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=[0.4914, 0.4822, 0.4465]):
-        super(BatchRandomErasing, self).__init__()
-
-        self.probability = probability
-        self.mean = mean
-        self.sl = sl
-        self.sh = sh
-        self.r1 = r1
-
-    def forward(self, img):
-        if self.training:
-
-            if random.uniform(0, 1) > self.probability:
-                return img
-
-            for attempt in range(100):
-
-                area = img.size()[2] * img.size()[3]
-
-                target_area = random.uniform(self.sl, self.sh) * area
-                aspect_ratio = random.uniform(self.r1, 1 / self.r1)
-
-                h = int(round(math.sqrt(target_area * aspect_ratio)))
-                w = int(round(math.sqrt(target_area / aspect_ratio)))
-
-                if w < img.size()[3] and h < img.size()[2]:
-                    x1 = random.randint(0, img.size()[2] - h)
-                    y1 = random.randint(0, img.size()[3] - w)
-                    if img.size()[1] == 3:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                        img[:, 1, x1:x1 + h, y1:y1 + w] = self.mean[1]
-                        img[:, 2, x1:x1 + h, y1:y1 + w] = self.mean[2]
-                    else:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                    return img
-
-        return img
-
-
 class G_C(nn.Module):
     def __init__(self, args):
         super(G_C, self).__init__()
@@ -79,14 +20,11 @@ def __init__(self, args):
         self.chs = 512 // self.n_ch
 
         osnet = osnet_x1_0(pretrained=True)
-        # attention = CAM_Module(256)
-        # attention = SE_Module(256)
 
         self.backone = nn.Sequential(
             osnet.conv1,
             osnet.maxpool,
             osnet.conv2,
-            # attention,
             osnet.conv3[0]
         )
 
diff --git a/model/g_p.py b/model/g_p.py
index 1c3134826..5c54627a7 100644
--- a/model/g_p.py
+++ b/model/g_p.py
@@ -6,87 +6,22 @@
 import random
 import math
 from .osnet import osnet_x1_0, OSBlock
-from .attention import PAM_Module, CAM_Module, SE_Module, Dual_Module
+from .attention import BatchDrop, BatchRandomErasing, PAM_Module, CAM_Module, SE_Module, Dual_Module
 from .bnneck import BNNeck, BNNeck3
 
 from torch.autograd import Variable
 
 
-class BatchDrop(nn.Module):
-    def __init__(self, h_ratio, w_ratio):
-        super(BatchDrop, self).__init__()
-        self.h_ratio = h_ratio
-        self.w_ratio = w_ratio
-
-    def forward(self, x):
-        if self.training:
-            h, w = x.size()[-2:]
-            rh = round(self.h_ratio * h)
-            rw = round(self.w_ratio * w)
-            sx = random.randint(0, h - rh)
-            sy = random.randint(0, w - rw)
-            mask = x.new_ones(x.size())
-            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
-            x = x * mask
-        return x
-
-
-class BatchRandomErasing(nn.Module):
-    def __init__(self, probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=[0.4914, 0.4822, 0.4465]):
-        super(BatchRandomErasing, self).__init__()
-
-        self.probability = probability
-        self.mean = mean
-        self.sl = sl
-        self.sh = sh
-        self.r1 = r1
-
-    def forward(self, img):
-        if self.training:
-
-            if random.uniform(0, 1) > self.probability:
-                return img
-
-            for attempt in range(100):
-
-                area = img.size()[2] * img.size()[3]
-
-                target_area = random.uniform(self.sl, self.sh) * area
-                aspect_ratio = random.uniform(self.r1, 1 / self.r1)
-
-                h = int(round(math.sqrt(target_area * aspect_ratio)))
-                w = int(round(math.sqrt(target_area / aspect_ratio)))
-
-                if w < img.size()[3] and h < img.size()[2]:
-                    x1 = random.randint(0, img.size()[2] - h)
-                    y1 = random.randint(0, img.size()[3] - w)
-                    if img.size()[1] == 3:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                        img[:, 1, x1:x1 + h, y1:y1 + w] = self.mean[1]
-                        img[:, 2, x1:x1 + h, y1:y1 + w] = self.mean[2]
-                    else:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                    return img
-
-        return img
-
-
 class G_P(nn.Module):
     def __init__(self, args):
         super(G_P, self).__init__()
 
-        # self.n_ch = 2
-        # self.chs = 512 // self.n_ch
-
         osnet = osnet_x1_0(pretrained=True)
-        # attention = CAM_Module(256)
-        # attention = SE_Module(256)
 
         self.backone = nn.Sequential(
             osnet.conv1,
             osnet.maxpool,
             osnet.conv2,
-            # attention,
             osnet.conv3[0]
         )
 
diff --git a/model/mcmp_n_drop.py b/model/lmbn_n.py
similarity index 66%
rename from model/mcmp_n_drop.py
rename to model/lmbn_n.py
index bf38c21d7..c2550ae08 100644
--- a/model/mcmp_n_drop.py
+++ b/model/lmbn_n.py
@@ -1,89 +1,17 @@
 import copy
-
 import torch
 from torch import nn
 from .osnet import osnet_x1_0, OSBlock
-from .attention import BatchDrop, BatchRandomErasing, PAM_Module, CAM_Module, SE_Module, Dual_Module
+from .attention import BatchDrop, BatchFeatureErase_Top, PAM_Module, CAM_Module, SE_Module, Dual_Module
 from .bnneck import BNNeck, BNNeck3
 from torch.nn import functional as F
 
-
 from torch.autograd import Variable
 
 
-class BatchDropTop(nn.Module):
-    def __init__(self, h_ratio):
-        super(BatchDropTop, self).__init__()
-        self.h_ratio = h_ratio
-
-    def forward(self, x, visdrop=False):
-        if self.training or visdrop:
-            b, c, h, w = x.size()
-            rh = round(self.h_ratio * h)
-            act = (x**2).sum(1)
-            act = act.view(b, h * w)
-            act = F.normalize(act, p=2, dim=1)
-            act = act.view(b, h, w)
-            max_act, _ = act.max(2)
-            ind = torch.argsort(max_act, 1)
-            ind = ind[:, -rh:]
-            mask = []
-            for i in range(b):
-                rmask = torch.ones(h)
-                rmask[ind[i]] = 0
-                mask.append(rmask.unsqueeze(0))
-            mask = torch.cat(mask)
-            mask = torch.repeat_interleave(mask, w, 1).view(b, h, w)
-            mask = torch.repeat_interleave(mask, c, 0).view(b, c, h, w)
-            if x.is_cuda:
-                mask = mask.cuda()
-            if visdrop:
-                return mask
-            x = x * mask
-        return x
-
-
-class BatchFeatureErase_Top(nn.Module):
-    def __init__(self, channels, h_ratio=0.33, w_ratio=1., double_bottleneck=False):
-        super(BatchFeatureErase_Top, self).__init__()
-        # if double_bottleneck:
-        #     self.drop_batch_bottleneck = nn.Sequential(
-        #         Bottleneck(channels, 512),
-        #         Bottleneck(channels, 512)
-        #     )
-        # else:
-        #     self.drop_batch_bottleneck = Bottleneck(channels, 512)
-        if double_bottleneck:
-            self.drop_batch_bottleneck = nn.Sequential(
-                OSBlock(channels, 512),
-                OSBlock(channels, 512)
-            )
-        else:
-            self.drop_batch_bottleneck = OSBlock(channels, 512)
-
-        # self.drop_batch_drop_basic = BatchDrop(h_ratio, w_ratio)
-        self.drop_batch_drop_top = BatchDropTop(h_ratio)
-
-    def forward(self, x, drop_top=True, bottleneck_features=True, visdrop=False):
-        features = self.drop_batch_bottleneck(x)
-        if drop_top:
-            x = self.drop_batch_drop_top(features, visdrop=visdrop)
-
-        # if drop_top:
-        #     x = self.drop_batch_drop_top(x, visdrop=visdrop)
-        # else:
-        #     x = self.drop_batch_drop_basic(features, visdrop=visdrop)
-        if visdrop:
-            return x  # x is dropmask
-        if bottleneck_features:
-            return x, features
-        else:
-            return x
-
-
-class MCMP_n_drop(nn.Module):
+class LMBN_n(nn.Module):
     def __init__(self, args):
-        super(MCMP_n_drop, self).__init__()
+        super(LMBN_n, self).__init__()
 
         self.n_ch = 2
         self.chs = 512 // self.n_ch
@@ -94,7 +22,6 @@ def __init__(self, args):
             osnet.conv1,
             osnet.maxpool,
             osnet.conv2,
-            # attention,
             osnet.conv3[0]
         )
 
@@ -115,8 +42,7 @@ def __init__(self, args):
 
         reduction = BNNeck3(512, args.num_classes,
                             args.feats, return_f=True)
-        # reduction = BNNeck(
-        #     args.feats, args.num_classes, return_f=True)
+
         self.reduction_0 = copy.deepcopy(reduction)
         self.reduction_1 = copy.deepcopy(reduction)
         self.reduction_2 = copy.deepcopy(reduction)
@@ -138,7 +64,7 @@ def __init__(self, args):
         # print('Using batch drop block.')
         # self.batch_drop_block = BatchDrop(
         #     h_ratio=args.h_ratio, w_ratio=args.w_ratio)
-        self.batch_drop_block = BatchFeatureErase_Top(512)
+        self.batch_drop_block = BatchFeatureErase_Top(512, OSBlock)
 
         self.activation_map = args.activation_map
 
@@ -166,15 +92,15 @@ def forward(self, x):
             fmap_p1 = par[:, :, h_par // 2:, :]
             fmap_c0 = cha[:, :self.chs, :, :]
             fmap_c1 = cha[:, self.chs:, :, :]
-            print('activation_map')
+            print('Generating activation maps...')
 
             return glo, glo_, fmap_c0, fmap_c1, fmap_p0, fmap_p1
 
         glo_drop = self.global_pooling(glo_drop)
         glo = self.channel_pooling(glo)  # shape:(batchsize, 512,1,1)
         g_par = self.global_pooling(par)  # shape:(batchsize, 512,1,1)
-        p_par = self.partial_pooling(par)  # shape:(batchsize, 512,3,1)
-        cha = self.channel_pooling(cha)
+        p_par = self.partial_pooling(par)  # shape:(batchsize, 512,2,1)
+        cha = self.channel_pooling(cha)  # shape:(batchsize, 256,1,1)
 
         p0 = p_par[:, :, 0:1, :]
         p1 = p_par[:, :, 1:2, :]
@@ -196,14 +122,13 @@ def forward(self, x):
 
         ################
 
-        fea = [f_glo[-1], f_p0[-1], f_glo_drop[-1]]
+        fea = [f_glo[-1], f_glo_drop[-1], f_p0[-1]]
 
         if not self.training:
 
             return torch.stack([f_glo[0], f_glo_drop[0], f_p0[0], f_p1[0], f_p2[0], f_c0[0], f_c1[0]], dim=2)
             # return torch.stack([f_glo_drop[0], f_p0[0], f_p1[0], f_p2[0], f_c0[0], f_c1[0]], dim=2)
 
-
         return [f_glo[1], f_glo_drop[1], f_p0[1], f_p1[1], f_p2[1], f_c0[1], f_c1[1]], fea
 
     def weights_init_kaiming(self, m):
diff --git a/model/lmbn_n_drop_no_bnneck.py b/model/lmbn_n_drop_no_bnneck.py
new file mode 100644
index 000000000..8e171cd9a
--- /dev/null
+++ b/model/lmbn_n_drop_no_bnneck.py
@@ -0,0 +1,180 @@
+import copy
+
+import torch
+from torch import nn
+from .osnet import osnet_x1_0, OSBlock
+from .attention import BatchDrop, BatchFeatureErase_Top, PAM_Module, CAM_Module, SE_Module, Dual_Module
+from .bnneck import BNNeck, BNNeck3, ClassBlock
+from torch.nn import functional as F
+
+
+from torch.autograd import Variable
+
+
+class LMBN_n_drop_no_bnneck(nn.Module):
+    def __init__(self, args):
+        super(LMBN_n_drop_no_bnneck, self).__init__()
+
+        self.n_ch = 2
+        self.chs = 512 // self.n_ch
+
+        osnet = osnet_x1_0(pretrained=True)
+
+        self.backone = nn.Sequential(
+            osnet.conv1,
+            osnet.maxpool,
+            osnet.conv2,
+            osnet.conv3[0]
+        )
+
+        conv3 = osnet.conv3[1:]
+
+        self.global_branch = nn.Sequential(copy.deepcopy(
+            conv3), copy.deepcopy(osnet.conv4), copy.deepcopy(osnet.conv5))
+
+        self.partial_branch = nn.Sequential(copy.deepcopy(
+            conv3), copy.deepcopy(osnet.conv4), copy.deepcopy(osnet.conv5))
+
+        self.channel_branch = nn.Sequential(copy.deepcopy(
+            conv3), copy.deepcopy(osnet.conv4), copy.deepcopy(osnet.conv5))
+
+        self.global_pooling = nn.AdaptiveMaxPool2d((1, 1))
+        self.partial_pooling = nn.AdaptiveAvgPool2d((2, 1))
+        self.channel_pooling = nn.AdaptiveAvgPool2d((1, 1))
+
+        reduction = ClassBlock(512, args.num_classes,
+                               num_bottleneck=args.feats, return_f=True)
+
+        self.reduction_0 = copy.deepcopy(reduction)
+        self.reduction_1 = copy.deepcopy(reduction)
+        self.reduction_2 = copy.deepcopy(reduction)
+        self.reduction_3 = copy.deepcopy(reduction)
+        self.reduction_4 = copy.deepcopy(reduction)
+
+        self.shared = nn.Sequential(nn.Conv2d(
+            self.chs, args.feats, 1, bias=False), nn.BatchNorm2d(args.feats), nn.ReLU(True))
+        self.weights_init_kaiming(self.shared)
+
+        self.reduction_ch_0 = ClassBlock(
+            args.feats, args.num_classes, linear=False, return_f=True)
+        self.reduction_ch_1 = ClassBlock(
+            args.feats, args.num_classes, linear=False, return_f=True)
+
+        # if args.drop_block:
+        #     print('Using batch random erasing block.')
+        #     self.batch_drop_block = BatchRandomErasing()
+        # print('Using batch drop block.')
+        # self.batch_drop_block = BatchDrop(
+        #     h_ratio=args.h_ratio, w_ratio=args.w_ratio)
+        self.batch_drop_block = BatchFeatureErase_Top(512, OSBlock)
+
+        self.activation_map = args.activation_map
+
+    def forward(self, x):
+        # if self.batch_drop_block is not None:
+        #     x = self.batch_drop_block(x)
+
+        x = self.backone(x)
+
+        glo = self.global_branch(x)
+        par = self.partial_branch(x)
+        cha = self.channel_branch(x)
+
+        if self.activation_map:
+            glo_ = glo
+
+        if self.batch_drop_block is not None:
+            glo_drop, glo = self.batch_drop_block(glo)
+
+        if self.activation_map:
+
+            _, _, h_par, _ = par.size()
+
+            fmap_p0 = par[:, :, :h_par // 2, :]
+            fmap_p1 = par[:, :, h_par // 2:, :]
+            fmap_c0 = cha[:, :self.chs, :, :]
+            fmap_c1 = cha[:, self.chs:, :, :]
+            print('activation_map')
+
+            return glo, glo_, fmap_c0, fmap_c1, fmap_p0, fmap_p1
+
+        glo_drop = self.global_pooling(glo_drop)
+        glo = self.channel_pooling(glo)  # shape:(batchsize, 512,1,1)
+        g_par = self.global_pooling(par)  # shape:(batchsize, 512,1,1)
+        p_par = self.partial_pooling(par)  # shape:(batchsize, 512,3,1)
+        cha = self.channel_pooling(cha)
+
+        p0 = p_par[:, :, 0:1, :]
+        p1 = p_par[:, :, 1:2, :]
+
+        f_glo = self.reduction_0(glo)
+        f_p0 = self.reduction_1(g_par)
+        f_p1 = self.reduction_2(p0)
+        f_p2 = self.reduction_3(p1)
+        f_glo_drop = self.reduction_4(glo_drop)
+
+        ################
+
+        c0 = cha[:, :self.chs, :, :]
+        c1 = cha[:, self.chs:, :, :]
+        c0 = self.shared(c0)
+        c1 = self.shared(c1)
+        f_c0 = self.reduction_ch_0(c0)
+        f_c1 = self.reduction_ch_1(c1)
+
+        ################
+
+        fea = [f_glo[-1], f_p0[-1], f_glo_drop[-1]]
+
+        if not self.training:
+
+            return torch.stack([f_glo[0], f_glo_drop[0], f_p0[0], f_p1[0], f_p2[0], f_c0[0], f_c1[0]], dim=2)
+
+        return [f_glo[1], f_glo_drop[1], f_p0[1], f_p1[1], f_p2[1], f_c0[1], f_c1[1]], fea
+
+    def weights_init_kaiming(self, m):
+        classname = m.__class__.__name__
+        if classname.find('Linear') != -1:
+            nn.init.kaiming_normal_(m.weight, a=0, mode='fan_out')
+            nn.init.constant_(m.bias, 0.0)
+        elif classname.find('Conv') != -1:
+            nn.init.kaiming_normal_(m.weight, a=0, mode='fan_in')
+            if m.bias is not None:
+                nn.init.constant_(m.bias, 0.0)
+        elif classname.find('BatchNorm') != -1:
+            if m.affine:
+                nn.init.constant_(m.weight, 1.0)
+                nn.init.constant_(m.bias, 0.0)
+
+
+if __name__ == '__main__':
+    # Here I left a simple forward function.
+    # Test the model, before you train it.
+    import argparse
+
+    parser = argparse.ArgumentParser(description='MGN')
+    parser.add_argument('--num_classes', type=int, default=751, help='')
+    parser.add_argument('--bnneck', type=bool, default=True)
+    parser.add_argument('--pool', type=str, default='max')
+    parser.add_argument('--feats', type=int, default=512)
+    parser.add_argument('--drop_block', type=bool, default=True)
+    parser.add_argument('--w_ratio', type=float, default=1.0, help='')
+
+    args = parser.parse_args()
+    net = MCMP_n(args)
+    # net.classifier = nn.Sequential()
+    # print([p for p in net.parameters()])
+    # a=filter(lambda p: p.requires_grad, net.parameters())
+    # print(a)
+
+    print(net)
+    input = Variable(torch.FloatTensor(8, 3, 384, 128))
+    net.eval()
+    output = net(input)
+    print(output.shape)
+    print('net output size:')
+    # print(len(output))
+    # for k in output[0]:
+    #     print(k.shape)
+    # for k in output[1]:
+    #     print(k.shape)
diff --git a/model/mcmp_n.py b/model/lmbn_n_no_drop.py
similarity index 98%
rename from model/mcmp_n.py
rename to model/lmbn_n_no_drop.py
index 9636db7cc..afb7b3ada 100644
--- a/model/mcmp_n.py
+++ b/model/lmbn_n_no_drop.py
@@ -9,9 +9,9 @@
 from torch.autograd import Variable
 
 
-class MCMP_n(nn.Module):
+class LMBN_n_no_drop(nn.Module):
     def __init__(self, args):
-        super(MCMP_n, self).__init__()
+        super(LMBN_n_no_drop, self).__init__()
 
         self.n_ch = 2
         self.chs = 512 // self.n_ch
@@ -22,7 +22,6 @@ def __init__(self, args):
             osnet.conv1,
             osnet.maxpool,
             osnet.conv2,
-            # attention,
             osnet.conv3[0]
         )
 
diff --git a/model/mcmp_r_drop.py b/model/lmbn_r.py
similarity index 98%
rename from model/mcmp_r_drop.py
rename to model/lmbn_r.py
index d50bb4891..64a72e447 100644
--- a/model/mcmp_r_drop.py
+++ b/model/lmbn_r.py
@@ -16,9 +16,9 @@
 from torch.autograd import Variable
 
 
-class MCMP_r_drop(nn.Module):
+class LMBN_r(nn.Module):
     def __init__(self, args):
-        super(MCMP_r_drop, self).__init__()
+        super(LMBN_r, self).__init__()
 
         self.n_ch = 2
         self.chs = 2048 // self.n_ch
diff --git a/model/mcmp_r.py b/model/lmbn_r_no_drop.py
similarity index 98%
rename from model/mcmp_r.py
rename to model/lmbn_r_no_drop.py
index e38b3035e..e0ddb2b08 100644
--- a/model/mcmp_r.py
+++ b/model/lmbn_r_no_drop.py
@@ -16,9 +16,9 @@
 from torch.autograd import Variable
 
 
-class MCMP_r(nn.Module):
+class LMBN_r_no_drop(nn.Module):
     def __init__(self, args):
-        super(MCMP_r, self).__init__()
+        super(LMBN_r_no_drop, self).__init__()
 
         self.n_ch = 2
         self.chs = 2048 // self.n_ch
diff --git a/model/mcmp_n_tiny.py b/model/mcmp_n_tiny.py
deleted file mode 100644
index 774de9944..000000000
--- a/model/mcmp_n_tiny.py
+++ /dev/null
@@ -1,246 +0,0 @@
-import copy
-
-import torch
-from torch import nn
-import torch.nn.functional as F
-import random
-import math
-from .osnet import osnet_x1_0, osnet_x0_5, OSBlock
-from .attention import PAM_Module, CAM_Module, SE_Module, Dual_Module
-from .bnneck import BNNeck, BNNeck3
-
-from torch.autograd import Variable
-
-
-class BatchDrop(nn.Module):
-    def __init__(self, h_ratio, w_ratio):
-        super(BatchDrop, self).__init__()
-        self.h_ratio = h_ratio
-        self.w_ratio = w_ratio
-
-    def forward(self, x):
-        if self.training:
-            h, w = x.size()[-2:]
-            rh = round(self.h_ratio * h)
-            rw = round(self.w_ratio * w)
-            sx = random.randint(0, h - rh)
-            sy = random.randint(0, w - rw)
-            mask = x.new_ones(x.size())
-            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
-            x = x * mask
-        return x
-
-
-class BatchRandomErasing(nn.Module):
-    def __init__(self, probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=[0.4914, 0.4822, 0.4465]):
-        super(BatchRandomErasing, self).__init__()
-
-        self.probability = probability
-        self.mean = mean
-        self.sl = sl
-        self.sh = sh
-        self.r1 = r1
-
-    def forward(self, img):
-        if self.training:
-            if random.uniform(0, 1) > self.probability:
-                return img
-
-            for attempt in range(100):
-
-                area = img.size()[2] * img.size()[3]
-
-                target_area = random.uniform(self.sl, self.sh) * area
-                aspect_ratio = random.uniform(self.r1, 1 / self.r1)
-
-                h = int(round(math.sqrt(target_area * aspect_ratio)))
-                w = int(round(math.sqrt(target_area / aspect_ratio)))
-
-                if w < img.size()[3] and h < img.size()[2]:
-                    x1 = random.randint(0, img.size()[2] - h)
-                    y1 = random.randint(0, img.size()[3] - w)
-                    if img.size()[1] == 3:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                        img[:, 1, x1:x1 + h, y1:y1 + w] = self.mean[1]
-                        img[:, 2, x1:x1 + h, y1:y1 + w] = self.mean[2]
-                    else:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                    return img
-
-        return img
-
-
-class MCMP_n_tiny(nn.Module):
-    def __init__(self, args):
-        super(MCMP_n_tiny, self).__init__()
-
-        self.n_ch = 2
-        self.chs = 256 // self.n_ch
-
-        osnet = osnet_x0_5(pretrained=True)
-        # attention = CAM_Module(256)
-        # attention = SE_Module(256)
-
-        self.backone = nn.Sequential(
-            osnet.conv1,
-            osnet.maxpool,
-            osnet.conv2,
-            # attention,
-            osnet.conv3[0]
-        )
-
-        conv3 = osnet.conv3[1:]
-
-        downsample_conv4 = osnet._make_layer(OSBlock, 2, 192, 256, True)
-        downsample_conv4[:2].load_state_dict(osnet.conv4[:2].state_dict())
-
-        self.global_branch = nn.Sequential(copy.deepcopy(
-            conv3), copy.deepcopy(downsample_conv4), copy.deepcopy(osnet.conv5))
-
-        self.partial_branch = nn.Sequential(copy.deepcopy(
-            conv3), copy.deepcopy(osnet.conv4), copy.deepcopy(osnet.conv5))
-
-        self.channel_branch = nn.Sequential(copy.deepcopy(
-            conv3), copy.deepcopy(osnet.conv4), copy.deepcopy(osnet.conv5))
-
-        if args.pool == 'max':
-            pool2d = nn.AdaptiveMaxPool2d
-        elif args.pool == 'avg':
-            pool2d = nn.AdaptiveAvgPool2d
-        else:
-            raise Exception()
-
-        self.global_pooling = pool2d((1, 1))
-        self.partial_pooling = pool2d((2, 1))
-        self.channel_pooling = pool2d((1, 1))
-
-        reduction = BNNeck3(256, args.num_classes,
-                            args.feats, return_f=True)
-        self.reduction_0 = copy.deepcopy(reduction)
-        self.reduction_1 = copy.deepcopy(reduction)
-        self.reduction_2 = copy.deepcopy(reduction)
-        self.reduction_3 = copy.deepcopy(reduction)
-
-        self.shared = nn.Sequential(nn.Conv2d(
-            self.chs, args.feats, 1, bias=False), nn.BatchNorm2d(args.feats), nn.ReLU(True))
-        self.weights_init_kaiming(self.shared)
-
-        self.reduction_ch_0 = BNNeck(
-            args.feats, args.num_classes, return_f=True)
-        self.reduction_ch_1 = BNNeck(
-            args.feats, args.num_classes, return_f=True)
-
-        # if args.drop_block:
-        #     print('Using batch random erasing block.')
-        #     self.batch_drop_block = BatchRandomErasing()
-        if args.drop_block:
-            print('Using batch drop block.')
-            self.batch_drop_block = BatchDrop(h_ratio=0.33, w_ratio=1)
-        else:
-            self.batch_drop_block = None
-
-        self.activation_map = args.activation_map
-
-    def forward(self, x):
-        # if self.batch_drop_block is not None:
-        #     x = self.batch_drop_block(x)
-
-        x = self.backone(x)
-
-        glo = self.global_branch(x)
-        par = self.partial_branch(x)
-        cha = self.channel_branch(x)
-
-        if self.activation_map:
-
-            _, _, h_par, _ = par.size()
-
-            fmap_p0 = par[:, :, :h_par // 2, :]
-            fmap_p1 = par[:, :, h_par // 2:, :]
-            fmap_c0 = cha[:, :self.chs, :, :]
-            fmap_c1 = cha[:, self.chs:, :, :]
-            print('activation_map')
-
-            return glo, fmap_c0, fmap_c1, fmap_p0, fmap_p1
-
-        if self.batch_drop_block is not None:
-            glo = self.batch_drop_block(glo)
-
-        glo = self.global_pooling(glo)  # shape:(batchsize, 2048,1,1)
-        g_par = self.global_pooling(par)  # shape:(batchsize, 2048,1,1)
-        p_par = self.partial_pooling(par)  # shape:(batchsize, 2048,3,1)
-        cha = self.channel_pooling(cha)
-
-        p0 = p_par[:, :, 0:1, :]
-        p1 = p_par[:, :, 1:2, :]
-
-        f_glo = self.reduction_0(glo)
-        f_p0 = self.reduction_1(g_par)
-        f_p1 = self.reduction_2(p0)
-        f_p2 = self.reduction_3(p1)
-
-        ################
-
-        c0 = cha[:, :self.chs, :, :]
-        c1 = cha[:, self.chs:, :, :]
-        c0 = self.shared(c0)
-        c1 = self.shared(c1)
-        f_c0 = self.reduction_ch_0(c0)
-        f_c1 = self.reduction_ch_1(c1)
-
-        ################
-
-        fea = [f_glo[-1], f_p0[-1]]
-
-        if not self.training:
-
-            return torch.stack([f_glo[0], f_p0[0], f_p1[0], f_p2[0], f_c0[0], f_c1[0]], dim=2)
-
-        return [f_glo[1], f_p0[1], f_p1[1], f_p2[1], f_c0[1], f_c1[1]], fea
-
-    def weights_init_kaiming(self, m):
-        classname = m.__class__.__name__
-        if classname.find('Linear') != -1:
-            nn.init.kaiming_normal_(m.weight, a=0, mode='fan_out')
-            nn.init.constant_(m.bias, 0.0)
-        elif classname.find('Conv') != -1:
-            nn.init.kaiming_normal_(m.weight, a=0, mode='fan_in')
-            if m.bias is not None:
-                nn.init.constant_(m.bias, 0.0)
-        elif classname.find('BatchNorm') != -1:
-            if m.affine:
-                nn.init.constant_(m.weight, 1.0)
-                nn.init.constant_(m.bias, 0.0)
-
-
-if __name__ == '__main__':
-    # Here I left a simple forward function.
-    # Test the model, before you train it.
-    import argparse
-
-    parser = argparse.ArgumentParser(description='MGN')
-    parser.add_argument('--num_classes', type=int, default=751, help='')
-    parser.add_argument('--bnneck', type=bool, default=True)
-    parser.add_argument('--pool', type=str, default='max')
-    parser.add_argument('--feats', type=int, default=512)
-    parser.add_argument('--drop_block', type=bool, default=True)
-    parser.add_argument('--w_ratio', type=float, default=1.0, help='')
-
-    args = parser.parse_args()
-    net = MCMP_n(args)
-    # net.classifier = nn.Sequential()
-    # print([p for p in net.parameters()])
-    # a=filter(lambda p: p.requires_grad, net.parameters())
-    # print(a)
-
-    print(net)
-    input = Variable(torch.FloatTensor(8, 3, 384, 128))
-    net.eval()
-    output = net(input)
-    print(output.shape)
-    print('net output size:')
-    # print(len(output))
-    # for k in output[0]:
-    #     print(k.shape)
-    # for k in output[1]:
-    #     print(k.shape)
diff --git a/model/osnet.py b/model/osnet.py
index 7d7b4fa36..7bd3d9995 100644
--- a/model/osnet.py
+++ b/model/osnet.py
@@ -353,7 +353,7 @@ def _get_torch_home():
     cached_file = os.path.join(model_dir, filename)
 
     if not os.path.exists(cached_file):
-        gdown.download(pretrained_urls[key], cached_file, quiet=False)
+        gdown.download(pretrained_urls[key], cached_file, quiet=True)
 
     state_dict = torch.load(cached_file)
     model_dict = model.state_dict()
diff --git a/model/p.py b/model/p.py
index f71110ca0..839fb5743 100644
--- a/model/p.py
+++ b/model/p.py
@@ -6,87 +6,22 @@
 import random
 import math
 from .osnet import osnet_x1_0, OSBlock
-from .attention import PAM_Module, CAM_Module, SE_Module, Dual_Module
+from .attention import BatchDrop, BatchRandomErasing, PAM_Module, CAM_Module, SE_Module, Dual_Module
 from .bnneck import BNNeck, BNNeck3
 
 from torch.autograd import Variable
 
 
-class BatchDrop(nn.Module):
-    def __init__(self, h_ratio, w_ratio):
-        super(BatchDrop, self).__init__()
-        self.h_ratio = h_ratio
-        self.w_ratio = w_ratio
-
-    def forward(self, x):
-        if self.training:
-            h, w = x.size()[-2:]
-            rh = round(self.h_ratio * h)
-            rw = round(self.w_ratio * w)
-            sx = random.randint(0, h - rh)
-            sy = random.randint(0, w - rw)
-            mask = x.new_ones(x.size())
-            mask[:, :, sx:sx + rh, sy:sy + rw] = 0
-            x = x * mask
-        return x
-
-
-class BatchRandomErasing(nn.Module):
-    def __init__(self, probability=0.5, sl=0.02, sh=0.4, r1=0.3, mean=[0.4914, 0.4822, 0.4465]):
-        super(BatchRandomErasing, self).__init__()
-
-        self.probability = probability
-        self.mean = mean
-        self.sl = sl
-        self.sh = sh
-        self.r1 = r1
-
-    def forward(self, img):
-        if self.training:
-
-            if random.uniform(0, 1) > self.probability:
-                return img
-
-            for attempt in range(100):
-
-                area = img.size()[2] * img.size()[3]
-
-                target_area = random.uniform(self.sl, self.sh) * area
-                aspect_ratio = random.uniform(self.r1, 1 / self.r1)
-
-                h = int(round(math.sqrt(target_area * aspect_ratio)))
-                w = int(round(math.sqrt(target_area / aspect_ratio)))
-
-                if w < img.size()[3] and h < img.size()[2]:
-                    x1 = random.randint(0, img.size()[2] - h)
-                    y1 = random.randint(0, img.size()[3] - w)
-                    if img.size()[1] == 3:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                        img[:, 1, x1:x1 + h, y1:y1 + w] = self.mean[1]
-                        img[:, 2, x1:x1 + h, y1:y1 + w] = self.mean[2]
-                    else:
-                        img[:, 0, x1:x1 + h, y1:y1 + w] = self.mean[0]
-                    return img
-
-        return img
-
-
 class P(nn.Module):
     def __init__(self, args):
         super(P, self).__init__()
 
-        # self.n_ch = 2
-        # self.chs = 512 // self.n_ch
-
         osnet = osnet_x1_0(pretrained=True)
-        # attention = CAM_Module(256)
-        # attention = SE_Module(256)
 
         self.backone = nn.Sequential(
             osnet.conv1,
             osnet.maxpool,
             osnet.conv2,
-            # attention,
             osnet.conv3[0]
         )
 
diff --git a/model/resnet50.py b/model/resnet50.py
index 63af294ac..9c5a227ab 100644
--- a/model/resnet50.py
+++ b/model/resnet50.py
@@ -60,13 +60,9 @@ def __init__(self, args, droprate=0.5, stride=1):
                                      return_f=True)
             # self.classifier = BNNeck3(2048, args.num_classes, feat_dim=512,
             #  return_f=True)
-            # self.classifier = new_BNNeck(2048, args.num_classes, 256, return_f=True)
-            # print(args.num_classes)
 
         else:
 
-            # self.classifier = ClassBlock(
-            #     2048, args.num_classes,relu=True, return_f=True)
             self.classifier = ClassBlock(
                 2048, args.num_classes, num_bottleneck=args.feats, return_f=True)
 
@@ -96,7 +92,7 @@ def forward(self, x):
             return x[0]
         # print(x[1].size())
         # print(x[-1].size())
-        return [x[1], x[-1]]
+        return [x[1]], [x[-1]]
 
 
 if __name__ == '__main__':
diff --git a/optim/__init__.py b/optim/__init__.py
index a711062c9..703e896fc 100644
--- a/optim/__init__.py
+++ b/optim/__init__.py
@@ -11,17 +11,9 @@ def make_optimizer(args, model):
         ignored_params = []
         for i in range(args.parts):
             name = 'classifier' + str(i)
-            c = getattr(model.model, name)
+            c = getattr(model, name)
             ignored_params = ignored_params + list(map(id, c.parameters()))
 
-        # ignored_params = (list(map(id, model.model.classifier0.parameters()))
-        #                   + list(map(id, model.model.classifier1.parameters()))
-        #                   # + list(map(id, model.model.classifier2.parameters()))
-        #                   # + list(map(id, model.model.classifier3.parameters()))
-        #                   # + list(map(id, model.model.classifier4.parameters()))
-        #                   # + list(map(id, model.model.classifier5.parameters())))
-        #                   )
-
         ignored_params = tuple(ignored_params)
 
         base_params = filter(lambda p: id(
@@ -108,7 +100,6 @@ def make_scheduler(args, optimizer, last_epoch):
 
         scheduler = WarmupCosineAnnealingLR(
             optimizer, multiplier=1, warmup_epoch=10, min_lr=args.lr / 1000, epochs=args.epochs, last_epoch=last_epoch)
-        # optimizer, multiplier=1, warmup_epoch=10, min_lr=3.5e-7, epochs=args.epochs, last_epoch=last_epoch)
 
         return scheduler
 
diff --git a/optim/warmup_cosine_scheduler.py b/optim/warmup_cosine_scheduler.py
index bd732f605..f163be510 100644
--- a/optim/warmup_cosine_scheduler.py
+++ b/optim/warmup_cosine_scheduler.py
@@ -40,85 +40,6 @@ def get_lr(self):
             return [base_lr * ((self.multiplier - 1.) * self.last_epoch / self.warmup_epoch + 1.) for base_lr in self.base_lrs]
 
 
-'''
-class WarmupCosineAnnealingLR(_LRScheduler):
-    """ Gradually warm-up(increasing) learning rate in optimizer.
-    Proposed in 'Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour'.
-    Args:
-        optimizer (Optimizer): Wrapped optimizer.
-        multiplier: target learning rate = base lr * multiplier if multiplier > 1.0. if multiplier = 1.0, lr starts from 0 and ends up with the base_lr.
-        warmup_epoch: target learning rate is reached at warmup_epoch, gradually
-        after_scheduler: after target_epoch, use this scheduler(eg. ReduceLROnPlateau)
-    """
-
-    def __init__(self, optimizer, multiplier, warmup_epoch, epochs, min_lr=3.5e-7, last_epoch=-1):
-        self.multiplier = multiplier
-        if self.multiplier < 1.:
-            raise ValueError(
-                'multiplier should be greater thant or equal to 1.')
-        self.warmup_epoch = warmup_epoch
-        self.last_epoch = last_epoch
-        if last_epoch < 10:
-            self.after_scheduler = lrs.CosineAnnealingLR(
-                optimizer, float(epochs - warmup_epoch), eta_min=min_lr, last_epoch=-1
-            )
-        else:
-            self.after_scheduler = lrs.CosineAnnealingLR(
-                optimizer, float(epochs - warmup_epoch), eta_min=min_lr, last_epoch=last_epoch - warmup_epoch - 1
-            )
-        # self.after_scheduler = after_scheduler
-        self.finished = False
-
-        super(WarmupCosineAnnealingLR, self).__init__(optimizer, last_epoch)
-
-    def get_lr(self):
-        if self.last_epoch > self.warmup_epoch - 1:
-            if self.after_scheduler:
-                if not self.finished:
-                    self.after_scheduler.base_lrs = [
-                        base_lr * self.multiplier for base_lr in self.base_lrs]
-                    self.finished = True
-                return self.after_scheduler.get_last_lr()
-            return [base_lr * self.multiplier for base_lr in self.base_lrs]
-
-        if self.multiplier == 1.0:
-            return [base_lr * (float(self.last_epoch + 1) / self.warmup_epoch) for base_lr in self.base_lrs]
-        else:
-            return [base_lr * ((self.multiplier - 1.) * self.last_epoch / self.warmup_epoch + 1.) for base_lr in self.base_lrs]
-
-    def step_ReduceLROnPlateau(self, metrics, epoch=None):
-        if epoch is None:
-            epoch = self.last_epoch + 1
-        # ReduceLROnPlateau is called at the end of epoch, whereas others are called at beginning
-        self.last_epoch = epoch if epoch != 0 else 1
-        if self.last_epoch <= self.warmup_epoch:
-            warmup_lr = [base_lr * ((self.multiplier - 1.) * self.last_epoch /
-                                    self.warmup_epoch + 1.) for base_lr in self.base_lrs]
-            for param_group, lr in zip(self.optimizer.param_groups, warmup_lr):
-                param_group['lr'] = lr
-        else:
-            if epoch is None:
-                self.after_scheduler.step(metrics, None)
-            else:
-                self.after_scheduler.step(metrics, epoch - self.warmup_epoch)
-
-    def step(self, epoch=None, metrics=None):
-        if type(self.after_scheduler) != ReduceLROnPlateau:
-            if self.finished and self.after_scheduler:
-                if epoch is None:
-                    # print('koooo',self.last_epoch)
-                    self.after_scheduler.step()
-                    self.last_epoch += 1
-
-                else:
-                    self.after_scheduler.step(epoch - self.warmup_epoch)
-                self._last_lr = self.after_scheduler.get_last_lr()
-                # self.last_epoch = self.after_scheduler.last_epoch + self.warmup_epoch
-            else:
-                return super(WarmupCosineAnnealingLR, self).step(epoch)
-        else:
-            self.step_ReduceLROnPlateau(metrics, epoch)
-'''
 
 if __name__ == '__main__':
     v = torch.zeros(10)
diff --git a/option.py b/option.py
index 05b88935b..0485d7e19 100755
--- a/option.py
+++ b/option.py
@@ -2,88 +2,136 @@
 
 parser = argparse.ArgumentParser(description='MGN')
 
-parser.add_argument('--nThread', type=int, default=4, help='number of threads for data loading')
-parser.add_argument('--cpu', action='store_true', help='if raise, use cpu only')
+parser.add_argument('--nThread', type=int, default=4,
+                    help='number of threads for data loading')
+parser.add_argument('--cpu', action='store_true',
+                    help='if raise, use cpu only')
 parser.add_argument('--nGPU', type=int, default=1, help='number of GPUs')
 
 parser.add_argument("--config", type=str, default="", help='config path')
 
-parser.add_argument("--datadir", type=str, default="Market-1501-v15.09.15", help='dataset directory root')
-parser.add_argument('--data_train', type=str, default='Market1501', help='train dataset name')
-parser.add_argument('--data_test', type=str, default='Market1501', help='test dataset name')
-
-parser.add_argument('--reset', action='store_true', help='reset the training')
-parser.add_argument("--epochs", type=int, default=80, help='number of epochs to train')
-parser.add_argument('--test_every', type=int, default=20, help='do test per every N epochs')
+parser.add_argument("--datadir", type=str,
+                    default="Market-1501-v15.09.15", help='dataset directory root')
+parser.add_argument('--data_train', type=str,
+                    default='Market1501', help='train dataset name')
+parser.add_argument('--data_test', type=str,
+                    default='Market1501', help='test dataset name')
+parser.add_argument('--cuhk03_labeled', action='store_true',
+                    help='if raise, use cuhk03-labeled dataset, otherwise cuhk03-detected dataset')
+
+parser.add_argument("--epochs", type=int, default=80,
+                    help='number of epochs to train')
+parser.add_argument('--test_every', type=int, default=20,
+                    help='do test per every N epochs')
 parser.add_argument("--batchid", type=int, default=16, help='the batch for id')
-parser.add_argument("--batchimage", type=int, default=4, help='the batch of per id')
-parser.add_argument("--batchtest", type=int, default=32, help='input batch size for test')
-parser.add_argument('--test_only', action='store_true', help='set this option to test the model')
-parser.add_argument('--sampler', type=str,default='True',help='do use sampler in dataloader')
-
-
-parser.add_argument('--model', default='MGN', help='model name')
-parser.add_argument('--loss', type=str, default='1*CrossEntropy+1*Triplet', help='loss function configuration')
-parser.add_argument("--if_labelsmooth", action='store_true', help='Label Smooth Trick')
-parser.add_argument("--bnneck", action='store_true', help='Apply bnneck before classifier, refer to BoT paper')
-parser.add_argument("--feat_inference", type=str,default='after', help='Apply bnneck before classifier, refer to BoT paper')
-parser.add_argument("--drop_block", action='store_true', help='Apply batch drop block')
-parser.add_argument("--w_ratio", type=float, default=1.0, help='w_ratio of batch drop block')
-parser.add_argument("--h_ratio", type=float, default=0.3, help='w_ratio of batch drop block')
-
-
-
-parser.add_argument('--act', type=str, default='relu', help='activation function')
+parser.add_argument("--batchimage", type=int, default=4,
+                    help='the batch of per id')
+parser.add_argument("--batchtest", type=int, default=32,
+                    help='input batch size for test')
+parser.add_argument('--test_only', action='store_true',
+                    help='set this option to test the model')
+parser.add_argument('--sampler', type=str, default='True',
+                    help='do use sampler in dataloader')
+
+
+parser.add_argument('--model', default='LMBN_n', help='model name')
+parser.add_argument('--loss', type=str, default='1*CrossEntropy+1*Triplet',
+                    help='loss function configuration')
+parser.add_argument("--if_labelsmooth", action='store_true',
+                    help='Label Smoothing Trick')
+parser.add_argument("--bnneck", action='store_true',
+                    help='Apply bnneck before classifier, refer to BoT paper')
+parser.add_argument("--feat_inference", type=str, default='after',
+                    help='Apply bnneck before classifier, refer to BoT paper')
+parser.add_argument("--drop_block", action='store_true',
+                    help='Apply batch drop block')
+parser.add_argument("--w_ratio", type=float, default=1.0,
+                    help='w_ratio of batch drop block')
+parser.add_argument("--h_ratio", type=float, default=0.3,
+                    help='w_ratio of batch drop block')
+
+
+parser.add_argument('--act', type=str, default='relu',
+                    help='activation function')
 parser.add_argument('--pool', type=str, default='avg', help='pool function')
-parser.add_argument('--feats', type=int, default=512, help='number of feature maps')
-parser.add_argument('--height', type=int, default=384, help='height of the input image')
-parser.add_argument('--width', type=int, default=128, help='width of the input image')
+parser.add_argument('--feats', type=int, default=512,
+                    help='number of feature maps')
+parser.add_argument('--height', type=int, default=384,
+                    help='height of the input image')
+parser.add_argument('--width', type=int, default=128,
+                    help='width of the input image')
 parser.add_argument('--num_classes', type=int, default=751, help='')
-parser.add_argument('--T', type=int, default=3, help='number of iterations of computing group loss')
-parser.add_argument('--num_anchors', type=int, default=1, help='number of iterations of computing group loss')
+parser.add_argument('--T', type=int, default=1,
+                    help='number of iterations of computing group loss')
+parser.add_argument('--num_anchors', type=int, default=2,
+                    help='number of iterations of computing group loss')
 
 
-parser.add_argument("--lr", type=float, default=2e-4, help='learning rate')
-parser.add_argument('--optimizer', default='ADAM', choices=('SGD','ADAM','NADAM','RMSprop'), help='optimizer to use (SGD | ADAM | NADAM | RMSprop)')
+parser.add_argument("--lr", type=float, default=6e-4, help='learning rate')
+parser.add_argument('--optimizer', default='ADAM', choices=('SGD', 'ADAM',
+                                                            'NADAM', 'RMSprop'), help='optimizer to use (SGD | ADAM | NADAM | RMSprop)')
 parser.add_argument('--momentum', type=float, default=0.9, help='SGD momentum')
 parser.add_argument('--dampening', type=float, default=0, help='SGD dampening')
 parser.add_argument('--nesterov', action='store_true', help='SGD nesterov')
 parser.add_argument('--beta1', type=float, default=0.9, help='ADAM beta1')
 parser.add_argument('--beta2', type=float, default=0.999, help='ADAM beta2')
 parser.add_argument('--amsgrad', action='store_true', help='ADAM amsgrad')
-parser.add_argument('--epsilon', type=float, default=1e-8, help='ADAM epsilon for numerical stability')
-parser.add_argument('--gamma', type=float, default=0.1, help='learning rate decay factor for step decay')
-parser.add_argument('--weight_decay', type=float, default=5e-4, help='weight decay')
-parser.add_argument('--decay_type', type=str, default='step', help='learning rate decay type')
-parser.add_argument('--lr_decay', type=int, default=60, help='learning rate decay per N epochs')
-parser.add_argument('--warmup', type=str, default='constant', help='warmup iteration, option: linear, constant')
-parser.add_argument('--pcb_different_lr', type=str,default='True', help='use different lr in pcb optimizer')
-parser.add_argument("--cosine_annealing", action='store_true', help='if raise, cosine_annealing')
-parser.add_argument("--w_cosine_annealing", action='store_true', help='if raise, warmup cosine_annealing')
+parser.add_argument('--epsilon', type=float, default=1e-8,
+                    help='ADAM epsilon for numerical stability')
+parser.add_argument('--gamma', type=float, default=0.1,
+                    help='learning rate decay factor for step decay')
+parser.add_argument('--weight_decay', type=float,
+                    default=5e-4, help='weight decay')
+parser.add_argument('--decay_type', type=str, default='step',
+                    help='learning rate decay type')
+parser.add_argument('--lr_decay', type=int, default=60,
+                    help='learning rate decay per N epochs')
+parser.add_argument('--warmup', type=str, default='constant', choices=['constant', 'linear'],
+                    help='warmup iteration, option: linear, constant')
+parser.add_argument('--pcb_different_lr', type=str,
+                    default='True', help='use different lr in pcb optimizer')
+parser.add_argument("--cosine_annealing", action='store_true',
+                    help='if raise, cosine_annealing')
+parser.add_argument("--w_cosine_annealing", action='store_true',
+                    help='if raise, warmup cosine_annealing')
 
 
 parser.add_argument('--parts', type=int, default=6, help='parts of PCB model')
 parser.add_argument("--margin", type=float, default=1.2, help='')
-parser.add_argument("--re_rank", action='store_true', help='if raise, use re-ranking')
-parser.add_argument("--cutout", action='store_true', help='if raise, use cutout augmentation')
+parser.add_argument("--re_rank", action='store_true',
+                    help='if raise, use re-ranking')
+parser.add_argument("--cutout", action='store_true',
+                    help='if raise, use cutout augmentation')
 
 parser.add_argument("--random_erasing", action='store_true', help='')
 parser.add_argument("--probability", type=float, default=0.5, help='')
 
+parser.add_argument('--save', type=str, default='test',
+                    help='folder name to save')
+parser.add_argument('--load', type=str, default='', help='folder name to load')
+parser.add_argument('--pre_train', type=str, default='',
+                    help='pre-trained model path')
+parser.add_argument("--activation_map", action='store_true',
+                    help='if raise, return feature activation map')
+# For Neptune
+parser.add_argument('--nep_token', '-n', type=str,
+                    default='', help='neptune_api_token')
+parser.add_argument('--nep_id', type=str,
+                    default='', help='neptune_experiment_id')
+parser.add_argument('--nep_name', type=str,
+                    default='x.ji/mcmp', help='neptune_project_name')
+
+parser.add_argument('--reset', action='store_true', help='reset the training')
 # parser.add_argument("--savedir", type=str, default='saved_models', help='directory name to save')
 # parser.add_argument("--outdir", type=str, default='out', help='')
 # parser.add_argument("--resume", action='store_true', help='whether resume training from specific checkpoint')
 # parser.add_argument('--save_models', action='store_true', help='save all intermediate models')
-parser.add_argument('--save', type=str, default='test', help='file name to save')
-parser.add_argument('--load', type=str, default='', help='file name to load')
-parser.add_argument('--pre_train', type=str, default='', help='pre-trained model directory')
-parser.add_argument("--activation_map", action='store_true', help='if raise, return feature activation map')
+
 
 args = parser.parse_args()
 
 for arg in vars(args):
-    if vars(args)[arg] == 'True':
-        vars(args)[arg] = True
-    elif vars(args)[arg] == 'False':
-        vars(args)[arg] = False
-
+     if vars(args)[arg] == 'True':
+          vars(args)[arg] = True
+     elif vars(args)[arg] == 'False':
+          vars(args)[arg] = False
diff --git a/utils/LightMBN.png b/utils/LightMBN.png
new file mode 100644
index 000000000..01503fea8
Binary files /dev/null and b/utils/LightMBN.png differ
diff --git a/utils/functions.py b/utils/functions.py
index 10c6b2e52..d94c60314 100755
--- a/utils/functions.py
+++ b/utils/functions.py
@@ -193,8 +193,10 @@ def cmc_baseline(distmat, query_ids=None, gallery_ids=None,
     return CMC, mAP
 
 
-def eval_liaoxingyu(distmat, q_pids, g_pids, q_camids, g_camids, max_rank):
-    """Evaluation with market1501 metric
+def evaluation(distmat, q_pids, g_pids, q_camids, g_camids, max_rank):
+    """
+    Written by Liao Xingyu
+    Evaluation with market1501 metric
     Key: for each query identity, its gallery images from the same camera view are discarded.
     """
     num_q, num_g = distmat.shape
diff --git a/utils/re_ranking.py b/utils/re_ranking.py
index e878476ae..ad20f3810 100755
--- a/utils/re_ranking.py
+++ b/utils/re_ranking.py
@@ -30,70 +30,291 @@
 
 
 import numpy as np
+import torch
+import time
+import gc
+from tqdm import tqdm
 
-def k_reciprocal_neigh( initial_rank, i, k1):
-    forward_k_neigh_index = initial_rank[i,:k1+1]
-    backward_k_neigh_index = initial_rank[forward_k_neigh_index,:k1+1]
-    fi = np.where(backward_k_neigh_index==i)[0]
+
+def k_reciprocal_neigh(initial_rank, i, k1):
+    forward_k_neigh_index = initial_rank[i, :k1 + 1]
+    backward_k_neigh_index = initial_rank[forward_k_neigh_index, :k1 + 1]
+    fi = np.where(backward_k_neigh_index == i)[0]
     return forward_k_neigh_index[fi]
 
+
 def re_ranking(q_g_dist, q_q_dist, g_g_dist, k1=20, k2=6, lambda_value=0.3):
     # The following naming, e.g. gallery_num, is different from outer scope.
     # Don't care about it.
     original_dist = np.concatenate(
-      [np.concatenate([q_q_dist, q_g_dist], axis=1),
-       np.concatenate([q_g_dist.T, g_g_dist], axis=1)],
-      axis=0)
-    original_dist = 2. - 2 * original_dist   #np.power(original_dist, 2).astype(np.float32)
-    original_dist = np.transpose(1. * original_dist/np.max(original_dist,axis = 0))
+        [np.concatenate([q_q_dist, q_g_dist], axis=1),
+         np.concatenate([q_g_dist.T, g_g_dist], axis=1)],
+        axis=0)
+    # np.power(original_dist, 2).astype(np.float32)
+    original_dist = 2. - 2 * original_dist
+    original_dist = np.transpose(
+        1. * original_dist / np.max(original_dist, axis=0))
     V = np.zeros_like(original_dist).astype(np.float32)
     #initial_rank = np.argsort(original_dist).astype(np.int32)
     # top K1+1
-    initial_rank = np.argpartition( original_dist, range(1,k1+1) )
+    initial_rank = np.argpartition(original_dist, range(1, k1 + 1))
 
     query_num = q_g_dist.shape[0]
     all_num = original_dist.shape[0]
 
     for i in range(all_num):
         # k-reciprocal neighbors
-        k_reciprocal_index = k_reciprocal_neigh( initial_rank, i, k1)
+        k_reciprocal_index = k_reciprocal_neigh(initial_rank, i, k1)
         k_reciprocal_expansion_index = k_reciprocal_index
         for j in range(len(k_reciprocal_index)):
             candidate = k_reciprocal_index[j]
-            candidate_k_reciprocal_index = k_reciprocal_neigh( initial_rank, candidate, int(np.around(k1/2)))
-            if len(np.intersect1d(candidate_k_reciprocal_index,k_reciprocal_index))> 2./3*len(candidate_k_reciprocal_index):
-                k_reciprocal_expansion_index = np.append(k_reciprocal_expansion_index,candidate_k_reciprocal_index)
+            candidate_k_reciprocal_index = k_reciprocal_neigh(
+                initial_rank, candidate, int(np.around(k1 / 2)))
+            if len(np.intersect1d(candidate_k_reciprocal_index, k_reciprocal_index)) > 2. / 3 * len(candidate_k_reciprocal_index):
+                k_reciprocal_expansion_index = np.append(
+                    k_reciprocal_expansion_index, candidate_k_reciprocal_index)
 
         k_reciprocal_expansion_index = np.unique(k_reciprocal_expansion_index)
-        weight = np.exp(-original_dist[i,k_reciprocal_expansion_index])
-        V[i,k_reciprocal_expansion_index] = 1.*weight/np.sum(weight)
+        weight = np.exp(-original_dist[i, k_reciprocal_expansion_index])
+        V[i, k_reciprocal_expansion_index] = 1. * weight / np.sum(weight)
 
-    original_dist = original_dist[:query_num,]
+    original_dist = original_dist[:query_num, ]
     if k2 != 1:
-        V_qe = np.zeros_like(V,dtype=np.float32)
+        V_qe = np.zeros_like(V, dtype=np.float32)
         for i in range(all_num):
-            V_qe[i,:] = np.mean(V[initial_rank[i,:k2],:],axis=0)
+            V_qe[i, :] = np.mean(V[initial_rank[i, :k2], :], axis=0)
         V = V_qe
         del V_qe
     del initial_rank
     invIndex = []
     for i in range(all_num):
-        invIndex.append(np.where(V[:,i] != 0)[0])
+        invIndex.append(np.where(V[:, i] != 0)[0])
 
-    jaccard_dist = np.zeros_like(original_dist,dtype = np.float32)
+    jaccard_dist = np.zeros_like(original_dist, dtype=np.float32)
 
     for i in range(query_num):
-        temp_min = np.zeros(shape=[1,all_num],dtype=np.float32)
-        indNonZero = np.where(V[i,:] != 0)[0]
+        temp_min = np.zeros(shape=[1, all_num], dtype=np.float32)
+        indNonZero = np.where(V[i, :] != 0)[0]
         indImages = []
         indImages = [invIndex[ind] for ind in indNonZero]
         for j in range(len(indNonZero)):
-            temp_min[0,indImages[j]] = temp_min[0,indImages[j]]+ np.minimum(V[i,indNonZero[j]],V[indImages[j],indNonZero[j]])
-        jaccard_dist[i] = 1-temp_min/(2.-temp_min)
+            temp_min[0, indImages[j]] = temp_min[0, indImages[j]] + \
+                np.minimum(V[i, indNonZero[j]], V[indImages[j], indNonZero[j]])
+        jaccard_dist[i] = 1 - temp_min / (2. - temp_min)
 
-    final_dist = jaccard_dist*(1-lambda_value) + original_dist*lambda_value
+    final_dist = jaccard_dist * (1 - lambda_value) + \
+        original_dist * lambda_value
     del original_dist
     del V
     del jaccard_dist
-    final_dist = final_dist[:query_num,query_num:]
+    final_dist = final_dist[:query_num, query_num:]
+    return final_dist
+
+# #!/usr/bin/env python3
+# # -*- coding: utf-8 -*-
+# """
+# Created on Dec, 25 May 2019 20:29:09
+# Faster version for kesci ReID challenge
+
+# @author: luohao
+# """
+
+# """
+# CVPR2017 paper:Zhong Z, Zheng L, Cao D, et al. Re-ranking Person Re-identification with k-reciprocal Encoding[J]. 2017.
+# url:http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhong_Re-Ranking_Person_Re-Identification_CVPR_2017_paper.pdf
+# Matlab version: https://github.com/zhunzhong07/person-re-ranking
+# """
+
+# """
+# API
+
+# probFea: all feature vectors of the query set (torch tensor)
+# probFea: all feature vectors of the gallery set (torch tensor)
+# k1,k2,lambda: parameters, the original paper is (k1=20,k2=6,lambda=0.3)
+# MemorySave: set to 'True' when using MemorySave mode
+# Minibatch: avaliable when 'MemorySave' is 'True'
+# """
+
+# Save memory version
+
+def euclidean_distance(qf, gf):
+
+    m = qf.shape[0]
+    n = gf.shape[0]
+
+    # dist_mat = torch.pow(qf,2).sum(dim=1, keepdim=True).expand(m,n) +\
+    #     torch.pow(gf,2).sum(dim=1, keepdim=True).expand(n,m).t()
+    # dist_mat.addmm_(1,-2,qf,gf.t())
+
+    # for L2-norm feature
+    dist_mat = 2 - 2 * torch.matmul(qf, gf.t())
+    return dist_mat
+
+
+def batch_euclidean_distance(qf, gf, N=6000):
+    m = qf.shape[0]
+    n = gf.shape[0]
+
+    dist_mat = []
+    for j in range(n // N + 1):
+        temp_gf = gf[j * N:j * N + N]
+        temp_qd = []
+        for i in range(m // N + 1):
+            temp_qf = qf[i * N:i * N + N]
+            temp_d = euclidean_distance(temp_qf, temp_gf)
+            temp_qd.append(temp_d)
+        temp_qd = torch.cat(temp_qd, dim=0)
+        temp_qd = temp_qd / (torch.max(temp_qd, dim=0)[0])
+        dist_mat.append(temp_qd.t().cpu())
+    del temp_qd
+    del temp_gf
+    del temp_qf
+    del temp_d
+    torch.cuda.empty_cache()  # empty GPU memory
+    dist_mat = torch.cat(dist_mat, dim=0)
+    return dist_mat
+
+
+# 将topK排序放到GPU里运算，并且只返回k1+1个结果
+# Compute TopK in GPU and return (k1+1) results
+def batch_torch_topk(qf, gf, k1, N=6000):
+    m = qf.shape[0]
+    n = gf.shape[0]
+
+    dist_mat = []
+    initial_rank = []
+    for j in range(n // N + 1):
+        temp_gf = gf[j * N:j * N + N]
+        temp_qd = []
+        for i in range(m // N + 1):
+            temp_qf = qf[i * N:i * N + N]
+            temp_d = euclidean_distance(temp_qf, temp_gf)
+            temp_qd.append(temp_d)
+        temp_qd = torch.cat(temp_qd, dim=0)
+        temp_qd = temp_qd / (torch.max(temp_qd, dim=0)[0])
+        temp_qd = temp_qd.t()
+        initial_rank.append(torch.topk(temp_qd, k=k1, dim=1,
+                                       largest=False, sorted=True)[1])
+
+    del temp_qd
+    del temp_gf
+    del temp_qf
+    del temp_d
+    torch.cuda.empty_cache()  # empty GPU memory
+    initial_rank = torch.cat(initial_rank, dim=0).cpu().numpy()
+    return initial_rank
+
+
+def batch_v(feat, R, all_num):
+    V = np.zeros((all_num, all_num), dtype=np.float32)
+    m = feat.shape[0]
+    for i in tqdm(range(m)):
+        temp_gf = feat[i].unsqueeze(0)
+        # temp_qd = []
+        temp_qd = euclidean_distance(temp_gf, feat)
+        temp_qd = temp_qd / (torch.max(temp_qd))
+        temp_qd = temp_qd.squeeze()
+        temp_qd = temp_qd[R[i]]
+        weight = torch.exp(-temp_qd)
+        weight = (weight / torch.sum(weight)).cpu().numpy()
+        V[i, R[i]] = weight.astype(np.float32)
+    return V
+
+
+def re_ranking_gpu(probFea, galFea, k1, k2, lambda_value):
+    # The following naming, e.g. gallery_num, is different from outer scope.
+    # Don't care about it.
+
+    t1 = time.time()
+    query_num = probFea.size(0)
+    all_num = query_num + galFea.size(0)
+    feat = torch.cat([probFea, galFea]).cuda()
+    initial_rank = batch_torch_topk(feat, feat, k1 + 1, N=6000)
+    # del feat
+    del probFea
+    del galFea
+    torch.cuda.empty_cache()  # empty GPU memory
+    gc.collect()  # empty memory
+    print('Using totally {:.2f}s to compute initial_rank'.format(
+        time.time() - t1))
+    print('starting re_ranking')
+
+    R = []
+    for i in tqdm(range(all_num)):
+        # k-reciprocal neighbors
+        k_reciprocal_index = k_reciprocal_neigh(initial_rank, i, k1)
+        k_reciprocal_expansion_index = k_reciprocal_index
+        for j in range(len(k_reciprocal_index)):
+            candidate = k_reciprocal_index[j]
+            candidate_k_reciprocal_index = k_reciprocal_neigh(
+                initial_rank, candidate, int(np.around(k1 / 2)))
+            if len(np.intersect1d(candidate_k_reciprocal_index, k_reciprocal_index)) > 2. / 3 * len(
+                    candidate_k_reciprocal_index):
+                k_reciprocal_expansion_index = np.append(
+                    k_reciprocal_expansion_index, candidate_k_reciprocal_index)
+        k_reciprocal_expansion_index = np.unique(k_reciprocal_expansion_index)
+        R.append(k_reciprocal_expansion_index)
+
+    gc.collect()  # empty memory
+    print('Using totally {:.2f}S to compute R'.format(time.time() - t1))
+    V = batch_v(feat, R, all_num)
+    del R
+    gc.collect()  # empty memory
+    print('Using totally {:.2f}S to compute V-1'.format(time.time() - t1))
+    initial_rank = initial_rank[:, :k2]
+
+    # 下面这个版本速度更快
+    # Faster version
+    if k2 != 1:
+        V_qe = np.zeros_like(V, dtype=np.float16)
+        for i in range(all_num):
+            V_qe[i, :] = np.mean(V[initial_rank[i], :], axis=0)
+        V = V_qe
+        del V_qe
+    del initial_rank
+
+    # 下面这个版本更省内存(约40%)，但是更慢
+    # Low-memory version
+    '''gc.collect()  # empty memory
+    N = 2000
+    for j in range(all_num // N + 1):
+        if k2 != 1:
+            V_qe = np.zeros_like(V[:, j * N:j * N + N], dtype=np.float32)
+            for i in range(all_num):
+                V_qe[i, :] = np.mean(V[initial_rank[i], j * N:j * N + N], axis=0)
+            V[:, j * N:j * N + N] = V_qe
+            del V_qe
+    del initial_rank'''
+
+    gc.collect()  # empty memory
+    print('Using totally {:.2f}S to compute V-2'.format(time.time() - t1))
+    invIndex = []
+
+    for i in range(all_num):
+        invIndex.append(np.where(V[:, i] != 0)[0])
+    print('Using totally {:.2f}S to compute invIndex'.format(time.time() - t1))
+
+    jaccard_dist = np.zeros((query_num, all_num), dtype=np.float32)
+    for i in tqdm(range(query_num)):
+        temp_min = np.zeros(shape=[1, all_num], dtype=np.float32)
+        indNonZero = np.where(V[i, :] != 0)[0]
+        indImages = [invIndex[ind] for ind in indNonZero]
+        for j in range(len(indNonZero)):
+            temp_min[0, indImages[j]] = temp_min[0, indImages[j]] + np.minimum(V[i, indNonZero[j]],
+                                                                               V[indImages[j], indNonZero[j]])
+        jaccard_dist[i] = 1 - temp_min / (2. - temp_min)
+    del V
+    gc.collect()  # empty memory
+    original_dist = batch_euclidean_distance(feat, feat[:query_num, :]).numpy()
+    final_dist = jaccard_dist * (1 - lambda_value) + \
+        original_dist * lambda_value
+    # print(jaccard_dist)
+    del original_dist
+
+    del jaccard_dist
+
+    final_dist = final_dist[:query_num, query_num:]
+    print(final_dist)
+    print('Using totally {:.2f}S to compute final_distance'.format(
+        time.time() - t1))
     return final_dist
diff --git a/utils/utility.py b/utils/utility.py
index ce0e2bf9c..b0147a205 100644
--- a/utils/utility.py
+++ b/utils/utility.py
@@ -13,6 +13,10 @@
 from shutil import copyfile, copytree
 import pickle
 import warnings
+try:
+    import neptune
+except Exception:
+    print('Neptune is not installed.')
 
 ROOT_PATH = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
 
@@ -24,80 +28,68 @@ def __init__(self, args):
         self.since = datetime.datetime.now()
         now = datetime.datetime.now().strftime('%Y-%m-%d-%H:%M:%S')
 
+        def _make_dir(path):
+            if not os.path.exists(path):
+                os.makedirs(path)
+
         if args.load == '':
             if args.save == '':
                 args.save = now
             self.dir = ROOT_PATH + '/experiment/' + args.save
-
-            # Only works if using google drive
-            if ROOT_PATH[:8] == '/content':
-                self.model_save_dir = osp.join(
-                    ROOT_PATH, '..', '..', 'experiment' + args.save)
-            else:
-                self.model_save_dir = 'none'
-
         else:
             self.dir = ROOT_PATH + '/experiment/' + args.load
             if not os.path.exists(self.dir):
                 args.load = ''
-            else:
-                # pass
-                # if args.resume != 0:
-                #     self.add_log(torch.tensor(
-                #         [args.resume, 0, 0, 0, 0, 0], dtype=torch.float32).reshape(1, 6))
-                # else:
-                #     self.log = torch.load(self.dir + '/map_log.pt')
-                if os.path.exists(self.dir + '/map_log.pt'):
-                    self.log = torch.load(self.dir + '/map_log.pt')
-                # print('Continue from epoch {}...'.format(
-                #     len(self.log) * args.test_every))
+            args.save = args.load
 
-        print('Experiment results will be saved in {} '.format(self.dir))
+        ##### Only works when using google drive and colab #####
+        self.local_dir = None
+        if ROOT_PATH[:8] == '/content':
 
-        if args.reset:
-            os.system('rm -rf ' + self.dir)
-            args.load = ''
-
-        def _make_dir(path):
-            if not os.path.exists(path):
-                os.makedirs(path)
+            self.dir = osp.join('/content/drive/Shareddrives/Colab',
+                                self.dir[self.dir.find('experiment'):])
+            self.local_dir = ROOT_PATH + \
+                '/experiment/' + self.dir.split('/')[-1]
+            _make_dir(self.local_dir)
+        ############################################
 
         _make_dir(self.dir)
 
-        if not args.test_only:
-
-            # _make_dir(self.dir + '/model')
-            _make_dir(self.dir + '/scripts')
+        if os.path.exists(self.dir + '/map_log.pt'):
+            self.log = torch.load(self.dir + '/map_log.pt')
 
-            # copytree(os.path.join(ROOT_PATH, 'model'), self.dir + '/scripts/model' +
-            #          datetime.datetime.now().strftime('%Y-%m-%d-%H:%M:%S'))
-            # copytree(os.path.join(ROOT_PATH, 'loss'), self.dir + '/scripts/loss' +
-            #          datetime.datetime.now().strftime('%Y-%m-%d-%H:%M:%S'))
+        print('Experiment results will be saved in {} '.format(self.dir))
 
         open_type = 'a' if os.path.exists(self.dir + '/log.txt') else 'w'
         self.log_file = open(self.dir + '/log.txt', open_type)
-        with open(self.dir + '/config.txt', open_type) as f:
-            f.write(now + '\n\n')
-            for arg in vars(args):
-                f.write('{}: {}\n'.format(arg, getattr(args, arg)))
-            f.write('\n')
+
+        ######### For Neptune: ############
+
+        try:
+            # replaced with your project name and token
+            exp = neptune.init(args.nep_name, args.nep_token)
+            if args.load == '':
+                self.exp = exp.create_experiment(name=self.dir.split('/')[-1],
+                                                 # tags=['keras', 'vis'],
+                                                 # upload_source_files=['**/*.py', 'parameters.yaml'],
+                                                 params=vars(args))
+                args.nep_id = self.exp.id
+            else:
+                self.exp = exp.get_experiments(id=args.nep_id)[0]
+            print(self.exp.id)
+
+        except Exception:
+            pass
+
+        ###################################
 
         with open(self.dir + '/config.yaml', open_type) as fp:
             dic = vars(args).copy()
-            del dic['load'], dic['save'], dic['pre_train'], dic['test_only'], dic['re_rank'],dic['activation_map']
+            del dic['load'], dic['save'], dic['pre_train'], dic['test_only'], dic['re_rank'], dic['activation_map'], dic['nep_token']
             yaml.dump(dic, fp, default_flow_style=False)
 
-    # def save(self, trainer, epoch, is_best=False):
-    #     trainer.model.save(self.dir, epoch, is_best=is_best)
-    #     trainer.loss.save(self.dir)
-    #     # trainer.loss.plot_loss(self.dir, epoch)
-
-    #     self.plot_map_rank(epoch)
-    #     torch.save(self.log, os.path.join(self.dir, 'map_log.pt'))
-    #     torch.save({'state_dict': trainer.optimizer.state_dict(), 'epoch': epoch},
-    #                os.path.join(self.dir, 'model',
-    #                             'optimizer.pt')
-    #                )
+        copyfile(self.dir + '/config.yaml', self.local_dir +
+                 '/config.yaml') if self.local_dir is not None else None
 
     def add_log(self, log):
         self.log = torch.cat([self.log, log])
@@ -109,10 +101,31 @@ def write_log(self, log, refresh=False, end='\n'):
         print(log, end=end)
         if end != '':
             self.log_file.write(log + end)
+
+            ######### For Neptune: ############
+            try:
+                t = log.find('Total')
+                m = log.find('mAP')
+                r = log.find('rank1')
+
+                self.exp.log_metric('batch loss', float(
+                    log[t + 7:t + 12])) if t > -1 else None
+                self.exp.log_metric('mAP', float(
+                    log[m + 5:m + 11])) if m > -1 else None
+                self.exp.log_metric('rank1', float(
+                    log[r + 7:r + 13])) if r > -1 else None
+            except Exception:
+                pass
+            ###################################
+
         if refresh:
             self.log_file.close()
             self.log_file = open(self.dir + '/log.txt', 'a')
 
+            # For Google Drive
+            copyfile(self.dir + '/log.txt', self.local_dir +
+                     '/log.txt') if self.local_dir is not None else None
+
     def done(self):
         self.log_file.close()
 
@@ -129,7 +142,7 @@ def plot_map_rank(self, epoch):
         plt.xlabel('Epochs')
         plt.ylabel('mAP/rank')
         plt.grid(True)
-        plt.savefig('{}/test_{}.jpg'.format(self.dir, self.args.data_test))
+        plt.savefig('{}/result_{}.pdf'.format(self.dir, self.args.data_test), dpi=600)
         plt.close(fig)
 
     def save_results(self, filename, save_list, scale):
@@ -239,14 +252,14 @@ def load_pretrained_weights(self, model, weight_path):
             state_dict = checkpoint
 
         model_dict = model.state_dict()
-        # print(model_dict.keys())
         new_state_dict = OrderedDict()
         matched_layers, discarded_layers = [], []
         for k, v in state_dict.items():
-            # print(k)
+
             if k.startswith('module.'):
                 k = 'model.' + k[7:]  # discard module.
-
+            if k.startswith('model.'):
+                k = k[6:]
             if k in model_dict and model_dict[k].size() == v.size():
 
                 new_state_dict[k] = v
@@ -298,7 +311,8 @@ def resume_from_checkpoint(self, fpath, model, optimizer=None, scheduler=None):
         self.write_log('[INFO] Loading checkpoint from "{}"'.format(fpath))
         checkpoint = self.load_checkpoint(fpath)
 
-        model.load_state_dict(checkpoint['state_dict'])
+        # model.load_state_dict(checkpoint['state_dict'])
+        self.load_pretrained_weights(model, fpath)
         self.write_log('[INFO] Model weights loaded')
         if optimizer is not None and 'optimizer' in checkpoint.keys():
             optimizer.load_state_dict(checkpoint['optimizer'])
diff --git a/utils/visualize_rank.py b/utils/visualize_rank.py
index d43d2a80f..2660b98a0 100644
--- a/utils/visualize_rank.py
+++ b/utils/visualize_rank.py
@@ -94,7 +94,7 @@ def _cp_img_to(src, dst, rank, prefix, matched=False):
                 osp.basename(src)
             )
             shutil.copy(src, dst)
-    num_q =200
+    # num_q =200
 
 
     for q_idx in range(num_q):