Torch VS TRT layer-wise debug ? #310

dedoogong · 2020-05-03T04:26:28Z

Hello!

After a tons of trial and errors,
I finally struggled to have converted my model to TRT and run it successfully.

Even it was just a FP 32 normal conversion, the speed increased 3~4 times!

But the result accuracy was little bit worse than my expectation.

I guess there the causes would come from layer fusion.

I would like to debug layer by layer, but for example some part of my layers are packed into a nn.Sequential().

networ.mark_output("each layers output tensor") would be the workaround, but I think I can't apply this method when I use nn.Sequential() because I can't get the each inner layers(e.g., Conv , Bn, Relu..) output as it is processed implicitly.

Would you please kindly give me any hint to debug all layers between torch vs TRT?

Thank you!

ma-siddiqui · 2020-05-03T05:54:02Z

In my case, the results are much more worse. I also struggled a lot to convert and run trt model successfully. I am also looking the same solution.

jaybdub · 2020-05-03T23:45:35Z

Hi All,

Thanks for reaching out!

Are you able to share the PyTorch model you're attempting to convert?

Best,
John

dedoogong · 2020-05-04T01:08:17Z

yes I converted the officially released model from https://github.com/dedoogong/SuperGluePretrainedNetwork
this project has 2 models named "superpoint" and "superglue" and my current target is the second one, "superglue".

You can just clone the repo and run(weight files and demo images are all included in the repo).
pip3 install numpy opencv-python torch matplotlib
python3 ./demo_superglue.py --input assets/freiburg_sequence/ --output_dir dump_demo_sequence --resize 320 240 --max_keypoints 350

I intentionally added all intermediate layer's outputs to return variables for TRT to recognize those as outputs so that it will not fuse the layers.

you can see the TRT related codes : superglue.py and matching.py

When I compared the results between TRT and torch model in the first conversion step(before saving the converted model), the two results were totally same. But after loading the converted model and run it again with different inputs, the two inference results start to be different!

And only when I added all intermediate outputs to the models final output, the results are much better. So, I guess the layer fusion would cause the accuracy drop.

dedoogong · 2020-05-04T01:46:01Z

I hope to find a easier way to manually handle some layer not to be fused as an optional torch2trt argument ( such as each layer's unique id or name)

jaybdub · 2020-05-05T03:45:47Z

Hi dedoogong,

Issue 1 - Incorrect results after loading model

Could you share the following information

Version of TensorRT
Version of PyTorch
Commit hash of torch2trt (or just branch name, probably "master")

Also, does your model use an interpolation / upsampling layer? I've refactored the serialization of the interpolate plugin in this branch if you want to give it a try.

#307

The refactoring is intended to resolve a different issue but would be worth checking.

Issue 2 - Handling layer not to be fused

Do you mind sharing a small example of what you mean by a layer "not to be fused"? Do you mean to skip adding the layer (an identity operation)?

dedoogong · 2020-05-07T01:24:32Z

Hello!

Issue 1 - Incorrect results after loading model

TRT 7
Pytorch 1.3
master , built with interpolation plugin(gcc 4.8, pytorch 1.3, protobuf 3.11.4)
my model doesn't use interpolation at all.

Issue 2 - Handling layer not to be fused

small example:

original model

class KeypointEncoder(nn.Module):

    def __init__(self, feature_dim, layers):

        super().__init__()

        self.encoder = nn.Sequential(nn.Conv1d(3, 32, 1, bias = True),

                                    nn.BatchNorm1d(32),

                                    nn.ReLU(inplace=False),

                                    nn.Conv1d(32, 64, 1, bias=True),

                                    nn.BatchNorm1d(64),

                                    nn.ReLU(inplace=False)

                                    )

    def forward(self, inputs):

        return self.encoder(inputs)

"not to be fused" version (for being marked as outputs by TRT)

class KeypointEncoder(nn.Module):

    def __init__(self, feature_dim, layers):

        super().__init__()

        self.encoder = nn.Sequential(nn.Conv1d(3, 32, 1, bias = True),

                                    nn.BatchNorm1d(32),

                                    nn.ReLU(inplace=False),

                                    nn.Conv1d(32, 64, 1, bias=True),

                                    nn.BatchNorm1d(64),

                                    nn.ReLU(inplace=False)

                                    ) 

    def forward(self, inputs):

        conv1=self.encoder[0](inputs)

        bn1  =self.encoder[1](conv1)

        relu1=self.encoder[2](bn1)

        conv2 = self.encoder[3](relu1)

        bn2 = self.encoder[4](conv2)

        relu2 = self.encoder[5](bn2)

        return conv1, bn1, relu1,  conv2, bn2, relu2

If you want any more request, please let me know!
I will try hard to help you !

poornimajd · 2021-03-30T08:00:53Z

@dedoogong and @ma-siddiqui have you been able to solve the issue?

amk2777 · 2022-05-05T15:18:54Z

@dedoogong

I am trying to convert the SuperGlue model [https://github.com/dedoogong/SuperGluePretrainedNetwork] to Tensor-RT for subsequent execution of the same of TX2. Still struggling and haven't been able to get a breakthrough.

Seems you had some success back in 2020. Can you kindly share your converted model? If not would it be possible to share scripts that you used for conversion?

Regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch VS TRT layer-wise debug ? #310

Torch VS TRT layer-wise debug ? #310

dedoogong commented May 3, 2020 •

edited

Loading

ma-siddiqui commented May 3, 2020

jaybdub commented May 3, 2020

dedoogong commented May 4, 2020 •

edited

Loading

dedoogong commented May 4, 2020

jaybdub commented May 5, 2020

dedoogong commented May 7, 2020 •

edited

Loading

poornimajd commented Mar 30, 2021

amk2777 commented May 5, 2022

Torch VS TRT layer-wise debug ? #310

Torch VS TRT layer-wise debug ? #310

Comments

dedoogong commented May 3, 2020 • edited Loading

ma-siddiqui commented May 3, 2020

jaybdub commented May 3, 2020

dedoogong commented May 4, 2020 • edited Loading

dedoogong commented May 4, 2020

jaybdub commented May 5, 2020

Issue 1 - Incorrect results after loading model

Issue 2 - Handling layer not to be fused

dedoogong commented May 7, 2020 • edited Loading

Issue 1 - Incorrect results after loading model

Issue 2 - Handling layer not to be fused

poornimajd commented Mar 30, 2021

amk2777 commented May 5, 2022

dedoogong commented May 3, 2020 •

edited

Loading

dedoogong commented May 4, 2020 •

edited

Loading

dedoogong commented May 7, 2020 •

edited

Loading