Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add transformer support?(matmul, layernorm) #216

Open
bigprince97 opened this issue Dec 26, 2019 · 8 comments
Open

add transformer support?(matmul, layernorm) #216

bigprince97 opened this issue Dec 26, 2019 · 8 comments

Comments

@bigprince97
Copy link

No description provided.

@bigprince97
Copy link
Author

Hello, this is very convenient repo to convert pytorch model to tensorrt, but it likely support cv well, support nlp poolly,I try to convert gpt2 model to tensorRT, but there are two operations that transformers(nlp very popular model, it is the base block to build gpt2, bert) needs don't support, include matmul, layernorm, c++ api support it, consider support it in python api? I am very expected to deploy gpt2 in tensorRT.

@bigprince97 bigprince97 changed the title Hello, this is very convenient repo to convert pytorch model to tensorrt, but it likely support cv well, support nlp poolly,I try to convert gpt2 model to tensorRT, but there a add transformer support?(matmul, layernorm) Dec 26, 2019
@bigprince97
Copy link
Author

I convert gpt2 model to tensorRT successfully use this repo and some operation I write by myself, and the speed has been increased three times, thanks this good repo.

@mowayao
Copy link

mowayao commented Jan 19, 2020

I convert gpt2 model to tensorRT successfully use this repo and some operation I write by myself, and the speed has been increased three times, thanks this good repo.

Can you share you impl of matmul, plz?

@czs1886
Copy link

czs1886 commented Jun 10, 2020

I convert gpt2 model to tensorRT successfully use this repo and some operation I write by myself, and the speed has been increased three times, thanks this good repo.

Can you share you impl of matmul, plz?

Just use addMatrixMultiply() which is defined here
You need to create a new converter in converters directory. You can copy mul.py and change its addElementwise to addMatrixMultiply.

@czs1886 czs1886 mentioned this issue Jun 10, 2020
@q248953144
Copy link

I convert gpt2 model to tensorRT successfully use this repo and some operation I write by myself, and the speed has been increased three times, thanks this good repo.

can you share your ways

@Fan9
Copy link

Fan9 commented Apr 7, 2021

@bigprince97 can you shart your code? thanks!

@bigprince97
Copy link
Author

Sorry to see the reply now, I write code last year, it can't run because of the update of the repo. fastertransformer supports gpt2 acceleration,It has pytorch interface, you can try.

@francescotaioli
Copy link

for anyone looking for the matmul, here's the code

@tensorrt_converter('torch.Tensor.__matmul__')
def convert_mul(ctx):
    input_a = ctx.method_args[0]
    input_b = ctx.method_args[1]
    output = ctx.method_return
    input_a_trt, input_b_trt = add_missing_trt_tensors(ctx.network, [input_a, input_b])
    input_a_trt, input_b_trt = broadcast_trt_tensors(ctx.network, [input_a_trt, input_b_trt], len(output.shape) - 1)
    
    #layer = ctx.network.add_elementwise(input_a_trt, input_b_trt, trt.ElementWiseOperation.PROD)
    layer = ctx.network.add_matrix_multiply(input_a_trt, trt.MatrixOperation.NONE, input_b_trt, trt.MatrixOperation.NONE)
    output._trt = layer.get_output(0)

Documentation for python api: here and here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants