Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: __len__() should return >= 0 #521

Open
whcjb opened this issue Mar 3, 2021 · 20 comments
Open

ValueError: __len__() should return >= 0 #521

whcjb opened this issue Mar 3, 2021 · 20 comments

Comments

@whcjb
Copy link

whcjb commented Mar 3, 2021

when use torch2trt convert the torch.eq, error occurs.
mm = torch.eq(mm, 0.)
mm is tensor and mm.shape = [3136, 1, 3, 3]

File "/media/cfs/torch2trt-master/examples/inpainting/model.py", line 329, in forward
mm = torch.eq(mm, nn)
File "./torch2trt/torch2trt.py", line 285, in wrapper
converter"converter"
File "./torch2trt/converters/compare.py", line 26, in convert_gt
return convert_elementwise(ctx, trt.ElementWiseOperation.EQUAL)
File "./torch2trt/converters/compare.py", line 9, in convert_elementwise
input_a_trt, input_b_trt = broadcast_trt_tensors(ctx.network, [input_a_trt, input_b_trt], len(output.shape) - 1)
File "./torch2trt/torch2trt.py", line 170, in broadcast_trt_tensors
if len(t.shape) < broadcast_ndim:
ValueError: len() should return >= 0

@whcjb
Copy link
Author

whcjb commented Mar 3, 2021

someone can help?

@jaybdub
Copy link
Contributor

jaybdub commented Mar 3, 2021

Hi @whcjb,

Thanks for reaching out!

I would guess this is because it is comparing against a scalar. We may need to update the converter to handle this case.

I will try to look into this soon.

Best,
John

@pepinu
Copy link

pepinu commented Mar 20, 2021

Hi @whcjb,

Thanks for reaching out!

I would guess this is because it is comparing against a scalar. We may need to update the converter to handle this case.

I will try to look into this soon.

Best,
John

Hey @jaybdub,

I stumbled across pretty much the same error as the OP, and I can verify that it comes from t.shape being a scalar. In my case, the shape was (32744).

I've tried to write a simple case in the if statement here torch2trt.py#L174 with a condition not hasattr(t, '__len__') however I cannot get the shape = tuple([1] * diff + list(t.shape)) to work, same error as OP.

How should I go about it? I can make a PR after I get it to work.

@jaybdub
Copy link
Contributor

jaybdub commented Mar 20, 2021

Hi @pepinu,

Hmm, do you mind sharing the error you see at shape = tuple([1] * diff + list(t.shape))?

Also, thanks for your interest in addressing this!

It's difficult to tell exactly where the change should be applied without reproducing myself, but one other area of interest for this issue may be here

trt_tensor = network.add_constant(shape, scalar).get_output(0)

This is where constant tensors are added to the TensorRT network for primitive types. Let me know if you discover anything here, or if there's anything you'd like me to investigate.

As general contributing guidelines, before integrating any solution we'll have to see if there are adverse side effects that might effect other models. One way to do this is to add module test cases that address this failure, and ensure that the existing test cases run.

Many of the converter files have examples of module test cases.

def test_gt_basic():

The test cases may be run by calling

python3 -m torch2trt.test --name=converters --tolerance=1e-2

This test script was created for torch2trt and performs cross validation of the outputs against PyTorch. It will simply highlight high errors as "yellow", but not hard-fail. It might not cover all use cases. If the change requires a special type of test let me know.

Please let me know if this helps / you have any questions or if there is any way I can help.

Best,
John

@pepinu
Copy link

pepinu commented Mar 20, 2021

Hey @jaybdub,

Thanks for the pointers, I'll take a look at this over the weekend.

Here is earlier mentioned err more in-depth:

  1. I've split the shape = tuple([1] * diff + list(t.shape)) to 3 lines as seen below:

Screenshot 2021-03-20 at 23 34 24

The error here is the same as the OP, and happens when t.shape is put into the list:
Screenshot 2021-03-20 at 23 38 17

  1. I've tried to get it to work with this (L177):

Screenshot 2021-03-20 at 23 39 42

but the error throws few lines after:
Screenshot 2021-03-21 at 23 49 33

I suspect it would just have to be unpacked within the shape reported in the error?
Hope it clarifies a little bit.

@pepinu
Copy link

pepinu commented Mar 21, 2021

@jaybdub

Alright so I did some testing, I think I identified what the problem might be but I'm not sure how to proceed.

Basically, the problem in my case is not that the t is a scalar, the t.shape is a scalar. I've edited the last image in my earlier post because I had the wrong condition (if not hasattr(t, 'len') will not catch this).

The Problem

if len(t.shape) < broadcast_ndim:
In the issue scenario, t.shape, which is of type trt.Dims has dimension 1 and looks like (32548). It has the len method but when invoked it throws the error. I tried to write a workaround with a lambda, but len is read-only in this case, so no luck there.

However, even if all len() calls are rewritten and length set arbitrarily to 1, the problem still persists here:

shape = tuple([1] * diff + list(t.shape))

list() calls len() internally which crashes the conversion.

I've tried to use brackets to put the t.shape object into a list, but the results are not the same:
Screenshot 2021-03-22 at 00 09 12

I couldn't find a way to reproduce the same representation of the trt.Dims as in the traceback. list() makes it [32548] while tuple makes it (32548, ). I will look into finding a way to extract the t.shape value as represented when printed, maybe then I can somehow convert it inside.

I wonder if you have any pointers where I could look, maybe 1-dim tensor conversion is buggy?

Also, I'll try to get a minimal reproducible code for this, so it's reproducible.

Best regards

@pepinu
Copy link

pepinu commented Mar 22, 2021

@jaybdub @whcjb

I found out the problem is the trt not being able to process slice() operator in the same fashion torch does.

The network I was trying to port crashed on torch.add() operation between two tensors, while converting minimal torch.add op worked like a charm.

My model was cutting spatial dimensions using python slice() operation, instead of torch.narrow recommended for tensors.

To check this is the culprit I wrote and tested 2 versions of a network that narrows dims and adds them together:
Screenshot 2021-03-22 at 13 23 23
Screenshot 2021-03-22 at 13 23 27
Screenshot 2021-03-22 at 13 23 32
Screenshot 2021-03-22 at 13 23 50

I think the screen is self-explanatory, here's a gist to reproduce this.

I'm not sure where to go from here, there should be some type check for slice within the lib, hope it helps.

Best Regards

EDIT:

I looked at the last screen and see that tensors are not matching between TRT and normal model, which is weird? I was sure that they were while writing this...

@Leerw
Copy link

Leerw commented Jun 3, 2021

Hi @pepinu,

Hmm, do you mind sharing the error you see at shape = tuple([1] * diff + list(t.shape))?

Also, thanks for your interest in addressing this!

It's difficult to tell exactly where the change should be applied without reproducing myself, but one other area of interest for this issue may be here

trt_tensor = network.add_constant(shape, scalar).get_output(0)

This is where constant tensors are added to the TensorRT network for primitive types. Let me know if you discover anything here, or if there's anything you'd like me to investigate.

As general contributing guidelines, before integrating any solution we'll have to see if there are adverse side effects that might effect other models. One way to do this is to add module test cases that address this failure, and ensure that the existing test cases run.

Many of the converter files have examples of module test cases.

def test_gt_basic():

The test cases may be run by calling

python3 -m torch2trt.test --name=converters --tolerance=1e-2

This test script was created for torch2trt and performs cross validation of the outputs against PyTorch. It will simply highlight high errors as "yellow", but not hard-fail. It might not cover all use cases. If the change requires a special type of test let me know.

Please let me know if this helps / you have any questions or if there is any way I can help.

Best,
John

for my case, in

t._trt = network.add_constant(shape, weight).get_output(0)

shape=(576, 960) and weight.shape=(1,1,576,960)

after run this line, I print t._trt
I got

[TensorRT] ERROR: [SHUFFLE #2] torch.Tensor.view(tensor(shape=[576], dtype=torch.float32), -1, 1): volume mismatch. Input dimensions [576] have volume 576 and output dimensions [1] have volume 1.
ValueError: __len__() should return >= 0

@DuyguSerbes
Copy link

Guys, do you have a final solution regarding that issue?

@pwais
Copy link

pwais commented Jan 16, 2022

+1 I am seeing a case where (perhaps a scalar) has a len of -1 according to tensorRT

I also seem to run into similar errors if a tensor (or argument to forward ) is None (this should probably just be pruned from the TRT conversion?)

@InfiniteLife
Copy link

Same problem

@RaiAmanRai
Copy link

hey @jaybdub , can you give some inputs on how long will it take before this is fixed?

@Tegala
Copy link

Tegala commented Jul 25, 2022

I meet same problem too, wish for some solution @jaybdub 0.0

@kct22aws
Copy link

Any ETA on this problem? Without the fix, torch2trt won't work for many models I tried: Hugging Face vision transformer, swintransformer, ViViT...etc

@iariav
Copy link

iariav commented Mar 27, 2023

@kct22aws
+1 on that question

@dcming
Copy link

dcming commented Jun 6, 2023

+1 on that question

2 similar comments
@Emilon1928
Copy link

+1 on that question

@shuyangsun
Copy link

+1 on that question

@StanleyPain
Copy link

2024 and no fix? Anyone get any traction on this?

@dragos98gl
Copy link

+1 on that question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests