Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future Hand Prediction: Is the mask multiplied by the prediction we submit? #19

Open
masashi-hatano opened this issue Sep 8, 2022 · 10 comments

Comments

@masashi-hatano
Copy link

masashi-hatano commented Sep 8, 2022

I tried submitting a json file, which follows the specified format, and I obtained the quantitative result as follows.

{"L_MDisp": 211.9670281732144, "R_MDisp": 276.70152097706693, "L_CDisp": 207.87675657595398, "R_CDisp": 271.26115030262525, "Total": 967.8064560288606}

However, even though the results that we tested in the validation dataset were better than the baseline, the results obtained from the actual submissions have a huge amount of errors.
This is probably because the mask is not multiplied by the prediction we submit. The mask is used so that the error is zero on frames in which hand is not visible.

To demonstrate that the quantitative results presented above are anomalous, here are a prediction list, which is a part of my submission.json file, and its visualization result.

As you can see these figures, the quantitative results obtained from the actual submission seem to be incorrect, and the reason for this is thought to be that the loss is calculated without multiplying the predictions by the masks.

@VJWQ
Could you please confirm that the loss calculation is done correctly? In particular, I would appreciate it if you could check if the process is done to set the error to zero if the hands are not in frames.

"2152_3837": [120.74557495117188, 84.73670959472656, 235.1125030517578, 93.4263687133789, 118.04257202148438, 86.06185150146484, 230.081787109375, 91.89846801757812, 125.53624725341797, 88.14488220214844, 230.46359252929688, 94.43958282470703, 122.34292602539062, 88.79545593261719, 225.5665740966797, 91.5564193725586, 122.0747299194336, 94.3060531616211, 217.82423400878906, 99.28343963623047]

003838
Figure1 pre_45 frame
003853
Figure2 pre_30 frame
003868
Figure3 pre_15 frame
003883
Figure4 pre_frame
003898
Figure5 contact_frame

@VJWQ
Copy link
Contributor

VJWQ commented Sep 8, 2022

hi @masashi-hatano happy to have you as our participant!
Things run as expected on my side, for your information our baseline also gives 1*20 non-zero prediction results for sample "2152_3837", which means it should be fine to have non-zero predictions on frames without hands. As we mentioned on the challenge page, "Our evaluation script won't penalize your algorithm if it gives predictions on frames without hands."
I suggest you revisit our sample evaluation code to understand how does our metrics work. Specifically, you'll see how we filter out the out-of-frame hand situation in L80 to make sure it won't influence the submission. Please also make sure to use our script generate_submission.py to generate the submission file, and don't forget to take care of the num_clips=30 argument.
Feel free to ask if you still got blocked!! Happy to help :)

@masashi-hatano
Copy link
Author

masashi-hatano commented Sep 9, 2022

@VJWQ
Thanks for your reply!
I solved this problem by using num_clips=30, and evaluation was done correctly.

But, I don't really understand why num_clips is needed. According to the sample evaluation code, num_clips is used just for dividing the predicted values. I would appreciate if you could give me some explanation about it.

@VJWQ
Copy link
Contributor

VJWQ commented Sep 9, 2022

@VJWQ Thanks for your reply! I solved this problem by using num_clips=30, and evaluation was done correctly.

But, I don't really understand why num_clips is needed. According to the sample evaluation code, num_clips is used just for dividing the predicted values. I would appreciate if you could give me some explanation about it.

Sure, please refer to the explanation. the number 30 is obtained from the line cfg.TEST.NUM_ENSEMBLE_VIEWS * cfg.TEST.NUM_SPATIAL_CROPS, which is an operation for better testing the robustness of the model. In short, we need to /30 when generating the submission file to obtain the average performance on each test clip.

@takfate
Copy link

takfate commented Sep 11, 2022

@VJWQ
In generate_submission.py.
num_clips seems not to be used.

@takfate
Copy link

takfate commented Sep 12, 2022

@masashi-hatano
@VJWQ
Hello, I also have some questions.
I want to know what is the loss for validation when you training the baseline code.
I append the code

    for key in pred_dict:
        pred_dict[key] = pred_dict[key] / num_clips

after the multi-view accumulation.
But I also get the eval results similar to the results in your first comment.
I doubt if there is a problem with my data sampling.

@masashi-hatano
Copy link
Author

masashi-hatano commented Sep 12, 2022

@takfate
Hello, It probably helps you.
If you evaluate your model by using generate_submission.py and eval.py, the predicted value will be divided by num_clips twice, so it may solve your problem if you remove either of them.

@takfate
Copy link

takfate commented Sep 12, 2022

@masashi-hatano
I use generate_submission.py to generate a submission file for the test set and submit it to EvalAI evaluation system.
Will the EvalAI evaluation system do another division by 30?

@VJWQ
Copy link
Contributor

VJWQ commented Sep 12, 2022

@masashi-hatano I use generate_submission.py to generate a submission file for the test set and submit it to EvalAI evaluation system. Will the EvalAI evaluation system do another division by 30?

hi @takfate, your results will not be divided twice. In generate_submission.py, num_clips is just a placeholder and does not really do division on your results. This file serves to sum all 30 prediction results for one clip, and the actual division happens in our evaluation script after you submit your results json file in which /30 helps to obtain the average results for each clip. However, you still need to run python tools/generate_submission.py /path/to/output.pkl 30 to generate the submission file correctly.
@masashi-hatano @takfate
Do you mind posting the commands you use to generate the submission file? I can have a look at them to see why you are receiving similar results and adjust our guidance accordingly.

@masashi-hatano
Copy link
Author

If so, it's fine for me, thanks though.

@takfate
Copy link

takfate commented Sep 13, 2022

@VJWQ @masashi-hatano
Our eval results are already normal. Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants