Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on evaluation metric #18

Open
gunshi opened this issue Nov 6, 2023 · 1 comment
Open

Clarification on evaluation metric #18

gunshi opened this issue Nov 6, 2023 · 1 comment

Comments

@gunshi
Copy link

gunshi commented Nov 6, 2023

Hi,
Thanks for open-sourcing this framework! I'm trying to reproduce the results of the baselines reported in the Robohive paper, and wanted to ask what is the exact metric that is averaged over 3 seeds in the Franka-expert data runs (here: https://github.com/facebookresearch/agenthive/tree/dev/scripts)?
Is it the maximum success rate over a run averaged over 3 seeds or the maximum of the average success rate over 3 seeds or something else?
The paper doesn't seem to mention exactly how the success rate of a run is decided (over many checkpoints).
Thanks!

@ShahRutav
Copy link
Contributor

We report the average success rate over three seeds x three camera angles (except for Robel Suite where we use all the camera angles). We use the last checkpoint to measure the success rate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants