Clarification on evaluation metric #18

gunshi · 2023-11-06T09:55:47Z

Hi,
Thanks for open-sourcing this framework! I'm trying to reproduce the results of the baselines reported in the Robohive paper, and wanted to ask what is the exact metric that is averaged over 3 seeds in the Franka-expert data runs (here: https://github.com/facebookresearch/agenthive/tree/dev/scripts)?
Is it the maximum success rate over a run averaged over 3 seeds or the maximum of the average success rate over 3 seeds or something else?
The paper doesn't seem to mention exactly how the success rate of a run is decided (over many checkpoints).
Thanks!

ShahRutav · 2023-12-09T16:54:57Z

We report the average success rate over three seeds x three camera angles (except for Robel Suite where we use all the camera angles). We use the last checkpoint to measure the success rate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on evaluation metric #18

Clarification on evaluation metric #18

gunshi commented Nov 6, 2023 •

edited

Loading

ShahRutav commented Dec 9, 2023

Clarification on evaluation metric #18

Clarification on evaluation metric #18

Comments

gunshi commented Nov 6, 2023 • edited Loading

ShahRutav commented Dec 9, 2023

gunshi commented Nov 6, 2023 •

edited

Loading