-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstructing scene from qpos and qvel #345
Comments
Hello! Yes,
|
Thanks for the fast response, @erikfrey.
This has not been the case for me unfortunately--at least in positional backend, using |
OK! It's our expectation that this would work, so please do share a repro and we'll take a look. |
Here's the minimal example I promised: https://colab.research.google.com/drive/1lAqGR4mpd4EX4eeXhrlR4CzQEoQncbZk?usp=sharing You will see that |
Can you provide info on exactly what data is reused?
|
Not to hijack Erwin's question--I further narrowed it down to Repro: https://colab.research.google.com/drive/1P3mtOCNTngZ6A_CJtjJdWzn4zX4TPcda?usp=sharing Seems that using anything but the default q and qd results in unreproducible resulst of |
@namheegordonkim I think you're onto something here. I agree the assert in your colab shouldn't fire. We'll take a closer look. @erwincoumans it varies by pipeline. In generalized, for example, we iteratively update the inverse mass matrix from the previous step instead of recalculating it from scratch. We can certainly improve the documentation to make this clearer. |
Great to hear that you'll be looking into this. I've wrestled with it for a few days and identified some sus lines. I hope these help! Lines 88 to 105 in c2cd14c
Here, the parent link and the child link are held together by the joint. Initially However, I did fool around with passing |
Oh whoops, you know, I lied to you. For spring and pbd pipelines, For spring and pbd, you would want to use either Sorry for leading you down a rabbit hole. Please let me know if some part of simulation still doesn't make sense for you. |
Well, FWIW the demo I shared is using generalized. According to your comment, if I disabled gravity, hung the character up in the air with absolutely no collision possibilities, |
Hi @namheegordonkim - no problem, let's get to the bottom of this. First off, just so I understand, where you say:
You mean this colab? As far as I can tell you're using the positional backend, right? Just to be sure I fired up your colab and switched it to 'generalized' and sure enough, the assert goes away:
That no longer throws after switching to generalized. |
I shared two colabs.
I was specifically referring to the first one. |
Oh got it, thanks for clearing that up. This is helping me sharpen how I talk about Brax. So the behavior in the generalized colab is expected (by me at least, haha!). We have not done any work to guarantee Brax is deterministic, that it will produce the same trajectory across diverse hardware (the typical case), or the exact same trajectory across states that were created via step vs. via init. I agree this would be a nice property for Brax and some engines go to through the trouble to make explicit claims here, so we'll add that to our TODO. One thing that I am certain of is that init vs step will produce slightly different mass matrices. If you would like to remove this difference, you can try setting matrix_inv_iterations to zero, as this will force brax to use the same matrix inverse operation for both step and init. This will slow down simulation though! For now I think your best bet may be to store the entire |
Also please do let me know if you have a repro for forward kinematics that shows it's broken! It's pretty well tested so I'd be surprised if it's incorrect. But I love surprises! |
I've mentioned this briefly in #157, but I believe this deserves a separate thread.
In certain applications like Go-Explore (https://www.nature.com/articles/s41586-020-03157-9), it's critical for the user to be able to reset the environment to the state corresponding to the gathered observation, i.e., qpos and qvel.
In my experiments with Brax v2, I haven't been able to successfully reconstruct the simulator state from qpos and qvel alone--I suspect the 2nd order information is also necessary. According to MuJoCo docs:
Ignoring
time
andudd_state
,act
needs to be taken into account. Not sure by what mechanism MuJoCo does this, but we can gather thatqpos
,qvel
, andact
should be necessary and sufficient for reconstructing the state.Current workaround for this is to actually store the states (i.e. in a batched instance of
State
class) gathered during exploration and to use that as the argument forstep()
. But this comes at a huge memory overhead.Any idea whether this will be made possible?
The text was updated successfully, but these errors were encountered: