Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mujoco environments: qvel is different to computed velocity in the step method #1851

Closed
albertcthomas opened this issue Mar 26, 2020 · 3 comments
Labels

Comments

@albertcthomas
Copy link

albertcthomas commented Mar 26, 2020

When looking at the code of the step method of the SwimmerEnv I noticed that the velocity along the x axis used in the reward is computed from the positions:

    def step(self, a):
        ctrl_cost_coeff = 0.0001
        xposbefore = self.sim.data.qpos[0]
        self.do_simulation(a, self.frame_skip)
        xposafter = self.sim.data.qpos[0]
        reward_fwd = (xposafter - xposbefore) / self.dt

However I would have expected the velocity to be taken directly from self.sim.data.qvel[0] but when looking at the values of self.sim.data.qvel[0] they are not equal to reward_fwd. Is there a good reason to compute the velocity from the positions and not use self.sim.data.qvel[0]?

There is a similar question on the MuJoCo forum.

@albertcthomas
Copy link
Author

albertcthomas commented Mar 30, 2020

I investigated this a bit more and I think the difference can be explained by the fact that

  1. According to the xml files, SwimmerEnv is using Runge-Kutta integration and not Euler integration as it is the case for HalfCheetahEnv for instance,
  2. frame_skip is strictly larger than 1, so one step in gym is actually several steps in MuJoCo (this is the case for SwimmerEnv and HalfCheetahEnv)

That being said I am still not sure why the velocity part for the reward is computed from the positions and not from qvel.

@TigerStone93
Copy link

TigerStone93 commented Nov 4, 2020

I found that frame_skip affects dt
dt = timestep * frame_skip

@jkterry1
Copy link
Collaborator

PR #2762 is about to be merged, introducing V4 MuJoCo environments using new bindings and a dramatically newer version of the engine. If this issue still persists with the V4 ones, please create a new issue for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants