Mujoco environments: qvel is different to computed velocity in the step method #1851

albertcthomas · 2020-03-26T15:18:31Z

When looking at the code of the step method of the SwimmerEnv I noticed that the velocity along the x axis used in the reward is computed from the positions:

    def step(self, a):
        ctrl_cost_coeff = 0.0001
        xposbefore = self.sim.data.qpos[0]
        self.do_simulation(a, self.frame_skip)
        xposafter = self.sim.data.qpos[0]
        reward_fwd = (xposafter - xposbefore) / self.dt

However I would have expected the velocity to be taken directly from self.sim.data.qvel[0] but when looking at the values of self.sim.data.qvel[0] they are not equal to reward_fwd. Is there a good reason to compute the velocity from the positions and not use self.sim.data.qvel[0]?

There is a similar question on the MuJoCo forum.

The text was updated successfully, but these errors were encountered:

albertcthomas · 2020-03-30T09:53:41Z

I investigated this a bit more and I think the difference can be explained by the fact that

According to the xml files, SwimmerEnv is using Runge-Kutta integration and not Euler integration as it is the case for HalfCheetahEnv for instance,
frame_skip is strictly larger than 1, so one step in gym is actually several steps in MuJoCo (this is the case for SwimmerEnv and HalfCheetahEnv)

That being said I am still not sure why the velocity part for the reward is computed from the positions and not from qvel.

TigerStone93 · 2020-11-04T08:49:29Z

I found that frame_skip affects dt
dt = timestep * frame_skip

jkterry1 · 2022-05-23T18:55:42Z

PR #2762 is about to be merged, introducing V4 MuJoCo environments using new bindings and a dramatically newer version of the engine. If this issue still persists with the V4 ones, please create a new issue for it.

christopherhesse added the mujoco label Apr 10, 2020

Rohan138 mentioned this issue Oct 24, 2021

Update on Plans for the MuJoCo, Robotics and Box2d Environments and the Status of Brax and Hardware Accelerated Environments in Gym #2456

Closed

jkterry1 closed this as completed May 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mujoco environments: qvel is different to computed velocity in the step method #1851

Mujoco environments: qvel is different to computed velocity in the step method #1851

albertcthomas commented Mar 26, 2020 •

edited

Loading

albertcthomas commented Mar 30, 2020 •

edited

Loading

TigerStone93 commented Nov 4, 2020 •

edited

Loading

jkterry1 commented May 23, 2022

Mujoco environments: qvel is different to computed velocity in the step method #1851

Mujoco environments: qvel is different to computed velocity in the step method #1851

Comments

albertcthomas commented Mar 26, 2020 • edited Loading

albertcthomas commented Mar 30, 2020 • edited Loading

TigerStone93 commented Nov 4, 2020 • edited Loading

jkterry1 commented May 23, 2022

albertcthomas commented Mar 26, 2020 •

edited

Loading

albertcthomas commented Mar 30, 2020 •

edited

Loading

TigerStone93 commented Nov 4, 2020 •

edited

Loading