Locomotion¶

Rewards mainly relevant for locomotion tasks on floating-base robots.

class gym_jiminy.common.compositions.locomotion.TrackingBaseHeightReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL)[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the height of the floating base of the robot relative to lowest contact point wrt some reference trajectory.

See also

See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.

class gym_jiminy.common.compositions.locomotion.TrackingBaseOdometryVelocityReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL)[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the odometry velocity wrt some reference trajectory.

See also

See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.

gym_jiminy.common.compositions.locomotion.l2_norm(vec)[source]¶

Compute the L2-norm of a vector.

Parameters:

array – Input array.
vec (ndarray)

Return type:

ndarray

class gym_jiminy.common.compositions.locomotion.DriftTrackingBaseOdometryPoseReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL, *, horizon)[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the drift of the odometry pose over a horizon wrt some reference trajectory.

See also

See DeltaBaseOdometryPosition, DeltaBaseOdometryOrientation and TrackingQuantityReward documentations for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.
horizon (float) – Horizon over which values of the quantity will be stacked before computing the drift.

class gym_jiminy.common.compositions.locomotion.TrackingCapturePointReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL)[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the capture point wrt some reference trajectory.

See also

See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.

class gym_jiminy.common.compositions.locomotion.TrackingFootPositionsReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL, *, frame_names='auto')[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the relative position of the feet wrt each other.

See also

See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.
frame_names (Sequence[str] | Literal['auto']) – Name of the frames corresponding to the feet of the robot. ‘auto’ to automatically detect them from the set of contact and force sensors of the robot. Optional: ‘auto’ by default.

class gym_jiminy.common.compositions.locomotion.TrackingFootOrientationsReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL, *, frame_names='auto')[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the relative orientation of the feet wrt each other.

The error corresponds to the sum of squared total angles of the differences between the true and reference relative orientations for each foot.

See also

See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.
frame_names (Sequence[str] | Literal['auto']) – Name of the frames corresponding to the feet of the robot. ‘auto’ to automatically detect them from the set of contact and force sensors of the robot. Optional: ‘auto’ by default.

class gym_jiminy.common.compositions.locomotion.TrackingFootForceDistributionReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL, *, frame_names='auto')[source]¶

Bases: TrackingQuantityReward

Reward the agent for tracking the relative vertical force in world frame applied on each foot.

Note

The force is normalized by the weight of the robot rather than the total force applied on all feet. This is important as it not only takes into account the force distribution between the feet, but also the overall ground contact interact force. This way, building up momentum before jumping will be distinguished for standing still. Moreover, it ensures that the reward is always properly defined, even if the robot has no contact with the ground at all, which typically arises during the flying phase of running.

See also

See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.
frame_names (Sequence[str] | Literal['auto']) – Name of the frames corresponding to the feet of the robot. ‘auto’ to automatically detect them from the set of contact and force sensors of the robot. Optional: ‘auto’ by default.

class gym_jiminy.common.compositions.locomotion.MinimizeAngularMomentumReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL)[source]¶

Bases: QuantityReward

Reward the agent for minimizing the angular momentum in world plane.

The angular momentum along x- and y-axes in local odometry frame is transformed in a normalized reward to maximize by applying a given RBF kernel on the error. See TrackingQuantityReward documentation for technical details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.

class gym_jiminy.common.compositions.locomotion.MinimizeFrictionReward(env, cutoff, shape=KernelShape.SQUARED_EXPONENTIAL)[source]¶

Bases: QuantityReward

Reward the agent for minimizing the tangential forces at all the contact points and collision bodies, and to avoid jerky intermittent contact state.

The L^2-norm is used to aggregate all the local tangential forces. While the L^1-norm would be more natural in this specific cases, using the L^2- norm is preferable as it promotes space-time regularity, ie balancing the force distribution evenly between all the candidate contact points and avoiding jerky contact forces over time (high-frequency vibrations), phenomena to which the L^1-norm is completely insensitive.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
cutoff (float) – Cutoff threshold for the RBF kernel transform.
shape (KernelShape) – Desired type of RBF kernel. Optional: KernelShape.SQUARED_EXPONENTIAL by default.

class gym_jiminy.common.compositions.locomotion.BaseRollPitchTermination(env, low, high, grace_period=0.0, *, training_only=False)[source]¶

Bases: QuantityTermination

Encourages the agent to keep the floating base straight, ie its torso in case of a humanoid robot, by prohibiting excessive roll and pitch angles.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
low (ndarray | number | float | int | bool | complex | Sequence[float | int | bool | complex | number] | None) – Lower bound below which termination is triggered.
high (ndarray | number | float | int | bool | complex | Sequence[float | int | bool | complex | number] | None) – Upper bound above which termination is triggered.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

class gym_jiminy.common.compositions.locomotion.FallingTermination(env, min_base_height, grace_period=0.0, *, training_only=False)[source]¶

Bases: QuantityTermination

Terminate the episode immediately if the floating base of the robot gets too close from the ground.

It is assumed that the state is no longer recoverable when its condition is triggered. As such, the episode is terminated on the spot as the situation is hopeless. Generally speaking, aborting an epsiode in anticipation of catastrophic failure is beneficial. Assuming the condition is on point, doing this improves the signal to noice ratio when estimating the gradient by avoiding cluterring the training batches with irrelevant information.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
min_base_height (float) – Minimum height of the floating base of the robot below which termination is triggered.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

class gym_jiminy.common.compositions.locomotion.FootCollisionTermination(env, security_margin=0.0, grace_period=0.0, frame_names='auto', *, training_only=False)[source]¶

Bases: QuantityTermination

Terminate the episode immediately if some of the feet of the robot are getting too close from each other.

Self-collision must be avoided at all cost, as it can damage the hardware. Considering this condition as a dramatically failure urges the agent to do his best in this matter, to the point of becoming risk averse.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
security_margin (float) – Minimum signed distance below which termination is triggered. This can be interpreted as inflating or deflating the geometry objects by the safety margin depending on whether it is positive or negative. See MultiFootCollisionDetection for details.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
frame_names (Sequence[str] | Literal['auto']) – Name of the frames corresponding to the feet of the robot. ‘auto’ to automatically detect them from the set of contact and force sensors of the robot. Optional: ‘auto’ by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

gym_jiminy.common.compositions.locomotion.depth_approx(positions, heights, normals)[source]¶

Approximate signed distance from the ground profile (positive if above the ground, negative otherwise) of a set of the query points.

Internally, it uses a first order approximation assuming zero local curvature around each query point.

Parameters:

positions (ndarray) – Position of all the query points from which to compute from the ground profile, as a 2D array whose first dimension gathers the 3 position coordinates (X, Y, Z) while the second correponds to the N individual query points.
heights (ndarray) – Vertical height wrt the ground profile of the N individual query points in world frame as 1D array.
normals (ndarray) – Normal of the ground profile for the projection in world plane of all the query points, as a 2D array whose first dimension gathers the 3 position coordinates (X, Y, Z) while the second correponds to the N individual query points.

Return type:

ndarray

class gym_jiminy.common.compositions.locomotion._MultiContactGroundDistanceAndNormal(env, parent)[source]¶

Bases: InterfaceQuantity[Tuple[ndarray, ndarray]]

Signed distance (positive if above the ground, negative otherwise) and normal from the ground profile of all the candidate contact points.

Note

Internally, it does not compute the exact shortest distance from the ground profile because it would be computionally too demanding for now. As a surrogate, it relies on a first order approximation assuming zero local curvature around all the contact points individually.

Warning

The set of contact points must not change over episodes. In addition, collision bodies are not supported for now.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
parent (InterfaceQuantity | None) – Higher-level quantity from which this quantity is a requirement if any, None otherwise.

initialize()[source]¶

Initialize internal buffers.

This is typically useful to refresh shared memory proxies or to re-initialize pre-allocated buffers.

Warning

Intermediary quantities ‘requirements’ are NOT initialized automatically because they can be initialized lazily in most cases, or are optional depending on the most efficient computation path at run-time. It is up to the developer implementing quantities to take care of it.

Note

This method must be called before starting a new episode.

Note

Lazy-initialization is used for efficiency, ie initialize will be called before the first time refresh has to be called, which may never be the case if cache is shared between multiple identical instances of the same quantity.

Return type:: None

refresh()[source]¶

Evaluate this quantity based on the agent state at the end of the current agent step.

Return type:: Tuple[ndarray, ndarray]

class gym_jiminy.common.compositions.locomotion.FlyingTermination(env, max_height, grace_period=0.0, *, training_only=False)[source]¶

Bases: QuantityTermination

Discourage the agent of jumping by terminating the episode immediately if the robot is flying too high above the ground.

This kind of behavior is unsually undesirable because it may be frightning for people nearby, damage the hardware, be difficult to predict and be hardly repeatable. Moreover, such dynamic motions tend to transfer poorly to reality because the simulation to real gap is worsening.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
max_height (float) – Maximum height of the lowest contact points wrt the groupd above which termination is triggered.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

gym_jiminy.common.compositions.locomotion.compute_velocity_tangential(velocity, normal)[source]¶

Compute the norm of the velocity projected in the plan orthogonal to a given normal direction vector.

Warning

The normal direction vector used assumed to be normalized. It is up to the pratitioner to make sure this holds true.

Parameters:

velocity (ndarray) – Linear velocity in world-aligned reference frame.
normal (ndarray) – Normal direction vector in world reference frame.

Return type:

ndarray

gym_jiminy.common.compositions.locomotion._compute_max_velocity_tangential(velocities, depths, normals, height_thr)[source]¶

Compute the maximum norm of the tangential velocity wrt local curvature of the ground profile of all the frames that are considered in contact.

Parameters:

velocities (ndarray) – Linear velocity of each frame in local-world-aligned reference frame.
depths (ndarray) – Signed distance of each frames from the ground as a vector.
normals (ndarray) – Normal direction vector that fully specify the local curvature of the ground profile at the location of each frame as a 2D array whose first dimension gathers the position components (X, Y, Z) and the second corresponds to individual frames.
height_thr (float) – Distance threshold below which frames are considered in contact with the ground.

Return type:

float

class gym_jiminy.common.compositions.locomotion._MultiContactMaxVelocityTangential(env, parent, height_thr)[source]¶

Bases: AbstractQuantity[float]

Maximum norm of the tangential velocity wrt local curvature of the ground profile of all the candidate contact frames that are close enough from the ground.

Note

The maximum norm of the tangential velocity is considered to be 0.0 if none of the candidate contact frames are close enough from the ground.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
parent (InterfaceQuantity | None) – Higher-level quantity from which this quantity is a requirement if any, None otherwise.
height_thr (float) – Height threshold above which a candidate contact point is ignored for being too far from the ground.

height_thr: float¶: Height threshold above which a candidate contact point is deemed too far from the ground and is discarded from the set of frames being considered when looking for the maximum norm of the tangential velocity.

refresh()[source]¶

Evaluate this quantity based on the agent state at the end of the current agent step.

Return type:: float

class gym_jiminy.common.compositions.locomotion.SlippageTermination(env, height_thr, max_velocity, grace_period=0.0, *, training_only=False)[source]¶

Bases: QuantityTermination

Discourage the agent of sliding on the ground purposedly by terminating the episode immediately if some of the active contact points are slipping on the ground.

This kind of behavior is usually undesirable because they are hardly repeatable and tend to transfer poorly to reality. Moreover, it may cause a sense of poorly controlled motion to people nearby.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
height_thr (float) – Height threshold below which a candidate contact point is closed enough from the ground for its tangential velocity to be considered.
max_velocity (float) – Maximum norm of the tangential velocity wrt ground of the contact points that are close enough above which termination is triggered.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

class gym_jiminy.common.compositions.locomotion.ImpactForceTermination(env, max_force_rel, grace_period=0.0, *, training_only=False)[source]¶

Bases: QuantityTermination

Terminate the episode immediately in case of violent impact on the ground.

Similarly to the jumping behavior, this kind of behavior is usually undesirable. See FlyingTermination documentation for details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
max_force_rel (float) – Maximum vertical force applied on any of the contact points or collision bodies above which termination is triggered.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

class gym_jiminy.common.compositions.locomotion.DriftTrackingBaseOdometryPositionTermination(env, max_position_err, horizon, grace_period=0.0, *, training_only=False)[source]¶

Bases: DriftTrackingQuantityTermination

Terminate the episode if the current base odometry position is drifting too much over wrt some reference trajectory that is being tracked.

It is generally important to make sure that the robot is not deviating too much from some reference trajectory. It sounds appealing to make sure that the absolute error between the current and reference trajectory is bounded at all time. However, such a condition is very restrictive, especially for robots dealing with external disturbances or evolving on an uneven terrain. Moreover, when it comes to infinite-horizon trajectories in particular, eg periodic motions, avoiding drifting away over time involves being able to sense the absolute position of the robot in world frame via exteroceptive navigation sensors such as depth cameras or LIDARs. This kind of advanced sensor may not be able, thereby making the objective out of reach. Still, in the case of legged locomotion, what really matters is tracking accurately a nominal limit cycle as long as doing so does not compromise local stability. If it does, then the agent expected to make every effort to recover balance as fast as possible before going back to the nominal limit cycle, without trying to catch up with the ensuing drift since the exact absolute odometry pose in world frame is of little interest. See BaseOdometryPose and DriftTrackingQuantityTermination documentations for details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
max_position_err (float) – Maximum drift error in translation (X, Y) in world plane above which termination is triggered.
horizon (float) – Horizon over which values of the quantity will be stacked before computing the drift.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

class gym_jiminy.common.compositions.locomotion.DriftTrackingBaseOdometryOrientationTermination(env, max_orientation_err, horizon, grace_period=0.0, *, training_only=False)[source]¶

Bases: DriftTrackingQuantityTermination

Terminate the episode if the current base odometry orientation is drifting too much over wrt some reference trajectory that is being tracked.

See BaseOdometryPose and DriftTrackingBaseOdometryPositionTermination documentations for details.

Note

It takes into account the number of turns of the yaw angle of the floating base over the whole span of the history.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
max_orientation_err (float) – Maximum drift error in orientation (yaw,) in world plane above which termination is triggered.
horizon (float) – Horizon over which values of the quantity will be stacked before computing the drift.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

class gym_jiminy.common.compositions.locomotion.ShiftTrackingFootOdometryPositionsTermination(env, max_position_err, horizon, grace_period=0.0, frame_names='auto', *, training_only=False)[source]¶

Bases: ShiftTrackingQuantityTermination

Terminate the episode if the selected reference trajectory is not tracked with expected accuracy regarding the relative foot odometry positions, whatever the timestep being considered over some fixed-size sliding window.

See MultiFootRelativeXYZQuat and ShiftTrackingMotorPositionsTermination documentation for details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment.
max_position_err (float) – Maximum shift error in translation (X, Y) in world plane above which termination is triggered.
horizon (float) – Horizon over which values of the quantity will be stacked before computing the shift.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
frame_names (Sequence[str] | Literal['auto']) – Name of the frames corresponding to the feet of the robot. ‘auto’ to automatically detect them from the set of contact and force sensors of the robot. Optional: ‘auto’ by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.

gym_jiminy.common.compositions.locomotion.angle_distance(angle_left, angle_right)[source]¶

Compute the element-wise distance between two batches of angles.

The distance is defined as the smallest angle in absolute value between right and left angles (ignoring multi-turns).

See also

See angle_difference documentation for details.

Parameters:

angle_left (ndarray | number | float | int | bool | complex) – Left-hand side angles.
angle_right (ndarray | number | float | int | bool | complex) – Right-hand side angles.

Return type:

class gym_jiminy.common.compositions.locomotion.ShiftTrackingFootOdometryOrientationsTermination(env, max_orientation_err, horizon, grace_period=0.0, frame_names='auto', *, training_only=False)[source]¶

Bases: ShiftTrackingQuantityTermination

Terminate the episode if the selected reference trajectory is not tracked with expected accuracy regarding the relative foot odometry orientations, whatever the timestep being considered over some fixed-size sliding window.

See MultiFootRelativeXYZQuat and ShiftTrackingMotorPositionsTermination documentation for details.

Parameters:

env (InterfaceJiminyEnv) – Base or wrapped jiminy environment. Maximum shift error in orientation (yaw,) in world plane above which termination is triggered.
horizon (float) – Horizon over which values of the quantity will be stacked before computing the shift.
grace_period (float) – Grace period effective only at the very beginning of the episode, during which the latter is bound to continue whatever happens. Optional: 0.0 by default.
frame_names (Sequence[str] | Literal['auto']) – Name of the frames corresponding to the feet of the robot. ‘auto’ to automatically detect them from the set of contact and force sensors of the robot. Optional: ‘auto’ by default.
training_only (bool) – Whether the termination condition should be completely by-passed if the environment is in evaluation mode. Optional: False by default.
max_orientation_err (float)