Block

This method gathers base implementations for blocks to be used in pipeline control design.

It implements:

  • the concept of block that can be connected to a BaseJiminyEnv environment through any level of indirection

  • the base controller block

  • the base observer block

class gym_jiminy.common.bases.block_bases.BlockInterface(env, update_ratio=1, **kwargs)[source]

Bases: object

Base class for blocks used for pipeline control design.

Block can be either observers and controllers. A block can be connected to any number of subsequent blocks, or directly to a BaseJiminyEnv environment.

Initialize the block interface.

It only allocates some attributes.

Parameters
  • env (gym_jiminy.common.envs.env_generic.BaseJiminyEnv) – Environment to ultimately control, ie completely unwrapped.

  • update_ratio (int) – Ratio between the update period of the high-level controller and the one of the subsequent lower-level controller.

  • kwargs (Any) – Extra keyword arguments that may be useful for mixing multiple inheritance through multiple inheritance.

Return type

None

observation_space: Optional[gym.spaces.space.Space]
action_space: Optional[gym.spaces.space.Space]
_setup()[source]

Reset the internal state of the block.

Note

The environment itself is not necessarily directly connected to this block since it may actually be connected through another block instead.

Note

It is possible to update the configuration of the simulator, for example to register some extra variables to monitor the internal state of the block.

Return type

None

_refresh_observation_space()[source]

Configure the observation of the block.

Note

The observation space refers to the output of system once connected with another block. For example, for a controller, it is the action from the next block.

Return type

None

_refresh_action_space()[source]

Configure the action of the block.

Note

The action space refers to the input of the block. It does not have to be an actual action. For example, for an observer, it is the observation from the previous block.

Return type

None

class gym_jiminy.common.bases.block_bases.BaseObserverBlock(*args, **kwargs)[source]

Bases: gym_jiminy.common.bases.generic_bases.ObserverInterface, gym_jiminy.common.bases.block_bases.BlockInterface

Base class to implement observe that can be used compute observation features of a BaseJiminyEnv environment, through any number of lower-level observer.

../../../../_images/aafig-ae381d46b67db0cd87084c43666b3f08cf0ca8f8.svg

Formally, an observer is a defined as a block mapping the observation space of the preceding observer, if any, and directly the one of the environment ‘obs_env’, to any observation space ‘features’. It is more generic than estimating the state of the robot.

The update period of the observer is the same than the simulation timestep of the environment for now.

Parameters
  • kwargs (Any) – Extra keyword arguments that may be useful for dervied observer with multiple inheritance, and to allow automatic pipeline wrapper generation.

  • args (Any) –

Return type

None

_refresh_action_space()[source]

Configure the action space of the observer.

It does nothing but to return the action space of the environment since it is only affecting the observation space.

Warning

This method that must not be overloaded. If one need to overload it, then using BaseControllerBlock or BlockInterface directly is probably the way to go.

Return type

None

_setup()[source]

Reset the internal state of the block.

Note

The environment itself is not necessarily directly connected to this block since it may actually be connected through another block instead.

Note

It is possible to update the configuration of the simulator, for example to register some extra variables to monitor the internal state of the block.

Return type

None

refresh_observation(measure)[source]

Compute observed features based on the current simulation state and lower-level measure.

Parameters

measure (Union[Dict[str, StructNested], Sequence[StructNested], ValueType]) – Measure from the environment to process to get high-level observation.

Return type

None

_refresh_observation_space()

Configure the observation space.

Return type

None

get_observation()

Get post-processed observation.

By default, it does not perform any post-processing. One is responsible for clipping the observation if necessary to make sure it does not violate the lower and upper bounds. This can be done either by overloading this method, or in the case of pipeline design, by adding a clipping observation block at the very end.

Warning

In most cases, it is not necessary to overloaded this method, and doing so may lead to unexpected behavior if not done carefully.

Return type

Union[Dict[str, StructNested], Sequence[StructNested], ValueType]

observe_dt: float
observation_space: Optional[gym.spaces.space.Space]
_observation: Optional[Union[Dict[str, StructNested], Sequence[StructNested], ValueType]]
action_space: Optional[gym.Space]
class gym_jiminy.common.bases.block_bases.BaseControllerBlock(*args, **kwargs)[source]

Bases: gym_jiminy.common.bases.generic_bases.ControllerInterface, gym_jiminy.common.bases.block_bases.BlockInterface

Base class to implement controller that can be used compute targets to apply to the robot of a BaseJiminyEnv environment, through any number of lower-level controllers.

../../../../_images/aafig-506950d67946aa370f719ef7a824bb8f823f16ab.svg

Formally, a controller is defined as a block mapping any action space ‘act_ctrl’ to the action space of the subsequent controller ‘cmd_ctrl’, if any, and ultimately to the one of the associated environment ‘act_env’, ie the motors efforts to apply on the robot.

The update period of the controller must be higher than the control update period of the environment, but both can be infinite, ie time-continuous.

Note

The space in which the command must be contained is completely determined by the action space of the next block (another controller or the environment to ultimately control). Thus, it does not have to be defined explicitely.

On the contrary, the action space of the controller ‘action_ctrl’ is free and it is up to the user to define it.

Parameters
  • args (Any) – Extra arguments that may be useful for mixing multiple inheritance through multiple inheritance, and to allow automatic pipeline wrapper generation.

  • kwargs (Any) – Extra keyword arguments. See ‘args’.

Return type

None

_refresh_observation_space()[source]

Configure the observation space of the controller.

It does nothing but to return the observation space of the environment since it is only affecting the action space.

Warning

This method that must not be overloaded. If one need to overload it, then using BaseObserverBlock or BlockInterface directly is probably the way to go.

Return type

None

_refresh_action_space()

Configure the action space of the controller.

Note

This method is called right after _setup, so that both the environment to control and the controller itself should be already initialized.

Return type

None

_setup()[source]

Configure the controller.

It includes:

  • refreshing the action space of the controller

  • allocating memory of the controller’s internal state and initializing it

Note

Note that the environment to ultimately control env has already been fully initialized at this point, so that each of its internal buffers is up-to-date, but the simulation is not running yet. As a result, it is still possible to update the configuration of the simulator, and for example, to register some extra variables to monitor the internal state of the controller.

Return type

None

compute_command(measure, action)

Compute the action to perform by the subsequent block, namely a lower-level controller, if any, or the environment to ultimately control, based on a given high-level action.

Note

The controller is supposed to be already fully configured whenever this method might be called. Thus it can only be called manually after reset. This method has to deal with the initialization of the internal state, but _setup method does so.

Parameters
  • measure (Union[Dict[str, StructNested], Sequence[StructNested], ValueType]) – Observation of the environment.

  • action (Union[Dict[str, StructNested], Sequence[StructNested], ValueType]) – Target to achieve.

Returns

Action to perform.

Return type

Union[Dict[str, StructNested], Sequence[StructNested], ValueType]

compute_reward(*args, info, **kwargs)

Compute reward at current episode state.

See ControllerInterface.compute_reward for details.

Note

This method is called after updating the internal buffer ‘_num_steps_beyond_done’, which is None if the simulation is not done, 0 right after, and so on.

Parameters
  • args (Any) – Extra arguments that may be useful for derived environments, for example Gym.GoalEnv.

  • info (Dict[str, Any]) – Dictionary of extra information for monitoring.

  • kwargs (Any) – Extra keyword arguments. See ‘args’.

Returns

Total reward.

Return type

float

compute_reward_terminal(*, info)

Compute terminal reward at current episode final state.

Note

Implementation is optional. Not computing terminal reward if not overloaded by the user for the sake of efficiency.

Warning

Similarly to compute_reward, ‘info’ can be updated by reference to log extra info for monitoring.

Parameters

info (Dict[str, Any]) – Dictionary of extra information for monitoring.

Returns

Terminal reward.

Return type

float

control_dt: float
action_space: Optional[gym.spaces.space.Space]
_action: Optional[Union[Dict[str, StructNested], Sequence[StructNested], ValueType]]
observation_space: Optional[gym.Space]
get_fieldnames()[source]

Get mapping between each scalar element of the action space of the controller and the associated fieldname for logging.

It is expected to return an object with the same structure than the action space, the difference being numerical arrays replaced by lists of string.

By default, generic fieldnames using ‘Action’ prefix and index as suffix for np.ndarray.

Note

This method is not supposed to be called before reset, so that the controller should be already initialized at this point.

Return type

Union[Dict[str, StructNested], Sequence[StructNested], ValueType]