smarts.env.gymnasium.wrappers.parallel_env module

class smarts.env.gymnasium.wrappers.parallel_env.ParallelEnv(env_constructors: Sequence[Callable[[int], gymnasium.Env]], auto_reset: bool, seed: int = 42)[source]

Batch together multiple environments and step them in parallel. Each environment is simulated in an external process for lock-free parallelism using multiprocessing processes, and pipes for communication.

Note

Simulation might slow down when number of parallel environments requested exceed number of available CPUs.

property action_space: gymnasium.Space

The environment’s action space in gym representation.

property batch_size: int

The number of environments.

close(terminate=False)[source]

Sends a close message to all external processes.

Parameters:

terminate (bool, optional) – If True, then the close operation is forced and all processes are terminated. Defaults to False.

property observation_space: gymnasium.Space

The environment’s observation space in gym representation.

reset() Tuple[Sequence[Dict[str, Any]], Sequence[Dict[str, Any]]][source]

Reset all environments.

Returns:

A batch of

observations and infos from the vectorized environment.

Return type:

Tuple[Sequence[Dict[str, Any]], Sequence[Dict[str, Any]]]

seed() Sequence[int][source]

Retrieves the seed used in each environment.

Returns:

Seed of each environment.

Return type:

Sequence[int]

step(actions: Sequence[Dict[str, Any]]) Tuple[Sequence[Dict[str, Any]], Sequence[Dict[str, float]], Sequence[Dict[str, bool]], Sequence[Dict[str, bool]], Sequence[Dict[str, Any]]][source]

Steps all environments.

Parameters:

actions (Sequence[Dict[str,Any]]) – Actions for each environment.

Returns:

A batch of (observations, rewards, terminateds, truncateds, infos) from the vectorized environment.

Return type:

Tuple[ Sequence[Dict[str, Any]], Sequence[Dict[str, float]], Sequence[Dict[str, bool]], Sequence[Dict[str, bool]], Sequence[Dict[str, Any]] ]