RLlib
RLlib is an open-source library for reinforcement learning that offers both high scalability and a unified API for a variety of applications. RLlib
natively supports TensorFlow
, TensorFlow Eager
, and PyTorch
. Most of its internals are agnostic to such deep learning frameworks.
SMARTS contains two examples using Proximal Policy Optimization (PPO).
Proximal policy optimization
script: e12_rllib/ppo_example.py
Shows the basics of using RLlib with SMARTS through
RLlibHiWayEnv
.
Proximal policy optimization with population based training
script: e12_rllib/ppo_pbt_example.py
Combines Proximal Policy Optimization (PPO) with Population Based Training (PBT) scheduling.
Recommended reads
RLlib
is implemented on top of Ray
. Ray
is a distributed computing framework specifically designed with RL in mind. There are
many docs about Ray
and RLlib
. We recommend to read the following pages first,
RLlib in 60 seconds: Getting started with
RLlib
.Common Parameters: Configuring
RLlib
algorithms.Basic Python API: Basic tune training.
Logging to TensorBoard: How to use TensorBoard to visualize metrics.
Built-in Models and Preprocessors: Built-in preprocessor, including how to deal with different observation spaces.
Proximal Policy Optimization (PPO):
RLlib
PPO implementation and PPO parameters.Tune Key Concepts: Tune key concepts.
RLlib Examples: Get to know
RLlib
quickly through examples.
Resume training
With respect to SMARTS/examples/e12_rllib
examples, if you want to continue an aborted experiment, you can set resume_training=True
. But note that resume_training=True
will continue to use the same configuration as was set in the original experiment.
To make changes to a started experiment, you can edit the latest experiment file in ./results
.
Or if you want to start a new experiment but train from an existing checkpoint, you will need to look into How to Save and Load Trial Checkpoints.