RLlib
RLlib is an open-source library for reinforcement learning that offers both high scalability and a unified API for a variety of applications. RLlib natively supports TensorFlow, TensorFlow Eager, and PyTorch. Most of its internals are agnostic to such deep learning frameworks.
SMARTS contains two examples using Proximal Policy Optimization (PPO).
Proximal policy optimization
script: e12_rllib/ppo_example.py
Shows the basics of using RLlib with SMARTS through
RLlibHiWayEnv.
Proximal policy optimization with population based training
script: e12_rllib/ppo_pbt_example.py
Combines Proximal Policy Optimization (PPO) with Population Based Training (PBT) scheduling.
Recommended reads
RLlib is implemented on top of Ray. Ray is a distributed computing framework specifically designed with RL in mind. There are
many docs about Ray and RLlib. We recommend to read the following pages first,
RLlib in 60 seconds: Getting started with
RLlib.Common Parameters: Configuring
RLlibalgorithms.Basic Python API: Basic tune training.
Logging to TensorBoard: How to use TensorBoard to visualize metrics.
Built-in Models and Preprocessors: Built-in preprocessor, including how to deal with different observation spaces.
Proximal Policy Optimization (PPO):
RLlibPPO implementation and PPO parameters.Tune Key Concepts: Tune key concepts.
RLlib Examples: Get to know
RLlibquickly through examples.
Resume training
With respect to SMARTS/examples/e12_rllib examples, if you want to continue an aborted experiment, you can set resume_training=True. But note that resume_training=True will continue to use the same configuration as was set in the original experiment.
To make changes to a started experiment, you can edit the latest experiment file in ./results.
Or if you want to start a new experiment but train from an existing checkpoint, you will need to look into How to Save and Load Trial Checkpoints.