Algorithms

We provide the following training algorithms:

PPO
PPO for Single Step RL
PPO for BiLevel Optimization

All the algorithms are designed for Multi-Agent Systems (for single-agent experiments set nagents to 1 in the environment). Additionally, all of these use Centralized Training with Decentralized Execution.

PreviousExample Slurm Script NextPPO Distributed Centralized Critic

Last updated 4 years ago

Was this helpful?