PFRL’s Pretrained Model Zoo

Prabhat Nagarajan
2 min readApr 6, 2021
An episode of a pre-trained Rainbow model.

In a previous post, I gave a high-level overview of PFRL (Preferred RL), an open source deep reinforcement learning (RL) library. In this short post, I’ll show how you can easily load PFRL’s pre-trained models.

Currently, PFRL has pre-trained models for all algorithms for which it has reproducibility scripts, which are scripts that are intended to reproduce the original algorithm’s published results as closely as possible. In particular, as of this writing, PFRL has reproducibility scripts for 9 key deep RL algorithms: A3C, DQN, IQN, Rainbow, DDPG, TRPO, PPO, TD3, and SAC.

Depending on the algorithm, PFRL has up to two types of pre-trained models available: final and best. The final weights refer to the final network weights at the end of training. The best weights refer to the weights that produced the highest scores during intermediate evaluation phases throughout training. Sometimes, especially in papers applying DRL algorithms on the Atari suite, the scores that are reported use either the best evaluation throughout the entire training run or a re-evaluation of the network weights that produced said best intermediate evaluation.

PFRL makes it easy to load these pre-trained models in a reproducibility script. For example, to view a trained DQN model’s performance, you can simply go to PFRL’s example script for DQN and run the following:

python train_dqn.py --demo --load-pretrained --env BreakoutNoFrameskip-v4 --pretrained-type best --gpu -1

Breaking these flags down:

  • --demo : Indicates to the script that it should evaluate the agent instead of train it.
  • --load-pretrained : Rather than create a new agent with a randomly initialized network, load an agent with a pre-trained network.
  • --pretrained-type best : Specifically load the best pretrained model (as explained above).
  • --env BreakoutNoFrameskip-v4 : Use the Breakout environment. The script will load the correct model based on the environment.
  • --gpu -1 : Indicates not to use the GPU (e.g., if you don’t have one).

And that’s it! Loading pre-trained models with PFRL is quite simple.

For more information on PFRL, check out the following links:

--

--