Implementation details of PPO only from paper and literature available at the time of publication?
Hi! I’ve tried to implement PPO for Mujoco based only on the paper and resources available at the time of publication, without looking at any existing implementations of the algorithm. I have now compared my implementation to the relevant details listed in The 37 Implementation Details of Proximal Policy Optimization, and it turns out I missed most details, see below. My question is: Were these details documented somewhere, or have they been known implicitly in the community at […]