How do I improve this (quadruped RL learning)
I’m new to RL and new to mujoco, so I have no idea what variables i should tune. Here are the variables ive rewarded/penalized: I’ve rewarded the following: + r_upright + r_height + r_vx + r_vy + r_yaw + r_still + r_energy + r_posture + r_slip and I’ve placed penalties on: p_vy = w_vy * vy^2 p_yaw = w_yaw * yaw_rate^2 p_still = w_still * ( (vx^2 + vy^2 + vz^2) + 0.05*(wx^2 + wy^2 + wz^2) ) […]