MPC-Net: A First Principles Guided Policy Search

Jan 20, 2020

My newest work has just been officially released:

MPC-Net: A First Principles Guided Policy Search.
Jan Carius, Farbod Farshidian, and Marco Hutter
IEEE Robotics and Automation Letters (RA-L), 2020

Here is the abstract of the publication:

We present an Imitation Learning approach for the control of dynamical systems with a known model. Our policy search method is guided by solutions from MPC. Typical policy search methods of this kind minimize a distance metric between the guiding demonstrations and the learned policy. Our loss function, however, corresponds to the minimization of the control Hamiltonian, which derives from the principle of optimality. Therefore, our algorithm directly attempts to solve the optimality conditions with a parameterized class of control laws. Additionally, the proposed loss function explicitly encodes the constraints of the optimal control problem and we provide numerical evidence that its minimization achieves improved constraint satisfaction. We train a mixture-of-expert neural network architecture for controlling a quadrupedal robot and show that this policy structure is well suited for such multimodal systems. The learned policy can successfully stabilize different gaits on the real walking robot from less than 10 min of demonstration data.

A preprint of the paper is available below. The published version can be found under https://doi.org/10.1109/LRA.2020.2974653. Our code is available on Github.

This work was supported by Intel Labs, the Swiss National Science Foundation(SNF) through project 166232, 188596, the National Centre of Competence in Research Robotics (NCCR Robotics), and the European Union’s Horizon2020 research and innovation program under grant agreement No 780883. This work was conducted as part of ANYmal Research, a community to advance legged robotics.