Optimization and Learning Algorithms for Dynamic Locomotion of Walking Robots

I recently defended my doctoral thesis with the title

Optimization and Learning Algorithms for Dynamic Locmotion of Walking Robots.

In the following is the abstract of my thesis, the full version is hosted by the ETH library:

Robots hold the promise of becoming useful helpers for many dangerous, laborious, or unpleasant tasks. One significant barrier to the ubiquitous use of robotic assistants today is the difficulty of programming sufficiently intelligent behavior that would allow for autonomous operation. The present thesis focuses on the challenges relating to automatic motion planning and control, which is an essential aspect of the autonomous behavior of any mobile robot. We showcase our developed solutions primarily on quadrupedal walking robots, as legged locomotion offers the necessary versatility to accomplish tasks in many natural and human-made environments.

The complexity of walking robots calls for an automatic control approach that handles high-dimensional and nonlinearly coupled dynamics with minimal human involvement. In this work, we derive several theoretical insights and subsequently develop practical algorithms for legged locomotion that are applicable to a large variety of systems. Contrary to many state-of-the-art methods that separate the planning and control aspects, our approach takes a holistic view of the locomotion problem in order to use the entire dynamics of a system. We leverage both optimal control and machine learning techniques to efficiently generate whole-body walking motions in real-time without relying on preceding footstep planners.

In the first chapters of this thesis, we devise an optimal control approach for generating motion plans that utilize the system’s full dynamic capabilities and do not require a pre-specified contact sequence. We demonstrate that a hard contact model can be incorporated into an efficient optimization scheme to form a contact-implicit planning algorithm. The resulting motions are physically feasible and can exploit sliding contacts if it is expedient for the task. Expanding on the idea of creating trajectories under realistic dynamical constraints, we develop a second algorithm based on a stochastic system model. The ensuing method can discover stable walking motions in environments with only sparse footholds by sampling feasible trajectories. Unlike other stochastic optimal control approaches, our method can handle constraints like standard optimal control while also leveraging data-driven methods to improve sampling efficiency. Through conducting hardware experiments we verify the performance of both algorithms under realistic conditions.

One of the practical challenges of our (stochastic) optimal control schemes is the computational complexity at runtime. The necessary powerful onboard computers can cause significant energy expenditure and reduce the payload of the system. To address this issue, the last part of this thesis presents a method for extracting the capabilities of an optimal control algorithm into a neural network. We show that optimal control theory naturally suggests a suitable loss function for policy search that can be minimized with standard machine learning tools. The procedure shifts the computational burden of optimal control into an offline training phase while online control is reduced to a mere evaluation of the learned policy. Our experiments show that the extracted policy exhibits similar performance as the original optimal controller.

This work was supported by Intel Labs, the Swiss National Science Foundation(SNF) through project 166232, 188596, the National Centre of Competence in Research Robotics (NCCR Robotics), and the European Union’s Horizon2020 research and innovation program under grant agreement No 780883. This work was conducted as part of ANYmal Research, a community to advance legged robotics.