Probabilistic models for data-efficient reinforcement learning


On our path toward fully autonomous systems, i.e., systems that operate in the real world without significant human intervention, reinforcement learning (RL) is a promising framework for learning to solve problems by trial and error. While RL has had many successes recently, a practical challenge we face is its data inefficiency: In real-world problems (e.g., robotics) it is not always possible to conduct millions of experiments, e.g., due to time or hardware constraints. In this talk, I will outline three approaches that explicitly address the data-efficiency challenge in reinforcement learning using probabilistic models. First, I will give a brief overview of a model-based RL algorithm that can learn from small datasets. Second, I will describe an idea based on model predictive control that allows us to learn even faster while taking care of state or control constraints, which is important for safe exploration. Finally, I will introduce an idea for meta learning (in the context of model-based RL), which is based on latent variables.

Karlsruhe Institute for Technology (virtual)
Marc Deisenroth
DeepMind Chair in Artificial Intelligence