机器学习实验室博士生系列论坛(第十二期)—— Reinforcement Learning with Observational Data: A Causal Perspective
报告人:Liangyu Zhang (PKU)
时间:20201-08-18 15:10-16:10
地点:北大静园六院一楼大会议室&腾讯会议 914 9325 5784
Abstract:In recent years, Reinforcement Learning (RL) achieves tremendous successes in various scenarios, e.g., game playing and robotics. However, these successes usually requires that either the agent can repeatedly interact with the environments or a large dataset generated by randomized trials is available. And in critical domains such as healthcare or education, such kind of trial and error is often impossible due to high costs or ethical constraints. Therefore, we would like to study how to solve RL problems with observational data. When dealing with the observational data, the biggest challenge is that the behavior policy that generates the observational data may depend on possibly unobserved confounders. Fortunately, we may effectively tackle such challenge leveraging tools from Causal Inference.
In this talk, we will first introduce the problem setup of RL with observational data and some basic results in causal inference. Then we will review a class of confounding-robust RL algorithms. The main idea is to construct causal bounds from confounded observations and then incorporate such bounds into the learning process. In addition, we will also briefly review some works on causal imitation learning.