机器学习实验室博士生系列论坛(第十三期)—— Implicit bias and implicit acceleration in deep learning
报告人:Chuhan Xie (PKU)
时间:2021-09-01 15:10-16:10
地点:北大静园六院一楼大会议室 & 腾讯会议 914 9325 5784
Abstract: One of the major research directions in deep learning theory is to explain why a learned neural network model can generalize well even when it is highly overparametrized. Recent work sheds light to this question: the optimization algorithm (e.g. gradient descent) used in training is biased towards simple solutions with good generalization performance. Such phenomenon is called implicit bias; particularly, implicit bias states that iterates converge to a solution that minimize a regularization function Q(.) under certain restrictions. In this talk, we will first introduce the precise definition of implicit bias, and then review recent research characterizing it according to different learning tasks (e.g. classification, matrix factorization), and different optimization methods (e.g. adaptive algorithms). Besides, we will introduce "implicit acceleration", a phenomenon that overparametrization somtimes leads to acceleration in training. We take deep linear networks as an example to illustrate that vanilla gradient descent on such overparametrized model leads to gradient descent with momentum and an adaptive learning rate on the original model.