About Schedule INS
香港中文大学(深圳)与上海交通大学学术交流研讨会

Mathematical convergence analysis for the training of artificial neural networks

Speaker

JENTZEN, Arnulf , The Chinese University of Hong Kong, Shenzhen, China & University of Münster, Germany

Time

03 Apr, 11:00 - 11:20

Abstract

In this talk we present some recent mathematical results regarding the training of artificial neural networks (ANNs). In particular, we establish the existence of minimizers in the optimization landscapes associated to the training of ANNs with the popular non-differentiable rectified linear unit (ReLU) activation function. In contrast to this, we also disprove the existence of minimizers in the optimization landscapes associated to the training of ANNs with other popular smooth activations such as the softplus and the sigmoid activations. Finally, in certain simplified ANN training scenarios we establish the convergence of gradient flow (GF) trajectories to good critical points whose risk values are close to the risk values of the global minimizers and, thereby, we establish overall convergence of GF trajectories in the training of such ANNs.

Bio

Arnulf Jentzen (*November 1983) is appointed as a presidential chair professor at the Chinese University of Hong Kong, Shenzhen (since 2021) and as a full professor at the University of Münster (since 2019). In 2004 he started his undergraduate studies in mathematics at Goethe University Frankfurt in Germany, in 2007 he received his diploma degree at this university, and in 2009 he completed his PhD in mathematics at this university. The core research topics of his research group are machine learning approximation algorithms, computational stochastics, numerical analysis for high dimensional partial differential equations (PDEs), stochastic analysis, and computational finance. Currently he serves in the editorial boards of several scientific journals such as the Annals of Applied Probability, Communications in Mathematical Sciences, the Journal of Machine Learning, the SIAM Journal on Scientific Computing, and the SIAM Journal on Numerical Analysis. In 2020 he was the recipient of the Felix Klein Prize of the European Mathematical Society (EMS), in 2022 he has been awarded an ERC Consolidator Grant from the European Research Council (ERC), and in 2022 he has been awarded the Joseph F. Traub Prize for Achievement in Information-Based Complexity. Further details on the activities of his research group can be found at the webpage http://www.ajentzen.de.