Lei Li, Institute of Natural Sciences, Shanghai Jiao Tong University
Room 306, No.5 Science Building
I will introduce the main ideas of recent works of Rotskoff and Eijnden. For a special class of neural networks, training the parameters can be viewed as the evolution of interacting particle systems.
The dynamics for the distribution of the parameters can be viewed a gradient flow in Wasserstein-2 space under a convex functional. That means though the loss function may not be convex in the parameters, the functional is convex in the distribution and global convergence of the distribution can be ensured. With this observation, new training algorithms with birth and death dynamics can be introduced to train the neural networks. This presentation should mainly be expalained on board and some background of PDEs and probability (like the concept of gradient flows) is needed to follow.