In this talk, we approach some fundamental problems in deep learning: (a) how can one achieve the robustness of deep learning; (b) how over-parameterization may help find sparse neural networks that generalize well? For the first problem, we shall revisit Huber’s contamination model in robust statistics, from a perspective of generative adversarial networks (GAN). When the outlier examples are fully agnostic in distributions, GANs may provide us robust estimates at information-theoretically optimal rates, equivalent in statistical precision to the Tukey’s median estimate that is NP-hard to compute though. GANs may have wider adaptation than other polynomial algorithms recently proposed based on moment methods. Hence, it provides provable guarantees on robust statistical estimates by playing zero-sum differential games in GANs. For the second problem, we shall introduce some differential inclusion methods that split the overparameterized and sparse models in the same algorithm to aid parsimonious training with some global convergence guarantees.