Zhi-Qin John Xu is an associate professor at the Institute of Natural Sciences/School of Mathematical Sciences, Shanghai Jiao Tong University. Zhi-Qin graduated from Zhiyuan College of Shanghai Jiao Tong University in 2012. In 2016, he graduated from Shanghai Jiao Tong University with a doctor's degree in applied mathematics. From 2016 to 2019, he was a postdoctoral fellow at NYU ABU Dhabi and the Courant Institute. For language model, he identifies the complexity of model is critical to the memorization and reason capbility of a language model. In deep learning theory, he and collaborators discovered frequency principle, parameter condensation and embedding principles in deep learning, and developed multi-scale neural networks; In AI for Science, mainly combustion, he and collaborators developed deep learning based mechanism reduction (DeePMR) and deep learning based surrogate model for accelerating the simulation of chemical kinetics (DeepCK). He published papers as the first author or corresponding author at TPAMI, JMLR, AAAI, NeurIPS, SIMODS, CiCP, CSIAM Trans. Appl. Math., JCP, Combustion and Flame, Eur. J. Neurosci. etc. Currently, he is the Managing Editor of Journal of Machine Learning. 许志钦,上海交通大学自然科学研究院/数学科学学院长聘教轨副教授。主持科技部重点研发计划青年科学家项目、面上等。 2012年本科毕业于上海交通大学致远学院。2016年博士毕业于上海交通大学,获应用数学博士学位。 2016年至2019年,在纽约大学阿布扎比分校和柯朗研究所做博士后。在大模型方面,发现复杂度对大模型记忆和推理影响的机制。 在深度学习基础研究方面,与合作者共同发现深度学习中的频率原则、参数凝聚和能量景观嵌入原则,发展多尺度神经网络等。 在AI for Science,主要是在燃烧化学反应方面,与合作者共同发展基于深度深习的机理简化方法和基于深度学习的替代模型加速燃烧模拟。 以第一作者或者通讯作者身份发表论文于TPAMI, JMLR,AAAI,NeurIPS,SIMODS,CiCP,CSIAM Trans. Appl. Math.,JCP, Combustion and Flame,Eur. J. Neurosci.等学术期刊和会议。 现为Journal of Machine Learning的managing editor。 Book (Chinese) in progressWelcome to give any feedback! 这本书以现象驱动介绍深度学习的一些基本知识,以及提供理解。请见github Language ModelsLanguage model research faces significant challenges, especially for academic research groups with constrained resources. These challenges include complex data structures, unknown target functions, high computational costs and memory requirements, and a lack of interpretability in the inference process, etc. Drawing a parallel to the use of simple models in scientific research, we propose the concept of an anchor function. This is a type of benchmark functions designed for studying language models. A PPT for anchor function and initialization effect in ppt [54] Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation, arxiv 2405.15302 (2024), and in pdf, and in arxiv [49] Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing, NeurIPS 2024 arxiv 2405.05409 (2024), and in pdf, and in arxiv [48] Anchor function, arxiv 2401.08309 (2024), and in pdf, and in arxiv New Journal: Journal of Machine LearningWe launched a new journal: Journal of Machine Learning (JML). Welcome to submit papers to JML. Editor-in-Chief: Prof. Weinan E, Prof. Jianfeng Lu Scope: Journal of Machine Learning (JML) publishes high quality research papers in all areas of machine learning, including innovative algorithms of machine learning, theories of machine learning, important applications of machine learning in AI, natural sciences, social sciences, and engineering etc. The journal emphasizes a balanced coverage of both theory and practice. The journal is published in a timely fashion in electronic form. WHY: Although the world is generally over-populated with journals, the field of machine learning (ML) is one exception. In mathematics, we just do not have a recognized venue (other than conference proceedings) for publishing our work on ML. In AI for Science, ideally, we would like to publish our work in leading scientific journals such as Physical Review Letters. However, this is a difficult task when we are at the stage of developing methodologies. Although there are many conferences in ML-related areas, publishing in journal form is still the preferred venue in many disciplines. The objective for Journal of Machine Learning (JML) is to become a leading journal in all areas related to ML, including algorithms and theory for ML, as well as applications to science and AI. JML will start as a quarterly publication. Considering the fact that ML is a vast and fast-developing field, we will do our best to carry out a thorough and responsive review process. To this end, we will have a group of young and active managing editors who will handle the review process, and a large, interdisciplinary group of experienced board members who can offer quick opinions and suggest reviewers when needed. Open access: Yes. Fee: NO. Sponsor: Center for Machine Learning Research, Peking University & AI for Science Institute, Beijing Publisher: Global Science Press, but the editorial board owns the journal. materialsA suggested notation for machine learning (通用机器学习符号) published by BAAI (北京智源), see page in github or the page in BAAI slides at 第一届机器学习与科学应用大会 CSML2022. slides at CSIAM 2020 ContactEmail: xuzhiqin at sjtu dot edu dot cn, Tel: 021-54742760 Office: 326, No.5 Science Building,No. 800 Dongchuan Road, Shanghai Jiao Tong University,Minhang District, Shanghai Research InterestI am interested in understanding deep learning from training process, loss landscape, generalization and application. For example, we found a Frequency Principle (F-Principle) that deep neural networks (DNNs) often capture target functions from low frequency to high frequency in order during the training. The overview paper of frequency principle is now in: ArXiv 2201.07395; we found an embedding principle that the loss landscape of a DNN “contains” all the critical points of all the narrower DNNs. I am also interested in computational neuroscience, ranging from theoretical study and simulation to data analysis. |