Zhilin Pei
Head of Training Framework and Compiled Computing, Shanghai Artificial Intelligence Laboratory, Shanghai, China
He is responsible for training framework and compilation computing in Shanghai Artificial Intelligence Laboratory. His main research areas are heterogeneous chip-based core framework, compilation technology, etc. He has participated in several deep learning projects as a core member, and is currently engaged in the research of basic framework and compilation computing direction in the laboratory.
Topic
AI Compiler Exploration and Practice in Large Model Scenarios
In recent years, the rise of large language models has posed great challenges to the computational and storage capabilities of hardware resources, and the scale of AI training, starting from a single machine with a single card, through a single machine with multiple cards, and then a small-scale multi-computer cluster, has entered into the era of large-scale distributed clusters. This talk focuses on exploring the support of distributed training and inference systems through the AI compiler technology stack in the context of existing technologies, relying on systematic operations to ensure that increasingly large models run efficiently on distributed systems. Outline: 1. background: problems faced by large models 2. Technical exploration: technical directions explored by existing AI compilers 3.Solution: DICompiler overall architecture and process 4. Technology development: explanation of relevant technical details 5.Summary of effect: the actual effect of the benefits