Xinfeng Shi | 2025 C++ and System Software Summit

Xinfeng Shi

Senior Technical Expert at Alibaba and Core Developer of the RTP-LLM Project

Joined Alibaba in 2013 and has been working on large model inference development since 2023, responsible for scheduling, distributed architecture, inference processes, and performance optimization of RTP-LLM. RTP-LLM is a widely used inference engine within Alibaba, supporting large model inference across multiple business units including Taobao, Tmall, Xianyu, Cainiao, Amap, Ele.me, AE, and Lazada.

Topic

RTP-LLM: Alibaba Large Model Inference Engine

RTP-LLM is Alibaba’s self-developed LLM inference engine. With high-performance kernels, scheduling, distributed KV cache, and optimized decision-making at a central scheduling node, it delivers lower inference latency and higher throughput. It has been extensively applied and validated across a wide range of LLM scenarios.

Boolan is a leading IT Education & Consulting company in China. Our core competence is our experts team around the world and their cutting edge technology experience accumulated through decades. Adhering to the tenet of "Global Experts, Global Wisdom", we are dedicated to providing our customers In-house Training,Technical Conference, Software Consulting, Expert Lecture, Seminar, Talent Evaluation and Certification and other services by gathering the world's top IT technology experts. www.boolan.com

沪ICP备15014563号