Zhihong Wang

R&D Director, Foundation Model Division, SenseTime

Zhihong Wang is the R&D Director of the Foundation Model Business Unit at SenseTime. He previously led the development of SenseParrots, SenseTime’s in-house AI training framework. He currently oversees the delivery of SenseTime’s privatized AI solutions and leads the technical development and ecosystem-building of the open-source project LazyLLM. With extensive experience in the AI field, he has deep expertise in production deployment of RAG and Agent systems for private environments, enabling dozens of enterprises to successfully adopt AI applications.

Topic

From Prototype to Production: The Three-Phase Architectural Evolution of LazyLLM

I. Background: Why Do We Need to Redesign LLM Frameworks? The gap between demos and production: Performance bottlenecks Limited scalability Hard-to-maintain architectures Strengths and limitations of the Python-first approach The origin of LazyLLM: building an evolvable framework rather than a "one-shot" engineering project II. Functionality First — Building Modular Components for Agents Identifying the core functional modules of an agent Defining subsystems that can be tested, replaced, and evolved independently Benefits of modularization: faster validation, replaceability, and architectural controllability III. Usability First — From Architectural Abstraction to Developer Experience Why usability is critical in LLM application development Using abstraction layers and API design to reduce learning costs Balancing usability and flexibility Absorbing "implicit complexity" within the framework IV. Performance First — Evolving the Architecture from Python to C++ Identifying performance-critical paths Which modules should be rewritten in C++ (compute-intensive / serialization / resource scheduling) Common approaches to Python/C++ hybrid development Performance gains vs. maintenance overhead after refactoring V. Principles & Lessons from Architectural Evolution Why architectures should not aim for the "optimal design" from day one How to build architectures with long-term evolvability Trade-offs among functionality, usability, and performance Looking ahead: can the framework continue to evolve toward higher performance?

© boolan.com 博览 版权所有

沪ICP备15014563号

沪公网安备31011502003949号