Michael Wong
Expert of the ISO Artificial Intelligence Technical Committee, Chair of the C++ Standard Committee’s Machine Learning Group, and CTO of YetiWare
Expert of the ISO Artificial Intelligence Technical Committee, Chair of the C++ Standard Committee’s Machine Learning Group, and CTO of YetiWare. He also serves as Chair of the C++ Embedded Development Committee (SG14) and the Machine Learning Committee (SG19), as well as Chair of the C++ Evolution Working Group. He is the former Vice President of R&D at Codeplay and former CEO of OpenMP, and currently leads the Canadian Delegation to the C++ Standard Committee. Michael has extensive experience in C++ parallel computing, high-performance computing, and machine learning. He led the development of the C++ heterogeneous programming language (SYCL) standard for GPU application development, as well as OpenCL. He has profound research and insights into the performance optimization of the underlying layers of PyTorch and TensorFlow. His work specifically covers parallel programming, neural networks, computer vision, and autonomous driving. Michael previously served as a Senior Technical Expert at IBM, where he led the development of the IBM XL C++ compiler and XL C compiler.
Topic
The C++ AI computing Wars: Standardizing the Stack to Survive the Age of LLM and AI Agents
Although the AI and LLM revolution is driven by Python scripts, the actual execution happens in C++ on specialized hardware. As we move from general-purpose CPUs to complex heterogeneous environments composed of GPUs, NPUs, and other AI accelerators, the foundational role of C++ in high-performance computing faces renewed necessity—yet also new challenges. Its long-standing dominance is being questioned. New specialized languages such as Mojo and Triton promise Python-like productivity while delivering C++-level performance. At the same time, the industry’s reliance on the proprietary CUDA ecosystem has created a growing demand for open and portable alternatives such as ROCm. This talk will outline strategies for navigating this new landscape. We will demonstrate that C++ is not a “legacy tool” to be merely maintained, but a language that can continue to evolve—serving as a standardized, high-performance, and interoperable foundation across the entire AI stack.
The New AI Imperative: Standardizing the Full C++ Stack for the Age of Agents
For decades, C++ has been the undisputed language of high-performance computing, serving as the powerful backend for AI and ML frameworks. This revolution, however, was built on a fragmented ecosystem of proprietary CUDA kernels, custom tensor libraries, and non-standard C-style code. Today, the rise of generative AI, LLM-based coding agents, and new domain-specific languages like Mojo has created a new imperative: C++ must evolve from a mere implementation detail to a standardized, productive, and safe platform for the entire AI pipeline. This talk presents a cohesive vision for C++'s future, showing how recent and future features are building a complete, first-class AI stack. We will explore this strategy in three layers: Layer 1: The Foundation (Data Science): How SG19's work on std::statistics and the critical need for a std::data_frame provides the standard, high-performance tools for the exploratory data analysis that begins every ML workflow. Layer 2: The Core Data Structures (Tensors & Graphs): Moving beyond C++23's std::mdspan (the "tensor view"), we'll discuss the essential next steps: C++26's std::linalg , the proposal for an owning std::mdarray, and SG19's massive proposal for std::graph —the standard solution for GNNs, recommendation engines, and knowledge graphs. Layer 3: The Execution (Performance & Parallelism): How we run it all. We will cover how C++26's std::execution provides the standard "plumbing" to offload these new libraries to GPUs and accelerators, while std::simd offers a portable, high-level path for vectorization, finally moving us beyond non-standard compiler intrinsics. Finally, we will address the most critical factor for AI-generated code: the "garbage in, garbage out" problem. This talk introduces a forward-looking call to action: the creation of a standardized, domain-specific C++ training corpus—an "ImageNet for C++"—to teach LLMs to generate idiomatic, modern, and safe code for low-lat