


Huiqiang Jiang

Research and Development Engineer at Microsoft Research Asia

Huiqiang Jiang, a Research and Development Engineer in the Systems Group at Microsoft Research Asia (Shanghai), graduated from Peking University. His research primarily focuses on efficient inference and training methods integrated with software systems, including dynamic sparse attention (MInference, RetrievalAttention), prompt compression (LLMLingua), KV cache compression, speculative decoding, model compression, sparse inference (PIT), neural architecture search, and efficient tuning, particularly for large language models. He has published dozens of papers at top conferences such as NeruIPS, ACL, EMNLP, and ICCV, and serves the community as a reviewer and Area Chair.
