Huiqiang Jiang
Research and Development Engineer at Microsoft Research Asia
Huiqiang Jiang, a Research and Development Engineer in the Systems Group at Microsoft Research Asia (Shanghai), graduated from Peking University. His research primarily focuses on efficient inference and training methods integrated with software systems, including dynamic sparse attention (MInference, RetrievalAttention), prompt compression (LLMLingua), KV cache compression, speculative decoding, model compression, sparse inference (PIT), neural architecture search, and efficient tuning, particularly for large language models. He has published dozens of papers at top conferences such as NeurIPS, ACL, EMNLP, and ICCV, and serves the community as a reviewer and Area Chair.
Topic