Zhangyu Chen

Zhangyu Chen is a researcher and engineer at Huawei Cloud, where he focuses on building efficient cloud infrastructure for large language models (LLMs). He received both his Ph.D (advised by Prof. Yu Hua) and B.E degrees from Huazhong University of Science and Technology (HUST).

His research interests include AI infrastructure, LLM inference, storage systems, and debugging.

News

Jul 02, 2026	Our paper TaiJi (Unifying Prefill-Decode Disaggregation and Aggregation for Goodput-Optimized LLM Serving) has been accepted by SC 2026! 🎉
May 01, 2026	Our paper TileSparse (Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM Decoding) has been accepted by ICML 2026! 🎉
Jan 16, 2026	Our paper DualMap (Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving) has been accepted by ICLR 2026! 🎉
Jul 10, 2025	Our paper on fail-slow hardware failure bugs received Best Paper Award Nomination at USENIX ATC 2025! 🏆
Apr 25, 2025	Our paper on fail-slow hardware failure bugs has been accepted by USENIX ATC 2025! 🎉

Selected Publications

SC

TaiJi: Unifying Prefill-Decode Disaggregation and Aggregation for Goodput-Optimized LLM Serving

Chao Wang, Pengfei Zuo, Zhangyu Chen, and 3 more authors

In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2026
ICML

TileSparse: Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM Decoding

Chao Wang, Pengfei Zuo, Zhangyu Chen, and 3 more authors

In Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026
ICLR

DualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving

Ying Yuan, Pengfei Zuo, Bo Wang, and 3 more authors

In Proceedings of the 14th International Conference on Learning Representations (ICLR), 2026
USENIX ATC

Understanding and Detecting Fail-Slow Hardware Failure Bugs in Cloud Systems

Gen Dong, Yu Hua, Yongle Zhang, and 2 more authors

In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2025

Best Paper Award Nomination

Best Paper Award Nomination
DATE

MPFS: A Scalable User-Space Persistent Memory File System for Multiple Processes

Bo Ding, Wei Tong, Yu Hua, and 6 more authors

In 2025 Design, Automation & Test in Europe Conference (DATE), 2025
FAST

GPHash: An Efficient Hash Index for GPU with Byte-Granularity Persistent Memory

Menglei Chen, Yu Hua, Zhangyu Chen, and 2 more authors

In Proceedings of the 23rd USENIX Conference on File and Storage Technologies (FAST), 2025
FAST

ROLEX: A Scalable RDMA-oriented Learned Key-Value Store for Disaggregated Memory Systems

Pengfei Li, Yu Hua, Pengfei Zuo, and 2 more authors

In Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023

Best Paper Award

Best Paper Award
ICCD

RMMIO: Enabling Reliable Memory-Mapped I/O for Persistent Memory Systems

Bo Ding, Wei Tong, Yu Hua, and 3 more authors

In Proceedings of the 40th IEEE International Conference on Computer Design (ICCD), 2022
ASPLOS

Efficiently Detecting Concurrency Bugs in Persistent Memory Programs

Zhangyu Chen, Yu Hua, Yongle Zhang, and 1 more author

In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

PDF Code Slides
DATE

Improving the Energy Efficiency of STT-MRAM Based Approximate Cache

Wei Zhao, Wei Tong, Dan Feng, and 6 more authors

In Proceedings of the 24th Design, Automation and Test in Europe Conference (DATE), 2021

PDF
USENIX ATC

Lock-free Concurrent Level Hashing for Persistent Memory

Zhangyu Chen, Yu Hua, Bo Ding, and 1 more author

In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2020

PDF Video Code Slides
DAC

Reducing Bit Writes in Non-volatile Main Memory by Similarity-aware Compression

Zhangyu Chen, Yu Hua, Pengfei Zuo, and 2 more authors

In Proceedings of the 57th Design Automation Conference (DAC), 2020

PDF
USENIX ATC

Mitigating Asymmetric Read and Write Costs in Cuckoo Hashing for Storage Systems

Yuanyuan Sun, Yu Hua, Zhangyu Chen, and 1 more author

In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2019

PDF