Zhangyu Chen

cs.zychen@gmail.com

prof_pic.jpg

Zhangyu Chen is a researcher and engineer at Huawei Cloud, where he focuses on building efficient cloud infrastructure for large language models (LLMs). He received both his Ph.D (advised by Prof. Yu Hua) and B.E degrees from Huazhong University of Science and Technology (HUST).

His research interests include AI infrastructure, LLM inference, storage systems, and debugging.

News

May 22, 2026 Our paper TileSparse (Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM Decoding) has been accepted by ICML 2026! 🎉
May 01, 2026 Our paper DualMap (Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving) has been accepted by ICLR 2026! 🎉
Jun 15, 2025 Our paper on fail-slow hardware failure bugs received Best Paper Award Nomination at USENIX ATC 2025! 🏆
Feb 15, 2025 Our paper GPHash has been accepted by FAST 2025!
Feb 20, 2023 Our paper ROLEX won the Best Paper Award at FAST 2023! 🏆

Selected Publications

  1. ICML
    TileSparse: Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM Decoding
    Chao Wang, Pengfei Zuo, Zhangyu Chen, and 3 more authors
    In Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026
  2. ICLR
    DualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving
    Ying Yuan, Pengfei Zuo, Bo Wang, and 3 more authors
    In Proceedings of the 14th International Conference on Learning Representations (ICLR), 2026
  3. USENIX ATC
    Understanding and Detecting Fail-Slow Hardware Failure Bugs in Cloud Systems
    Gen Dong, Yu Hua, Yongle Zhang, and 2 more authors
    In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2025
  4. DATE
    MPFS: A Scalable User-Space Persistent Memory File System for Multiple Processes
    Bo Ding, Wei Tong, Yu Hua, and 6 more authors
    In 2025 Design, Automation & Test in Europe Conference (DATE), 2025
  5. FAST
    GPHash: An Efficient Hash Index for GPU with Byte-Granularity Persistent Memory
    Menglei Chen, Yu Hua, Zhangyu Chen, and 2 more authors
    In Proceedings of the 23rd USENIX Conference on File and Storage Technologies (FAST), 2025
  6. FAST
    ROLEX: A Scalable RDMA-oriented Learned Key-Value Store for Disaggregated Memory Systems
    Pengfei Li, Yu Hua, Pengfei Zuo, and 2 more authors
    In Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023
  7. ICCD
    RMMIO: Enabling Reliable Memory-Mapped I/O for Persistent Memory Systems
    Bo Ding, Wei Tong, Yu Hua, and 3 more authors
    In Proceedings of the 40th IEEE International Conference on Computer Design (ICCD), 2022
  8. ASPLOS
    Efficiently Detecting Concurrency Bugs in Persistent Memory Programs
    Zhangyu Chen, Yu Hua, Yongle Zhang, and 1 more author
    In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
  9. DATE
    Improving the Energy Efficiency of STT-MRAM Based Approximate Cache
    Wei Zhao, Wei Tong, Dan Feng, and 6 more authors
    In Proceedings of the 24th Design, Automation and Test in Europe Conference (DATE), 2021
  10. USENIX ATC
    Lock-free Concurrent Level Hashing for Persistent Memory
    Zhangyu Chen, Yu Hua, Bo Ding, and 1 more author
    In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2020
  11. DAC
    Reducing Bit Writes in Non-volatile Main Memory by Similarity-aware Compression
    Zhangyu Chen, Yu Hua, Pengfei Zuo, and 2 more authors
    In Proceedings of the 57th Design Automation Conference (DAC), 2020
  12. USENIX ATC
    Mitigating Asymmetric Read and Write Costs in Cuckoo Hashing for Storage Systems
    Yuanyuan Sun, Yu Hua, Zhangyu Chen, and 1 more author
    In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2019