Zhangyu Chen
cs.zychen@gmail.com
Zhangyu Chen is a researcher and engineer at Huawei Cloud, where he focuses on building efficient cloud infrastructure for large language models (LLMs). He received both his Ph.D (advised by Prof. Yu Hua) and B.E degrees from Huazhong University of Science and Technology (HUST).
His research interests include AI infrastructure, LLM inference, storage systems, and debugging.
News
| May 22, 2026 | Our paper TileSparse (Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM Decoding) has been accepted by ICML 2026! 🎉 |
|---|---|
| May 01, 2026 | Our paper DualMap (Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving) has been accepted by ICLR 2026! 🎉 |
| Jun 15, 2025 | Our paper on fail-slow hardware failure bugs received Best Paper Award Nomination at USENIX ATC 2025! 🏆 |
| Feb 15, 2025 | Our paper GPHash has been accepted by FAST 2025! |
| Feb 20, 2023 | Our paper ROLEX won the Best Paper Award at FAST 2023! 🏆 |
Selected Publications
- ICMLTileSparse: Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM DecodingIn Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026
- ICLRDualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM ServingIn Proceedings of the 14th International Conference on Learning Representations (ICLR), 2026
- USENIX ATCUnderstanding and Detecting Fail-Slow Hardware Failure Bugs in Cloud SystemsIn Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2025
- DATEMPFS: A Scalable User-Space Persistent Memory File System for Multiple ProcessesIn 2025 Design, Automation & Test in Europe Conference (DATE), 2025
- FASTGPHash: An Efficient Hash Index for GPU with Byte-Granularity Persistent MemoryIn Proceedings of the 23rd USENIX Conference on File and Storage Technologies (FAST), 2025
- FASTROLEX: A Scalable RDMA-oriented Learned Key-Value Store for Disaggregated Memory SystemsIn Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023
- ICCDRMMIO: Enabling Reliable Memory-Mapped I/O for Persistent Memory SystemsIn Proceedings of the 40th IEEE International Conference on Computer Design (ICCD), 2022
- DATEImproving the Energy Efficiency of STT-MRAM Based Approximate CacheIn Proceedings of the 24th Design, Automation and Test in Europe Conference (DATE), 2021
- DACReducing Bit Writes in Non-volatile Main Memory by Similarity-aware CompressionIn Proceedings of the 57th Design Automation Conference (DAC), 2020
- USENIX ATCMitigating Asymmetric Read and Write Costs in Cuckoo Hashing for Storage SystemsIn Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2019