Publications
Publications in reversed chronological order.
2026
- ICMLTileSparse: Arithmetic-Intensity-Aware Sparse Attention for Compute-Bound LLM DecodingIn Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026
- ICLRDualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM ServingIn Proceedings of the 14th International Conference on Learning Representations (ICLR), 2026
2025
- ArxivServing Large Language Models on Huawei CloudMatrix384Arxiv, 2025
- ArxivInjecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention DisaggregationArxiv, 2025
- USENIX ATCUnderstanding and Detecting Fail-Slow Hardware Failure Bugs in Cloud SystemsIn Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2025
- DATEMPFS: A Scalable User-Space Persistent Memory File System for Multiple ProcessesIn 2025 Design, Automation & Test in Europe Conference (DATE), 2025
- FASTGPHash: An Efficient Hash Index for GPU with Byte-Granularity Persistent MemoryIn Proceedings of the 23rd USENIX Conference on File and Storage Technologies (FAST), 2025
2024
- TCEnabling Reliable Memory-Mapped I/O With Auto-Snapshot for Persistent Memory SystemsIEEE Transactions on Computers (TC), 2024
2023
- TOSA High-performance RDMA-oriented Learned Key-value Store for Disaggregated Memory SystemsACM Transactions on Storage (TOS), 2023
- TCADAPPcache+: An STT-MRAM-Based Approximate Cache System With Low Power and Long LifetimeIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2023
- JCSTApproximate Similarity-Aware Compression for Non-Volatile Main MemoryJournal of Computer Science and Technology (JCST), 2023Accepted and to appear
- FASTROLEX: A Scalable RDMA-oriented Learned Key-Value Store for Disaggregated Memory SystemsIn Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023
- TACOLock-Free High-Performance Hashing for Persistent Memory via PM-Aware Holistic OptimizationACM Transactions on Architecture and Code Optimization (TACO), 2023
2022
- ICCDRMMIO: Enabling Reliable Memory-Mapped I/O for Persistent Memory SystemsIn Proceedings of the 40th IEEE International Conference on Computer Design (ICCD), 2022
2021
- DATEImproving the Energy Efficiency of STT-MRAM Based Approximate CacheIn Proceedings of the 24th Design, Automation and Test in Europe Conference (DATE), 2021
2020
- DACReducing Bit Writes in Non-volatile Main Memory by Similarity-aware CompressionIn Proceedings of the 57th Design Automation Conference (DAC), 2020
2019
- USENIX ATCMitigating Asymmetric Read and Write Costs in Cuckoo Hashing for Storage SystemsIn Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2019