RDMA is becoming prevalent because of its low latency, high throughput and low CPU overhead. However, current RDMA remains a single path transport which is prone to failures and falls short to utilize the rich parallel paths in datacenters. Unlike previous multipath approaches, which mainly focus on TCP, this paper presents a multi-path transport for RDMA, i.e. MP-RDMA, which efficiently utilizes the rich network paths in datacenters.

MP-RDMA employs three novel techniques to address the challenge of limited RDMA NICs on-chip memory size: 1) a multi-path ACK-clocking mechanism to distribute traffic in a congestion-aware manner without incurring per-path states; 2) an out-oforder aware path selection mechanism to control the level of out-of-order delivered packets, thus minimizes the meta data required to them; 3) a synchronise mechanism to ensure in-order memory update whenever needed.

With all these techniques, MP-RDMA only adds 66B to each connection state compared to single-path RDMA. Our evaluation with an FPGA-based prototype demonstrates that compared with single-path RDMA, MP-RDMA can significantly improve the robustness under failures (2x~4x higher throughput under 0.5%∼10% link loss ratio) and improve the overall network utilization by up to 47%.

Publication

Multi-Path Transport for RDMA in Datacenters.
Yuanwei Lu, Guo Chen, Bojie Li, Kun Tan, Yongqiang Xiong, Peng Cheng, Jiansong Zhang, Enhong Chen and Thomas Moscibroda.
Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI ‘18). [PDF] [Slides]

I only do a small amount of discussion in this MP-RDMA project. Credits should go to Dr. Yuanwei Lu.

People

Comments

2018-04-09