Research paper presented at the ACM SIGMOD/PODS Conference that took place in Philadelphia from 12 to 17 June 2022.
Abstract
Apache Kafka is an open-source distributed publish-subscribe system, which is widely used in data centers for messaging between applications, log aggregation, and stream processing. The existing Kafka implementation uses TCP/IP for communication, which has various inefficiencies such as a high message dispatch cost due to OS involvement and excessive memory copies. Recently, the availability of cost-effective RDMA-capable network controllers within data centers and cloud infrastructures have encouraged many modern applications to adopt RDMA networking, which offers the potential to outperform classical TCP/IP. We introduce KafkaDirect, an extension to Apache Kafka, that uses RDMA to accelerate the three most network intensive datapaths: record production, record replication, and record consumption. In this work, we explore the design choices including which RDMA operations to use to take full advantage of offloaded communication. Our RDMA design relies on one-sided RDMA requests to attain true zero-copy communication completely avoiding the need for using intermediate buffers in Kafka servers, thereby ensuring low latency and high throughput communication. KafkaDirect can offer up to 9x increase in throughput for both Kafka producers and Kafka consumers, and can provide 4x and 50x reduction in latency for Kafka producers and Kafka consumers, respectively.
Authors
Konstantin Taranov, Steve Byan, Virandra Marathe, Torsten Hoefler