CC-NIC: a Cache-Coherent Interface to the NIC

CC-NIC: a Cache-Coherent Interface to the NIC

2024 | Henry N. Schuh, Arvind Krishnamurthy, David Culler, Henry M. Levy, Luigi Rizzo, Samira Khan, Brent E. Stephens
CC-NIC is a cache-coherent interface for network interface controllers (NICs) that leverages emerging coherent interconnects to improve performance. This work addresses the limitations of current PCIe NICs, which prioritize CPU efficiency at the expense of latency. By redesigning the host-NIC interface, CC-NIC minimizes overhead and benefits from the new interactions enabled by coherence. The design is modeled using Intel's Ice Lake and Sapphire Rapids UPI interconnects, demonstrating potential for optimization. CC-NIC achieves a maximum packet rate of 1.5Gbps and 980Gbps packet throughput, with 77% lower minimum latency and 88% lower latency at 80% load compared to PCIe NICs. It also shows application-level core savings and benefits across a range of interconnect performance characteristics. The PCIe host-NIC interface is analyzed, revealing tradeoffs between CPU efficiency and latency. PCIe imposes limitations on shared data structures and incurs CPU overhead for host-initiated operations. Current NIC designs aim to minimize host PCIe overhead at the expense of transmission latency by introducing additional signaling trips and batching. The impact on packet latency is significant, with host-NIC loopback latency on a Mellanox CX6 NIC being 2.1us at low load and 6.0us at 80% load, almost an order of magnitude higher than switch traversal. Coherent interconnects offer a fundamental change in host-device communication, enabling new benefits and posing challenges. CC-NIC is designed to take advantage of coherent interconnects by redesigning the host-NIC interface to optimize for cache-coherent data paths and interactions. The design includes optimized data structures, layouts, and signaling to reduce overhead and improve performance. CC-NIC demonstrates a packet rate of 1.5Gpps and a minimum TX-RX latency of 494ns, with latency under 80% load of 716ns, an even greater reduction relative to PCIe NICs. Compared to an interface matching a current PCIe NIC, on the same UPI link, the proposed design achieves a 3.3× throughput improvement and 52% minimum latency reduction, in addition to decreased latency under load and terabit bandwidth saturation. The design of CC-NIC is evaluated on Intel's Ice Lake and Sapphire Rapids server platforms, showing consistent relative improvement across varied interconnect performance characteristics. The results demonstrate that CC-NIC's design benefits hold, maintaining consistent relative improvement. The evaluation shows that CC-NIC's performance is significantly better than PCIe NICs, with lower latency and higher throughput. The design also shows application-level core savings and benefits across a range of interconnect performance characteristics. The results indicate that CC-NIC is a promising solution for improving host-NIC communication in the context of coherent interconnects.CC-NIC is a cache-coherent interface for network interface controllers (NICs) that leverages emerging coherent interconnects to improve performance. This work addresses the limitations of current PCIe NICs, which prioritize CPU efficiency at the expense of latency. By redesigning the host-NIC interface, CC-NIC minimizes overhead and benefits from the new interactions enabled by coherence. The design is modeled using Intel's Ice Lake and Sapphire Rapids UPI interconnects, demonstrating potential for optimization. CC-NIC achieves a maximum packet rate of 1.5Gbps and 980Gbps packet throughput, with 77% lower minimum latency and 88% lower latency at 80% load compared to PCIe NICs. It also shows application-level core savings and benefits across a range of interconnect performance characteristics. The PCIe host-NIC interface is analyzed, revealing tradeoffs between CPU efficiency and latency. PCIe imposes limitations on shared data structures and incurs CPU overhead for host-initiated operations. Current NIC designs aim to minimize host PCIe overhead at the expense of transmission latency by introducing additional signaling trips and batching. The impact on packet latency is significant, with host-NIC loopback latency on a Mellanox CX6 NIC being 2.1us at low load and 6.0us at 80% load, almost an order of magnitude higher than switch traversal. Coherent interconnects offer a fundamental change in host-device communication, enabling new benefits and posing challenges. CC-NIC is designed to take advantage of coherent interconnects by redesigning the host-NIC interface to optimize for cache-coherent data paths and interactions. The design includes optimized data structures, layouts, and signaling to reduce overhead and improve performance. CC-NIC demonstrates a packet rate of 1.5Gpps and a minimum TX-RX latency of 494ns, with latency under 80% load of 716ns, an even greater reduction relative to PCIe NICs. Compared to an interface matching a current PCIe NIC, on the same UPI link, the proposed design achieves a 3.3× throughput improvement and 52% minimum latency reduction, in addition to decreased latency under load and terabit bandwidth saturation. The design of CC-NIC is evaluated on Intel's Ice Lake and Sapphire Rapids server platforms, showing consistent relative improvement across varied interconnect performance characteristics. The results demonstrate that CC-NIC's design benefits hold, maintaining consistent relative improvement. The evaluation shows that CC-NIC's performance is significantly better than PCIe NICs, with lower latency and higher throughput. The design also shows application-level core savings and benefits across a range of interconnect performance characteristics. The results indicate that CC-NIC is a promising solution for improving host-NIC communication in the context of coherent interconnects.
Reach us at info@study.space