RDMA is the standard protocol for high-speed InfiniBand network connections. This RDMA network protocol is often used for intersystem communication and was first popular in high-performance computing environments.
RDMA is the modern commercial form of an idea that originated in work by Thorsten von Eicken, Anindya Basu, Vineet Buch and Werner Vogels, who wrote an SOSP paper in 1995 showing that you could build a parallel supercomputer by running parallel computing code on normal servers, by disabling virtualization (which slows
RDMA is a hardware mechanism through which the network card (NIC) can directly access all or parts of the main memory of a remote node without involving the processor. One-sided RDMA operations do not involve the remote CPU at all. No data is copied between the user space and kernel, and the other way around.
DMA is also used for intra-chip data transfer in multi-core processors. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without DMA channels. DMA can also be used for "memory to memory" copying or moving of data within memory.
The industry standard for RDMA over Ethernet is iWARP, which uses the familiar TCP/IP stack as foundation. High performance iWARP implementations are available and compete directly with InfiniBand in real application benchmarks.
Remote Direct Memory Access, or RDMA, is a technology that provides a low-latency network connection between processing running on two servers, or virtual machines in Azure.
InfiniBand is just a specific network architecture offering RDMA. It's simply an RDMA implementation over (lossless data center) Ethernet which is somewhat competing with InfiniBand as a wire-protocol while using the same verbs interface as API.
RDMA over Converged Ethernet (RoCE) is a network protocol that allows remote direct memory access (RDMA) over an Ethernet network. It does this by encapsulating an IB transport packet over Ethernet. There are two RoCE versions, RoCE v1 and RoCE v2.
4.Configuring Guest RDMA
- Install Windows Server 2019.
- Install the Hyper-V Role and the Data Center Bridging (DCB) feature.
- Configure QoS (Quality-of-Service), DCB, PFC, ETS.
- Configure Hyper-V SET (Switch Embedded Team).
- Test RDMA communication between the physical servers prior configuring the VMs.
The advantages of RDMA is the low latency transfer of information between compute nodes at the memory-to-memory level, without burdening the CPU. This transfer function is offloaded to the network adapter hardware in order to bypass the operating system software network stack.
NVMe over RDMA over Converged Ethernet (RoCE)It encapsulates the InfiniBand transport packet over Ethernet. Its solution provides the Link Level Flow Control mechanism to assure zero loss, even when the network is saturated. The RoCE protocol allows lower latencies than its predecessor, the iWARP protocol.
Higher performance, higher radix and low-latency solutions: Today, Ethernet switches offer connectivity of up to 400 Gbps. In contrast, InfiniBand HDR only offers connectivity of up to 200 Gbps and Fibre channel offers connectivity of up to 32 Gbps.
AWS ParallelCluster now supports NVIDIA GPUDirect RDMAAWS ParallelCluster is a fully supported and maintained open source cluster management tool that makes it easy for scientists, researchers, and IT administrators to deploy and manage High Performance Computing (HPC) clusters in the AWS cloud.
RDMA is a feature that enables network adapters to transfer data directly between each other without requiring the main processor of the system to be part of that transfer. This results in lower latency and lower processor utilization.
Installation
- Clone the kernel (4.9.0). Run: # cd /usr/src.
- Compile the kernel. Run: # cd linux.
- To enable Soft-RoCE, perform the following steps: 3.1 Press “/” : This will open a text field (search) for you. 3.2 Type “rxe” and “OK”.
- Compile. Run:
When RDMA traffic bypasses the kernel, it cannot be monitored using tcpdump, wireshark or other tools, but it can be done by monitoring a switch port in the network and sending the traffic to a designated server.
Mellanox OFED (MLNX_OFED) is a Mellanox tested and packaged version of OFED that supports two interconnect types using the same RDMA (remote DMA) and kernel bypass APIs called OFED verbs – InfiniBand and Ethernet.
SMB Direct is an extension of the Server Message Block technology by Microsoft used for file operations. The Direct part implies the use of various high speed Remote Data Memory Access (RDMA) methods to transfer large amounts of data with little CPU intervention.
Before enabling it, make sure that each network adapter connected to a network with the Storage traffic type supports RDMA. To enable or disable RDMA, use the switcher on the SETTINGS > Advanced settings > RDMA tab.
Deploy SMB Direct with Ethernet (iWARP) Network Adapters
- Overview.
- Configure the IP addresses.
- Configure Windows Firewall.
- Allow access across multiple subnets.
- Verify the configuration.
- Verify the network adapter configuration.
- Verify the SMB configuration.
- Verify the SMB connection.