- Source: IWARP
- Source: IWarp
iWARP is a computer networking protocol that implements remote direct memory access (RDMA) for efficient data transfer over Internet Protocol networks. Contrary to some accounts, iWARP is not an acronym.
Because iWARP is layered on Internet Engineering Task Force (IETF)-standard congestion-aware protocols such as Transmission Control Protocol (TCP) and Stream Control Transmission Protocol (SCTP), it makes few requirements on the network, and can be successfully deployed in a broad range of environments.
History
In 2007, the IETF published five Request for Comments (RFCs) that define iWARP:
RFC 5040 A Remote Direct Memory Access Protocol Specification is layered over Direct Data Placement Protocol (DDP). It defines how RDMA Send, Read, and Write operations are encoded using DDP into headers on the network.
RFC 5041 Direct Data Placement over Reliable Transports is layered over MPA/TCP or SCTP. It defines how received data can be directly placed into an upper layer protocols receive buffer without intermediate buffers.
RFC 5042 Direct Data Placement Protocol (DDP) / Remote Direct Memory Access Protocol (RDMAP) Security analyzes security issues related to iWARP DDP and RDMAP protocol layers.
RFC 5043 Stream Control Transmission Protocol (SCTP) Direct Data Placement (DDP) Adaptation defines an adaptation layer that enables DDP over SCTP.
RFC 5044 Marker PDU Aligned Framing for TCP Specification defines an adaptation layer that enables preservation of DDP-level protocol record boundaries layered over the TCP reliable connected byte stream.
These RFCs are based on the RDMA Consortium's specifications for RDMA over TCP. The RDMA Consortium's specifications are influenced by earlier RDMA standards, including Virtual Interface Architecture (VIA) and InfiniBand (IB).
Since 2007, the IETF has published three additional RFCs that maintain and extend iWARP:
RFC 6580 IANA Registries for the Remote Direct Data Placement (RDDP) Protocols published in 2012 defines IANA registries for Remote Direct Data Placement (RDDP) error codes, operation codes, and function codes.
RFC 6581 Enhanced Remote Direct Memory Access (RDMA) Connection Establishment published in 2011 fixes shortcomings with iWARP connection setup.
RFC 7306 Remote Direct Memory Access (RDMA) Protocol Extensions published in 2014 extends RFC 5040 with atomic operations and RDMA Write with Immediate Data.
Protocol
The main component in the iWARP protocol is the Direct Data Placement Protocol (DDP), which permits the actual zero-copy transmission. DDP itself does not perform the transmission; the underlying protocol (TCP or SCTP) does.
However, TCP does not respect message boundaries; it sends data as a sequence of bytes without regard to protocol data units (PDU). In this regard, DDP itself may be better suited for SCTP, and indeed the IETF proposed a standard RDMA over SCTP. To run DDP over TCP requires a tweak known as marker PDU aligned (MPA) framing to guarantee boundaries of messages.
Furthermore, DDP is not intended to be accessed directly. Instead, a separate RDMA protocol (RDMAP) provides the services to read and write data. Therefore, the entire RDMA over TCP specification is really RDMAP over DDP over either MPA/TCP or SCTP. All of these protocols can be implemented in hardware.
Unlike IB, iWARP only has reliable connected communication, as this is the only service that TCP and SCTP provide. The iWARP specification omits other features of IB, such as Send with Immediate Data operations. With RFC 7306, the IETF is working to reduce these omissions.
Implementation
Because a kernel implementation of the TCP stack can be seen as a bottleneck, the protocol is typically implemented in hardware RDMA network interface controllers (rNICs). As simple data losses are rare in tightly coupled network environments, the error-correction mechanisms of TCP may be performed by software while the more frequently performed communications are handled strictly by logic embedded on the rNIC. Similarly, connections are often established entirely by software and then handed off to the hardware. Furthermore, the handling of iWARP specific protocol details is typically isolated from the TCP implementation, allowing rNICs to be used for both as RDMA offload and TCP offload (in support of traditional sockets based TCP/IP applications). The portion of the hardware implementation used for implementing the TCP protocol is known as the TCP Offload Engine (TOE).
TOE itself does not prevent copying on the reception side, and must be combined with RDMA hardware for zero-copy results. The RDMA / TCP specification is a set of different wire protocols intended to be implemented in hardware (though it seems feasible to emulate it in software for compatibility but without the performance benefits).
Interfaces
iWARP is a protocol, not an implementation, but defines protocol behavior in terms of the operations that are legal for the protocol, known as Verbs. As such, iWARP does not have any single standard programming interface. However, programming interfaces tend to very closely correspond to the Verbs.
Several programmatic interfaces have been proposed, including OpenFabrics Verbs, Network Direct, uDAPL, kDAPL, IT-API, and RNICPI. Implementations of some of these interfaces are available for different platforms, including Windows and Linux.
Services available
Networking services implemented over iWARP include those offered in the OpenFabrics Enterprise Distribution (OFED) by the OpenFabrics Alliance for Linux operating systems, and by Microsoft Windows via Network Direct.
NVMe over Fabrics (NVMEoF)
iSCSI Extensions for RDMA (iSER)
Server Message Block Direct (SMB Direct)
Sockets Direct Protocol (SDP)
SCSI RDMA Protocol (SRP)
Network File System over RDMA (NFS over RDMA)
GPUDirect
Vendors
Popular vendors of iWarp enabled equipment include:
Chelsio
Marvell
Bloombase
See also
RDMA over Converged Ethernet
References
External links
OpenFabrics Alliance at the University of New Hampshire InterOperability Laboratory — Testing on iWARP devices
Remote Direct Data Placement Charter (IETF)
MPI-SCTP: Using the Stream Control Transmission Protocol for parallel programs written using the Message Passing Interface Archived 2009-10-02 at the Wayback Machine (2008-09-01)
SMB2 Remote Direct Memory Access (RDMA) Transport Protocol (2017-06-01)
iWarp was an experimental parallel supercomputer architecture developed as a joint project by Intel and Carnegie Mellon University. The project started in 1988, as a follow-up to CMU's previous WARP research project, in order to explore building an entire parallel-computing "node" in a single microprocessor, complete with memory and communications links. In this respect the iWarp is very similar to the INMOS transputer and nCUBE.
Intel announced iWarp in 1989. The first iWarp prototype was delivered to Carnegie Mellon in summer of 1990, and in fall they received the first 64-cell production systems, followed by two more in 1991. With the creation of the Intel Supercomputing Systems Division in the summer of 1992, the iWarp was merged into the iPSC product line. Intel kept iWarp as a product but stopped actively marketing it.
Each iWarp CPU included a 32-bit ALU with a 64-bit FPU running at 20 MHz. It was purely scalar and completed one instruction per cycle, so the performance was 20 MIPS or 20 megaflops for single precision and 10 MFLOPS for double. The communications were handled by a separate unit on the CPU that drove four serial channels at 40 MB/s, and included networking support in hardware that allowed for up to 20 virtual channels (similar to the system added to the INMOS T9000).
iWarp processors were combined onto boards along with memory, but unlike other systems Intel chose the faster, but more expensive, static RAM for use on the iWarp. Boards typically included four CPUs and anywhere from 512 kB to 4 MB of SRAM.
Another difference in the iWarp was that the systems were connected together as a n-by-m torus, instead of the more common hypercube. A typical system included 64 CPUs connected as an 8×8 torus, which could deliver 1.2 gigaflops peak.
George Cox was the lead architect of the iWarp project. Steven McGeady (later an Intel Vice-president and witness in the Microsoft antitrust case) wrote an innovative development environment that allowed software to be written for the array before it was completed. Each node of the array was represented by a different Sun workstation on a LAN, with the iWarp's unique inter-node communication protocol simulated over sockets. Unlike the chip-level simulator, which could not simulate a multi-node array, and which ran very slowly, this environment allowed in-depth development of array software to begin.
The production compiler for iWarp was a C and Fortran compiler based on the AT&T pcc compiler for UNIX, ported under contract for Intel by the Canadian firm HCR Corporation and then extensively modified and extended by Intel.
See also
Systolic array
Notes
External links
iWarp Project at CMU
Kata Kunci Pencarian:
- SCSI
- IWARP
- IWarp
- RDMA over Converged Ethernet
- Remote direct memory access
- MVAPICH
- Systolic array
- George Tutuska
- WARP (systolic array)
- Intel
- OpenFabrics Alliance