Mellanox Infiniband intelligent interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance. Mellanox Technologies is a leading supplier of end-to-end Ethernet and InfiniBand intelligent interconnect solutions and services for servers, storage, and hyper-converged infrastructure.
Mellanox offers a choice of high performance solutions: network and multicore processors, network adapters, switches, cables, software and silicon, that accelerate application runtime and maximize business results for a wide range of markets including high performance computing, enterprise data centers, Web 2.0, cloud, storage, network security, telecom and financial services.
In Cluster Computing, two of the key elements in running a program across multiple nodes is network bandwidth and latency. That’s how much information gets across to another node and how fast the transaction will take. Mellanox currently offers multiple speeds of Mellanox InfiniBand: FDR at 56 Gbp/s, EDR at 100 Gbp/s, and HDR at 200 Gb/s.
HDR is currently the fastest available Mellanox InfiniBand product on the market, and also boasts the highest bandwidth. With Virtual Protocol Interconnect (VPI) technology, Mellanox cards not only allow for InfiniBand connectivity, but also allows up to 200 Gbps of Ethernet connectivity.
|General Specs||ConnectX-3 VPI||ConnectX-4 VPI||ConnectX-5 VPI||ConnectX-6 VPI|
|Ports||Single, Dual||Single, Dual||Single, Dual||Single, Dual|
|Port Speed (Gbs)||FDR10, FDR Eth: 10, 40, 56||FDR, EDR Eth: 10, 25, 40, 50, 56, 100||IB: FDR, EDR Eth: 10, 25, 40, 50, 100||IB: FDR, EDR, HDR 200, HDR 100 Eth: 10, 25, 40, 50, 100, 200|
|PCIe||Gen3 x 8||Gen3 x 8, Gen3 x 16||Gen3 x 16, Gen4 x 16||Gen3 x 16, Gen4 x 16, 32 lanes as 2 x 16-lane PCIe|
|Message Rate (million msgs/sec)||36||150||200 (ConnectX-5 Ex, Gen4 server) 165 (ConnectX-5, Gen3 server)||Contact Aspen Systems|
|Power (2 ports, max. speed)||6.2W||16.3W||19.3W (ConnectX-5 Ex, Gen4 server), 16.2W (ConnectX-5, Gen3 server)||Contact Aspen Systems|
|InfiniBand||Line Rate||QSFP Port||Switch I/O||#Ports/Switch|
|FDR||14 Gb/s||56 Gb/s||2.0 Tb/s||36|
|EDR||25 Gb/s||100 Gb/s||3.6 Tb/s||36|
|HDR||50 Gb/s||200 Gb/s||8.0 Tb/s||40|
When setting up an HPC network, it’s important to ask yourself how much blocking, or oversubscription, you’re willing to live with. Oversubscription is when multiple ports on an edge switch share fewer ports which uplink to the core switch. If configuring a cluster with 108 nodes, we can have four Edge switches, and connect 27 nodes to each 36-port Edge switch. Then, we can take one Core switch and have 9 uplinks from each Edge switch to the Core switch. With four Edge switches, that would take all 36 ports of the Core switch for the 108 nodes. Because we have 27 nodes on each Edge switch and 9 uplinks, we have a 27 to 9, or 3 to 1 oversubscription (Figure 1).
How will this affect job performance? Well, as long as not all 27 ports on the switch are using the bandwidth to the uplinked switch, you get less than 3 to 1 oversubscription. For instance, if only 9 nodes on a single Edge switch are communicating with nodes on other Edge switches, you still have non-blocking. Why is this important? Because if you want no oversubscription or blocking on 108 nodes, you’d need either a 108 port Director switch at a much higher cost than five 36-port switches, or you can have six 36-port Edge switches, and another six 36-port Core switches for a total of 12 switches (Figure 2).
Mellanox Technologies has two classes of switches: Edge switches and Director switches. Mellanox Edge switches come as 12 to 36 port switches, and are usually used as “top of the rack” edge switches in larger systems. These switches can be used as “collector” or “core” switches on systems that aren’t large enough to need Director switches, as they can manage up to 648 nodes in FDR configurations, or up to 2048 nodes in an EDR configuration. FDR switches are capable of providing FDR connectivity in switches.
Mellanox’s Scalable HPC interconnect solutions are paving the road to Exascale computing by delivering the highest scalability, efficiency, and performance for HPC systems today and in the future. Mellanox Technologies Scalable HPC solutions are proven and certified for a large variety of market segments, clustering topologies and environments (Linux, Windows). Mellanox and Aspen Systems are active members of the HPC Advisory Council and contribute to high-performance computing outreach and education around the world.
Mellanox Director switches come as 108 to 800 (200 Gb/s) or 1600 (100 Gb/s) port switches, and can provide the most bandwidth and lowest latency for clusters up to 800 (200 Gb/s) or 1600 (100 Gb/s) ports. If more nodes are needed, these switches serve as the “core” switch to connect to the Edge switches.
|Ports||324||648||800 (200Gb/s) 1600 (100Gb/s)|
|Management||2048 nodes||2048 nodes||2048 nodes|
|Leaf Modules (max)||9||18||20|
|Redundancy||Yes (N+N)||Yes (N+N)||Yes (N+N)|
|Fan Redundancy||Yes||Yes||Liquid cooled|