Networking today is very dependent on high-bandwidth, low-latency interconnects to move data among the various nodes of an HPC cluster. As clusters get bigger, efficiency becomes a bigger concern, and low-overhead protocols that can help to eliminate wasting compute power becomes even more important. When selecting the main fabric to be used for an HPC cluster network, the choice often comes down to two solutions: Mellanox InfiniBand or Intel Omni-Path Architecture (Intel OPA). Users should look at the communication patterns of the HPC application at hand–including the size of messages being sent and the level of latency that is acceptable–to make the best decision.
InfiniBand by Mellanox is the choice for many general purpose HPC clusters, thanks to its high throughput and low latency. And thanks to the low overhead encoding scheme used in Fourteen Data Rate (FDR) InfiniBand, very high data rates can be achieved while dedicating fewer CPU cycles on message copying, protocol handling, or checksum calculation.
While, Intel’s Omni-Path Architecture delivers the performance for tomorrow’s high performance computing (HPC) workloads and the ability to scale to tens of thousands of nodes at a price competitive with today’s fabrics.
There’s a lot going on in the networks of HPC clusters, and selecting the right network fabric, equipment, and topology is important to ensuring good performance for given applications. A “one size fits all” approach rarely works, and architects will do well to tailor the network to the needs of the application.
If your cluster design requires a high-speed interconnect, either InfiniBand, Omni-Path or Ethernet can provide network solutions characterized by higher bandwidth and low-latency. These technologies provide you with low latencies and blazing fast performance. Currently 100Gb/s technologies are available, with even faster on the way.
The most basic HPC cluster will utilize a single Gigabit Ethernet network for administrative traffic, data sharing and applications processing traffic. If your applications are bandwidth or latency sensitive, using only a high speed interconnect like Infiniband or Omni-Path for your cluster network is preferred.
Often, HPC clusters will be configured with two networks. The first, a Gigabit Ethernet Interface on each node and is identical to the single Ethernet network used by the basic HPC cluster. This is used for scheduling, node maintenance, basic logins, and perhaps data sharing, while the second internal network is dedicated to computational traffic. This configuration ensures that critical computational traffic is not hampered by other traffic.
Choosing the right Interconnect depends on balancing cost with bandwidth and latency. For More Information about a specific Interconnect, click on the appropriate link below: