Cluster Racking

How will your cluster be racked?

 

Aspen racks all clusters in a standardized way, but can customize your clusters rack layout to meet your needs. Aspen is an APC™ partner, and utilizes APC™ AR3100 series 42U (rack unit) racks for standard installations. Rack units are 1.75” in height, and are used as a standard in-rack height measurement for rack mounted equipment. A 42U rack contains 73 1/2” of inside vertical space available for installing equipment. These racks are quite strong, with a static load of 3000 pounds and a dynamic load of 2250 pounds, and are also one of the most compact full height racks available, which facilitates rack delivery and installation.

 

We also can provide racks in less than full height sizes, such as 25U, but do not recommend utilizing less than full height racks unless you have a requirement to specifically do so. Floor space is becoming more and more valuable in facilities, and a shorter rack occupies the same floor space while offering less expandability for future needs. If your organization utilizes specific or specialized racks, we can also accommodate your needs. Speak to your sales engineer about your requirements.

 

Your cluster may come in one or more racks. Each standard rack is designed, and includes hardware for, baying to other racks on either a 24” (U.S. standard) or a 600 mm (International standard) grid, allowing an exact one to one rack to floor tile placement in any computer room. Each standard rack has two side panels, a lower and an upper, on each side. These panels can be removed when baying racks together to allow for more dense wiring bundles between each rack or to allow rack to rack cooling flow.

 

Aspen places heavier hardware, such as master nodes, UPS systems, external RAID units, or storage nodes low in the rack to remove any danger of tipping when units are extended in the rack, and nodes are normally installed above these systems. Racks are numbered left to right, and lower numbered nodes are placed lower in the rack. Node numbering sequences are always ordered from the front, left to right, bottom to top.

 

If an integrated console (keyboard, mouse, and display) unit is configured for your rack, it is installed so that it's bottom is at rack unit 22 or 23 (40 1/2” or 42 1/4”) whenever possible. This height allows most people to comfortably stand at the console and type, or to use an extended height office chair to sit at the keyboard. This location can be customized to your specific needs.

 

High speed interconnect switches such as InfiniBand or Myrinet are installed in such as way as to minimize cable length and simplify cable routing. In larger clusters, these switches will be located toward the center of each rack row and toward the bottom of the rack if underfloor wiring is utilized, or toward the top of the rack if rack top cable routing accessories or ceiling mount cable trays are used.

 

Gigabit Ethernet switches used for the administrative, computing network, or Intelligent Platform Management Interface (IPMI) networks are normally installed in the top rear of racks, and are located to reduce cable lengths as well. Many of these switches are not full depth installations, so allow the same rack space in the front of the rack to be used for additional units such as other network or Keyboard Video Mouse (KVM) switches.

 

All nodes, and every system that allows it, is mounted on slide-out rails to allow for maintenance. The units slide from the front of the rack, and have two or more attachment screws securing the unit for shipping and normal operation located at the front of each unit. Specific hardware such as UPS systems and other very heavy units are mounted in the racks using fixed rails for safety. In almost all cases, power, network, and other cables must be detached from the unit before it is extended from the rack, although cable management solutions can be configured that will allow your nodes to be extended in the rack while all cables remain connected at additional expense.

 

Your cluster cabling is bundled between racks and formed for easy connection to the appropriate switch or KVM, and Aspen utilizes maintenance loops on most cables which are routed down the inside of the frame rails on each side of the rack. Cluster Ethernet cables are color-coded as follows;

 

  • Yellow: inside network #1 (1st Ethernet network, administrative and NFS network)
  • Blue: inside network #2 (2nd Ethernet network, Data access or MPI)
  • Green: inside network #3 (3rd Ethernet if needed)
  • White: IPMI network (3rd LAN Ethernet network dedicated to IPMI)
  • Red: Outside world connection (if this cable is supplied by Aspen)

 

In addition to color codes, each Ethernet cable inside your cluster is labeled at both ends with a letter-number combination which uniquely identifies the cable within its color group. For instance, an Ethernet cable connected to your single master node might be identified with an “M”, and the equivalent node1 Ethernet cable would be identified with a “1”.

 

Power Distribution Units (PDUs) are mounted vertically in the rear of each rack (up to 6 in some cases), and can be reversed to allow connection to overhead or underfloor power. If switched or metered PDUs are used, their Ethernet connections will be color coded to match the network to which they are cabled.

 

Aspen's default rack front and rear doors are perforated to allow full cooling flow to the air intakes of all units, and free egress of rear exhaust air. Aspen can provide additional cooling options, such as raised floor helper fans, rear fan doors with rout-able plenum's to control exhaust air direction and destination, front or rear self-contained rack cooling doors which are connected to building chilled water, in-row Computer Room Air Conditions (CRAC) units, or even fully enclosed “cold aisle, hot aisle” solutions to meet your facility needs.

 

You may also request  that nodes be racked with 1/3 Rack unit spacing between each node in the rack.  This is done to eliminate metal to metal contact between node cases so that heat transfer is minimized as well as to provide greater front to rear air flow within the rack. This may be desirable in facilities with air flow issues or less adequate cooling.

 

You or your organization may have special racking or unit location preferences or requirements that are not reflected in our standard rack layout. In that case, your Aspen sales engineer will arrange a conference with you, the sales engineer, and our hardware engineers to determine your specific needs. Aspen will generate a custom rack layout diagram that reflects your specific requirements, then rack your cluster according to that layout diagram after your approval and acceptance.

 

Lets pretend that you have configured a 46 node InfiniBand cluster from Aspen, with 2 UPS systems, a 4U storage node and master, external RAID system, and KVM capabilities. The following diagram illustrates a standard physical rack configuration for this example cluster.

 

 

In this example cluster, the InfiniBand switch selected can support a maximum of 48 ports, so connecting the master and storage nodes to the high speed interconnect only left 46 ports available. What advantage did we gain by connecting the master to the low latency interconnect in this case? Your sales engineer can answer that question. A larger 7U switch chassis placed in R2 might be a more cost effective option for future expansion, as it can be populated with additional InfiniBand port cards when additional nodes are purchased, and allow the new nodes to be added to this cluster without any rewiring of current node connections. The 7U switch supports 96 ports, so a third rack could be procured at a later time with up to 44 additional identical nodes and bayed to your existing cluster as R3, expanding this cluster to 90 nodes.

 

Your Ethernet infrastructure needs to be examined as well. If non-blocking Gigabit connectivity were not needed, a multi-port trunk between the existing switch and a new switch needed to connect the additional nodes might be sufficient. This would imply that your data access on the compute nodes is to be over the InfiniBand interface, not the Ethernet. If your data mounts are to be done over the Gigabit Ethernet, a stackable switch with a high speed stacking interface might be a cost-effective solution depending on the number of nodes to be added, while a larger chassis based Gigabit Ethernet switch could guarantee nonblocking connectivity. As your storage or master node data serving speed on Gigabit Ethernet would be limited to 2 Gb/s (if 2 Ethernet interfaces were bonded) in any case, it would be more cost and performance effective to utilize the InfiniBand interface for data access given this number of nodes.

 

Density

 

Racking density is a major concern of many HPC users due to limited space availability and rising facility expenses. Aspen has several solutions for this environment, including Blade server offerings. While more expensive, a blade server solution might be for you if space savings are the primary consideration for your cluster. However, blades often will not support the highest performance processors, and there will most likely be storage, access and other compromises you will need to make in order to utilize blade servers for your cluster. A more cost effective alternative for your cluster might be “twin” systems.

 

Aspen offers twin systems, which combine 2 nodes into a 1U chassis sharing a common power supply. In the example cluster above, the nodes inhabit 46U of rack space. Using the twin systems, the same number of nodes would inhabit only 23U of rack space, and unlike many blade solutions, no processor or storage compromises need be made. In more dense clusters, it is critical that your facility offer adequate cooling or that we provide you with one of our additional cooling solutions. Additional rack space could be saved by utilizing a 3U or 2U master and storage node, or changing out the UPS systems for a larger single system. Your sales engineer can help you decide what options best fit your needs and budget.

 

If your facility has cooling issues or a history of overheating, Aspen can also rack all units with a 1/3U spacing between every node. In this case, 3 1U nodes would require 4U of actual rack space, and no heat transfer between nodes due to metal to metal contact will occur.

 


<< Previous | Next >>


Bookmark and Share