Nodes
What are your node configuration options?

Compute nodes are almost always considered as replaceable, with little or no customization for any particular node. Using the Aspen utilities, Rocks, or Perceus / Warewulf, it is possible to define a particular group of nodes (a group could be one node) which have some unique properties, such as an extra external network connection, a different processor architecture, or a different software build, which can then be used for different functions you might need for your cluster.
These nodes are considered unique recovery targets. If you are using Aspen utilities, these nodes are backed up as a unique image which is tied to that particular node or nodes, just as a storage or fail-over master node is itself a unique image. In a Perceus / Warewulf system, another image of these nodes are kept on the master node, and in Rocks a separate class of node is created, which has its own unique properties.
This does not preclude another node of similar hardware configuration being used as a replacement for that node should it fail, although some physical intervention might be required. For instance, lets say that an external network as well as a SAN connection is attached to node1, and node1 fails. These connections could be serviced by node2 if;
- the external Ethernet connection on node1 is moved to node2
- node2 contains a fiber channel card identical to node1 or,
- the node1 fiber channel card is moved to node2
- node2 is re-imaged with a current node1 image
If you are going to utilize any of your compute nodes in unique hardware configurations for mission critical tasks, Aspen recommends that you configure more than one node with that exact configuration in order to facilitate fail-over in the case of node failure. In the above example, adding a fiber channel card to node2 and attaching another external Ethernet connection to it, even though neither is actually configured up during normal operation, would allow node1 functionality to be returned to the cluster in as little as 5 minutes. An even better option is to place this functionality on the master or another front end node which itself is set up for high availability fail-over.
It is possible to utilize the same software build, and image, on nodes of different clock speeds and memory configurations with no modification. Nodes of different chip architectures will require different images to accommodate different hardware drivers and to take advantage of specific processor performance attributes.
Your nodes memory configurations should be based on your applications memory requirements. Many customers have some additional applications that require more memory per core than might be economical, but do not run many instances of this application compared to other code(s) they run. In this case, configuring a single or several nodes with more memory to accommodate these requirements is more economical, and you may utilize your scheduler or other utilities to route that particular application to that node or set of nodes.
While it is uncommon, nodes can fail in your cluster. There are two separate types of compute environments that affect your node hardware configuration choices.
- Non-critical codes: In many environments, applications are easily re-ran in the case of a node failure. Applications are usually of limited duration and can be re-submitted or reran if a node involved fails, or possibly the code has checkpoint and restart capabilities.
- Critical codes: In this environment, applications are not easily re-ran, perhaps because of the number of applications ran, duration of application execution, time sensitivity, input data storage requirements, or model preparation complexity.
In a non-critical job environment using disked clusters with our standard distributions, there is no need to have a hardware or software RAID 1 environment on compute nodes. Aspen provides command line utilities, or a GUI when ABC is purchased, to restore any node very quickly, and the added expense of hardware RAID cards and/or additional disks can be utilized to procure an entire spare compute node. Nodes do not need to have redundant power supplies, and this cost savings can also be utilized to purchase additional compute nodes which can function as complete spares in the case of node failure.
In a critical code environment, node failures are taboo. Utilize software or hardware RAID 1 configurations on your disks, and ask for your nodes to be configured with redundant power supplies. In cases where 1U nodes with a high speed interconnect card are used, your expansion slots may be limited, making software RAID 1 more attractive. If your nodes are configured with redundant power supplies, they should be powered by different rack Power Distribution Units (PDUs) within the cluster, which are in turn connected to circuits connected to separate circuit breakers and if possible different electrical power panels within your facility.
Occasionally, extremely fast local scratch space is needed by your application(s). A single SATA II disk on a compute node can provide ~50 to ~70 MB/s sustained write throughput, depending on the disk model used; that might not be fast enough for some applications. In these cases, compute nodes can be configured with multiple disks and RAID 5 or RAID 0 sets to achieve the desired level of reliability and performance.
Many single 1U servers can be configured with four 3.5” disks, which allow for an entire RAID 5 set that includes 3 drives with one hot spare disk and includes the operating system build and scratch space (more reliable, faster than a single drive), or a less reliable but quite fast single O.S. drive and a 3 disk RAID 0 (striping) set.
Hardware or software RAID can be used in either of these scenarios, although when configuring with software RAID 5, the /boot partition is configured as RAID 1 to allow correct booting. If you opt for a RAID 5 compute node solution, Aspen recommends that you configure your nodes with a hardware RAID controller if possible. Software RAID 5 will almost always incur higher overhead on your node, and can slow down code(s) execution, so Aspen recommends that you utilize a hardware RAID solution if RAID 5 is needed. Your application requirements and node expansion slot availability will drive this choice.




