One of the key components to manage an HPC Cluster is to have the right HPC Management software in place. This includes methods to deploy compute nodes, keep operating systems and other software up to date, and monitor the hardware. There are full Cluster Managers available. Some are free, including Aspen Systems’ Aspen Cluster Management Environment (ACME) software, and others come with commercial support and require a license. Then, there are different components you can use to create your own software stack. A lot of these tools are available as Open Source software.
The software stack is perhaps the most important part of your high performance computing solution. Starting with your choice of operating system, the software stack determines not only how your system operates, but also its performance.
Unlike most other HPC manufacturers, Aspen Systems offers a full selection of operating systems for you to choose from. Some OS’s are more user-friendly, while other may provide increased performance for your applications. You also may already be familiar with a particular Linux distribution, so sticking with it may be the best choice for you depending on the hardware selected.
Equally as important as the distribution are the HPC Compilers and HPC MPIs
- Compiling your source code using commercial compilers such as Intel will most likely lead to significant performance increases.
- If you have a high speed interconnect such as InfiniBand, then compiling your cluster’s MPIs with a commercial compiler and the performance communications libraries will be of great benefit.
- If your cluster contains GPU processors or FPGAs then using a custom compiler is imperative towards achieving optimal performance.
Aspen offers a full selection of performance software options such as compiling your choice of MPIs and other software with as many compilers as you wish before your system ships. Aspen requires all customers to fill out our online Statement of Work (SOW).
Aspen Systems Cluster Management
Cluster HPC management and support is perhaps one of the most overlooked facets of operating a cluster. Two questions must be answered for your successful cluster deployment. What hardware and software capabilities will be installed on your cluster to facilitate successful HPC management and support; and what are your cluster management, warranty, and support options?
Aspen Systems Cluster Management software comes standard with all of our HPC Clusters, along with our Standard Service Package at no additional cost. Aspen Cluster HPC Management software is compatible with most Linux distributions and is supported for the life of the cluster.
- Node Provisioning – Aspen Cluster Maintenance Environment (ACME) is a network bootable Linux environment independent of the environment installed on a cluster node which is used for deploying images across your cluster, testing and pre-configuration of cluster nodes, and stress testing. Images are created using ‘aspencopy’ and deployed through ACME using ‘aspenrestore’.
- Aspen Tools – Aspen provides command line tools on our clusters for imaging, remote power, sensor programs, and more. These are often used by more advanced cluster users to quickly check status on nodes, remotely power them on or off, or to re-image large groups of nodes.
- Scheduler – Aspen can install and configure several different resource manager and scheduler combinations on your cluster. Some are open source and no charge to you, while some are commercial products which you must purchase. Aspen can procure and install these utilities on your cluster for you, or transfer licenses from existing licenses you might have.
- Environment Modules – Aspen installs Environment Modules on all of our HPC Clusters. Modules allow users to dynamically modify their environment via modulefiles., and useful in managing different versions of applications (MPI, Compilers). Modules can also be bundled into metamodules that will load an entire suite of different applications.
- Ganglia – Aspen normally installs and configures Ganglia on your cluster, and can make Ganglia externally available as a default web page for organizations who are used to seeing Ganglia as the front end web page for their clusters. Ganglia is a quite popular scalable distributed monitoring system for clusters and grids, and many HPC customers do not consider a cluster complete without it.
- Monitoring – Aspen can configure your cluster with several monitoring tools to help you and your support team get the most value out of your technology investment. Nearly all aspects of your cluster can be monitored, including performance/utilization, network saturation, power consumption, temperature monitoring and more.
Bright Cluster Manager
Bright Computing is an industry leader in HPC middleware solutions, for provisioning and managing HPC clusters, Hadoop clusters, and OpenStack private clouds in your data center or in the cloud. Bright Cluster Manager, the flagship product of Bright Computing, makes it easy to deploy and manage big data and cloud architectures. Bright Cluster Manager makes Linux clusters easy to install, manage and use. In addition to ease of management, Bright Cluster Manager is designed to scale to thousands of nodes. The Bright Cluster Manager software solution is designed to be a complete HPC management solution and includes everything a user or system administrator would expect from an advanced cluster management software stack. Contact one of our expert sales engineers today to learn how the HPC solutions from Bright Computing can help you streamline the installation and management of your HPC system.
HPC Simplified with Intel HPC Orchestrator
High-performance computing is driving new innovations across a wide range of industries — from biosciences, to finance, to cosmology and more. Intel HPC Orchestrator simplifies the installation, management, and ongoing maintenance of your system by reducing the amount of integration and validation effort required to run an HPC software stack. With Intel HPC Orchestrator, based on the OpenHPC system software stack, you can take advantage of the innovation driven by the open source community – while also getting peace of mind from Intel support across the entire stack. Accelerate your time to results and value for your HPC initiatives through Orchestrator.
Read more about Intel HPC Orchestrator
Intel HPC Orchestrator, part of the Intel Scalable System Framework, reduces the burden of integrating and validating an HPC software stack and greatly simplifies ongoing maintenance and support. This video touches on Intel HPC Orchestrator proof of concepts with Fujitsu, ANSYS and COMSOL. It gives the software ecosystem the best of both worlds – community-driven innovation and the peace of mind of Intel expertise and support. Intel HPC Orchestrator is helping close the gap between hardware and software on the path to exascale performance.