Remote Management

Ideally, in an HPC cluster, the need to log into a node is unnecessary. In the rare event that local-access to a node becomes necessary for debugging or troubleshooting, having physical access to a particular node must be assured.

Enabling remote management for all the nodes in your cluster is recommended, however traditional KVM (Keyboard, Video, Mouse) solutions can be problematic as they require special cabling and have growth limitations. In modern network environments, KVM is the most commonly used method of connecting multiple systems to each other for remote management from a single console. Aspen Systems node management takes on many forms, from power controls, remote monitoring of systems, to actual KVM remote emulation.


IPMI refers to IP-based Management Interfaces, commonly used to manage computers regardless of the system’s state. Common IPMI features include lights-out management, system thermal management information, and can even provide monitoring of CPU status without interruption on the system. IPMI systems used by Aspen Systems also integrate network-based KVM integration, providing a single-cable Ethernet-based standard for monitoring and management of the system without the need for dedicated KVM switches.


With Aspen Beowulf Cluster Management (ABC) software, the full capability of the IPMI monitoring and KVM emulation systems can be integrated into a single management console. ABC combines monitoring systems with IPMI/KVM integration to ensure that each and every node in your cluster is performing at peak efficiency. Effective monitoring can help predict hardware failures, showing CPU and communications workloads across the cluster. With ABC software system downtime can be minimized; maintaining the entire cluster through a single SSL enabled web browser from anywhere on the Internet. Integrated power controls can also provide an additional view into the management of the system, reporting current usage by each node, and even controlling the power state if a node becomes unresponsive. Imagine having complete control over your cluster from anywhere!

 

Bookmark and Share