Intel Lustre Making Parallel File Systems Simple
Intel Lustre is an open-source parallel file system and is used in some of the largest HPC clusters in the world due to its high performance and scalability. Lustre utilizes the following concepts and components to present a unified Lustre file system; Management Server (MGS), Management Target (MGT), Metadata Servers (MDS), Metadata Targets (MDT), Object Storage Servers (OSS), Object Storage Targets (OST), Lustre clients.
MDS makes metadata available to clients via MDTs. So each MDS manages names and directories in the Lustre file system, and provides network connectivity for one or more MDTs, which are local to the MDS. MDTs store metadata (filenames, directories, permissions), and there is only one MDT per Lustre file system.
OSS provides I/O and network connectivity for one or more local OSTs. The OST stores the actual file data on one or more OSSs. A single Lustre file system can have many OSTs, and you can stripe across many OSTs for performance using a Logical Object Volume (LOV). There is a lot of flexibility regarding where the MDT or OSTs are located, but normally, an OSS has four or more OSTs.
Meta-data is one of the most critical and limiting factors in a Lustre file system. Some configurations place the metadata on flash storage to increase speed and lower latency. Intel Lustre has high availability features, such as active/active OSSs with SAN connectivity to shared disks, and fail-over MDS systems. An interesting reliability feature that Lustre implements is that the Lustre client does not directly write to the file system served by the OST. Instead, the OSS does the file system modifications. This can isolate the file system from incorrectly configured or defective clients, and forms an additional layer of protection against file system corruption.
Intel Enterprise Edition for Lustre software
Intel EE for Lustre software unleashes the performance and scalability of the Lustre parallel file system for HPC workloads, including technical applications common within today’s enterprises. It allows end-users that need the benefits of large–scale, high bandwidth storage to tap the power and scalability of Lustre, with the simplified installation, configuration and management features provided by Intel Manager for Lustre software, a management solution purpose-built by the Lustre experts at Intel for the Lustre file system. Intel EE for Lustre software is backed by Intel.
Get the most widely used parallel file system for HPC, with the enterprise ready reliability you expect from Intel. Lustre Editions from Intel provide the fast, massively scalable storage software needed to accelerate performance, even on complex workloads.
Intel Manager for Lustre
Intel Manager for Lustre software includes simple, but powerful, management tools that provide a unified, consistent view of Lustre storage systems and simplify the installation, configuration, monitoring, and overall management of Lustre. The manager consolidates all Lustre information in a central, browser-accessible location for ease of use.
Integrated Apache Hadoop Adapter
When organizations operate both Lustre and Apache Hadoop within a shared HPC infrastructure, there is a compelling use case for using Lustre as the file system for Hadoop analytics, as well as HPC storage.
Intel Enterprise Edition for Lustre includes an Intel-developed adapter which allows users to run MapReduce applications directly on Lustre. This optimizes the performance of MapReduce operations while delivering faster, more scalable, and easier to manage storage.
Performance and Affordability
Intel EE for Lustre software has been designed to enable fully parallel I/O throughput across thousands of clients, servers, and storage devices. Metadata and data are stored on separate servers to allow optimization of each system for the different workloads they present. Improved metadata scalability using Distributed Namespace (DNE) feature is now integrated in Intel Manager for Lustre. Intel EE for Lustre can also scale down efficiently to provide fast parallel storage for smaller organizations.
Intel EE for Lustre software is based on the community release of Lustre software, and is hardware, server, and network fabric neutral. Enterprises can scale their storage deployments horizontally, yet continue to have simple-to-manage storage.
Why ROI is so high with Intel Lustre
Lustre powers 60% of the world’s top 100 fastest computers. Unleash the Lustre parallel file system as an enterprise platform—offering higher throughput and helping to prevent bottlenecks.
- Built on the community release of Lustre software
- Intel Manager for Lustre simplifies install and configuration
- Enormous storage capacity and I/O
- Open, documented interfaces for deep integration
- Throughput in excess of 1 terabyte per second
- Resilient, highly available storage
- Centralized, GUI-based administration for management simplicity
- Integrated support for Hadoop MapReduce applications with Lustre storage
- Rigorously tested, stable software proven across diverse industries
- Flexible storage solution based on enhanced community release software
- Global 24/7 technical support
Lustre not only has the backing of the open source community, it also has the backing of one of the largest chipmakers in the world. Purchasing IEE with Aspen Systems’ Lustre solution allows our customers to get the Engineering help from both of these world-class companies.
How it works: Lustre has five major component Groups
Management Server (MGS): Lustre servers (MDS and OSS) provide information to the MGS, while the Lustre clients retrieve information from the MGS. In common setups, the MGS is often a shared server with the MDS for simplicity.
- Management Target (MGT): Stores information provided by the Lustre servers.
Metadata Server (MDS): A Lustre Server that does the storing of the metadata (the filenames and layout, permissions, and directory location) and provides this information to the clients.
- Metadata Target (MDT): Stores the metadata and is attached to one or more MDSs. This can be directly attached storage (DAS) but is commonly on shared storage for failover setups. As the MGS needs to provide metadata quickly to the clients, faster storage such as SSDs are commonly used for MDTs. It is also good practice to backup the MDT, or at least have it in a RAID configuration that would provide some protection from lost data.
Object Storage Server (OSS): A Lustre Server that acts as the actual storage server for Lustre. It stores and provides the data to the clients. More OSSs can mean better performance, and OSSs can be added to upgrade the system for performance, and capacity.
- Object Storage Target (OST): Holds the data for Lustre. With larger OSSs, there is now usually one OSS per OST, but multiple OSTs can be attached to the OSS. OSTs can be added to an OSS later on to provide more storage.
Clients: In HPC, the compute nodes which use the Lustre filesystem to store and retrieve data. To do this, the Management Client (MGC), Metadata Client (MDC), and the Object Storage Clients (OSCs) are all included in the Lustre client software stack.
- Mounts Lustre
The Lustre Network (LNET) is the networking application program interface (API) which is used to provide all information and data across the network. Most commonly, InfiniBand and/or Ethernet is used, with InfiniBand being a stronger choice due to its low-latency and availability of high bandwidth. Please note that both InfiniBand and Ethernet can be used simultaneously as the LNET so that clients which do not have InfiniBand can still mount the Lustre filesystem through Ethernet.
Ready for the performance of a parallel filesystem backed with support of two world class companies? Ask your Aspen Systems rep for a quote today!