BeeGFS

BeeGFS – An Open Source Parallel Cluster File System For High Performance Computing Applications

beeGFS

Formerly FhGFS ( Fraunhofer Parallel File System), BeeGFS was created by the German research organization The Fraunhofer Society for the Advancement of Applied Research. It is an open-source, free to use file system optimized for high performance computing. Designed from the ground up for easy installation and management, BeeGFS is used on some of the fastest computer clusters in the world, and has quickly become a leader in parallel cluster file systems.

We at Aspen Systems always aim to help our customers achieve the best performance, by assisting in selecting the right products and solutions, and offering the highest quality support for a wide variety of HPC systems.

Contact an Aspen Systems Expert to see if BeeGFS is the right choice for your system.

How BeeGFS Parallel Cluster File System Works

Parallel Cluster File System

BeeGFS’ Major Components

BeeGFS spreads user data across multiple servers utilizing file striping, allowing for an increased number of servers and hard disks in the system. Add new nodes seamlessly, easily scaling up performance and capacity to whatever level is required. Whether 10 nodes or 10 thousand nodes, the system works the same.

Services:

  • Client – an internal client registered with the Linux virtual file system interface
  • Storage – a service to store data chunk files
  • Metadata – a service to store metadata like directory information and access rights. It can be scaled out to improve the system performance, allowing you to address different types of workloads. Any time a new directory is created, the system automatically selects an available metadata server to handle those files. Sub-directories can be assigned to other servers for purposes of load-balancing.
  • Management – the conjunction center for metadata, storage, and client services
  • Admon – a Java-based GUI Administration and Monitoring System Tool

Key features

  • Ease of use
    No kernel patches are required to run services, unlike many similar parallel file systems. The client service is a patchless kernel module. The server components are userspace daemons. Simple, point-and click graphical installation tools give you total power over your cluster. Add more clients and servers to a system to expand the cluster with zero down time.

  • HPC Focus
    Highly efficient. Multi-threaded core components. Native Infiniband Support. Built from the ground up to be scalable. Each file system node can serve either Infiniband and Ethernet (or any other TCP-enabled network) connections simultaneously and switch on-the-fly to redundant connections in case of any failure.

  • High Availability
    BeeGFS allows for additional optional metadata for file redundancy and replication. Access all of your existing data even in the case of hardware failures.

  • Distributed File Contents and Metadata
    BeeGFS strictly avoids architectural bottle necks. Contents can be striped across multiple drives and storage servers, and file system metadata is distributed across multiple metadata servers, giving a huge performance boost to large clusters running applications where metadata is crucial.

  • Client and Servers on any Machine
    No specific Linux distribution is required. There are no special environmental requirements. BeeGFS client and servers can even run on the same machine to boost performance for smaller clusters. It requires no dedicated file system partition – It uses existing partitions, formatted with xfs, ext4, or any of the standard Linux file systems. For larger clusters and networks, you can create several distinct BeeGFS partitions, each with different configurations to handle different kinds of workloads.


  • Robust, high performance concurrent access under extreme I/O loads
    Any changes to a file or directory by one client are always immediately visible to other clients. Multiple clients can seamlessly read and write from the same shared file without data corruption. This is something which simple remote file systems like NFS show serious performance problems with, and can even lead to corrupt data. BeeGFS, on the other hand, was designed specifically with issues like this in mind — a common use-case for advanced HPC clusters.

Would you like to learn more? Contact the team at Aspen Systems to see if BeeGFS is the right choice for you.

Contact an Aspen Systems Expert