NAS and SAN Technology

<< Local Disk Configurations | NAS and SAN Technology | Local File Systems >>


NAS and SAN storage solutions can be used to service large data volumes at faster speeds than a single master or storage node disk configuration is capable of. A Storage Area Network (SAN) configuration is often required for parallel file system implementation because direct access to the disks by more than one host is needed. Network Attached Storage (NAS) solutions can serve NFS shares at performance levels that are not possible using master or storage local disk configurations, and some NAS solutions can be used to serve parallel file systems as well.

 

Both SAN and NAS solutions utilize RAID configurations just like a master or storage node local disk share does, but they're specifically designed to do a single job, share their data with other computers. NAS solutions are most often used to present remote file systems (client mounts the file system from the NAS system and does not see the disks) to clients, while SANs are most often used to provide remote block level storage (client manages the file system on top of the exported disks) to its clients. Hybrid NAS/SAN systems exist as well, which combine the technologies to provide particular performance advantages or reliability characteristics not provided by a single technology.

 

Network Attached Storage (NAS)

 

A NAS system is self-contained, and combines RAID, the local file system, network interfaces, and data sharing functionality into a single unit. They have customized user interfaces that provide simplified management and diagnostics, and will almost always out-perform local file based solutions. Optimized for high speed data access, they achieve higher performances with custom hardware and software designs that provide greater data serving speed than a general use server.

 

NAS systems are almost always easy to expand, and often incorporate customized file systems which make online expansion possible and quite easy to accomplish. Data replication between units is often supported for business continuance and disaster recovery, and they interface with standard back-up systems. NAS systems almost always support more than one protocol for the data share. NFS and Microsoft CIFS support is de rigueur on all NAS solutions, and some implement additional parallel file system access as well. This multi-protocol support allows a single NAS to support Windows systems, normal NFS clients, and high speed parallel file system clients all at the same time, presenting a unified data space across all the resources. This is a very powerful feature, with obvious advantages.

 

Clustered NAS systems combine more than one unit in order to increase performance and reliability. They can use a backend high speed network, or sometimes the data delivery network itself, to service data sharing needs between the units. Virtual IP or re-direct technologies can be used to load balance target systems across the different components of the NAS and remove or ameliorate network performance bottlenecks.

 

Aspen integrates and recommends several NAS solutions as part of our portfolio. Panasas provides a clustered NAS solution which scales extremely well, supports NFS and CIFS access, and also acts as a parallel NFS server to deliver high speed data access to your HPC solution. Terascala implements a hardened Lustre server appliance that not only gives superior performance, but also removes most of the pain from implementing a high performance Lustre solution. Isilon and BlueArc both implement extremely high performance NFS and CIFS serving, which scales extremely well in most HPC solutions.

 

NAS technology, and especially clustered NAS systems, combine extreme scalibility, performance, high reliability, enterprise fail-over features, and ease of management to provide a very compelling answer to high performance and large scale HPC data access needs. Which NAS system is best for your specific needs can be quite a complex question. Your sales engineer will study your requirements and environment carefully before they make a recommendation for your solution.

 

Storage Area Networks (SAN)


SANS are used to attach remote computer storage devices to computers. These devices could be disk arrays (most common usage), tape libraries, or even optical back-up systems. Even though the devices are available to more than one client system, they show up to the operating system on the server as being a local device. This contrasts with master or storage node disk configurations , which are classed as direct attached storage (DAS), and are available to only one system for use at any given time.

 

Storage Area Networks can be implemented via several different protocols. The most common protocol used in SAN implementations today is Fibre Channel. Fibre Channel was originally designed for HPC, and performs many of the same functions as HIPPI, which was commonly used by super computers in the 1980s to connect storage devices and network systems together. Fibre Channel uses the Fibre Channel Protocol to transport other commands such as SCSI across the fabric to access and manipulate the storage device. Today, Fibre channel is most often implemented using 850 nm multimode fiber optic cables, although single mode fiber (for long distance) and even copper twisted pair mediums are available. Common Fibre Channel speeds are currently 2 Gb/s, 4 Gb/s, or 8 Gb/s. Fibre channel can be deployed as Point-to-Point, where two devices are connected directly together, as an arbitrated loop, where all the devices are in a ring, or as a switched fabric via the use of Fibre Channel switches. Fibre Channel is quite flexible, and a SAN that uses Fibre Channel can be deployed in quite a few different ways.

 

Given that a SAN attached disk can be accessed by more than one server attached to its SAN at the same time, parallel file systems are often installed on SAN attached RAID devices. Parallel file systems allow more than one server to access the same file system at the same time, as opposed to local file systems , which allow only one system to utilize the file system at any given time. SANs can be used to store local file systems for single hosts as well. This can provide flexibility and ease of administration, but the cost can be prohibitive for all but larger enterprise deployments.

 

 

SAN implementations are normally accomplished by connecting one or more storage devices, usually RAID systems, into a fibre channel switch or switches. Each host that will access the shared disks is also connected into the switch using fibre channel host bus adaptors, or HBAs. Depending on the number of hosts and disk sub-systems in the design, this configuration can add significant cost. The most common small fibre channel switch has eight ports, and larger switches cost more. Full path redundancy is also available. Additional fibre channel switches can be purchased, and multiple ports on disk sub-systems and hosts are then connected to different fibre channel switches. This configuration utilizes multipath technology to present the same storage device over more than one link so that a single fibre channel switch, HBA, or fiber cable failure will not interrupt operations. Needless to say, redundancy can significantly impact the solution cost.

 

 

Some RAID systems provide direct fibre channel ports on the unit. These systems typically support four connections, allowing you to connect four hosts directly to the RAID system itself. THis is a very useful configuration for implementing a parallel file system such as GFS , and enables each server connected to the RAID to be used as an NFS server to the rest of your cluster.

 

SANs are extremely scalable. Exabyte sized storage systems are possible, and other interconnects, such as Infiniband or Ethernet can be used as SAN Interconnects instead of Fibre Channel. Infiniband is quite common in the HPC environment, but not deployed very widely in Enterprise markets, so many SAN connected systems do not yet support Infiniband as their SAN Interconnect. Gigabit Ethernet and 10 Gigabit Ethernet can also be used as SAN backbone Interconnects. Gigabit Ethernet is too slow to support high performance SAN deployments, but 10 Gigabit Ethernet can match current Fibre Channel performance and is becoming less unusual in SAN designs.

 


<< Local Disk Configurations | NAS and SAN Technology | Local File Systems >>


 

Bookmark and Share