NAS and SAN Technology
<< Local Disk Configurations | NAS and SAN Technology | Local File Systems >>
NAS and SAN storage solutions can be used to service large data
volumes at faster speeds than a single
master or
storage node disk configuration
is capable of.
A Storage Area Network (SAN) configuration is often required
for
parallel file system
implementation because direct access to the disks by more than one host
is needed.
Network Attached Storage (NAS) solutions can serve
NFS
shares at performance levels that are not possible using
master or storage local disk configurations, and some NAS solutions
can be used to serve parallel file systems as well.
Both SAN and NAS solutions utilize
RAID
configurations just like a master or storage node local disk share does,
but they're specifically designed to do a single job, share their data
with other computers.
NAS solutions are most often used to present
remote file systems (client mounts the file system
from the NAS system and does not see the disks) to clients, while
SANs are most often used to provide remote block level
storage (client manages the file system on top of the exported disks)
to its clients. Hybrid NAS/SAN systems exist as well, which combine the
technologies to provide particular performance advantages or
reliability characteristics not provided by a single technology.
Network Attached Storage (NAS)
A NAS system is self-contained, and combines RAID, the local file system, network interfaces, and data sharing functionality into a single unit. They have customized user interfaces that provide simplified management and diagnostics, and will almost always out-perform local file based solutions. Optimized for high speed data access, they achieve higher performances with custom hardware and software designs that provide greater data serving speed than a general use server.
NAS systems are almost always easy to expand, and often incorporate
customized file systems which make online expansion possible and quite
easy to accomplish.
Data replication between
units is often supported for business continuance and disaster recovery,
and they interface with standard
back-up systems. NAS systems almost always support more than
one protocol
for the data share.
NFS
and Microsoft
CIFS
support is de rigueur on all NAS solutions, and some implement additional
parallel file system access as well. This multi-protocol support allows
a single NAS to support Windows systems, normal NFS clients, and high speed
parallel file system clients all at the same time, presenting a unified
data space across all the resources. This is a very
powerful feature, with obvious advantages.
Clustered NAS systems combine more than one unit in order to increase performance and reliability. They can use a backend high speed network, or sometimes the data delivery network itself, to service data sharing needs between the units. Virtual IP or re-direct technologies can be used to load balance target systems across the different components of the NAS and remove or ameliorate network performance bottlenecks.
Aspen integrates and recommends several NAS solutions as part of our
portfolio. Panasas
provides a clustered NAS solution which scales extremely well,
supports NFS and CIFS access, and also acts as a
parallel NFS server
to deliver high speed data access to your HPC solution.
Terascala
implements
a hardened
Lustre
server appliance that not only gives
superior performance, but also removes most of the pain from implementing
a high performance Lustre solution.
Isilon
and
BlueArc
both implement extremely high performance NFS and CIFS serving, which scales
extremely well in most HPC solutions.
NAS technology, and especially clustered NAS systems, combine extreme scalibility, performance, high reliability, enterprise fail-over features, and ease of management to provide a very compelling answer to high performance and large scale HPC data access needs. Which NAS system is best for your specific needs can be quite a complex question. Your sales engineer will study your requirements and environment carefully before they make a recommendation for your solution.
Storage Area Networks (SAN)
SANS are used to attach remote computer storage devices to computers. These
devices could be disk arrays (most common usage), tape libraries,
or even optical back-up systems. Even though the devices are available to
more than one client system, they show up to the operating system on the
server as being a local device. This contrasts with
master or
storage node disk configurations
, which are classed as
direct attached storage (DAS), and are available to
only one system for use at any given time.
Storage Area Networks can be implemented via several different protocols. The
most common protocol used in SAN implementations today is
Fibre
Channel
.
Fibre Channel was originally designed for HPC, and performs many of the
same functions as
HIPPI
,
which was commonly used by super computers in the 1980s to connect
storage devices and network systems together. Fibre Channel uses the
Fibre Channel Protocol to transport other commands such as
SCSI
across the fabric to access and
manipulate the storage device. Today, Fibre channel is most often implemented
using 850 nm multimode fiber optic cables, although single mode fiber (for
long distance) and even copper twisted pair mediums are available. Common
Fibre Channel speeds are currently 2 Gb/s, 4 Gb/s, or 8 Gb/s. Fibre channel
can be deployed as Point-to-Point, where two devices are
connected directly together, as an arbitrated loop, where all
the devices are in a ring, or as a switched fabric via the use of
Fibre Channel switches. Fibre Channel is quite flexible,
and a SAN that uses Fibre Channel can be deployed in quite a few different
ways.
Given that a SAN attached disk can be accessed by more than one server
attached to its SAN at the same time,
parallel file systems
are often installed on SAN attached
RAID devices. Parallel file systems allow more than one server to access
the same file system at the same time, as opposed to
local file systems
, which allow only one system to
utilize the file system at any given time. SANs can be used to store local file
systems for single hosts as well. This can provide flexibility and ease of
administration, but the cost can be prohibitive for all but larger
enterprise deployments.

SAN implementations are normally accomplished by connecting one or more
storage devices, usually
RAID
systems, into a fibre channel switch or
switches. Each host that will access the shared disks is also connected
into the switch using fibre channel
host bus adaptors
,
or HBAs. Depending on the number of hosts and disk sub-systems
in the design, this configuration can add significant cost. The most common
small fibre channel switch has eight ports, and larger switches cost more.
Full path redundancy is also available. Additional fibre channel switches can be
purchased, and multiple ports on disk sub-systems and hosts are then connected to
different fibre channel switches. This configuration utilizes
multipath technology to present the same storage device over more than one
link so that a single fibre channel switch, HBA, or fiber cable failure will not interrupt
operations. Needless to say, redundancy can significantly impact the solution cost.

Some RAID systems provide direct fibre channel ports on the unit. These
systems typically support four connections, allowing you to connect four
hosts directly to the RAID system itself. THis is a very useful configuration
for implementing a parallel file system such as
GFS
,
and enables each server connected to the RAID to be used as an
NFS
server to the rest of your
cluster.
SANs are extremely scalable. Exabyte sized storage
systems are possible, and other interconnects, such as
Infiniband
or Ethernet can be used as SAN
Interconnects instead of Fibre Channel. Infiniband is quite common in the HPC environment, but not deployed very widely in
Enterprise markets, so many SAN connected systems do not yet support Infiniband as their SAN Interconnect. Gigabit Ethernet
and 10 Gigabit Ethernet can also be used as SAN backbone Interconnects. Gigabit Ethernet is too slow to support high performance
SAN deployments, but 10 Gigabit Ethernet can match current Fibre Channel performance and is becoming less unusual in
SAN designs.
<< Local Disk Configurations | NAS and SAN Technology | Local File Systems >>






