Delivering capacity and performance
Centralized storage caching and global
namespaces work together
by Gary Orenstein
Network-attached storage (NAS)
environments have evolved significantly from
the early deployments for limited
departmental file sharing. Today, NAS has
grown to support key enterprise applications
such as databases, financial analytics,
design automation, simulation, business
intelligence and the majority of scalable
Web applications.
The rapid evolution of NAS to support
enterprise-wide solutions requires
consolidating multiple file systems across
storage devices. This technique and
implementation is known as a global
namespace, where multiple NAS devices can be
linked together in such a way that servers
only need to access one file system, which
may be distributed across multiple devices.
Global namespaces can help NAS users
conquer capacity management, but they do not
directly improve performance for storage
systems. In fact, some global namespace
implementations can hamper performance due
to ballooning directory structures. New
solutions based on centralized storage
caching, however, can improve the
performance of global namespaces with
additional I/O operations per second and low
latency response. This provides the
advantages of a consolidated file system
yielding superior storage capacity, without
penalties.
The flexibility of a centralized file
system enables the rapid addition of new
clients. Early NAS devices, however, could
only expand to a finite capacity, and
additional storage requirements mandated the
deployment of a new device. That device, in
turn, had to have is own unique file system,
requiring separate management and
administration. For many IT managers, the
proliferation of unique NAS devices led to
an unwieldy number of file systems and a
delicate balancing act for storage
management. This process is akin to
operating a computer with a dozen or more
individual disk drives, requiring a search
through every drive each time a user wants
to find a file.
Understandably, larger-scale NAS
solutions were held back by an “island-likeâ€
management approach. This dilemma led to the
development of global namespaces, which
provide an abstraction layer to aggregate
multiple unique file systems into a single,
coherent, shared file system. Global
namespaces can be implemented through
appliances within a network environment or
as part of the NAS storage layer. Typically,
parallel or clustered file systems use
global namespaces to aggregate large amounts
of storage capacity into an easily managed
pool.
Performance Challenges
By eliminating the need to micromanage
individual file systems, a global namespace
removes previous limitations on adding new
clients and NAS devices. This provides an
unimpeded growth path to expand the client
and storage infrastructure.

Global namespaces can
simplify NAS client and
server expansion.
While global namespaces solve a capacity
management issue, they are not directly
responsible for improving I/O performance.
While aggregating multiple NAS devices
together would appear to deliver such a
boost, there are factors that create the
opposite effect.
As global namespaces grow, the directory
information grows. In fact, large
directories present a performance challenge
in their own right. For example, finding a
file now means searching through a larger
file system, often referred to as “walking
the directory tree,†which adds significant
latency. Specifically, additional NFS
operations are required at each stage of the
process.
Global namespaces also impact
performance because they are primarily
disk-based. While aggregating disk drives
together can increase throughput (or
bandwidth), this architecture cannot
directly improve two other critical measures
of storage system performance: I/O
operations per second (IOPS) and latency.
Disks provide the greatest amount of
capacity, but due to the mechanical nature
of disk spindles, they are limited in the
overall amount of IOPS they can deliver.
Further, because each request includes head
seek time and the rotation of the magnetic
media, latency for disk-based requests can
be significant.
Caching, on the other hand, makes use of
memory to deliver not only throughput, but
more importantly, high IOPS and ultra-low
latency. For I/O constrained applications,
this combination delivers application
performance improvements by significantly
increasing the number of transactions and
dramatically reducing the processing time.
The implementation of a global namespace
can provide relief from capacity management
headaches, but can also result in the need
for performance improvements.
Caching Improves Performance
New centralized storage caching
solutions directly boost I/O operations per
second and reduce access time (i.e., low
latency) by complementing the
capacity-management features of global
namespaces. This combination is ideal for
customers who have large data sets requiring
simplified management and the need to
frequently access data with ultra low
latency in such applications as databases,
financial analytics and simulations.
Centralized caching is emerging as the
high-performance component of the global
namespace.
Implementation of this solution involves
deploying one or more scalable caching
appliances that serve data from high-speed
RAM, offloading the conventional access to a
slower, mechanical disk. By implementing a
solution with caching, all data remains
protected on the persistent storage, and IT
managers can retain existing storage
management, backup, recovery, snapshot,
replication and provisioning features.
Most traffic to the application servers
is delivered from the caching appliance. For
data that has yet to be cached, the
appliance will retrieve it from the
persistent storage layer upon first request,
and then continue to serve the data from
cache.
Caching by its very nature is dynamic,
and once installed, is a relatively
management-free process. Multiple
applications accessing different data sets
can benefit from a single caching appliance
because it continually makes the most
recently accessed data available. If the
active data set becomes larger than the
existing capacity of the appliance,
expansion can take place on the fly by
adding an additional appliance.
Global namespaces provide a valuable
addition to large-scale NAS deployments by
streamlining capacity management. This helps
improve utilization, reduces manual data
movement and allows for easy expansion.
Centralized storage caching delivers the
performance boost on top of global
namespaces for data centers that are both
capacity- and performance-constrained. The
seamless integration of these two
technologies combines to maximize the
effectiveness and efficiency of large-scale
NAS deployments.
Gary Orenstein is vice president of
marketing at Gear6, Mountain View, Calif.
For more information:
(click here)