Clustered network storage.
SAN, NAS and Object-based storage (ObS) are the three categories of primary network storage. SAN systems support FC and/or iSCSI and are used for database, e-mail and file storage. NAS is typically used for file storage, file sharing, images and other unstructured data. ObS is a relatively new category of storage open systems environments and often requires custom APIs for integration between applications and storage. ObS is predominately used to implement unstructured data archives with the ability to prescribe and enforce data retention policies. Each of the above storage personality has one or more vendors that offer solutions with clustered network architectures.
CNS allows customers to add more controller nodes adding additional bandwidth, cache memory, processors and in some cases capacity as well. Since it is still one logical system regardless of how many nodes are added to the cluster, management should not increase in complexity by any substantive measure. But is this scalability important? It depends on what your needs are. For some customers a 2-node cluster may be sufficient, while others may need to scale to 32-nodes. And there are several points in between. However, the advantage of being able to scale without disruption is compelling because it enables easy and cost effective growth. And who among us is not going to grow?
Traditional storage systems have capacity limits that are rarely fully realized since other resources, such as bandwidth and processing power are likely to become the bottleneck. This is one of the advantages of CNS. These solutions have the ability to not only support large amounts of capacity but make it practical for you to do so. It is not uncommon for clustered network systems to support capacity levels matching Enterprise-class storage systems.
Traditional NAS solutions only support a maximum of 16 TB file systems today. Next generation NAS solutions support 100 TB file systems and beyond. Some clustered network solutions provide a single file system that can scale infinitely with no theoretical limit and spans multiple nodes within a cluster. Other solutions offer a hierarchy of aggregated file systems with a single top level file system presented to system administrators. Both methods offer a scalable file system that simplifies management and improves performance.
ObS systems are designed to scale in capacity and more importantly the number of objects archived. ObS is targeted towards massive archives of data and as such should scale to hundreds of millions of objects, which results in hundreds of TBs of capacity.
The ability to provide a coherent cache among all nodes within a cluster is extremely difficult and supported by just a handful of storage systems today. Each node in the cluster has its own memory that temporarily stores data to memory for faster access. Memory data accesses are hundreds of times faster than disk. The challenge in a clustered environment is making sure that cached data in memory can survive a node failure and that each node knows where the cached data resides. Enabling this cache coherency through a large cluster requires extremely efficient communication between all nodes. Cache coherency results in high performance that scales in a linear fashion.
A single pool of storage means that data is stored across all of the drives within the storage system unlike traditional storage systems that have a finite number of drives within a RAID parity group. Having a single pool enables greater performance since every hard drive in the storage system is working in concert to read and write data. Having a single pool essentially eliminates the need for performance tuning.
Some storage systems have controllers that include processors, cache memory, host interfaces and internal hard disk drives (integrated). Other storage systems have controllers that are separate from the disk drives and use an external array to house them (external). The advantage of having an integrated storage system is that with every node comes with all hardware resources in one streamlined footprint. That is also its disadvantage. The number of drives you can fit into a single node does not scale as compared to an external array. Some customers may not need as much performance scalability as they do capacity. The price per MB goes up with integrated solutions because whenever you acquire capacity you always acquire processors, memory, etc.
Some CNS systems are designed to be future-proof by using commodity-based server technology for their controller nodes. Some solutions allow customers to upgrade to controllers with newer processors, memory and even disk drives when their next level platforms become available. They can add the new node with the latest advancements to the cluster migrating data transparently. They can continue this process until the entire cluster has been upgraded and the entire operation can be done on-line without disruption.
Not every solution supports all of the concepts discussed in this article. Ask the questions and delve and bit deeper. Additionally, consider what solution fits your particular needs.
ESG believes that clustered network storage is an evolution beyond traditional active-active solutions. The value it brings is obvious and easy to quantify. Eventually every storage system dealing with primary storage will need to support some level of clustering beyond active-active. As RAID and caching have become requisite, one of the next core functions within storage systems will be N-way clustering.
Tony Asaro is a senior analyst for the Enterprise Strategy Group (Milford, MA)
Opening shots in continuing stories ...
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||first in/first out|
|Publication:||Computer Technology Review|
|Date:||Jun 1, 2005|
|Previous Article:||Intelligent archiving drives STK ILM strategy: will ILM be its day in the Sun?|
|Next Article:||The winds of change.|