Global namespace: The future of distributed file server management, part 1. (Storage Networking).
Today's file systems were designed to take advantage of the storage architecture of the 1960s, which was far less distributed and characterized by static links between clients and storage volumes. In the 1990s, however, the world of file storage became distributed, making file system management more complex and costly. What is needed is a new approach to file management, one that simplifies file management for users as well as administrators.
File Management Challenges
The explosive growth of data storage and massive proliferation of file servers and NAS appliances has created a management nightmare for company data centers and storage administrators. Every server and filer is a storage island and an independent file system that requires individual management on a regular basis. Companies are looking for ways to simplify file system management and reduce storage management costs.
Corporate users and applications find it difficult to navigate today's file systems--they must map (or mount, in the UNIX world) the shares (or exports) they access while carrying out their tasks. (Figure 1) User issues with the traditional file system paradigm include the following:
* Users must know where files are located
* Users must map to multiple volumes in order to access required data
* Cross-volume data searches are difficult
* This configuration is not highly reliable, available, or scalable
* There are many single points of failure throughout the distributed file system
* If files are moved or storage reconfigured, user access may be interrupted, as their shortcuts and login scripts must be modified to access files in their new location
For administrators, the issues are even greater. Administrators face significant challenges as they work to provide efficient file storage management. Challenges that drive up hardware and administrative costs include tasks that administrators face on a daily basis:
Adding file servers. Adding the first server (whether NAS filer, DAS, or server connected to a SAN) is easy -- it contains its own file system and is easily implemented in the network. When a second NAS box is added, the administrator must once again set up network shares and inform users of its existence so that they can mount/map to it. Each successive NAS addition requires redundant administrative setup and provides additional complexity for administrators and users.
Load balancing storage and migrating data. If an administrator determines that one file system is 100 percent utilized and another is underutilized, he or she cannot move data from one to the other without affecting users.
Preparing for disaster recovery. Administrators are tasked with making file systems highly available and recoverable without interrupting business, which is exceedingly difficult in a distributed environment.
Reconfiguring file systems. Users are aware of the name and location of the servers on which their data resides, as they must attach to each machine in order to access files. This complicates the administrator's job, as be or she cannot add, move, migrate, or rebalance storage without interrupting users' access to it. In addition, all such changes require some reconfiguration to the client machine. The hard dependencies between users and physical storage makes effective, efficient file system management an impossible dream in today's environment.
The Answer: Global Namespace
A namespace is a logical layer that is inserted between clients (users and applications) and file systems. It provides a method of viewing and accessing files that is independent of the physical file locations. This is a powerful concept, as it means an administrator can use a namespace to logically arrange and present data to users, irrespective of where the data is located. It also gives administrators the ability to add, change, move, and reconfigure physical file storage without affecting how users view and access it.
The goals of a namespace are to: 1) shield users from the complexities of the storage architecture, and 2) enable the administrator to manage the physical layer without affecting how users access files. A namespace is a means of "pooling" multiple file systems into a single, global file system. Ideally, a global namespace can pool storage from multiple, heterogeneous storage types (DAS, SAN, or NAS), and across different storage platforms (Windows and UNIX).
With a global namespace in place, the administrator is able to distribute files in a way that achieves best performance and capacity utilization, and clients access them via the logical namespace. When storage is added or consolidated, and files are moved or renamed, clients are automatically redirected to the files in their new location without ever having to know (or care) that they were moved.
The global namespace approach to file management dramatically simplifies file management, as it frees the administrator to add, move, change, and reconfigure physical storage without affecting how clients access it. And most important, it permanently eliminates any need for desktop reconfiguration, drive letter remapping, or login scripts modification when storage is reconfigured!
Global Namespace Example
To illustrate the power of a global namespace, let's look at how the members of the Coastal Services marketing department currently access their marketing files. In Figure 2, you will note that every marketing user and marketing application is mapped to shares on drive letters E:, F:, G:, and H:, which represent file servers in NYC, London, and Houston. These users have become accustomed to finding presentation files on the London server, brochures on the NYC server, and so on.
Suppose Server 4 becomes full, and the Sales data has to be moved to a new file server (Server 5). The administrator must migrate the data outside business hours and remap every marketing user's desktop, modify user login scripts, and update every marketing application to access Sales information on the new server. In addition, any shortcuts users may have created to access Sales information will no longer work.
Now let's look at how a global namespace changes the whole file management paradigm. In Figure 3, the administrator has created a global namespace through which Coastal Services marketing users access their marketing data.
Users no longer need to know where the data physically resides, as they access it through an intuitive, logically arranged namespace. Better yet, when Server 4 becomes full, and the Sales data is moved to a new Server 5, users are not affected at all. They continue to access Sales files through the namespace without knowing they were moved. A namespace allows the administrator to easily manage distributed data, and scale, migrate and load-balance storage without affecting users, and without having to reconfigure desktops and applications. This translates into significant time and cost savings for organizations with large, distributed file systems. And it creates a better user experience as well.
Global Namespace -- Distributed File Server Management for the Future
Using a global namespace to manage multiple file servers provides tremendous flexibility and delivers significant benefits to the organization, administrator and users.
* Better utilize distributed file storage capacity
* Gain administrative flexibility by shielding users from changes to physical storage
* Easily support and administer heterogeneous storage devices
* Protect users from the complexity of the storage infrastructure
* Increase data availability
* Seamlessly accommodate growth and changes in storage requirements
* Decrease storage management costs
* Provides flexibility for administrators to move data and add/change/consolidate storage devices without affecting users
* Gives an administrator the ability to manage and scale logical and physical storage components independently
* Administrators can add new storage to the file system without having to reconfigure the namespace nor users' desktops
* Consolidating servers and rebalancing data across devices is easy to implement and transparent to users
* Data availability is increased, which makes it easier for the administrator to ensure the organization's readiness for disaster recovery
* Makes it easier for users and applications to find and access data
* Makes data location transparent to users and applications
* User productivity is improved because file system changes do not interrupt user access nor require desktop reconfiguration
In addition to the benefits mentioned above, a global namespace provides the infrastructure for critical enterprise storage solutions, including rapid disaster recovery, increasing server availability, rapid disaster recovery, transparent data migration, rapid SAN deployment, and server/storage consolidation. Each of these solutions delivers significant, immediate benefits to an organization by increasing data availability and reducing the cost of storage.
Implementing a Global Namespace
Global namespace is a concept that clearly delivers significant benefits. So what does it take to implement one? The basic infrastructure needed to implement a global namespace is already included in the Windows and Unix operating systems. That means an organization does not have to change their existing environment in order to deploy a namespace, but can make a seamless transition from their current environment to one using a global namespace. In the Windows environments, a global namespace can be implemented using Microsoft DFS, and in UNIX environments, using Automounter and NIS+. What is needed is a solution that allows creation of a common namespace across multiplatform environments.
As an organization considers implementing a global namespace, there are several other key considerations:
* What tools are available to enable namespace creation and population?
* How do you synchronize the namespace with the underlying physical storage?
* How will you monitor and manage the namespace?
* How do you backup and restore the namespace?
* If the namespace resides on a single server, it is potentially a single point of failure in the file system. How do you avoid single points of failure?
* Namespace design considerations: What is the organizing principle (e.g., job function, geographic, division, or other)? How large should the namespace be? How complex?
* When it becomes necessary to change or reconfigure the namespace, how will you manage it?
The good news is that there are solutions available today that make it easy to create and populate, monitor and manage namespaces of any size. We'll discuss these solutions in "Global Namespace -- File System of the Future, Part 2."
Rahul Mehta is CEO of NuView Inc. (Houston, TX).
|Printer friendly Cite/link Email Feedback|
|Publication:||Computer Technology Review|
|Date:||Feb 1, 2003|
|Previous Article:||Clustering for high availability: but don't forget about your backups! (Storage Networking).|
|Next Article:||A new storage architecture for a new information age. (Storage Networking).|