Printer Friendly
The Free Library
14,458,148 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Wide area file sharing across the WAN.


Distributed enterprises virtually cover the globe. Remote offices are everywhere and remote office workers now far outnumber out·num·ber  
tr.v. out·num·bered, out·num·ber·ing, out·num·bers
To exceed the number of; be more numerous than.


outnumber
Verb

to exceed in number:
 those who work out of central office locations. With this distribution of resources, today's companies must manage development efforts across multiple remote locations which means that they must also somehow enable all remote office workers and team members, worldwide, to collaborate on the same shared files and data at the same time. Add to this the fact that file sizes and data storage requirements are increasing year after year, and the efficient sharing of files across distributed enterprises over the wide area network (WAN) has become a Herculean task.

The problem is that although gigabytes of data can easily be shared over a local area network (LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used. ) using standard file server technology, they cannot so easily be shared across remote offices connected over the WAN. In truth, standard file server protocols provide unacceptably slow response times while opening and writing files over the WAN and this forces remote office IT managers to make some unappealing choices. IT managers and network users must either live with reduced productivity due to poor network performance at remote offices or they must use replication schemes that waste storage and inhibit global collaboration.

Recently, however, a new class of product known as wide-area file services (WAFS WAFS Wide-Area File Services (storage technology)
WAFS Wide Area File System (storage technology)
WAFS Women's Auxiliary Ferrying Squadron
WAFS Women Against Fantasy Sports (blog) 
) has showed remarkable results in solving the problem of remote office collaboration for distributed organizations. WAFS allows companies with remote offices to utilize the WAN to share files as if it were a virtual LAN Also called a "VLAN," it is a logical subgroup within a local area network that is created via software rather than manually moving cables in the wiring closet. It combines user stations and network devices into a single unit regardless of the physical LAN segment they are attached to and , enabling real-time, read/write access to shared files while also guaranteeing the consistency and coherency co·her·en·cy  
n. pl. co·her·en·cies
Coherence.

Noun 1. coherency - the state of cohering or sticking together
coherence, cohesion, cohesiveness
 of all file data.

The most successful WAFS systems address inherent WAN file sharing Copying files from one computer to another. See peer-to-peer network, file sharing protocol and file and printer sharing.  issues through a multi-layered technology approach. In designing this technology, WAFS vendors begin by carefully looking at the file sharing protocols A high-level network protocol that provides the structure and language for file requests between clients and servers. It provides the commands for opening, reading, writing and closing files across the network and may also provide access to the directory services.  used in today's enterprise infrastructures. This is a key starting point Noun 1. starting point - earliest limiting point
terminus a quo

commencement, get-go, offset, outset, showtime, starting time, beginning, start, kickoff, first - the time at which something is supposed to begin; "they got an early start"; "she knew from the
, which bears a closer look.

Problems With File Sharing Over the WAN

All major file sharing protocols, including NFS (Network File System) The file sharing protocol in a Unix network. This de facto Unix standard, which is widely known as a "distributed file system," was developed by Sun. See file sharing protocol and WebNFS.

NFS - Network File System
 (Network Filesystem for Unix/Linux environments), CIFS (Common Internet File System) The file sharing protocol used in Windows. It evolved out of the SMB (Server Message Block) protocol in DOS, which is why the terms CIFS/SMB and SMB/CIFS are sometimes seen. The word "Internet" in the CIFS name has little relevance.  (Common Internet Filesystem for Windows environments (1) (upper case "W") Refers to computers running under a Microsoft Windows operating system.

(2) (lower case "w") Also called a "windowing environment," it refers to any software that provides multiple windows on screen such as Windows, Mac, Motif and X Window.
), and IPX/SPX See IPX.  (Internetwork (1) To go between one network and another.

(2) A large network made up of a number of smaller networks. Same as "internet" (lower case "i"), not the "Internet" (upper case "I"). See internet.
 Packet Exchange/Sequenced Packet Exchange for Novell environments) were designed for LAN environments where clients and servers were located in the same building or campus.

The assumption that the client and the server would be in close proximity led to a number of design decisions that do not scale across WANs. For example, these file-sharing protocols tend to be rather "chatty chat·ty  
adj. chat·ti·er, chat·ti·est
1. Inclined to chat; friendly and talkative.

2. Full of or in the style of light informal talk: a chatty letter.
", which means that they send many remote procedure calls (RPCs) across the network to perform operations.

Let's take a look at a closer look at the NFS protocol to show an example of this type of "chatty" behavior. For certain operations on a filesystem using NFS (such as an synchronization (1) See synchronous and synchronous transmission.

(2) Ensuring that two sets of data are always the same. See data synchronization.

(3) Keeping time-of-day clocks in two devices set to the same time. See NTP.
 of a source code tree), almost 80% of the RPCs sent across the network can be access RPCs, while the actual read and write RPCs typically comprise only 8-10% of the RPCs. Thus, 80% of the work done by the protocol is simply spent trying to determine if the NFS client has the proper permissions to access a particular file on the NFS server, rather than actually moving data.

In a LAN environment, these RPCs do not impact performance significantly, but when combined with the high latency typical of WANs, these RPCs can be deadly to performance. Worse, remote clients often end up timing out and retransmitting the RPCs, compounding the inefficiency. Furthermore, because data movement RPCs make up such a small percentage of the communication, increasing network bandwidth will make no difference to the aggravated ag·gra·vate  
tr.v. ag·gra·vat·ed, ag·gra·vat·ing, ag·gra·vates
1. To make worse or more troublesome.

2. To rouse to exasperation or anger; provoke. See Synonyms at annoy.
 end user. Like NFS, CIFS and IPX/SPX suffer from issues of "chattiness chat·ty  
adj. chat·ti·er, chat·ti·est
1. Inclined to chat; friendly and talkative.

2. Full of or in the style of light informal talk: a chatty letter.
" that negatively impact performance over the WAN.

Workarounds and Attempted Solutions

Various solutions and workarounds have been proposed to the WAN file-sharing problem, including replicating file copies and implementing distributed file systems Software that keeps track of files stored across multiple networks. When the data are requested, it converts the file names into the physical location of the file so it can be found. , but neither approach has provided a complete solution. Enterprise content delivery networks (eCDNs) tried to mitigate this problem by caching copies of files at each remote office. But eCDNs, like web caching infrastructure, only provide a read-only copy of data at the remote office. If remote office users wanted to modify the file, they either had to go across the WAN to access the original copy and incur a major performance penalty, or update the local copy and create multiple, out-of-sync versions of the same file.

Filesystems developed over the last 15 to 20 years such as AFS A distributed file system for large, widely dispersed Unix and Windows networks from Transarc Corporation, now part of IBM. It is noted for its ease of administration and expandability and stems from Carnegie-Mellon's Andrew File System.

AFS - Andrew File System
 attempted to solve the WAN file-sharing problem using a distributed filesystem architecture which unites disparate file servers at remote offices into a single logical filesystem. The problem with these technologies is that they require substantial changes in IT architecture to work properly and also require remote-office applications to use entirely new protocols because they do not export data using industry standard protocols such as NFS or CIFS. With over 1 billion computers deployed in the world that access data using either CIFS or NFS and billions of dollars invested in current file server and NAS (1) See network access server.

(2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular
 infrastructure, filesystem solutions are clearly untenable.

The bottom line is that for any WAFS solution to gain traction, it must be able to integrate itself with existing infrastructure rather than requiring new infrastructure to be built.

A Whole New Option

In spite of the failures of both caching technologies like eCDNs and distributed filesystems to address the central issues in WAN file sharing, these technologies do provide important components for solving the WAN file-sharing problem. New WAFS products combine distributed filesystems with caching technology to allow real-time, read-write access to shared file storage from any location, while also providing interoperability with standard file sharing protocols such as NFS and CIFS.

WAFS products enable transparent worldwide design collaboration on the same data set, without complicated replication schemes or slow network performance. WAFS products will cache files A file of data on a local hard drive. When downloaded data is temporarily stored on the user's local disk or on a local network disk, it speeds up retrieval the next time the user wants that same data (Web page, graphic, etc.) from the Internet or other remote source. See Web cache and cache.  in a read-write mode at remote locations, thus speeding up data access for remote users tremendously. WAFS enables LAN semantics for file access to be extended to the entire enterprise.

WAFS systems usually consist of edge file gateway (EFG EFG Electric Field Gradient
EFG Edge-defined Film-fed Growth
EFG European Financial Group
EFG European Federation of Geologists
EFG Egyptian Financial Group
EFG Epic Fail Guy
EFG Earth Federation Government (Mobile Suit Gundam) 
) appliances, which are placed at remote offices, and one or more central server (CS) appliances that allow storage resources to be accessed by the EFGs (see Figure 1).

Each EFG appears as a local fileserver to remote office users. Together, the EFGs and CS implement a distributed filesystem and communicate using a WAN-optimized protocol. This protocol is translated back and forth to NFS and CIFS at either end, to communicate with centralized cen·tral·ize  
v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es

v.tr.
1. To draw into or toward a center; consolidate.

2.
 storage and remote user applications.

3 Key Design Questions

When building a WAFS system, three key design questions that must be addressed include:

* What are the features of the optimized protocol run between the EFGs and CSes across the WAN?

* What specific optimizations have to occur in the system design for reading files?

* What is the specific architecture for writing files and moving updates back to central storage resources?

The protocol used between the remote offices and the datacenter should incorporate fileware differencing technology, data compression data compression

Process of reducing the amount of data needed for storage or transmission of a given piece of information (text, graphics, video, sound, etc.), typically by use of encoding techniques.
, streaming, and other technologies to improve performance and efficiency in moving data across the WAN. File-aware differencing is especially important because it can detect which parts of a file have changed, and only move those parts across the WAN. Furthermore, if pieces of a file have been rearranged, only offset information will be sent, rather than the data itself. These techniques result in tremendous, order-of-magnitude bandwidth reduction across the WAN and time savings in accessing files by remote users.

Read performance is governed by the ability of the EFG to cache files at the remote office, and the ability to serve cached data to users while minimizing the overhead of expensive kernel user communication and context switches, in effect enabling the cache to act just like a high-performance file server. If the WAFS system is architected correctly the remote cache should mirror the data center exactly and only a few WAN round trips are required to check credentials and availability of file updates, but read requests will be satisfied from the local cache. Thus, regardless of how many NFS/CIFS read RPCs come into the EFG, it should hardly translate into any WAN traffic.

Unlike read performance, write performance is governed by the write caching mechanism that is used in the WAFS system. The two main types of mechanisms are known as write-back and write-through.

In a write-through approach, data written to a file is sent immediately over the WAN to the datacenter, while in write-back the data is written to the EFG and then sent over the WAN. Either approach, in isolation, has certain associated tradeoffs. Write-through is very safe, because all file updates are stored in the datacenter, but it suffers from poor performance and does not survive WAN disruptions. Write-back is very fast, but is riskier if the EFG fails before updates are sent to the datacenter.

The optimal combination involves using a write-back approach for maximum performance, coupled with synchronous logging of file updates to persistent storage, ensuring no data loss in case of filesystem crashes or WAN outages. Write-back caching is typically very difficult to implement correctly with logging, but has superior performance and reliability characteristics.

Figure 2 depicts representative performance for both read and write operations over a WAFS system and over a standard WAN. Opening a 5-Mbyte file over the WAN takes about 122 seconds, while a high-performing WAFS system will fetch the file in 11 seconds the first time it is accessed, and at essentially LAN speed on subsequent (warm) accesses because the file is cached locally. Writing a 2-MB file over the WAN takes 81 seconds. A write-back WAFS system achieves the same result in about 4 seconds.

In all the tests shown in Figure 2, the WAN latency was 60ms (representative of actual conditions between San Francisco San Francisco (săn frănsĭs`kō), city (1990 pop. 723,959), coextensive with San Francisco co., W Calif., on the tip of a peninsula between the Pacific Ocean and San Francisco Bay, which are connected by the strait known as the Golden  and Houston) and the bandwidth allotted al·lot  
tr.v. al·lot·ted, al·lot·ting, al·lots
1. To parcel out; distribute or apportion: allotting land to homesteaders; allot blame.

2.
 was 1,544 Mbit/s (T1 Line). Clearly, WAFS products can enable near-LAN speed read-write access to data in a WAN environment.

[FIGURE 1 OMITTED]

Maintaining Data Coherency and Consistency

Data coherency and data consistency Data consistency summarizes the validity, accuracy, usability and integrity of related data between applications and across the IT enterprise. This ensures that each user observes a consistent view of the data, including visible changes made by the user's own transactions and  are important properties of WAFS implementations, because they ensure that file updates are safe (cannot be written over) and available throughout the network of edge devices--crucial features for supporting engineering collaboration.

Data coherency means that file updates (writes) from any one remote office are guaranteed never to conflict with updates from another remote office. Properly designed WAFS implementations guarantee this by maintaining a system of file leases. Leases are defined as a particular access privilege to a file from a remote office.

If a user at a remote office wants to write to a cached file A file that has been stored temporarily in a disk or memory cache. See cache and cache file. , the EFG at that office must obtain a "write lease", i.e., a right to modify the document before it can do so. WAFS solutions guarantee that at any time there will be only one remote office that has the write lease on a particular file thus guaranteeing coherence. Also, when a user at another office tries to open the file, the EFG that has the write lease flushes its data first and optionally can give up the write lease if there are no active writers to the file. This mechanism ensures that writers at different offices do not collide col·lide  
intr.v. col·lid·ed, col·lid·ing, col·lides
1. To come together with violent, direct impact.

2.
 with each other and that file updates are safe.

Data consistency implies that file updates made at one office are always available enterprise-wide, and well-architected WAFS system do this immediately after the update is made. Again, for collaboration, this is supremely important because remote designers want to be sure they are working on the most current version of any file, no matter where it was worked on last.

Scalable Implementation

Any WAFS implementation should be capable of handling large files and large numbers of files (particularly important for CAD), as well as large numbers of concurrent users. A WAFS product that cannot scale beyond a few hundred MB of files or 10 users is not of much use.

The issue of write-through and write-back architectures mentioned earlier figures into how well a WAFS implementation scales. A write-through WAFS implementation that cannot scale to enterprise levels as synchronous data Synchronous data

Information available at the same time. To test option-pricing models, the price of the option and of the underlying should be synchronous and reflect the same moment in the market.
 transfers across the WAN quickly becomes a bottleneck as the number of files and users increase. Systems that are based on write-back architectures, that incorporate differencing and compression technologies, scale much better.

Additionally, scalable systems recognize temporary files that applications (e.g. Microsoft Word A full-featured word processing program for Windows and the Macintosh from Microsoft. Included in the Microsoft application suite, it is a sophisticated program with rudimentary desktop publishing capabilities that has become the most widely used word processing application on the market. ) may create during the normal course of operation and do not send these files over the WAN, instead only sending over the final revision, which accelerates performance. Changes are reflected back to the datacenter in a consistent and coherent manner.

Summary

For companies with remote offices, sharing data in real time between team members at distributed locations has been a challenge. The culprit has been the poor performance and reliability of file-sharing protocols such as NFS and CIFS when used over the WAN. To date, workarounds to the problem have only exacerbated the issues. WAFS technologies have entered the industry to "save the day" by enabling enterprises around the world to share data over the WAN with the same performance, reliability, and peace of mind that they would have over a local area network.

[FIGURE 2 OMITTED]
Figure 2: Diagram showing performance of a WAFS system

Time to Open a 5 MB File *

WAFS System:   3 seconds
Warm Cache

WAFS System:   11 seconds
Cold Cache

Standard       122 seconds
Network Share

Note: Table made from bar graph.

Figure 2: Diagram showing performance of a WAFS system

Time to Save a 2 MB File *

WAFS System:   4 seconds

Standard       81 seconds
Network Share

Note: Table made from bar graph.


www.tacitnetworks.com

Noah Breslow is vice president of marketing at Tacit Networks (South Plainfield South Plainfield, borough (1990 pop. 20,489), Middlesex co., NE N.J.; inc. 1926. It is the seat of several research and consulting firms and has plants that make chemicals, plastics, spices and flavorings, cosmetics, rubber products, pigments, electrical machinery, , NJ)
COPYRIGHT 2004 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Networking; wide area network
Author:Breslow, Noah
Publication:Computer Technology Review
Geographic Code:1USA
Date:Aug 1, 2004
Words:2286
Previous Article:Scalable network storage architectures.(Storage Networking)
Next Article:Backup & recovery using revolutionary MAID architecture: part 1.(Disaster Recovery & Backup/Restore)(Massive Array of Idle Disks)
Topics:



Related Articles
Implementing Fibre Channel Over A Wide Network.(Technology Information)
Nets Pan SAN Gold In OSNI.(Company Business and Marketing)
Global Storage Networks: Their Time Is Now.(Industry Trend or Event)
SANs VS. NAS: What You Should Know And Why You Should Care.(Industry Trend or Event)
File sharing over the WAN. (Storage Networking).
Maintaining quality of service for WAN storage over IP.(Storage Networking)(wide area network)(Internet Protocol)
Tacit works with Microsoft to provide windows-based solution that extends datacenter IT services.(Tacit Networks Inc.)
Sooner than you think.(WHAT'S NEXT?)
WAFS & CDP let organizations move beyond acceleration attempts, delivers WAN file access performance and continuous backup consolidation.(Storage...
New WAFS capability completes Exinda's Unified Performance Management.(wide area file service)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles