Storage Networking And The Data Center Of The Future.Charting the emergence of the Direct Access File System This is the first in a series of columns authored by members of the DAFS (Direct Access File System) A high-performance file sharing protocol based on the VI memory-to-memory architecture. Designed for storage area networks (SANs), DAFS provides bulk data transfer directly between the application buffers of two machines without Collaborative, a new industry group formed to create a protocol specification for direct, memory-to-memory data networking. One of the daily battles we face in the IT industry is the data explosion. The need to provide increasing numbers of people with access to exponentially increasing amounts of data is driving significant changes in data center architectures and data access technologies. And this revolution is only just beginning, as ever-increasing processing power creates ever-increasing amounts of data. The four basic drivers of this data explosion are computer-generated data, digital imaging, database expansion, and an increasingly web-connected population. Applications such as ECAD ECAD Electronic Computer-Aided Design ECAD European Cities Against Drugs ECAD European Center for Aviation Development ECAD external carotid artery dysplasia and Hollywood special effects special effects, in motion pictures, cinematographic techniques that create illusions in the audience's minds as well as the illusions created using these techniques. generate massive amounts of data. Digital sampling of the real world is exploding in the consumer space with digital photography, digital audio (think of all the MP3 files on the web), and the emergence of streaming video A one-way video transmission over a data network. It is widely used on the Web as well as company networks to play video clips and video broadcasts. Computers in home networks stream video to digital media hubs connected to a home theater. . And that's in addition to the increasing use of high resolution applications such as medical, satellite, and seismic imaging-a single image of the earth used in the oil prospecting business, for instance, can consume more than 500GB. Finally, sophisticated and high-powered database applications are creating a need to store massive amounts of information for and about enterprises. All these sources of data need to be made accessible to increasing numbers of people via intranets or the Internet. And those millions of Internet-connected people generate enormous amounts of data through email messages and web pages. Consequently, the requirements on data centers and the architecture of application software and the supporting data infrastructure are changing dramatically. This is best illustrated today by e-business data centers where a three-tier architecture is being adopted-front end web servers, application servers in the middle, and database servers at the back end-all connected in secure high-speed switched networks. This topology makes it much easier to modify/replace/upgrade one tier without risking disruption of the others. It also facilitates scalability, load balancing The fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. For example, in clustering, load balancing might distribute the incoming transactions evenly to all servers, or it might redirect them , and performance tuning Performance tuning is the improvement of system performance. This is typically a computer application, but the same methods can be applied to economic markets, bureaucracies or other complex systems. . Not surprisingly, however, the focus of attention is not so much on the compute infrastructure as on the supporting data management infrastructure. The old days of choosing a server vendor and then automatically buying that vendor's storage solution are disappearing. Today data storage, access, and management is the strategic issue. Storage Networking So how do you manage all that data? A recent development to address this problem is storage networking. The promise of storage networks is that they can provide independently-scalable data storage, centralized data consolidation, reliability and fault resilience, and simplified data management. Two approaches have emerged: Storage Area Networks (SANs) and Network Attached Storage (NAS (1) See network access server. (2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular ). With traditional direct-attached storage Direct-attached storage (DAS) refers to a digital storage system directly attached to a server or workstation, without a storage network in between. It is a retronym, mainly used to differentiate non-networked storage from SAN and NAS. , the server moves disk blocks over a SCSI SCSI in full Small Computer System Interface Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. interconnect to a storage device (disk drive or RAID box). Storage Area Networks extended this paradigm to a switched Fibre Channel fabric A Fibre Channel fabric (or Fibre Channel switched fabric, FC-SW) is a switched fabric of Fibre Channel devices enabled by a Fibre Channel switch. Fabrics are normally subdivided by Fibre Channel zoning. Each fabric has a name server and provides other services. , where encapsulated SCSI disk blocks are moved from the server to the storage pool over a Fibre Channel interconnect. The main advantage of this approach is fast, direct-to-memory access. However, since they implement a block-mode architecture, SANs can be complex and inflexible, and they do not support heterogeneous data sharing The ability to share the same data resource with multiple applications or users. It implies that the data are stored in one or more servers in the network and that there is some software locking mechanism that prevents the same set of data from being changed by two people at the same time. . In contrast, Network Attached Storage (NAS) uses standard file sharing protocols (NFS (Network File System) The file sharing protocol in a Unix network. This de facto Unix standard, which is widely known as a "distributed file system," was developed by Sun. See file sharing protocol and WebNFS. NFS - Network File System , CIFS (Common Internet File System) The file sharing protocol used in Windows. It evolved out of the SMB (Server Message Block) protocol in DOS, which is why the terms CIFS/SMB and SMB/CIFS are sometimes seen. The word "Internet" in the CIFS name has little relevance. , HTTP HTTP in full HyperText Transfer Protocol Standard application-level protocol used for exchanging files on the World Wide Web. HTTP runs on top of the TCP/IP protocol. ) accessed over a standard TCP/IP TCP/IP in full Transmission Control Protocol/Internet Protocol Standard Internet communications protocols that allow digital computers to communicate over long distances. network. The storage in this case is not simply a block device, it is a special-purpose computer A computer designed from scratch to perform a specific function. Contrast with general-purpose computer. with an embedded file system. The advantages of this approach are multiprotocol and multiplatform data sharing, ease-of-use, and flexible data management. Historically, the limitation of NAS was associated with network speed and network overhead (such as TCP/IP packetization and operating system operating system (OS) Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs. overhead). However, the industry's vast aggregate R&D investment in standard networking technology is rapidly removing those limitations. For instance, eight years ago a state-of-the-art SCSI disk interconnect was 100 times faster than an Ethernet connection. Six years ago, SCSI was 20 times faster than a state-of-the-art Ethernet connection. Four years ago the fastest SCSI was four times faster than a switched Ethernet An Ethernet network that is controlled by a switch instead of a shared hub. The switch cross connects all clients, servers and network devices, giving each sending-receiving pair the full rated transmission speed. connection. Today Fibre Channel runs about neck-and-neck with gigabit Ethernet. Next year, 10 Gigabit Ethernet will surpass 2Gb Fibre Channel. The distinctions between SAN and NAS are beginning to disappear as new technologies appear. An example is SCSI over IP (block protocol over TCP/IP). Next year, the distinction between SAN and NAS will be further blurred by the emergence of a new protocol--the Direct Access File System (DAFS), which achieves file access over direct-to-memory interconnects (See Fig). Not SAN, Not NAS ... DAFS Just as NFS maps file system semantics to standard TCP/IP network transports, DAFS maps file system semantics to next-generation system interconnects. DAFS uses the Virtual Interface (VI) architecture (or a VI-like architecture) as its underlying transport mechanism, to enable applications to access VI-capable hardware without operating system intervention, and carry out bulk data transfers directly to or from application buffers. The result is low-latency, high performance file access between clusters of application servers, database servers, and shared pools of storage in data center environments. DAFS demonstrates that very soon the storage networking question will no longer be SAN versus NAS, but "what is the best data access method for my environment...block mode access or file mode access?" The members of the DAFS Collaborative believe that file mode access will become increasingly important in data center environments. Blocks simply contain data. Files contain data and file structure information (metadata). So unlike block mode, file systems know which blocks comprise a file, which users have permission to access each file, which applications have locks on which blocks, and where a requested subset of the file is written to disk. Consequently file mode access provides a much richer set of data management capabilities. DAFS, we believe, will be an important underpinning for the data center of the future-enabling independently and massively scalable storage and compute power, resiliency to application server and file server failovers, multiprotocol data sharing, ease of management, and a heterogeneous, standards-based environment that promotes Cost efficiency and vendor independence. What nobody can deny is that intelligent management of data storage will continue to be a strategic asset and a competitive weapon as big data and new applications create new models for delivering products and services that customers value. David Dale is the co-chair of the DAFS Collaborative and industry evangelist, Network Appliance (Waltham, MA). The DAFS Collaborative was formed in June 2000 by Network Appliance, Intel, and other leading systems and storage networking vendors, with the goal of making the Direct Access File System protocol available to the industry. The group is soliciting industry review and feedback before submitting the new file system to an appropriate standards body. The DAFS Collaborative encourages broad industry participation. Additional participants are welcome and can join online. |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion