Storage Over IP: Death Of The SAN, Or New Beginning?
Weather on the SAN has been choppy at best, with sudden squalls and a constant undertow by those vendors who believe in controlling the marketplace and getting standards neatly aligned behind them. Still, a strong prevailing wind constantly blows in the direction of more open and more productive technology: LAN-free backup becoming server-less then NDMP-standardized, virtualization increasing capacity utilization and vendor-neutral capacity assignment. Calmer seas are in sight--or were.
Another weather system has been expanding over in networking. The Ethernet now reaches gigabit speeds, with 10 gigabits, then Infiniband looming. Network speeds and connectivity surpass internal bus speeds, at least nominally.
The collision of gigabit internetworking and the fiber-SAN will be happening this year, with a flurry of vendor announcements already started or planned. This has the potential to whip up a "Perfect Storm" in storage networks, as past investments and future blueprints get reconsidered: should you access your storage over Ethernet or Fibre Channel? Is the SAN going back to square one? Will the iSCSI gale sink your storage infrastructure, taking your job down in the process?
These coming months will test the seamanship of many a CIO. Granted that this is a sea change, it can be turned to your advantage.
First The Basics
We can look forward to widespread debate and ample confusion about protocols: iSCSI, SoIP, and other varieties will vie for attention. Let us take some time to unpack them.
The most basic level is a block I/O request. This request is issued by the file system or DBMS based on an application need for a file segment. The accepted low-level protocol between the server and its storage is SCSI, a well-known protocol especially adapted to moving large blocks of data. However, SCSI comes with a very weak transport layer. This layer allows addressing only of seven or fifteen devices, depending on the SCSI revision level. The solution is to wrap the SCSI messages in a routable, widely addressable transport layer. The issue: Which transport layer works best, Fibre Channel or IP?
Fibre Channel 101
Let us first dispel a common misconception. Fibre Channel is not synonymous with the use of fiber-optic. You can use multimode or monomode fiber-optic, but also coaxial copper cable. Fibre Channel, as codified by its standard body, the Fibre Channel Industry Association (FCIA) is a transport protocol, or rather two protocols:
Fibre Channel Arbitrated Loops (FC/AL) allow connection of up to 128 devices on a single 1Mbps fibre loop. Addressability is much better than SCSI's fifteen-device limit, but still falls short of any major data center's needs. More importantly, the shared bandwidth discourages any sizeable configuration. For this reason, FC/AL is disappearing as a valid contender.
Switched fabrics make use of the full addressability of the Fibre Channel protocol, enabling connectivity of 224 or 16 million devices in the same network. Moreover, it is designed to allow each device the full benefit of the megabit-per-second pipe.
Broad connectivity and scalable bandwidth have made the success of the Fibre Channel protocol. An additional strength is the efficiency with which it uses the bandwidth, allowing up to 90% utilization or 900Kbps effective data throughput.
Although very economical, Fibre Channel also carries a number of weaknesses, some academic, others more serious. The weakness of its security is not a practical deterrent as most connections occur over very short distances in a physically secure data center, and intrusion into an input/output stream would be very challenging and impractical to exploit. The greater challenge is the physical limitation to a very short distance: ten to thirty kilometers, further limited by FCC regulations preventing transmission beyond private boundaries.
The Ethernet Cometh
What is commonly called the Ethernet protocol (more specifically IEEE 802.3) uses broadcasting within a given domain, and routing across domains. It is used in Internet protocol (IP). Above the IP layer, a choice of two main transport protocols (ISO Layer 4) has been developed: the basic Universal Datagram Protocol (UDP) and the more commonly used and reliable Transmission-Control Protocol (TCP).
Ethernet-based protocols have long reigned over the LAN, and more recently over Wide Area Networks (WANs), for any peer-to-peer computer communications. Ethernet products serve a huge market, and contain most of the bells and whistles that are missing in the Fiber Channel world. Security, such as PGP (Pretty Good Privacy) public key encryption, and elaborate management tools are available especially for TCP/IP.
Most importantly, Ethernet now offers nominal bandwidth at the same gigabit speed as Fibre Channel, with more aggressive plans to increase it over the next few years.
On the down side, TCP/IP does a poor job of moving large blocks of data, such as storage input/output messages, as it breaks them down into numerous packets, all subject to collisions and retries, then recombines them once all blocks have reached destination. This activity creates a very high CPU overhead on participating computers, creates inordinate LAN congestion, and drive down throughput to a fraction of nominal bandwidth. These severe limitations still allow high-level, low traffic data access activities such as network-attached storage, but condemn using TCP/IP for storage I/O.
A Technology Smorgasbord
The situation described above has prevailed for the past few years, leading to an uneasy truce: storage would only utilize Fibre Channel, and would stay for the most part local, where the protocol's productivity benefits largely outweigh its limitations. In the past two years, widespread Web-based computing and economic pressure have created a demand for storage consolidation across sites, either "insourcing" to service centers, or "outsourcing" to application and storage service providers and co-locators.
As is often the case in computing, this technical problem, once identified, has been quickly addressed by an influx of multiple incompatible technologies. Cisco, Nishan, Gadzoox, INRANGE, and many others are offering their solution, or will do so soon.
All solutions share the same starting point: the two end points (the requesting computer and the storage resource, or the two interconnected storage resources) speak SCSI, and the command must somehow be carried remotely over the Ethernet without slowing down to the point of defeating its purpose. Each vendor takes a separate strategy to crack this nut.
Nishan Systems is going for technical but proprietary excellence. The IFCP protocol, proposed to the IETF as a target standard, passes traffic between Fibre Channel devices over WANs. IFCP maps Fibre Channel frames to a predetermined TCP connection for transport, with a good performance/bandwidth utilization compromise for remote connections. Industry critics point to potential incompatibility down the road with other IP standards. Nishan Systems also promotes MFCP, the Metro Fibre Channel Protocol, targeted to Metropolitan Area Networks, that utilizes UDP and maps FCP frames over IP rather than TCP as the transport layer. MFCP and IFCP attempt to standardize the means by which IP networks can provide FCP-based fabric services to compatible storage devices that today are being furnished by Fibre Channel switches.
The Fiber Channel Internet Protocol (FCIP) proposal to Internet Engineering Task Force (IETF) seeks to standardize the means by which Fibre Channel can be tunneled over TCP/IP networks. The sponsors of this proposal include Lucent, Vixel, Brocade, McData, Qlogic, and Gadzoox. The FCIP proposal would allow Fibre Channel switch vendors to retain the core routing intelligence used for local and globally distributed storage networks.
Internet SCSI (iSCSI) calls for networking native SCSI-3 traffic over TCP/IP. This proposal calls for a block I/O infrastructure that leverages existing TCP/IP Ethernet management tools. If adopted, iSCSI could be an alternative to FCIP, iFCP, or mFCP in linking two or more storage networks over long distances. The goal of the iSCSI community is to make TCP/IP Ethernet the storage network transport of choice.
The iSCSI is currently in pre-standard format and was first proposed as a block storage I/O transport to the IETF by NuSpeed (recently acquired by Cisco), IBM, and Cisco. This proposal maps SCSI to TCP/IP, with 1Gbps solutions now available and 10Gbps products in development. Using TCP/IP Ethernet as a block I/O transport now appears feasible.
Potential iSCSI products under development include I/O Routing (from NetConvergence, Cisco, Lucent, Nortel, and Pirus), Gigabit-attached iSCSI RAID systems (IBM's new IPstorage 200i), and iSCSI Adapters--utilizing TCP/IP offload engines (TOEs). TOEs are being used because TCP requires a great deal of processor overhead to transport data (from Adaptec, Agilent, Emulex, Qlogic, and Troika).
Pre-standard versions of iSCSI are likely to be introduced in the first half of 2001. The iSCSI standard is targeted for completion in the fall of 2001, with products being routinely available by late 2002.
Meanwhile, an InfiniBand fabric has been demonstrated at the February 2001 Intel Developer's Forum. InfiniBand technology is still two years away from productization and three to four years away from mass customer deployment.
Are you confused yet? Well, so is the storage community. While Gigabit Ethernet holds great promises for ubiquitous storage access, the reality remains that you should tread with extreme caution, asking yourself three questions: * Which standard serves best your immediate needs? * Are they compatible with your storage applications: mirroring, virtualization, backup, management (the short answer being compatibility just isn't there yet). * Which standards are most likely to flourish?
If you run a large data or service center, or a single-location ASP or SSP, your solution is simple. Ignore this technology. You can get equal bandwidth with much better capacity utilization and fewer surprises with switched Fibre Channel.
If you are looking to interconnect your data centers for disaster recovery, "follow the sun" capacity replication or to achieve a logical "single worldwide site", the now classical option of dark fibre with DWDM (Dense Wave Division Multiplexing) is now supplemented with a more flexible, anywhere-to-anywhere MAN or WAN transport capability. Nishan Systems and INRANGE already offer a number of technical options, with Cisco sure to follow suit. We strongly recommend that your environment (operating system, storage, backup, and mirroring applications, etc.) be painstakingly assessed and tested for potential impact. A few consultancy groups such as SanOne already provide the required expertise and resources. These solutions can be deployed as early as this year without taking inconsiderate risks.
If you are looking to give remote (non-data center) servers access to your shared storage pools over LAN, MAN or WAN, there's good news and bad news for you. On the positive side, you can do it now using Cisco and in some cases Nishan products. On the negative side, less-than-optimal use of bandwidth, LAN congestion, application compatibility, and high cost of devices will limit the practicality of these solutions for the short term. If this is a high enough priority, you may want to consider a limited pilot project in the second half of this year. Most customers will find it much easier to pull another fibre cable within their facilities and implement a less adventurous Fibre Channel solution.
Great News In Disguise
At this point, I sense the most sensitive among you are starting to break down crying. Who wouldn't? Just when Fibre Channel SANs were becoming safe, a new layer of confusion is threatening to bring your plans tumbling back to square one.
Cheer up. This is good news.
Cisco, Nishan, Nortel, and other vendors are endorsing the viability of the SAN by announcing their intention to contribute to its expansion. The core SCSI protocol is reinforced as a standard, and so is Fibre Channel switching. You can proceed unfazed with your immediate plans, buying and deploying SAN solutions from the vendors who give you true value right now, in the form of virtualization, connectivity, cost-per-megabyte, and most importantly seamless integration.
Not only is Gigabit Ethernet not disruptive short term; it also opens exciting future perspectives. Networked storage solutions are expanding their span beyond the glass house, in step with business needs. Management and administrative control of the application network (LAN and WAN) and the data network (the SAN) will converge rapidly, eventually making your life easier.
For now, your short-term philosophy should be "lead, follow, or get out of the way". Lead if you have the infrastructure, resiliency, and absolute need to be a pioneer, complete with arrows in your back. Follow the advice of a vendor-neutral consultant you can trust if you are interested in exploring the possibilities yet cannot afford to jump blind. Or stay out of the way for now. Storage consolidation and improved capacity utilization through SAN and virtualization may still offer the best return on your investment dollar.
My advice to you: don't sail into this spring's Gigabit storm unless you must, and you have a first-class skipper and crew. But don't stay moored to your direct-attached storage either. The SAN promises much smoother and enjoyable sailing ahead. Come on aboard! Mike Flannery is the president of SanOne (Phoenix, AZ).