Continuous data access: the 99.999 challenge facing network administrators; Enterprise level high availability using iSCSI.
Before examining how iSCSI can enable high availability for an IP-SAN, let's review how an iSCSI storage target (volume) interfaces with the operating systems and file system on the host. This is especially important to understand since with iSCSI the host can conceivably be in Florida while the data and the storage targets are in California. An IP-SAN and iSCSI work under the file system at the block level. This allows iSCSI to support virtually any application, including databases, backup and restore applications and Exchange.
[FIGURE 1 OMITTED]
With traditional DAS or FC-SAN attached disk drives or tape, a file system makes read and write requests via the server to the storage devices using a set of standard SCSI commands. The OS and file system have 100% control over these storage devices. ISCSI, like standard SCSI, is a block-based storage protocol layered underneath the file system. This means that an iSCSI volume appears as an additional disk drive when mounted by the OS. The iSCSI volume can be partitioned, named and formatted like a normal disk drive.
An iSCSI initiator driver is available for all popular operating systems. Standards for the iSCSI initiator are governed by the IETF (Internet Engineering Task Force). The iSCSI initiator driver responds to file system SCSI commands that are targeted to the disk drives the initiator represents. The iSCSI initiator driver encapsulates these SCSI commands and data into iSCSI packets that are, in turn, encapsulated into TCP/IP packets. The TCP/IP packets are then routed very quickly over the Ethernet network, where they are delivered to an iSCSI storage target. The iSCSI storage target can be located on the same network within the building or halfway around the world. This iSCSI target has all the same attributes as a standard SCSI storage system. Once the iSCSI packet arrives at the iSCSI storage system representing the targets, the SCSI commands and data are decapsulated from the iSCSI/TCP/IP packet and are executed on the storage system. Once executed, the results are encapsulated back into iSCSI/TCP/IP and returned to the iSCSI initiator driver on the server where they are decapsulated and delivered to the SCSI layer and then the file system.
Automatic Multi-path Failover with the iSCSI Initiator
The iSCSI initiator comprises layers that are key to providing multiple data paths between servers and iSCSI storage targets, like those provided by the iSCSI intelligent storage switch. The two main layers within the iSCSI initiator are the session and connection layers.
The first layer is the "session" layer. The session layer is an upper layer and is responsible for maintaining the communication to the SCSI layer within the server. It also ensures proper order of SCSI commands and data to and from the server file system to the iSCSI storage target. SCSI commands are numbered in sequence as they are sent from the server. The iSCSI storage target arranges the SCSI commands according to their order, ensuring that commands are not lost, taken out of order or duplicated.
Within every server there is usually only one iSCSI initiator but there can be more than one session established and running within a single initiator. For example, if there were two iSCSI storage targets being used by the server, then there would be one initiator with two sessions running.
The second layer is the "connection" layer. The connection layer is the TCP/IP connection between the server and the iSCSI storage target, which, in our case, is the ISCSI intelligent storage switch. The session layer can maintain several connections. In common applications there is only a primary iSCSI TCP/IP connection. But for 99.999 applications there is a primary iSCSI TCP/IP connection and an alternate connection. All iSCSI traffic between the server and storage system travels over the primary connection. In most IP-SAN deployments, this is a Gb Ethernet link and is more than fast enough to handle average server traffic. With some application servers, a 10/100 Ethernet link (100Mb) is adequate since most servers don't generate more than 50Mbs of sustained storage traffic. A Gb Ethernet link (1000Mb) is recommended to handle I/O spikes if spikes result in I/O bursts greater than 50Mbs assuming a 50% overhead.
Because the iSCSI session is aware of alternate TCP/IP paths to the iSCSI storage target, it will automatically transfer traffic through an alternate TCP/IP connection. So, if one TCP/IP connection between a server and iSCSI storage target fails, the traffic is automatically routed through the alternate TCP/IP connection. Because the SCSI commands are numbered, the iSCSI storage target is able to arrange the commands received across multiple connections.
Demonstration of Server Multi-path Failover
To demonstrate how iSCSI failover functions, several companies participated in a third party demonstration where 5 videos where streamed from 5 iSCSI storage targets on an ISCSI intelligent storage switch to 5 Microsoft Windows 2003 hosts. Each host used the native Microsoft-supplied iSCSI initiator driver software. As reviewed earlier, the iSCSI session layer was responsible for maintaining the video stream to the video player application on the hosts. Each host had two TCP/IP Ethernet connections used by the iSCSI session. The primary connection was a 10/100 Ethernet CAT5 copper connection between the host and ISCSI intelligent storage switch via a switched LAN. The second connection was a wireless WI-FI connection between the host and ISCSI intelligent storage switch. The following statement reviews the failover test and results:
"We were able to disconnect the CAT5 cable from the hosts and the iSCSI session automatically routed the traffic over to the WI-FI connection. We were able to do video streaming to 5 hosts running iSCSI (wireless) going to an access point, then to a hub, then to the iSCSI intelligent storage switch (iSCSI to FCP-SCSI), then to a core-edge FC fabric, then to a virtual port, and finally mapped to an FC open-9 LUN on our enterprise class storage system. This forced failover test was extremely positive and fast enough to keep all 5 videos streaming."
iSCSI intelligent storages switch IP Take-Over for Enhanced High-Availability
IP take-over is key to enabling high availability and failover paths with the ISCSI intelligent storage switch. In the event an iSCSI intelligent storage switch is temporarily off-line, the second storage switch attached to the same storage and the same host network can take over the IP addresses and data communication for the off-line switch. Both iSCSI intelligent storage switches are "active" servicing their assigned hosts but they can also provide a "passive" failover path for other hosts within the network. This is because both ISCSI intelligent storage switches maintain the configuration information of other storage switch within the cluster and monitor the heartbeat of their designated partner or partners. When a site or ISCSI intelligent storage switch goes offline, iSCSI will terminate the host connections with the problematic site but maintain the iSCSI session within the host while waiting for the IP addresses for storage to be re-exposed. The other storage switch will now expose the IP addresses from the down site or ISCSI intelligent storage switch. The iSCSI initiator will discover the re-exposed IP addresses and create a new connection thus enabling the hosts to proceed with communication through a new ISCSI intelligent storage switch to the storage systems.
IP-SAN can deliver 99.999% data storage availability. With the right products, any business can now build an IP-SAN that is simple, cost effective, highly scaleable and fast enough to exceed the majority their performance requirements.
Zophar Sante is VP of Market Development at SANRAD (Silicon Valley, CA).
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Disaster Recovery & Backup/Restore|
|Publication:||Computer Technology Review|
|Date:||Oct 1, 2005|
|Previous Article:||How to eliminate the complexity of software licensing with utility pricing.|
|Next Article:||PCI Express switching and Remote I/O in the data center.|