Recently ratified by the IETF, the iSCSI standard is about to make storage area networks (SANs) much more attractive to small- and mid-sized businesses.
Although its name harkens back to its humble beginnings as a peripheral interconnect for direct-attached storage, the Small Computer Systems Interface (SCSI) is one of the most important tools in large, high-performance storage area networks (SANS).
Since ANSI approved the use of the SCSI protocol in Fibre Channel networks in 1996, IT managers have been able to consolidate their storage in one location, giving all of their users access to reliable, low-latency storage and backup capabilities. Yet, not everyone’s been able to benefit from such SANs due to Fibre Channel’s high cost of entry. Last year, SAN equipment vendor QLogic estimated that the cost of an entry-level, enterprise SAN was about $250,000. Although industry efforts like the Affordable SAN Initiative are trying to reduce that amount, most small- and mid-sized businesses still can’t justify allocating hundreds of thousands of dollars from their already depressed budgets.
Fortunately, an update to the way SCSI is transported is changing all that. Developed by the Internet Engineering Task Force (IETF) and ratified this past February, the new Internet SCSI (iSCSI) standard allows computers to transfer blocks of SCSI-encapsulated data over traditional IP networks. For organizations disinterested in the cost and complexity of Fibre Channel, iSCSI provides an efficient and affordable storage solution.
From DAS to SANs to iSCSI in a Hurry
The simplest storage architecture is direct attached storage, or DAS, where a disk drive is attached directly to a computer. Working over a parallel cable, data is transferred from the computer to the disk using SCSI commands, which transfer data between the endpoints in blocks, the low-level, granular units used by storage devices (as opposed to files).
Almost all computers use DAS for internal or local storage, but DAS storage is extremely limiting: attached storage is expensive to manage, expensive to expand, and requires colocation of all hardware (say, server and disk drives), because cables lengths cannot be longer than about 40 feet.
The failings of DAS have driven the need for network attached storage, or NAS. Unlike DAS, NAS is a file-based storage architecture, where devices are connected via a LAN, and storage traffic traverses the LAN as well.
NAS improves on DAS because it leverages existing LAN technology and is therefore easy to manage. NAS is also flexible: if a site needs more storage, it can simply attach more devices to the LAN. But NAS has disadvantages, too: in particular, as storage traffic increases, the performance of the LAN can suffer.
The next variant is the storage area network, or SAN, which combines the best features of DAS with the flexibility of NAS into a powerful, hybrid storage architecture. A SAN functions like DAS: blocks are transferred from computer to device. And a SAN is conceptually similar to NAS — a network connects servers and storage devices. But where NAS used the enterprise LAN for storage traffic, a SAN uses a dedicated (often high-speed) network, separating storage traffic from other LAN traffic.
SANs generally provide excellent performance, are readily scaled to meet increasing demand, and are extremely reliable. So, what’s not to like?
SAN interconnect technology such as Fibre Channel can be expensive to deploy and maintain. Moreover, special staff may be needed to perform installation and maintenance. And although the theoretical cabling limit for Fibre Channel is six miles or so, practical limitations may keep devices very close by, perhaps only 800 to 1600 feet away from each other. If you want your SAN far removed from your LAN (say, if your servers are co-located in earthquake country), the practical and theoretical limits of an interconnect like Fibre Channel may be unacceptable.
And that’s where iSCSI comes in: iSCSI can be used to build IP-based SANs. iSCSI is a standard protocol that encapsulates SCSI commands into TCP/IP packets and transports block data over IP networks. A simple, yet powerful technology — imagine your hard disk tethered to a transatlantic-length parallel cable — iSCSI offers high-speed, low-cost, and long-distance storage solutions.
The Building Blocks
The groups most likely to see the benefits of iSCSI right away are universities and enterprises with widespread campuses and multiple departments.
Because using Fibre Channel to bridge the distance between departments requires expensive hardware, and because most of the departments are unlikely to have adequate Fibre Channel training, storage management is often as fractured and different as the groups themselves. Each department maintains its own servers and direct-attached storage, making it difficult for the organization as a whole to ensure that each division has enough storage to meet future needs, and that all important data is being backed up regularly.
With iSCSI, though, the organization can use its existing Ethernet network — or build a new network with relatively inexpensive Ethernet hardware — to connect disparate and far-flung departments to a central storage system. iSCSI saves on equipment costs and on training costs.
“You’re [simply] not going to bring Fibre Channel in to all of the desktops and departmental servers, because there’s just not enough Fibre Channel training,” explains John Hufferd, Senior Technical Staff Member with IBM’s Systems Group. “But iSCSI gives you the opportunity to use SANs with [common] technologies that the people at each location already understand.”
|Figure One: A traditional software initiator|
The common technologies in question are the standard Ethernet NICs, switches, and routers that any company with an IP network has already invested in. Because iSCSI essentially binds SCSI to TCP/IP, all that’s needed to send an iSCSI command is the appropriate software driver on the client system, called the initiator. On the other end of the network, iSCSI targets — whether connected to tape drives, RAID arrays, or other storage devices — receive and carry out the requests.
Figure One shows the various protocols and drivers needed to create a software-based iSCSI client. Using a software-only initiator, the processing of all TCP/IP traffic and iSCSI blocks is performed by the CPU. (IBM currently offers iSCSI initiators for Linux and Windows NT, and both are available under the GPL. Cisco, too, has an open source Linux initiator, which it’s made available through a SourceForge project. Registered Cisco customers can also download initiators for AIX, HP-UX, Linux, Solaris, and Windows 2000 from the Cisco site.)
|Figure Two: A TCP offload engine accelerates processing|
While using these software-based drivers to implement a SAN solution is certainly economical and easy to manage, there is one drawback: performance degradation. Processing TCP/IP traffic can be resource intensive, and handling iSCSI traffic via software can bog down a system if enough blocks are transferred at once. As a solution, some IT managers are beginning to replace their standard NICs with TCP/IP offload engines, or TOEs.
As shown in Figure Two, the offload engines move TCP/IP processing from the operating system to a separate card, so that the workstation CPU is free to run other applications. Adding a TOE can dramatically increase throughput, and there are already a number of vendors with TOEs in the market, including Adaptec, Alacritech, and Qlogic.
But even with a TOE, the CPU still has to deal with the iSCSI traffic once the blocks have been stripped out of the TCP/IP envelopes. A third option, shown in Figure Three, frees up the CPU even more by pushing the iSCSI processing layer itself down onto the PC card. These host bus adapters (HBAs) take the place of the NIC, the TOE, and the iSCSI driver. In addition to the vendors mentioned above, hardware vendors Emulex, Intel, JNI, and LSI Logic have all released iSCSI HBAs.
|Figure Three: A host bus adapter frees the CPU|
The Real Advantages
Critics are quick to point out that most IT departments would rather use iSCSI HBAs than software initiators, thereby negating any cost savings over Fibre Channel. That’s true: iSCSI HBAs are priced in the same $600-$1000 range as Fibre Channel HBAs. But it’s important to note that Fibre Channel switches can cost 10 times as much as a typical Ethernet switch.
In an Adaptec study, researchers estimated the total hardware cost of a low-end, Fibre Channel SAN to be over $61,400, whereas a similar iSCSI SAN cost approximately $42,500. The estimate for a high-end Fibre Channel SAN was $206,300, while its iSCSI counterpart only cost $156,200.
To focus solely on hardware, though, is to ignore the real cost savings that iSCSI can provide. IT departments today typically deal with a vast number of interconnects on a regular basis, from Ethernet, Fibre Channel, and WiFi, to SCSI, Serial ATA, and even InfiniBand. Those with fewer kinds of interconnects obviously have it easier: they don’t have to train their existing staff or hire specialized technicians to install and maintain yet another network. Similarly, using iSCSI to transfer data over Ethernet instead of one of those other interconnection technologies reduces the total cost of ownership of a storage area network. It also allows IT departments to continue building their investments in Ethernet, which is a network that is not going to go away any time soon.
“You can’t not adopt Ethernet,” says Hufferd, who’s also a member of IETF’s IP Storage working group. “So, if you have to deal with it anyway, then is there a way that you can deal only with it and not any other network technology?”
Other cost savings come from the fact that companies don’t have to lay new cable to use iSCSI. Organizations also don’t have to worry about distance limitations inherent to the technology. For example, rather than being held back by Fibre Channel’s 10-kilometer range, data centers can extend their SAN as far as their network can reach, all without having to install specialized hardware.
To be fair, not everyone thinks iSCSI is the perfect solution for storage area networks. In November 2001, Gartner analyst Robert Passmore told News.com (http://www.news.com) that Fibre Channel was the best choice for high-performance SANs. Two years later, Passmore still believes that advice still holds true.
“We will see iSCSI products announced through the rest of the year, but most will be directed to the low end, or to limited extensions of Fibre Channel SANs,” predicts Passmore. “And all will be lower performance implementations compared to Fibre Channel. If iSCSI ever matches Fibre Channel performance, it will likely occur in 2005 or later.”
Indeed, iSCSI can only perform as well as the Ethernet network it rides on. So far, Gigabit Ethernet tends to be the fastest implementation. Fibre Channel in the meantime is already shipping at two gigabit-per-second speeds.
And although some iSCSI proponents believe the balance will shift as organizations implement 10 Gigabit Ethernet, high equipment costs are so far keeping many small- and mid-sized companies from investing in the standard.
An additional concern is that, because of its asynchronous nature, IP is not best suited to handle synchronous applications like certain disaster recovery packages. “IP was designed for sharing facilities among multiple applications,” wrote Thomas Hammond-Doel, a board member of the Fibre Channel Industry Association, in a December 2002 InfoStor piece. “Synchronous applications, including some business continuance applications, do not run well on iSCSI. High-performance TCP applications can require as much as 50 percent to 75 percent over-allocation of bandwidth when adequate quality of service (QoS) is not available.” Although TCP/IP offload engines can reduce the latency in these cases, sites may need to set up separate networks for their iSCSI SANs to preserve bandwidth in the existing IP network.
Finally, while there are many TCP/IP management utilities available today, there are still relatively few for the iSCSI protocol itself. Fibre Channel, on the other hand, has a large set of available utilities. It’s certain that iSCSI will catch up as companies begin to identify management needs during adoption phases, but companies that need advanced management capabilities right away will have to look to custom or vendor-specific solutions.
If you’re considering a networked storage solution for your organization, how do you know what to choose? Deciding whether to implement a storage area network (SAN) or to install network-attached storage (NAS) can be difficult. Both SAN and NAS provide enterprises with a way to centralize storage, both reduce the number of daily management tasks that an IT department must perform, and now, both run over IP, thanks to the availability of iSCSI SANs. The trick to keeping the two separate is to remember that NAS devices serve files while SANs serve blocks. While the difference may seem subtle, it can make a huge difference for your network, depending on what you need to do.
A NAS device serves files over IP by using NFS, CIFS, and other file protocols. This makes NAS ideal in situations where your users need to share files between clients or platforms.
|Figure Four: The topology of NAS|
SANs, on the other hand, simply use SCSI — originally over Fibre Channel, but now also over Ethernet — to serve blocks of data to requesting applications. This makes SANs more suitable for streaming multimedia or hosting database applications that require block-level access, like Microsoft Exchange.
|Figure Five: The topology of SAN.|
Additionally, SANs and NAS devices have a number of physical differences that must be factored in to a purchasing decision. When a NAS device is plugged into your LAN, your existing clients and servers see it as an additional file server (see Figure Four). With a SAN, though, each server on your LAN must specifically be connected to your SAN to make use of the extra storage (see Figure Five). When Fibre Channel was the only option for doing this, servers required Fibre Channel host bus adapters and would be connected to a separate network where the SAN resided. Today, with iSCSI, IT departments have the option of installing a SAN on the existing Ethernet network, or using standard Ethernet equipment to build a new network.
John Hufferd, Senior Technical Staff Member with IBM’s Systems Group, believes many companies can get away with just having a SAN because of the low rate of file sharing he’s seen in the field. “When we’re out meeting with customers, we often ask people if they need to share files,” explains Hufferd. “Usually, fewer than five percent of the people raise their hands, and they’re all engineers.” When people do share files, says Hufferd, it’s more often through email than a file server. Most other information is traded through database applications. This makes block I/O a better option for companies in these situations.
For organizations that do need a file sharing option, but still want the scalability and capacity of a SAN, Hufferd suggests using a NAS device as a front-end to a SAN. That way, the NAS device handles all the necessary file protocols but lets the SAN deal with the raw data. IT managers can reallocate storage to the NAS as needed while still allotting space for other directly-connected servers.
And with iSCSI, there’s no need to deal with multiple networking technologies. The SAN can connect to the NAS the same way that the NAS connects to its host: over Ethernet
Some may view security as an additional tradeoff in adopting iSCSI, especially since Fibre Channel is inherently secure. But iSCSI comes with a number of security features that make it as strong as — and sometimes even stronger than — a closed network like Fibre Channel.
The first of these “enhanced” security features is the IP Security (IPSec) protocol. Widely deployed in virtual private networks, IPSec provides authentication, encryption, and verification of communications. Vendors who choose to implement the iSCSI protocol must offer IPSec, although customers don’t necessarily have to use it. This gives customers the flexibility of applying IPSec to only the most vulnerable channels in their networks.
iSCSI also supports Challenge-Handshake (CHAP) authentication for approving communications between an initiator and a target. More likely, though, Kerberos will become the de facto standard for authentication, given the push its getting lately from Microsoft and other software vendors.
While these security features are handy, it’s important to note that employing the latest encryption tools and configuration practices is just one part of proper SAN security. There is still the physical problem of protecting the boxes in which the data is stored, and the logistical problem of keeping track of important files on the network and in backups.
“Anytime you work with storage and backups, you should know that you’re creating another copy of a file,” warns Aberdeen’s Tanner. And while that may seem like an obvious statement, it becomes incredibly important when your organization suddenly has hundreds of confidential and time-sensitive files that need to be tracked in various locations where they may stay alive indefinitely. Unfinished of this writing, the IETF is still discussing how it will add remote boot capabilities to iSCSI. Because computers must load and configure the appropriate drivers and networks stacks before they can use iSCSI, attempting to use iSCSI to boot from remote storage creates a tricky paradox: how do you access and load an iSCSI driver into a computer when the driver itself resides somewhere across an iSCSI connection?
Fortunately, there are some workarounds to this seemingly “chicken and egg” scenario. IBM’s solution is called iBoot, which is essentially a ROM image containing iSCSI client code, a TCP/IP stack, and BIOS interrupt code that allows the computer to talk directly to the iSCSI target upon boot. IT managers can use iBoot to load Linux or Windows.
Cisco also has its own Network Boot tool based on Intel’s Preboot Execution Environment (PXE) standard. Part of the code closely corresponds to work that the IETF is doing toward creating a standard for iSCSI boot. It’s expected that the details behind the standard will be worked out by the time this article goes to press.
A Bright Future
Besides Fibre Channel, the other major competitor to iSCSI is InfiniBand. But many experts don’t believe InfiniBand is making much headway in the storage market, partly because of its high cost of entry, and partly because it’s gradually losing support from major players.
“Intel convinced people that products would have native support for InfiniBand, but then Intel walked away,” recalls Hufferd. “When Microsoft walked away, it was clear that InfiniBand wasn’t going to get any cheaper, and that it wasn’t going to have the momentum to replace other interconnects.”
Hufferd, like many storage experts, expects InfiniBand will end up as an interconnect for high-performance clusters rather than an interconnect in storage area networks. But that isn’t stopping the IETF from borrowing certain InfiniBand features — most notably, remote direct memory access (RDMA) — for possible inclusion in an upcoming version of iSCSI. It’s unclear when that may happen, though. In an attempt to maintain the stability of the current protocol version, the IETF has said it isn’t planning to release another major version of iSCSI for at least a year. The move is certain to be welcome by IT managers who have been burned by fast-changing specifications that often render new equipment obsolete and incompatible.
So, with little in the way to hinder adoption, the future looks bright for iSCSI. Of course, mainstream adoption of the protocol will still take time, just as it does with any new technology. iSCSI is not likely to surpass Fibre Channel for a number of years. This is due mainly to the fact that organizations with existing Fibre Channel investments will phase in iSCSI over time rather than ripping and replacing infrastructure.
In the meantime, more and more companies are beginning to see the merits of iSCSI. The Fortune 500′s Komatsu America is already using it in conjunction with a Fibre Channel SAN for data recovery. In Pasadena, CA, NASA’s Jet Propulsion Laboratory is reportedly installing an iSCSI storage network for backing up 150 Sun Solaris servers. These early installations are sure to be followed by more widespread adoption, as the benefits of iSCSI are too good to pass up.
Amit Asaravala is an independent journalist based in San Francisco, CA. Prior to becoming a full-time writer, he founded
New Architect magazine and was Editor-in-Chief of Web Techniques.