The SuperComputing Conference is THE international conference and expo for all things HPC (High Performance Computing). The astute attendee of this year's conference could see that storage is a big part of this year's show. Two major storage trends from this year's conference: really fast storage and really dense storage.
The Supercomputing Conference 2010, commonly called SC10, is the conference/show for everything surrounding HPC (High Performance Computing). There are over 10,000 attendees of the conference while the expo floor is very large making it difficult to see and learn everything, but your intrepid author did his best to explore the expo floor in search of the latest HPC storage trends.
One question that a few people may ask is how are storage trends from HPC related to my storage needs/requirements including home users. The answer is fairly simple and perhaps more important than you might expect. HPC storage requirements are extremely demanding in many respects including performance, scalability, reliability, management, and file systems. This means that HPC usually sees problems or develops solutions to problems many years before they make their way into the mainstream. So what HPC storage is doing today may affect you tomorrow. Not all HPC solutions become mainstream, but many of them do make it into everyday use in a number of areas.
There are several example of HPC moving into the mainstream but one simple one, while it isn’t exactly storage related, is cloud computing. For many years HPC has been doing distributed computing including something call Grid Computing. Grid Computing can be considered the forerunner to Cloud Computing or it can even be considered the same thing depending upon your perspective. The idea is to run your application on a system that you may or may not control with your data likely to have to follow your application. So the HPC world has worked out tools and techniques for accomplishing this task which sounds remarkably like Cloud Computing.
Examples in the HPC storage world include scalable storage solutions that allow you to scale performance and capacity that first started in HPC many, many years ago. High performance storage, including SSD’s, started in HPC many years ago as well. Another example is the concept of storage tiering where data is moved to faster or slower “tiers” of storage. So you can see that HPC is pushing the proverbial envelope developing solutions that can also help the mainstream storage world. So it behoves us to watch what’s going in HPC because in several years it could be common.
This year I saw two storage trends at SC10. The first is the development of technology related to really fast storage tiers and the second is an increase in the number of companies offering more dense storage technologies. The results solutions in both of these technology trends is interesting but there are some important implications resulting from the technologies.
One of the basic tenants of HPC storage is density. The more storage per rack unit (TB/U) results in fewer racks being used, shorter cable lengths, and possibly reduced costs since the number of chassis is reduced. Several companies have high density storage chassis including DDN, NexSAN
, and Scalable Informatics
. At SC10, LSI
announced a new high density storage unit, the Engenio 2600-HD
(previously known as Wembley
). All of these storage devices increase the storage density by creating new chassis designs. Some of them mount the drives vertically through the top of the chassis. This allows the chassis to increase the density by putting in more drives. This also means that you have to pull out the chassis to replace a specific drive but they are usually designed for this while they continue to operate.
On the other hand, the new LSI Engenio 2600-HD uses drive trays that keep the drives horizontal by having 5 trays of 12 drives each (60 drives in total). You pull out a tray to pull out a specific drive which can be done while the system is running.
Table 1 below is a quick high-level comparison of the major high density storage units that offered today. High density storage units from other vendors are not listed since they are typically OEM’ed versions of these major units. The units are categorized by their configuration include JBOD (Just a bunch of drives) and RBOD (RAID bunch of drives). Typically a JBOD does not contain a controller and the RBOD contains a RAID controller.
Table 1 – Major High Density Storage Units – Main Features
Notice that there is some variation in the size of the units (number of rack units), the number of drives, the depth of the units, and the weight of the units. This variation indicates the differences in philosophy in the design of the units.
If we take the most dense units with 60 drives in 4U, that means in a typical 42U rack we could get 10 chassis for a total of 600 drives. If we assume 2TB drives for the moment, that means 1.2PB of raw storage in a 42U rack! With the upcoming 3TB drives that means 1.8PB in a 42U rack.
Of course, these numbers are all theoretical and depend upon the exact storage configuration (don’t forget the storage servers!) but it gives you can idea of the density one can achieve using these storage units.
Really Fast Storage
One question that is always insightful to ponder is, “does all of my data have to be accessed at the same speed?” To help answer this question, you should take some time to examine your data, specifically to examine the “age” of the data on your storage. Examine when it was last accessed, when it was last modified, or when it was created (there are tools
that can help
). I think you will find that a great deal of your data has not been accessed or modified in quite some while. (Note: My favorite acronym that applies in this case is WORN – Write Once, Read Never.) After the examination ask yourself a simple question, “Does data that hasn’t been accessed in a long time need much performance?” Do you really need to have data that hasn’t been accessed in years on storage media that is really fast (and correspondingly expensive)? Or could that data perhaps be archived on slower and hopefully less expensive storage?
By periodically performing an age profile of your data you can get a feel for the I/O profile of your storage. In particular you can estimate how much data you are creating and the rate at which you are creating it. You can also perform a statistical analysis of age profile to understand the average and standard deviation of the data as a function of time to see how fast, or slow, your data is “aging.” In the High Performance Computing world, understanding how much data you create and how quickly is vital to the success of the storage subsystem.
Since HPC works with so much data, long ago people discovered that it would be best to put data on multiple tiers. You may be familiar with the concept of tiering, but just in case, what tiering means is that you have various “levels” of storage hardware where each one has a different level of performance and capacity. Typically we start with tier-1 which is usually the fastest performing tier with a small capacity (to reduce costs), followed by tier-2 storage which is slower but has more capacity than tier-1, followed by tier-3 which is slower yet but has the largest capacity. You are not restricted to three tiers and can have as many tiers as you want or as few as you want, but the essence of the concept is that there is some performance and capacity difference between tiers.
The tier differences also mean that there is a cost difference between the tiers. tier-1 storage is the fastest so it is almost always the most expensive but is used for applications that need a great deal of I/O. Consequently, tier-1 is used for I/O when the application is actually running. Once the application is done and it’s I/O requirements are finished, it can be moved to tier-2 storage which is slower but has more capacity. On tier-2 storage, the data can be chopped, sliced, and diced, to find useful information or insight. But since this process usually has a smaller set of I/O requirements than the initial application, tier-2 storage is slower than tier-1. Finally, once the data has been dissected, it can be stored on a long-term storage system that is tier-3 (typically tape or very inexpensive disk). This storage tier is much larger than tier-2 but has very low performance since the data is not too likely to be retrieved but has to be retained for a long period.
In addition, tier numbering is fairly arbitrary, meaning that one person’s tier-1 might be the same as another person’s tier-2. Moreover, the tiering scheme has arbitrarily started with tier-1 which is normally considered the fastest tier. However, an “über-fast” tier is sometimes created that is typically referred to as tier-0. Again, the distinction and labeling of tiers is arbitrary so the purpose of calling certain storage “tier-0″ is to emphasize that this storage tier is extremely fast – faster than the classic tier-1.
Tier-0 storage is very useful in HPC, because there are applications that do a huge amount of I/O to the point where I/O is the driving factor in the performance of the application. Having the fastest storage tier possible means that your application will run efficiently. However, since this tier is so fast, it is also usually very expensive. So the capacity of tier-0 is fairly small relative to other tiers even tier-1 storage.
At SC10, the number of vendors discussing tier-0 storage technology was much greater than the last few years. This trend is significant because tier-0 storage is fairly expensive and usually he purview of vertical markets, such as the financial industry, that need the fastest performance regardless of the cost, or for large HPC centers that have very large applications and need high performance I/O to avoid creating a bottleneck and becoming a detriment to their performance.
But in the past, the tier-0 market has been fairly small. However, with the rise in the number of cores in typical systems coupled with the rise in fast co-processors (GPU’s), the I/O performance and capacity requirements have risen correspondingly. This has led to an increased demand for tier-0 storage technologies which was evident on the SC10 expo floor. Let’s examine some of the vendors who are offering tier-0 storage and the various technologies behind them.
Tier-0 storage has actually been around the HPC world for a while and has become more widespread than one might think. For example, Texas Memory Systems
has been offering both ramdisk and SSD based storage solutions for many years. To show you the breadth of solutions, Table 2 lists their current products and major features.
Table 2 – Texas Memory Systems
There are other tier-0 manufacturers as well including Fusion-IO which introduced a new 5TB PCIe based Flash storage device at SC10 called ioDrive Octal. It is a double-wide x16 PCIe Gen2 card that is capable of about 6.2 GB/s throughput with over 1,000,000 IOPS. There is a 5.12TB drive that is MLC based (Multi-Level Cell) and there is a 2.56GB SLC (Single Level Cell) based card that has about 40% better write throughput (6.0 GB/s vs. 4.4 GB/s).
Violin Memory has also developed a tier-0 storage device. At SC10, they introduced a new NFS caching device called the vCACHE that uses their Tier-0 storage devices as a caching device for NFS. The vCACHE devices uses one of their flash memory devices, either the Violin 3200 or the Violin 3140. The Violin 3200 is a 3U device that uses small flash devices that plug into DIMM slots. The 3U chassis has a capacity of 512GB to 10TB with over 220,000 random write IOPS. The Violin 3140 is a 3U unit with up to 40TB of flash memory (it is intended to be more of a capacity unit than a performance unit). It achieves a little over 100,000 IOPS.
The SC10 expo floor had an area devoted to “disruptive technologies”. One of the vendors in that area, Virident, was showing their new PCIe based SSD card called tachIOn. This card follows the latest trend of new SSD devices in that it has skipped the conventional SATA/SAS interface and is using a PCIe interface to increase performance. It has capacities of 300, 400, 600, and 800 GB with a peak read performance of 1.44 GB/s, a peak write performance of 1.2 GB/s, and 300,000 IOPS (4K blocks, 75% read and 25% write). One of the interesting things about the tachIOn is that it has field replaceable flash modules. If something happens to other PCIe based SSD’s, you have to replace the entire card. With the tachIOn card, you just replace the specific flash module. Perhaps even better, you can add modules to add capacity to the card.