SandForce 1222 SSD Testing, Part 1: Initial Throughput Results

SandForce has developed a very interesting and unique SSD controller that uses real-time data compression. This can improve performance (throughput) and extend the life of the SSD but it hinges upon the compressibility of your data. This article is the first part in a series that examines the performance of a SandForce 1222-based SSD and the impact of data compressibility.

Introduction

SandForce is a relatively new company that has developed a unique SSD controller. It’s fairly well known that their controllers use real-time data compression within the controller. While this doesn’t increase the capacity of the SSD it can increase the throughput and increase the number of rewrite cycles of an SSD (i.e. the “life” of the SSD). What makes it even more interesting is that performance and longevity of the SSD depend upon your data (very unique and exciting concept). If your data is very compressible then you get a performance boost and a longer lasting SSD. If your data is very incompressible, then you don’t get any performance boost and the SSD will last as long as any other with the same type of cells.

Having this flexibility in your SSD creates the opportunity for what I like to think of as “re-design” of your application to take advantage of the real-time compression. You can write your application(s), if you write your own, so that the data is easier to compress. For example, you could switch from binary data to pure text data. This can improve performance and longevity because text is fairly easy to compress, but it could also mean that the space used by your data increases.

The obvious key to using SandForce based SSDs is the compressibility of the data. When I discuss this issue with various people I almost always get the same immediate response – “my data is very incompressible.” However, some simple experiments I have done show that quite a few different data types have some level of compressibility (albeit I looked primarily at HPC data types). Some data types I have examined have remarkably compressible data despite the fact that they are “binary”. So before you react with the same initial answer that your data is incompressible I think it behooves you to examine your data and to test a SandForce based SSD. Of course there will always be exceptions such as heavily encrypted data or data that has been compressed using gzip, but for the majority of situations, you should examine the compressibility of your data (however, let me also point out that I know one compression company that can compress even already compressed data files and heavily encrypted data).

This article examines the performance of a 64GB Micro Center SSD that uses a SandForce 1222 controller. This is a very inexpensive drive that gives you about 60GB of useable space for just under $100. ($1.67/GB). The specifications on the website state the drive has a SATA 3.0 Gbps interface (SATA II), and has a performance of up to 270 MB/s for writes, 280 MB/s for reads, and up to 50,000 IOPS.

For examining the performance, this article uses IOzone to test the throughput capability of the SandForce SSD. IOZone has the capability of specifying the “dedup” level of the data used in the tests. This “dedupability” is basically the same thing as “compressibility” so it will allow me to control the level of data compression in the testing. The details of the configurations are below as are the details of the benchmarks and tests run. It’s been a while since I’ve presented benchmarks, but we always need to use our good benchmarking techniques so now would be a good time to review them.

IOzone

IOzone is one of the most popular throughput benchmarks. It’s open-source and is written in very plain ANSI C (not an insult but a compliment). It is capable of single thread, multi-threaded, and multi-client testing. The basic concept of IOzone is to break up a file of a given size into records. Records are written or read in some fashion until the file size is reached. Using this concept, IOzone has a number of tests that can be performed:

  • Write
    This is a fairly simple test that simulates writing to a new file. Because of the need to create new metadata for the file, many times the writing of a new file can be slower than rewriting to an existing file. The file is written using records of a specific length (either specified by the user or chosen automatically by IOzone) until the total file length has been reached.
  • Re-write
    This test is similar to the write test but measures the performance of writing to a file that already exists. Since the file already exists and the metadata is present, it is commonly expected for the re-write performance to be greater than the write performance. This particular test opens the file, puts the file pointer at the beginning of the file, and then writes to the open file descriptor using records of a specified length until the total file size is reached. Then it closes the file which updates the metadata.
  • Read
    This test reads an existing file. It reads the entire file, one record at a time.
  • Re-read
    This test reads a file that was recently read. This test is useful because operating systems and file systems will maintain parts of a recently read file in cache. Consequently, re-read performance should be better than read performance because of the cache effects. However, sometimes the cache effect can be mitigated by making the file much larger than the amount of memory in the system.
  • Random Read
    This test reads a file with the accesses being made to random locations within the file. The reads are done in record units until the total reads are the size of the file. The performance of this test is impacted by many factors including the OS cache(s), the number of disks and their configuration, disk seek latency, and disk cache among others.
  • Random Write
    The random write test measures the performance when writing a file with the accesses being made to random locations with the file. The file is opened to the total file size and then the data is written in record sizes to random locations within the file.
  • Backwards Read
    This is a unique file system test that reads a file backwards. There are several applications, notably, MSC Nastran, that read files backwards. There are some file systems and even OS’s that can detect this type of access pattern and enhance the performance of the access. In this test a file is opened and the file pointer is moved 1 record forward and then the file is read backward one record. Then the file pointer is moved 2 records forward in the file, and the process continues.
  • Record Rewrite
    This test measures the performance when writing and re-writing a particular spot with a file. The test is interesting because it can highlight “hot spot” capabilities within a file system and/or an OS. If the spot is small enough to fit into the various cache sizes; CPU data cache, TLB, OS cache, file system cache, etc., then the performance will be very good.
  • Strided Read
    This test reads a file in what is called a strided manner. For example, you could read at a file offset of zero for a length of 4 Kbytes, then seek 200 Kbytes forward, then read for 4 Kbytes, then seek 200 Kbytes, and so on. The constant pattern is important and the “distance” between the reads is called the stride (in this case it is 200 Kbytes). This access pattern is used by many applications that are reading certain data structures. This test can highlight interesting issues in file systems and storage because the stride could cause the data to miss any striping in a RAID configuration, resulting in poor performance.
  • Fwrite
    This test measures the performance of writing a file using a library function “fwrite()”. It is a binary stream function (examine the man pages on your system to learn more). Equally important, the routine performs a buffered write operation. This buffer is in user space (i.e. not part of the system caches). This test is performed with a record length buffer being created in a user-space buffer and then written to the file. This is repeated until the entire file is created. This test is similar to the “write” test in that it creates a new file, possibly stressing the metadata performance.
  • Refwrite
    This test is similar to the “rewrite” test but using the fwrite() library function. Ideally the performance should be better than “Fwrite” because it uses an existing file so the metadata performance is not stressed in this case.
  • Fread
    This is a test that uses the fread() libary function to read a file. It opens a file, and reads it in record lengths into a buffer that is in user space. This continues until the entire file is read.
  • Refread
    This test is similar to the “reread” test but uses the “fread()” library function. It reads a recently read file which may allow file system or OS cache buffers to be used, improving performance.

There are other options that can be tested, but for this exploration only the previously mentioned tests will be examined. However, even this list of tests is fairly extensive and covers a large number of application access patterns that you are likely to see (but not all of them).

There are a large number of command line options available for IOzone, far more than will be covered here. The next sections will present the test system as well as the specific IOzone commands used.


Test System

Comments on "SandForce 1222 SSD Testing, Part 1: Initial Throughput Results"

solanum

Can you rerun the performance tests with a 2.37.x kernel? They made a number of changes to the kernel in the block device layer, and I am wondering how much that impacts the performance. :)

Reply
    laytonjb

    Stay tuned! That is my plan for the last part in this series.

    The next part will cover initial IOPS performance. Part 3 will cover a more in-depth throughput study (and comparison to an Intel SSD). Part 4 will do the same more in-depth study and comparison but for IOPS. Then Part 5 will compare the 2.6.32 kernel to the latest kernel (probably 2.6.37 but maybe 2.6.38 if it comes out).

    Jeff

    Reply
pjwelsh

Ahh forget 2.6.37! Add the mainline kernel tracker repo with version 2.6.38 (currently) from the GREAT folks at ElRepo

Reply
sdanpo

Excellent article!

I liked the thoroughness of the test and the great data derived form it.
Looking forward to the coming parts.

Disclaimer : The comment is written by an Anobit Employee.
Anobit is an Enterprise SSD vendor with Data-pattern-Agnostic behavior.

Reply
storm

I’ve been reading about and testing SSDs for years and am finally leaving my first comment. I’m doing so because none of the benchmarks I’ve read test the achilles heal of SSDs which happens to be our production workload.

I would suggest doing a mixed random read/write workload with a 64GB file (full extend of the drive) with 4k write size that runs for a long time to arrive at steady state behavior, e.g., a day. When I was working with their engineers when beta’ing the FusionIOs IOdrive, they said this is the most toturesome workload they’ve ever seen. They had to make a number of changes to the driver for us as a result. Caches get quickly overwhelmed, wear leveling/grooming quickly gets pinned shuffling blocks around and can lead to huge periodic drops in performance unless they are amortized over time (SSDs are over provisioned under the hood to help with this), block aligning/elevator algorithms don’t help due to the randomness, the small IO size kills throughput, the mixed nature of the IOPS r/w (especially when done in parallel) can cause havoc with the rewrite algorithm, etc.. The dirty little secret in the industry is to quote inflated random IOPS performance using a file that is 1/4-1/3 the size of the drive.

Another surprise that we’ve found during testing is how drives perform as you increase the number of parallel read/write threads. With Fusion, for instance, it doesn’t make much of a difference positively or negatively. Virident’s tachIOn drive, however, tripled in performance! We were blown away. FYI this is the best SSD we’ve tested to date.

Ok, that was cathartic :) Thanks for letting me rant.

Thanks for the great article and I look forward to the rest.

Reply
detroitgeek

I have been looking at SSD to put my OS on, and plan on having my home directory on a standard drive. I worry about the lifetime of the SSD under these conditions because of all of the writing the OS does. My /var/ directory would also be on a standard drive. Is my concern realistic?

Reply
eoverton

I had issues with my drive going off-line randomly. Seems it was a bios issue. But which bios? see http://ssdtechnologyforum.com/threads/835-Sandforce-SSD-Firmware-Version-Confusion. So I ugraded my bios from Adata site. The drive does not have the issue anymore.

Reply
    laytonjb

    Was this the same MicroCenter drive that I tested? I was told it was an Adata drive but I haven’t been able to confirm that.

    What were the symptoms of the drive going off-line? What distro/kernel were you using?

    Thanks!

    Jeff

    Reply
      eoverton

      I was using windose at that time :(. The drive would go off-line and I would get BSD or sometimes would reboot and halt at “Could not find bootable drive”. I would power off, wait! Then power on, everything was then ok. I confirmed mine was Adata by small manual the drive came and googling. The issue did not look like it was OS related.

      Reply
venugopalan

This article has given very nice heads up on IOPS & SSD controllers.

Reply

Can you rerun the performance tests with a 2.37.x kernel? They made a number of changes to the kernel in the block device layer, and I am wondering how much that impacts the performance. :)FiberYes

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>