The goal of this process is threefold:
Thus, Dell's approach to flash tiering succeeds in leveraging the best that SLC and MLC devices bring to the table while avoiding the sweeping compromises made by single-tier deployments of "mixed-use SLC" (SLC with less wear-leveling capacity) and "eMLC" (MLC with added wear-leveling capacity). Said another way, it's much more like having a tiered 15K SAS/7.2K NL-SAS spinning-disk array that can give you the benefit of both types of media versus having a single-tier 10K SAS spinning-disk array that gives you something in between.
Yes, there's a catch
It's rare that an engineering decision doesn't have some kind of drawback. In this case, the catch is found in the creation of those snapshots that are so vital to the tiered-flash model. If data is immediately moved from the write-optimized SLC tier to the read-optimized MLC tier upon creation of a snapshot, there's an obvious cost to doing that. The load on the SLC tier will increase as data is read out of it and written into the MLC tier, and this can't help but impact performance on the SLC tier whenever host I/O is driving those SSDs to their limits. Worse yet, pages that are migrated from one tier to the next have to be locked during the operation, and this can cause contention in very high-I/O situations given that committing data to the MLC tier takes three to five times longer than reading it from the SLC tier.
To test the impact of this, I created a worst-case scenario in the lab. I set up a series of volumes and started directing a breakneck read and write load at all of them. In my case, it was a stream of randomized 4K I/Os with a 70/30 mix of reads versus writes (very roughly approximating an OLTP workload). This workload was isolated to a fairly small footprint on the array (about 80GB in total).
Enterprise Manager will also help you keep an eye on the health and wear of the SSDs
At first, the entry-level "6+6" (SLC+MLC) configuration handled this workload entirely with the SLC tier and clocked in at more than 70,000 IOPS with sub-5ms latencies -- truly impressive considering a similarly priced spinning-disk array would be hard-pressed to serve up a third of those IOPS with three times the latency. However, things took a turn for the worse when I created a snapshot that simultaneously impacted all the volumes I was throwing my workload against. The I/O stream came to a screeching halt -- immediately dropping to about 3,500 IOPS and slowly crawling back up to its previous speed over a period of a few minutes.
Sign up for CIO Asia eNewsletters.