Flash storage, often called solid state drives (SSDs), is a promising technology that will be deployed in nearly every data center over the next decade. The primary downside is the price, which, despite vendor claims, is 3X-10X the price of spinning media (HDDs). Here are two ways storage architects can go about analyzing their current and future requirements to understand which workloads will benefit from flash storage.
One of the best ways to understand your deployment requirements is to have an accurate model that represents your current storage I/O profiles. This model can be used to test new architectures, products and approaches. The goal is to enable the development of a realistic-enough workload model to enable comparisons of different technologies, devices, configurations and even software/firmware versions that would be deployed in your infrastructure.
The first step to effectively model workloads is to know the key storage traffic characteristics that have the biggest potential performance impact. For any deployment, it is critical to understand the peak workloads, specialized workloads such as backups and end of month/year patterns, and impactful events such as login/logout storms.
There are three basic areas to consider when characterizing a workload. The first is the description of the size, scope and configuration of the environment itself. The second is to understand the access patterns for how frequently and in what ways the data is accessed. The third is to understand the load patterns over time.
File (NAS) and block (SAN) storage environments each have unique characteristics that must be understood in order to create an accurate workload model. For example, in NAS environments you need to determine: relevant number of clients and servers, the number of clients per server, file size distribution, sub-directory distributions, tree depths, etc.
For SAN environments, you need to determine: number of physical initiators (HBAs/NICs), average number of virtual initiators per physical initiator, average number of active virtual initiators per physical port, number of Logical Units per HBA and the queue depth settings for the server HBAs or iSCSI LUNs.
The access patterns are also key to understanding how frequently and by what means storage is accessed. It is important to consider several use cases, such as average, peak and special business events. Proper characterization of access patterns is also different for file and block storage.
In NAS environments, information about each file is tied to the file, directory and computer, including data such as: file name, location, creation date, last written date, access rights, and backup state. This information, called metadata, often make up the bulk of all file access commands and storage traffic.
Some application access patterns contain over 90% metadata; less than 10% is devoted to writes and reads. For file access it is important to know the percentage breakdown for each command. Freeware tools like Iometer (that many flash vendors use for IOPS claims) are useless in file storage environments, as Iometer can't model metadata commands. Including these metadata commands enables understanding how an application stresses the storage infrastructure and the processing that occurs in each computer, not just the file system.
Sign up for CIO Asia eNewsletters.