For more than 15 years, Matt Henderson has been a database and systems architect specializing in Sybase and SQL Server platforms, with extensive experience in high volume transactional systems, large data warehouses and user applications in the telecommunications and insurance industries. Matt is currently an engineer at Violin Memory.
IT departments and data centers are on a constant treadmill to deliver continuous optimizations for faster and cheaper delivery, yet always at more scale. The requirements never end: Faster. Larger. Cheaper. Quicker. Features versus speed. Cost versus performance. Ease versus flexibility. Solution architecting can be the difference between long-term success and failure.
Feeding the need for continuous optimization is a constant stream of new technologies. New technologies can be broken down into two categories: those that modernize and those that revolutionize. Modernizing is basically doing the same thing as before, just a little bit faster or little bit easier. Revolutionizing is either eliminating or vastly changing how something is accomplished. Typically, it takes a whole new technology to be fully revolutionary.
Taking Storage to a New Level
In the case of storage, a new technology has somewhat recently come to market: NAND flash. Instead of storing 1’s and 0’s in magnetic sectors on spinning platters with a swing arm reader head, it stores data on silicon wafers that are dynamically addressible. While allowing for RAS (Random Access Storage) in the persistent tier, it does not come without complexity and cost.
In order to sell this new technology, the industry went with the fastest and easiest option to get into the market and what was most likely to sell. Enter the SSD (Solid State Drive). SSDs are specifically designed to be a bootable hard disk drive (HDD) refresh. The designers took the 2.5” form factor hard disk drives, removed the spinning platters and replaced them with NAND flash silicon chips.
SSDs have the same physical size with the same physical connectors to make it easy to plug right into existing infrastructures. This allows vendors to sell to anyone in the enterprise or consumer market immediately without having to spend and risk a lot of capital building custom flash-optimized storage device. SSDs were designed to make flash easy to sell. They were not designed to be the optimum deployment of flash. SSDs are a modernization, not a revolution.
SSDs, while faster than HDDs, qualify as a modernization because they leave the infrastructure, architecture and management of storage entirely in place while just making the individual storage components faster. Aggregation and Segregation (A&S) has long been the standard model for deploying enterprise data over many atomic parts. Left standing are all the typical issues and challenges of this type of A&S architecture:
- Must define each workload and its I/O profile
- Must determine how many units to allocate to each workload
- Must determine RAID factor for each workload
- Must choose which unit type to deploy in each LUN group (SATA, 10k, 15k, SSD, etc.)
- Must assess the consequences of when the IO profile changes over a given time
- Must segregate units strands performance
- Must consider that legacy chassis controllers are not designed for NAND flash specific issues (wear leveling, error correction, write cliff mitigation, etc.)
- Must consider legacy chassis or shelf engines and controllers (designed for hard disk drive speeds)
- Must consider administrator time spent managing data locality issues
If SSDs are just a modernization, what then would be a revolution? All Flash Arrays (AFA) are a distributed block, flash-as-a-chassis persistent storage appliance that deploys and integrates terabytes of flash into one device.
Flash deployed as one integrated device allows for the technology specific issues (wear leveling, error correction and write cliff mitigation) to be deployed over larger quantities of chips and allows for every I/O to go at the maximum speed of the whole storage appliance. Workloads no longer need to be segregated, thus stranding speed in the underutilized groupings while needing to manage hot spots. It also allows solutions architects to work with much larger blocks of storage. Imagine having 40-100TB’s of usable storage that all works at the same speed with no tuning or advanced planning.
- Random Access Storage (RAS). A memory-like architecture where every storage address is equally accessible, at the same speed, all the time. Any workload using any data will work the same at any time. When sequential and random become the same, then any number of workloads can be active at the same time allowing for scale (parallelization) without performance degradation.
- Distributed Block Architecture (DAB). With every I/O hitting every component every time, parallelization of flash is at its maximum, delivering the best possible speeds to every I/O, every time. Segregation of units into LUN decreases parallelization instead of maximizing it.
- Parallelization. As the CPU manufacturers have migrated from delivering faster cores to more cores, the model for application processing has turning into massively parallel workloads. This has driven the need for storage that can be dynamically accessed with a high rate of random requests. While SSDs deliver random storage access, they do it over a small footprint. The more flash wafers that are engaged per I/O the more parallel the packet can be processed. Only terabyte sized and chassis-aware appliances can deploy enough flash wafers to sustain the execution of hundreds of thousands to millions of I/O’s per second at a bit level stripping.
Only the invention of a new technology can allow something to quickly become cheaper, faster and simpler. AFAs are that new technology deployed in its proper form. The future of storage is simple to deploy large footprint, ultra-low latency, chassis-aware storage appliances utilizing NAND flash to deliver a distributed block architecture allowing applications to utilize random access storage.
Storage purchases will usually have a production life of 3 to 5 years. What do you want your data residing on 3 years from now: something modern or something revolutionary?
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission processfor information on participating. View previously published Industry Perspectives in ourKnowledge Library.