Increase the IQ, release the IOPS

By Edward Lee, Architect, Tintri.

  • 10 years ago Posted in

It’s no news to anyone in our industry that flash is revolutionising storage. The decades-long bottleneck in storage – disk spindles -- is being obliterated! A single commodity SSD is 400 times faster than a hard disk. In comparison, the speed of sound is "only" 250 times faster than walking! Moreover, flash will continue to scale with rapid improvements in semiconductor technology.


While flash provides extraordinarily high IOPS, it brings a whole new set of problems: write amplification, latency spikes, limited write endurance, and - last but not least - very high cost/GB. Today, commodity MLC SSDs cost about $1/GB - fifteen to twenty times more than SATA hard disks. This is too expensive to run many mainstream applications on SSD. To leverage the high IOPS but compensate for the high $/GB of flash, flash storage systems are employing a variety of techniques such as caching, tiering, and inline compression and dedupe.


Existing storage vendors and new entrants have attempted to exploit flash in different ways. These products can be grouped into two broad categories, based on their impact on latency:
· Disk-based products with flash as a cache.
· Flash-based products (with or without HDDs for expanded capacity).
Disk-based products are fundamentally designed to optimise the use of hard disk drives, with flash bolted-on as a cache to accelerate read performance. Flash as a cache is relatively easy to implement, so it is not surprising that existing legacy storage vendors have taken this path. Many flash-as-cache implementations are non-persistent and non-redundant, so performance plummets after crashes and/or component failures. Since the “master” copy remains on hard disk, reads benefit from flash, but writes do not. Therefore, overall performance will not scale proportionately with improvements in flash technology.

Disk-based architectures suffer from high latency even with flash added to them. Hit rates of 50% are typical for disk-based products with flash as a cache. Even with hit rates as high as 67%, average read latencies are ten times higher than flash-based products.


Because disk-based products rely on disks as a key part of their basic data path, they have difficulty achieving flash-level latency and will be left behind by rapid improvements in flash performance.


In contrast, flash-based products are designed specifically for flash rather than mechanical disk drives, delivering dramatically lower latency. The key distinction is that their basic data paths do not require accessing disk. Often using low-cost MLC technology, flash-based products sometimes incorporate new techniques such as inline dedupe and compression to reduce the high $/GB of flash. Most flash-based products are flash-only, while some integrate hard disks to expand capacity and simplify management.


Initial flash-only products are basic arrays. Focused on getting the highest possible IOPS, they generally have very high $/GB, and are missing enterprise features such as HA, snapshots, and clones. Even with inline dedupe and compression, flash-only arrays are currently too expensive for running the majority of applications in an enterprise. Even very aggressive estimates for these advanced techniques cannot overcome the >15x cost/GB gap between low-cost SATA HDDs and MLC flash devices. Consequently, flash-only arrays require separate low-cost disk-based storage systems for storing snapshots, replicas, infrequently accessed data, and the data of less IO-intensive applications.

As a result, flash-only arrays require significant additional work to stage and de-stage data and applications between flash and disk. Combined with their high $/GB and lack of enterprise features, this means that flash-only arrays are much better suited for very high-performance applications without significant data management requirements. Using flash-only products for enterprise applications will require extensive planning, monitoring and additional supporting infrastructure.


I have already discussed disk-based products using bolt-on flash as a cache. These systems still access disk in their basic data paths and cannot take full advantage of flash performance. On the other hand there are flash-based products, which are designed to fully leverage flash. Flash-only products have very good IOPS but are expensive and lack management features, making them incomplete solutions for enterprise applications. Fully exploiting flash performance — while eliminating the limitations and high costs of flash-only products — requires a more intelligent approach.


Intelligent flash-based products use a combination of flash and hard disk, but apply techniques such as inline deduplication, compression and working set analysis to service nearly all IO from flash. Most data evicted from flash is snapshots, replicas, unused applications, powered-off VMs and other very cold data. Unlike flash-only products, you can fill 100 percent of the useable flash without worrying about running out of space and having your applications come to a screeching halt. Intelligent flash-based products achieve sub-millisecond flash latencies, and are operationally far simpler and more cost-effective than flash-only products.


Given that many application management problems today originate from storage, flash combined with application-awareness allows intelligent storage systems to not only simplify storage management, but applications and the overall IT infrastructure. So why hasn’t this been done before? Prior to the advent of flash, mechanical disk-based systems were too complex to support a high level of intelligence. It would be like trying to build a personal computer using vacuum tubes. The huge leap in flash performance, at last puts intelligent storage within reach.

Flash-only architectures deliver performance but still require a disk tier for cold data. Intelligent flash architectures are built around flash, but can use disk as an integral part of the system to provide cost-effective data and performance management.


Although it is easy to think of flash as simply faster storage, it can offer far more. We have eliminated a key mechanical barrier to scaling computing systems. Computation, communication and — finally — storage will now scale with improvements in semiconductor technology. Consider that when transistors replaced vacuum tubes, we got much more than merely compact radios. We got more powerful and more intelligent systems. Similarly, flash is a technology with potentially profound impact when properly harnessed: intelligent products that are far simpler and far more powerful. It automates many of the tough but tedious problems such as configuration, management, efficiency and performance barriers that waste enormous amounts of system administrator effort.
Simple, intelligent, fast: This is the future of enterprise storage.
 

Bridgestone EMEA enhances efficiency and increases scalability across operations as it migrates...
Commvault provides cloud-first organisations with greater choice and flexibility to protect and...
Strategic collaboration delivers data science and IT teams a common platform and processes for AI...
Supermicro adds new maximum performance GPU, multi-node, and rackmount systems to the X14...
Collaboration bolsters generative AI capabilities with advanced data management and secure...
Additional industry first innovations in the Pure Storage platform provide financial flexibility...
Senior IT professionals lack confidence in recovery solutions, despite 78% losing data in the last...
Advances in intelligent data infrastructure enable simplicity, powerful high-end capabilities, and...