UPDATED 12:00 EDT / JUNE 21 2021


A new path for unstructured data in a hybrid world

Today, data is arguably more critical to business success than traditional physical assets. As computing faces the pressures of a data-driven era, theCUBE, SiliconANGLE Media’s livestreaming studio, reviews some of the milestone moments in enterprise data storage and explores the evolution of today’s emerging hybrid solutions.

A story of storage and speed

Enterprise data storage as we know it started in the late 1950s, when IBM released the first hard disk drive. Retroactively named directly attached storage, or DAS, these devices marked the dawn of enterprise storage. But they were bulky and expensive. Each IBM 350 disk storage unit was the size of a medium refrigerator and cost $34,500. That’s $300,000 in today’s money, or $81,500 per megabyte. Imagine if that cost to storage capacity ratio held today?

IBM 350 unit being loaded for shipping (Image: Pinterest / Dawn of the Computer Era)

Despite these shortcomings, DAS ruled until the 1980s when the invention of local area networks led to network-attached storage, which enabled many computers to access a centralized storage device. The 1990s brought the next iteration in the form of storage area networks, known as SAN, although it wasn’t until the 2000s that pre-built SAN became widely used.

Physical size and expense dramatically dropped over the years of hard disk development, while storage capacity grew. The last version of IBM’s original 350 unit was less than six inches in size and held 180 gigabytes of data.

As compute technology went into high-speed evolution, the mechanical nature of disk storage became a stumbling block.

“Disks became smaller, heads moved across the magnetic media from one track to another, but nothing has had much impact on the time to access data. Magnetic disks are limited by rotation speed, and access times are measured in milliseconds while compute cycles are measured in nanoseconds,” stated David Floyer, research analyst at The Wikibon Project in a 2015 analysis of the evolution of all-flash array architectures. Wikibon is a SiliconANGLE sister company.

The bottom line: Enterprise demanded access speeds that disk-based storage solutions just couldn’t meet, creating the urgent need for a non-mechanical storage solution.

Driving change for a new millennium

By the start of the 21st century, flash memory, named for its ability to allow fast electrical erasure, had revolutionized the consumer electronics market.

“I wanted to make a chip that would one day replace all other memory technologies on the market,” Fujio Masuoka of Toshiba Technologies told Bloomberg Businessweek in 2006. “Going after [the memory storage] market was the obvious thing to do for me.”

Flash was mass-produced, readily available and cheap. Innovators saw the potential of NAND architecture for mass storage, and Masuoka’s goal to take over the memory market made the leap from consumer to enterprise grade.

All-flash arrays started to arrive in the 2010s, with a flurry of investment marking excitement over the development. But the competition was brutal, and the majority of players fell by the wayside. Two notable stories are those of Violin Systems LLC (originally Violin Memory) and Pure Storage Inc.

Early market leader Violin had a record-breaking initial public offering in 2012, but its roller-coaster history ended with its 2020 sale to StorCentic Inc. Pure Storage, however, is very much still in the game. Chief Executive Officer Charles Giancarlo told investors this year that Pure experienced “great strength and growth, setting new revenue and sales records” in 2020.

“Flash has killed what we call ‘performance by spindles.’ In other words, the practice of adding more disk drives to keep performance from tanking,” said David Vellante, co-CEO of SiliconANGLE Media and co-founder and chief analyst at The Wikibon Project, in a recent analysis of the storage market.

Flash hasn’t been the only driver of change, according to Vellante. “Cloud has been the most disruptive force in storage over the past 10 years,” he said in his analysis.

Amazon Web Services Inc. introduced its S3 storage solution in 2006, kicking off the adoption of cloud that is culminating in today’s COVID mandate to “transform or die.” Discarding the block and file systems used by mechanical storage, S3 uses object storage, a new method that was better suited to manage the massive amounts of unstructured data cloud was collecting.

As the decade advanced, so did cloud adoption. As multicloud and hybrid approaches became more common, information technology teams had to choose where to store data. Often the answer depended on the workload. This led to technical silos, where database application data was stored on-premises in blocks, files stayed in file stores, and data associated with newer web applications went to the cloud and object storage.

Cloud culture hits storage

As with everything it touches, cloud brought the need for agility to the storage world. Faced with the fear of obsolescence, executives that had previously left the IT department to its own devices came looking for data insights to streamline operations and fulfill customer needs. In turn, chief data officers sought storage solutions to address the demand for faster analysis of growing data in an increasingly dispersed environment.

“We want to give customers the choice, whether they want to run on-premises or in the cloud,” Rob Lee, vice president and chief architect at Pure Storage, told theCUBE as the company expanded its hybrid reach in a 2019 partnership with AWS. “We don’t want to put customers in a position where they feel they have to make that choice and feel trapped in one location or another because of lack of features, lack of capabilities, or economics.”

For Pure, there’s been a growing shift in modern analytics platforms away from traditional architectures to hybrid solutions designed to scale the compute and storage pieces independently.

2020s bring file and object unity

Pure took the visionary crown in 2020’s Gartner Magic Quadrant for distributed file systems and object storage, most recently delving into unified fast file and object storage, or UFFO.

UFFO is designed to meet five drivers of storage innovation:

  1. The rise in fast-object popularity
  2. Increasing reuse of data across applications
  3. Growth of machine-generated data
  4. Risk of disruption from ransomware attacks
  5. Mission-critical demand for reliable and consistent performance

Traditionally, distributed file systems and object storage haven’t had much in common. Distributed file systems use hierarchical directories while objects are stored in flat buckets; distributed file systems use portable operating system interface, or POSIX, file operations while object storage uses REST API; DFS random writes anywhere in file while object atomically replace full objects … and so on. But, critically, both are suited for storing large amounts of unstructured data.

As stated earlier, breaking down silos and simplifying access to data assets regardless of location is key to gaining the insights so critical to business advantage.

“A modern data experience, whereby data is easily accessible, commutable and delivered where it is needed instantly does not respect the technical silos that exist between different environments,” stated Wes van den Berg, vice president and general manager of Pure Storage United Kingdom and Ireland, in a November 2020 post on the impact of data applications to storage technology. Pure’s FlashBlade is the world’s first UFFO platform, modernizing storage for the hybrid era.

Of course, Pure isn’t the only innovator in the market, according to Gartner’s Magic Quadrant for distributed file systems and object storage. NetApp Inc.’s Keystone is designed for hybrid cloud and offers a flex subscription option, while Red Hat’s Ceph makes artificial intelligence and machine learning workload needs top priority. Long-haul players Dell Technologies Inc. and IBM head up the leaders’ quadrant, and Pure loses out in its ability to execute to newly minted unicorn Qumulo Inc., which along with Scality Inc. rounds out the four leaders.

Qumulo’s “secret sauce” is the company’s ability to deliver radical simplicity, Bill Richter, president and chief executive officer told theCUBE, describing Qumulo as “laser-focused on building solutions that simplify the increasingly complex task of managing massive amounts of file data.” Before building its file data platform Qumulo Core, the company conducted thousands of interviews with enterprise data storage users to learn what their pain points were in order to remove them.

“We’re dissatisfied with the status quo of spending hours on infrastructure challenges that should take seconds,” Richter said. “If you have a time-sensitive research project like the COVID vaccine and you run out of storage space, it should be simple to add a node and scale up.”

From Dell’s PowerScale to Caringo Inc.’s Swarm, the most popular storage solutions are focused on managing unstructured data and catering to hybrid environments. It’s not surprising. Gartner reports an unstructured data growth of between 30 and 60% year over year and states that 40% of infrastructure and operations leaders plan to implement at least one of the hybrid cloud storage architectures by 2024.

“You shouldn’t use a teaspoon to measure a gallon,” said Richter, describing the impossibility of managing today’s vast amounts of unstructured data with legacy hardware systems. With 70% of Qumulo’s customers already over the petabyte mark, Richter predicts file data accumulation rates will soon become an avalanche.

“We’ll have to be in the hybrid cloud to be ready for it,” he said.

Vellante asserts that the world is hybrid. “The days of selling storage controllers that mask the deficiencies of spinning disk or add embedded hardware functions or easily picking off a legacy installed bases with flash … well those days are gone.”

Image: Vector Fusion Art

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community 

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.