Multi-Petabyte Databases Demand Hyperscale Distributed Technologies

Multi-petabyte and exabyte sized databases, until now largely the province of large government agencies and cloud/social media service providers like Facebook and Shutterfly, are quickly coming to large and even medium-sized companies, Russ Kennedy, VP of Product Strategy at Cleversafe, told the Wikibon Peer Incite community on January 22. Increasing numbers of companies are seeing data quantities expand rapidly as part of a major shift in the way companies operate to a more networked, social media approach. Databases that once were mainly written documents such as e-mail repositories now contain an increasing percentage of very large audio, video, and still image files, even as the total volume of files increases.

These fast-growing, large databases, of material that is often written once and then accessed occasionally, demand a new technical architecture, he says. Above about 2 Petabytes, traditional databases and backup technologies start experiencing performance problems, and by the time they reach 10 Petabytes they become impractical in both performance and financial terms. In a world in which Exabyte databases are now becoming if not common at least not exotic, and many companies are watching databases reach multi-petabyte size over the next two-to-five years or have already reached that size, this implies a major shift in database technology.

In business, Kennedy says, it is increasingly common for teams spread around the world to collaborate on product and service development and other projects that involve large visual and other files and create large amounts of interaction and digital information. These environments create large numbers of large files in various formats, and members want to share these worldwide and preserve them for future reference. That global collaborative approach is driving private cloud adoption, as organizations want to add more services to support high-productivity approaches while controlling costs. And they want to do this internally rather than moving it to a public cloud service provider like AWS, in part to control privacy and security.

Software-Led, Commodity Hardware Architecture

This demands a new approach to infrastructure supporting a new architecture, Kennedy says. Rather than building on traditional, expensive, custom hardware with services built in, it demands the Facebook/Google/Yahoo approach of building on commodity hardware that comes with no maintenance and is simply replaced when it breaks, and putting all the services in software. Traditional backup becomes impossible for databases this size, and traditional replication becomes impractically expensive. Instead, companies need to adopt the erasure coding approach of breaking up these large objects and distributing the pieces to different parts of the architecture, often in separate locations, and reconstructing the file when needed, based on having access to a certain number of pieces. When a storage system breaks, the data can still be reconstructed with the remaining pieces.

Metadata becomes vital in this model for finding the right pieces quickly and reconstructing the full object correctly. In the Cleversafe model both objects and metadata are stored in the same architecture and distributed across physical systems and locations. However, some Cleversafe clients, notably Shutterfly, which stores Exabytes of personal photos for customers permanently and needs relatively fast recovery of individual photos on demand, store their metadata separately. This allows faster reconstruction of individual objects. Another approach, which Cleversafe and, presumably, other providers of this new technology, are experimenting with, is to have a large, flash-based cache in front of the database to hold objects that are in demand at a given time.

This technology is not adequate for all use cases, and high-performance Tier 1 applications are a poor fit, Kennedy warns. Reconstructing data from multiple pieces scattered over a large infrastructure comes with a time tax. Typically recovery times are in the 100-to-200 millisecond range rather than the microsecond response rate demanded by many transactional systems. And that rate is dependent on several considerations such as the speed of the network, processing speed in putting the object together, and geographic distances involved.

As a result, he says, the traditional hardware vendors are not going to see their present markets erode significntly. However, the main growth in the future will be in these software-led, commodity-hardware architectures. And, he says, the vendors are obviously aware of this and are either creating partnerships or acquiring companies to help them move into the commodity hardware market, while are working to abstract the management software and services to a higher layer. The vendors can certainly create a viable market on the basis of selling bare-bones hardware that carries no maintenance contract and will be replaced every three years while selling their management and other software either bundled with the hardware or separately. But it will require that they migrate to a new financial model, which will involve some pain.

Cleversafe itself sells both complete systems, built on third-party commodity hardware, and its storage management software, including erasure coding, separately to run on the customer’s hardware. As long as the hardware meets basic technical specifications, Kennedy said, the Cleversoft software can turn virtually any hardware infrastructure into a distributed, very large database architecture.

Peer Incite meetings are free to members of the Wikibon community. To join, simply register at Wikibon. Membership is free and provides access to all Wikibon research and the ability to comment on and contribute new Alerts and longer reports to the community.