Dave Vellante sat down with Alistair Veitch, the Director of Storage and Information Management Platforms at HP Labs, at HP Discover 2012 being held this week in Frankfurt, Germany (full video below).
Over the last decade there’s been a change in how organizations are operating, now they’re taking over smaller groups, due to the risks of research and development (R&D). HP Labs has been criticized for not bringing enough innovation to the commercialization of the acquisitions that HP has done over the last several years, with two of those being Express Query and StoreOnce—both overseen by Alistair.
StoreOnce first was introduced publicly three years ago, but work really started three years before that announcement talking to partners in the storage business trying to figure out where storage stood, and where it needed to go to be the most effective with StoreOnce the result of that work. StoreOnce is a means of storing large data and eliminating duplicating parts and how they could apply that to the virtual tape library systems with the optimal algorithms. They wanted to de-duplicate the data, and when putting it back together, doing so as quickly as possible. It’s a combination of both architecture and the algorithms in order to quickly retrieve the information.
“The ability to take that technology and put it into a lot of different places,” is what Vellante says that HP is touting, “you can use it for backup on hardware devices… but also you can put it into software, in theory, you can put it into primary storage.”
When a user backs up data, they’re repeatedly backing up the same information, leaving a higher value proposition, unlike primary storage where the breakdown of the data isn’t as large, because the values haven’t yet been introduced into storage. Backed up data can be compressed, broken down, and de-de-duplicated at a much higher rate than primary storage, although there is the ability for primary storage and that shouldn’t really deter a potential user. The user can truly make StoreOnce extremely efficient which is optimal for organizations.
Express Query has just been announced at HP Discover 2012 and is a capability in the store all product line that looked at the issue of metadata management in the storage systems—metadata is information about the data—the idea that knowing what exactly you have, who touched them, when they were touched and so forth, and storing all of that and having the capability of querying it in a database. HP realized quickly that conventional storage space was not suitable with files of that size, and the constant updates and changes left systems falling short.
HP then went on to develop their own technology from the ground up, integrating it into the product group, and now gives users the ability to take that data, hold onto it all efficiently, and query it extremely fast. For instance, if you’re looking for all the files that have been changed within a certain time, or if you’re a systems administrator and you want to find all the files bigger than a certain size, or all the files written by a certain user; this used to take days in order to get a response back from the query. The purpose, and goal, of Express Query is to get the response to those queries 100,000 times faster, getting results in mere seconds.
In order to enable this, one of the things HP did was develop their own database system by changing how they work. Alistair explains that when a query is made to a conventional database it’s processed right then and there. The new database was created for the optimization of the rapid incoming information. What Express Query does is put that process into the background where it’s not affecting the foreground workload or any of the other operations that are occurring in the file system; everything is updated in the background, and made sure to be consistent, then it’s made available for the query.
Vallente asks Alistair about the application of Express Query to Hadoop, where the big batch jobs and bringing of real-time to that entire environment is difficult. Alistair does admit that it’s a tenuous connection at best. StoreOnce could be translated to Hadoop with some success, but Express Query would be much more difficult to put into the system.
Vallente pushes a little further, saying that “Once I find my nuggets, I have all this data, I wanna search on this data, so why wouldn’t it do it,” to which Alistair says that you could do. One of the good things about Express Query is that you can ask the program to bring you the information of a certain sort, and in that sense, Hadoop could be used, storing the information directly into Express Query and then being able to search it.
While critics might say that HP Labs hasn’t been doing enough in the realm of innovation, it’s clear that between StoreOnce and Express Query that innovation has been an area of diligence for HP and led by Alistair HP Labs has introduced new products that should be exciting all users and quieting the critics.