When he founded Vertica in 2005, you might have expected Michael Stonebraker to base his new analytics database on Postgres, the open source relational database Stonebraker invented over 20 years earlier.
But that’s not what happened.
“He invented it and chose not to use it,” said Colin Mahoney, Vertica’s Vice President of Products and Business Development.
Instead, Stonebraker built a new columnar-oriented, massively parallel analytic database from the ground-up. That’s in stark contrast to Vertica competitors like Greenplum, who modified the exiting Postgres, row-oriented architecture to create their data warehouse offerings, Mahoney said.
Vertica’s native approach “allowed us to do a lot of things that nobody will ever be able to do when they add on to an existing database,” Mahoney said. “For example, we can operate on encoded and compressed data. We don’t have to decompress our data to operate on that, something that Oracle can never do, something that all of these Postgres pretenders, as we like to call them, have a really hard time doing.”
HP apparently also saw the value in Vertica’s approach to Big Data Analytics and acquired the company earlier this year. Vertica now sits inside HP’s Office of Technology and Strategy, but is allowed to operate in much the same way as it always has, Mahoney said.
Speaking live inside theCube with Wikibon Chief Analyst Dave Vellante and SiliconANGLE Founder John Furrier at HP Discover in Las Vegas, Mahoney also gave his take on Hadoop and why Vertica has no plans to offer its own commercial distribution of the open source Big Data framework.
“We see a lot of value in HDFS [Hadoop Distributed File System] as a file system to store all different types of data. You don’t have to define a schema in advance, it can be unstructured, it can be structured,” Mahoney said. “I think there is virtue to a simple, distributed system that can store a lot of information with the flexibility of programming.”
But Hadoop is batch-oriented and not capable of the powerful, real-time analytics that Vertica specializes in. There is also a dearth of data scientists with the skills to deploy and manage Hadoop distributions, according to Mahoney.
So rather than come out with its own Hadoop distribution as EMC Greenplum did earlier this year, Vertica developed an open source connecter to HDFS. “And then what we do with our connector is very easily allow you to move that metadata into Vertica to do the real-time analytics,” Mahoney explained.
“Frankly, I think what Greenplum is doing and what Aster is also doing is very much confusing the market” by contributing to the many Hadoop forks being created. Competing approaches to Hadoop, he said, slows down development of the framework, as enterprises don’t know which fork to take.
“Somebody comes out with a proprietary platform, you write your code for one, it doesn’t run on the other. That’s not what customers want,” Mahoney said. “We will not create another fork.”