After a brief hiatus, Hadoop returned to the center of attention last week when private equity giant IVP and three other investors injected $30 million into the coffers of Qubole Inc. to aid its mission of simplifying the data crunching framework. The startup’s namesake service provides a graphical interface that enables business users to manipulate information in their cloud-based analytics clusters without writing any code.
Analysis may be carried out using one several processing engines depending on the specific goal at hand. Small ad hoc requests that have to be fulfilled quickly, for example, can be executed using HBase to keep latency at a minimum, while Spark and MapReduce are available to handle more complicated work. The Qubole Data Service automatically provisions the necessary resources from the cloud platform on which an organization’s Hadoop implementation is deployed and releases the infrastructure immediately after an operation is complete.
It’s the same functionality Cloudera Inc. offers in the new iteration of its rivaling automation software that debuted last week. The release ups the ante against Qubole with a high-availability mode that enables cloud deployments to quickly recover from node failures and access controls that make it possible to regulate who can manipulate what information at the row level. The additions aim to make the company’s Hadoop distribution more viable for sensitive workloads such as financial records and healthcare information, but the framework has a long way to go before it is ready to become a one-stop data lake.
For the time being, organizations are handling much of their information using simpler tools like Popily Inc.’s namesake visualization engine, which hit general availability against the backdrop of Cloudera’s update. Built-in machine learning algorithms scan user data to provide suggestions on how to best organize it and thereby eliminate most of the time-consuming tinkering involved in creating charts, according to the startup.