Can the enterprise afford to be in the ‘ignore state’ with its data lake?


The data lake is evolving, and one company is nudging the enterprise into extracting value from data that’s stored in this increasingly complex paradigm. 

“Time to value is critical. So how do you reduce the time to insight — from the time the data is produced to the time the data is available to the data consumers and for downstream use cases? asked Ben Sharma (pictured, right), founder and chief executive officer of Zaloni Inc.

Sharma and Tony Fisher (pictured, left), senior vice president of business development and strategy at Zaloni, answered these questions when they joined John Furrier (@furrier) and George Gilbert (@ggilbert41), co-hosts of theCUBE, SiliconANGLE Media’s mobile live streaming studio, during the BigData SV event in San Jose CA, (*Disclosure below.)

The company finds missed opportunities all around them as they visit prospects and customers. “The vast majority [of people he meets] … are in the ignore state. About 50 percent plus of the organizations are in the research stage,” Fisher said.

Approximately 25 percent of these companies are in the data store phase, and Zaloni is trying to move them to the managed data lake environment, he added.

What is Data Lake in a Box?

The company hopes to speed up the process with its latest solution, Data Lake in a Box. The offering brings the full stack together for customers by supplying all the software, integration and data ingestion.

The product has a unique data management and data governance in place to help companies leapfrog into a managed data lake environment in eight weeks, according to Zaloni. The solution offers managed ingestion from various sources through the integrated platform with a common metadata layer, allowing for data validation and data quality and lifecycle management. Companies now can derive the benefits of analytics and insight into their data much quicker, Sharma explained.

“The metadata layer is key to being able to generate these insights about the data itself. You can use that [layer] effectively for data science or downstream applications and use cases. This is critical in our experience of taking a POC [proof of concept] pilot into the production phase,” explained Sharma.

Data Lake in a Box uses a rich catalog that organizations can use as a data marketplace or portal, which has a permissions-based shopping cart. Fisher called it an on-ramp concept that gives the end user the ability to make sure they have the right data and allows the data scientist to be in a good position to create an application.

A rich catalog allows enterprise customers to create a data marketplace or portal within the organization that catalogs not just the data lake, but other data stores and provide one single unified view of the data sets for data scientists to see if the data will work for the applications they are developing, according to Sharma and Fisher.

In a multi-cloud world, companies need to move from a single architecture to a distributed architecture, and Data Lake in a Box integrates with legacy databases and hybrid architectures while offering the same controls and state, as well as the same governance as you build the environment, they said.

The company is also researching machine learning and artificial intelligence as part of the managed layer of the data lake.

“Moving forward with machine learning and some of the advanced algorithms, some of the research we are doing now is using machine learning to manage the data lake, which is a new concept. So when we get to the optimized phase of our maturity model, a lot of that has to do with self-correcting and self-automating,” said Fisher.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of BigData SV 2017. (*Disclosure: Some segments on SiliconANGLE Media’s theCUBE are sponsored. Sponsors have no editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE