UPDATED 16:43 EDT / FEBRUARY 02 2022

BIG DATA

Managed data lakehouse startup Onehouse launches with $8M in funding

Onehouse, a new data lakehouse startup, today launched from stealth mode after raising a $8 million seed funding round co-led by Greylock and Addition.

A lakehouse is a new type of software solution that helps enterprises more efficiently extract insights from their data. It combines the features of data warehouses and data lakes in a single platform.

Menlo Park, California-based Onehouse provides a managed, cloud-based lakehouse service designed for ease of use. Setting up a lakehouse environment usually requires months of work and specialized technical know-how. Onehouse says that its service reduces the setup process from months to just a few minutes. 

Onehouse’s service is based on the open-source Apache Hudi platform, which was created by founder and Chief Executive Officer Vinoth Chandar while working at Uber Technologies Inc. as a data architect. Uber uses the platform in production to process about 500 billion records every day. Other users include Amazon.com Inc., Walmart Inc. and General Electric Co.’s GE Aviation division.

“The data lake house is the future of data lakes, providing customers the ease of use of a data warehouse with the cost and scale advantages of a data lake,” said Greylock Partner Jerry Chen. “Apache Hudi is already the de facto starting point for modern data lakes and today Onehouse makes data lakes easily accessible and usable by all customers.”

One of the flagship features of Onehouse’s lakehouse service is a technology called incremental processing. It allows companies to start analyzing their data soon after it’s generated, which is difficult when using traditional technologies.

Before a company can start analyzing its business records for insights, it has to move the records to a data processing environment. This task is usually performed with ETL, or extract, transform and load, software. 

Moving records to a data processing environment using ETL software often takes hours, which means that by the time the information arrives in a company’s data processing environment, it’s no longer fresh. As a result, the information becomes less useful. That’s an especially major challenge when it comes to implementing real-time analytics use cases, which depend on the ability to process data soon after it’s generated.

Onehouse’s incremental processing technology allows companies to ingest data every few minutes rather than every few hours as ETL tools do. The result: Enterprises can run analyses on the data while it’s still fresh.

The company’s lakehouse service automatically optimizes customers’ data ingestion workflows to improve performance, the startup says. Because the service is delivered via the cloud on a fully managed basis, customers don’t have to manage the underlying infrastructure. 

Onehouse also provides a raft of other features to ease day-to-day operations for users. One of those features is a capability dubbed small file compaction. It allows companies to consolidate multiple small records into a single, larger record in order to optimize query performance. Turning multiple data points into a single item reduces the total number of records that an application has to scan while reading data, which speeds up processing.

Chandar said in a statement that “while a warehouse can just be ‘used,’ a lakehouse still needs to be ‘built.’” Having worked with many organizations on that journey for four years in the Apache Hudi community, we believe Onehouse will enable easy adoption of data lakes and future-proof the data architecture for machine learning/data science down the line.”

Onehouse will use its $8 million seed funding round to expand research and development activities. 

Image: Onehouse

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU