UPDATED 14:42 EDT / JANUARY 23 2018

BIG DATA

Apache Arrow-based self-service analytics startup Dremio hauls in $25M

Self-service data analytics startup Dremio Corp. today said it has closed a $25 million funding round to build and market the Apache Arrow-based platform that the company says can democratize access to big-data repositories.

The Series B round brings Dremio’s total funding to $40 million. It was led by new investor Norwest Venture Partners along with existing investors Lightspeed Venture Partners and Redpoint Ventures.

Founded in 2015 by a group of big data veterans that includes a co-developer of Arrow, Dremio launched last summer, promising that its productcan work with any business intelligence or data science tool to eliminate the need for messy big-data processes such as extract/transform/load procedures, data warehouses, aggregation tables and extracts.

“Most companies don’t have the requisite skills in-house to use Hadoop effectively,” said Kelly Stirman, chief marketing officer. “It’s easy to get lots of data into Hadoop, but it’s hard to open it up to people who aren’t proficient with a language like Java and understand the 30-plus projects that comprise the Hadoop ecosystem.”

Dremio is based upon Apache Arrow, a distributed query engine that uses columnar in-memory analytics to boost speeds up to 100-fold. Its technology is similar to the one Google LLC uses to deliver sub-second response times in response to search queries, but Dremio is optimized for analytical operations, said Tomer Shiran (pictured), co-founder and chief executive.

“When a query is submitted to us, we compile it and leverage optimized data structures that are a combination of indexing and partitioning,” he said. “This happens regardless of where your data lives. Traditionally, you have to make 12 different copies with IT [information technology] in the middle.”

Dremio said it gives users the flexibility to use their preferred business intelligence front-ends, such as Tableau, Looker, Qlik, PowerBI and even Excel on data in a data lake, a storage repository for raw data.

The company also provides an open-source community edition of its product as well as a licensed enterprise version. Designed for use both on-premises and in the cloud, the software can tap into elastic compute resources as well as object storage such as Amazon Web Services Inc.’s S3 to populate its home-grown indexing, sorting, partitioning and aggregation optimizer called “Data Reflection.”

“We sit between the tools you want to run on the desktop and where the data lives today, whether in one data lake or across several systems,” Shiran said. “We do everything in between those two layers.”

Image: Dremio

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU