UPDATED 12:00 EST / MAY 25 2021

BIG DATA

Open-source data pipeline startup Airbyte raises $26M round

Open-source data integration platform startup Airbyte got the backing of several prominent investors in a $26 million round of funding announced today.

Benchmark led the Series A round, which also saw the participation of 8VC, Accel, SV Angel, and Ycombinator, as well as private individuals who include MongoDB Inc. Chief Executive Dev Ittycheria, Elastic N.V. co-founder and CEO Shay Bannon and LiveRamp Holdings Inc. co-founder Auren Hoffman.

Today’s round comes hot on the heels of a seed funding round of $5.2 million in March 2021 led by Accel, bringing Airbyte’s total amount raised so far to $31.2 million.

Airbyte is using open-source technology to try to solve the tricky problem of moving data from applications, application programming interfaces and databases to a data warehouse or a data lake, where it can be analyzed more easily.

It’s a data integration firm that wants to rival companies such as Fivetran Inc. and Talend SA, which provide various pre-built data connectors to enable the smooth transfer of information from various sources to a cloud data warehouse.

Like those rivals, Airbyte’s platform does away with the need to build a data connector for each individual data source as it automates the entire process. The company offers prebuilt data connectors that can automatically adjust to schema and API changes, helping ensure the continuity of data projects. Airbyte’s connectors run in Docker containers, which means they can be deployed in minutes on any cloud platform.

Airbyte CEO Michel Tricot told SiliconANGLE the key difference is that Airbyte’s platform is open-source, which means the data connectors it provides are built and maintained by the community, rather than by the company alone. That’s important because building and maintaining connectors requires a ton of resources, and while connectors exist for popular data sources such as Salesforce and Marketo and common destinations such as Snowflake and BigQuery, a lot of smaller services aren’t supported by traditional data integration platforms.

In that case, Tricot said, companies must resort to building their own custom connectors internally.

“Closed-source solutions will never be able to address the long tail of integrations, as they will always have a ROI consideration to building and maintaining a data connector in-house,” Tricot explained. “With open source, you no longer have the ROI consideration.”

Another problem with proprietary solutions is that the connectors on offer are usually designed for specific use cases, and cannot easily be edited by end users, Tricot said. With open-source connectors, that’s never going to be a problem.

“Closed-source data integration solutions are great if you have exactly the same use cases that the solution was built to support,” Tricot said. “But once you stray away from these simple and limited use cases, you start hitting walls. And everybody strays away in our increasingly complex multicloud, multitool, fragmented world we live in.”

Airbyte’s goal is to foster a community of users that it says will eventually build and maintain literally thousands of open-source data connectors that can then be used by everyone. To that end, it has created a Connector Developer Kit that any data engineer can use to build a connector from a specific source to a specific destination in as little as two hours.

Tricot said that this task usually takes two days or more, so the CDK is a big help. For now, the CDK is limited to REST API source connectors, but the company is planning to expand it to include all types of sources and destinations.

Airbyte reckons that by commoditizing data integration in this way, it will be able to establish a new standard for moving and consolidating data from different sources to data warehouses and data lakes. It’s moving fast in that direction already, growing its user base by more than eight times in the last four months, to more than 2,000, it said.

The company’s open-source advantage will also enable it to disrupt the current pricing model around extract, transact and load services. Tricot said Airbyte is planning to release a hosted version of its platform, called Airbyte Cloud, later this year, and that it will have some very big repercussions for the industry. Although the company is keeping its future pricing strategy confidential for now, it said there are a number of drawbacks to the existing volume-based pricing models in use today.

Tricot said one of the main issues is that some companies’ databases are absolutely huge already, and that having millions of rows of data is a fairly common sight. One of the dangers with this is that it’s easy for an employee to replicate an entire multimillion-row database with a single click. What with volume-based pricing, that single click could cost several thousand dollars, he explained.

A second problem with volume-based pricing is that it’s not really aligned with the value provided to customers, Tricot said. “The insights you get from your data have almost the same value whatever your data volume is, as long as the data is complete,” he said. “But it takes as much time for a data engineer to maintain a data pipeline, whatever the volume going through it.”

As a result, volume-based pricing can be extremely unpredictable. Airbyte is determined to change this with more “disruptive pricing” as it moves forward, Tricot promised.

Analyst Holger Mueller of Constellation Research Inc. told SiliconANGLE that Airbyte looks to be a promising startup because data integration is a very hairy problem for enterprises. He said data integration becomes more challenging by the day as companies move different automation assets to the cloud, and the cloud-based software they use gets frequently updated.

“Enterprises are in a constant state of data integration, so it’s good to see a fresh set of eyes and ears looking to take on the problem,” Mueller said. “Data integration work needs to move away from manual, declarative processes and become more automated.”

Tricot said there are currently more than 70 connectors available on Airbyte’s platform, but he’s hoping to more than double that number by the end of the year. “Our goal is to reach 200 and become the platform with the most connectors within 18 months from inception,” he said.

Photo: Amigos3D/pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community 

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.