UPDATED 12:00 EDT / JULY 07 2021

BIG DATA

Airbyte enables easy data integration from multiple sources with Amazon S3 storage

Open-source data integration startup Airbyte Inc. has come up with an easy way for Amazon Web Services Inc. customers to replicate data from dozens of popular sources to their Amazon Simple Storage Service accounts.

The new capability announced today is said to be the industry’s first-ever open-source integration for a data lake. Companies can now use one of Airbyte’s more than 75 pre-built connectors to transfer data from a number of widely used databases to their Amazon S3 accounts.

Airbyte is a newly emerged data integration startup that’s trying to solve the tricky problem of moving data from applications, databases and application programming interfaces to data warehouses and data lakes, which are used to analyze that information. It rivals firms such as Fivetran Inc. and Talend SA, and does away with the need for companies to build their own data connector for each individual data source.

Airbyte’s open-source connectors automatically adjust to any schema and API changes, ensuring continuity with data projects, the company says. They run in Docker containers too, allowing them to be deployed on any kind of infrastructure.

Whereas rivals like Fivetran and Talend build their own connectors, which are proprietary software, Airbyte’s are all built and maintained by the open-source community. Airbyte Chief Executive Michel Tricot told SiliconANGLE in May, when the company raised $26 million in a Series A funding round, that the open-source approach has significant advantages, allowing connectors to be more easily edited, for example.

Moreover, by building an open-source community, Airbyte believes it will eventually be able to create and maintain many more connectors than its rivals, supporting numerous smaller services, Tricot said. The company’s vision is to foster a community that will eventually build and maintain thousands of connectors.

Airbyte already offers connectors from sources such as PostgreSQL, MySQL, Facebook Ads, Salesforce and Stripe. Users can connect those to data warehouses, including Redshift, Snowflake and BigQuery.

Airbyte said that Amazon S3 is the first data lake destination it offers. That will provide a big advantage to some users, because data lakes and data warehouses are not the same.

Whereas data warehouses are filled with structured data that has already been processed and filtered for a specific purpose, data lakes are vast pools of raw data that has no predefined purpose. Data lakes are therefore more difficult to work with, but the information within them potentially holds much greater value.

Asked why Airbyte built its first data lake connector for Amazon S3, Tricot told SiliconANGLE that it was the “most popular and also the most requested” by the company’s users. That’s not to say other data lakes are being neglected though, for the company has plans to add more destinations for its connectors in the near future. Its targets include “the data lakes of other cloud providers” and also the open-source Delta Lake project started by Databricks Inc., it said.

“Airbyte is moving forward with its mission to commoditize all data integration and will start supporting all the other data lakes,” Tricot said in prepared remarks today. Referring to processes known as extract/transform/load or extract/load/transform, he added, “Airbyte is becoming the new de facto standard for open-source ETL/ELT.”

Image: Airbyte

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU