UPDATED 13:30 EDT / NOVEMBER 17 2021

CLOUD

Data lake transformation fueled by ChaosSearch approach to scalability, automation and multiple tools

There is a new era in data management and analytics, and it’s being driven by enterprise solutions implemented through firms such as ChaosSearch Inc.

In a previous paradigm, data lived in a cloud object store and had to be extracted into a new location for cleansing and structuring through a schema for analysis. This could often be complicated and expensive.

ChaosSearch has opened a new window into data management and analytics by offering a much different solution. Now enterprises can leave data where it resides and then leverage multilingual tooling to extract insights.

The ChaosSearch story is one of scale, automation, fine-grained capabilities and a growing customer ecosystem that can innovate on top of data and extract value. In doing so, the company is redefining the ultimate mission of a data lake.

“What if you could have the easy ‘in’ and the value ‘out’?” asked Thomas Hazel (pictured, left), founder and chief technology officer of ChaosSearch. “What if you had that simple ‘in’ with a unique architecture and index technology to make it virtually accessible and publishable dynamically at petabyte scale? We’re getting the benefits on both sides — schema on read/write performance with schema on write/read performance. That’s the true promise of a data lake.”

Hazel spoke with Dave Vellante, host of theCUBE, SiliconANGLE Media’s livestreaming studio, during the ChaosSearch: Make Your Data Lake Deliver event. He was joined in the interview by Ed Walsh (pictured, right), chief executive officer of ChaosSearch. Vellante also spoke with Kevin Miller, vice president and general manager of S3 at Amazon Web Services Inc., and Mark Hill, senior director of IT operations at Digital River Inc., in separate interviews during the program. They discussed how ChaosSearch provides tools for refining stored information, data as a product for consumption, innovation on top of the data lake platform, and how one customer is leveraging the solution for business insight. (* Disclosure below.)

Rise of the data mesh

ChaosSearch’s approach highlights a key enterprise computing trend. A focus on using the centralized data warehouse is changing, and the data mesh is coming.

The concept of a data mesh was outlined by Zhamak Dehghani, director of emerging technologies at Thoughtworks Inc., as part of a blog post in 2019. She made the case that as data becomes more valuable, it must also become more distributed, counter to the existing notion that it must be stored in a centralized, monolithic location such as a data warehouse.

In a data mesh, those who create information also own it, and therefore they must also own the tools necessary to do something valuable with it for the enterprise. In this model, data becomes the product and every employee is a consumer.

“I think she’s right on with her philosophy,” Hazel said. “We’re mesh-ready; we can participate in a way that very few products can. My argument with the data mesh is that producers and consumers have the same rights; I want the consumer to be able to choose how they want to consume the data. Our data refinery is that answer.”

Analysis for S3 storage

The Chaos Refinery is designed to perform the virtual transformation of data needed by each user. It is the software platform that cleans and prepares data, allowing the user to visually interact with the information as necessary.

“We get rid of the physical extract, transform and load, which is 80% of the work, but the last 20% is done by this refinery where you can do virtual views,” Walsh explained. “You can give that as a role-based access service to your end users. Yyou don’t have to be a data scientist or database analyst. In fact, all of our employees, regardless of seniority, if they’re in finance or in sales, they go through and learn how to do this.”

ChaosSearch’s SaaS solution transforms S3 storage on AWS or storage in Google Cloud into a log and event analytics platform. In addition to using the Elasticsearch engine on AWS, users can analyze data from applications such as Looker, Tableau or Grafana.

The rise of cloud native platforms has increased the amount of log data stored in S3 buckets, and a great deal of that information is being driven by mission-critical applications. Organizations need to analyze this data to see how applications are performing or being used by customers, and this is where ChaosSearch brings value.

“ChaosSearch is a good example of the kind of software that helps go up-stack, automate data management, and help customers focus on the things they want to accomplish for their business,” said AWS’ Miller. “What ChaosSearch is doing here is automatic indexing, being able to take the data as it is in their bucket, index it, keep it fresh, and allow for customers to innovate on top of that.”

Support for APIs

Digital River Inc. is an example of a company seeking to innovate on top of the ChaosSearch platform. Founded 27 years ago, Digital River grew its business rapidly as a provider of global e-commerce, payments and marketing services.

As the company expanded, its datacenter footprint and the associated costs of running IT operations grew as well. Digital River began moving to AWS several years ago and has adopted a cloud-first vision, which included leveraging the capabilities of ChaosSearch’s platform.

“We moved away from the time-consuming operational tasks and put our resources into revenue generating products, like pivoting to an API offering,” said Digital River’s Hill. “ChaosSearch has offered us a manageable and cost-effective opportunity to store months or even years of data that we can use for operations. It’s allowed us to use new technologies that make sense for business solutions.”

ChaosSearch has seen its business grow rapidly since emerging from stealth in 2020. It increased revenues 611% year-over-year and has tripled its customer base.

“We have data lake 3.0 now and we’re just stretching our legs,” Walsh said.

Here’s the complete program, part of SiliconANGLE’s and theCUBE’s coverage of the ChaosSearch: Make Your Data Lake Deliver event. (* Disclosure: TheCUBE is a paid media partner for the ChaosSearch: Make Your Data Lake Deliver event. Neither ChaosSearch, the sponsor for theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU