UPDATED 17:14 EDT / NOVEMBER 29 2021

BIG DATA

5 insights you might have missed from the ChaosSearch: Make Your Data Lake Deliver event

ChaosSearch Inc. set out to revolutionize the data management and analytics market. The startup has opened a new window into how enterprises leverage the data lake by leaving information where it resides and allowing users to employ multiple tools to derive insight.

The company’s quest for a new data paradigm was the focus of the ChaosSearch: Make Your Data Lake Deliver event, hosted by theCUBE, SiliconANGLE Media’s livestreaming studio.

Here are five insights you might have missed from theCUBE’s interviews with ChaosSearch executives, partners and customers. (* Disclosure below.)

1. The enterprise world is moving past the ‘Hadoop hangover’

Barely 10 years ago, Hadoop was taking the enterprise world by storm. Its framework for distributed processing of large datasets led numerous companies to embrace the technology for big data management. However, with the rise of tools such as S3 for storage and Spark for processing, the bloom on Hadoop’s rose began fading not long after 2015. Headlines such as “Hadoop Has Failed Us, Tech Experts Say” and “Hadoop’s Star Dims in the Era of Cloud Object Data Storage” captured the technology’s fall out of favor.

During the aftermath, there was a realization that Hadoop could no longer deliver on the two most attractive features that made it a hit 10 years ago: performance and cost. Transformation of data lakes, through tools such as ChaosSearch, combined with cloud native technologies equipped to handle petabytes of data as a service, shifted the ground under Hadoop, although it remains in use today for large dataset processing.

“We had that Hadoop ‘hangover,’” said Thomas Hazel, founder and chief technology officer of ChaosSearch, in his conversation with theCUBE during the event. “We were using that platform too many varieties of ways. The Hadoop Compatible File System wasn’t really a service. Cloud Object Storage is a service.”

2. ChaosSearch and its customers are a major part of the open API movement

One of the key drivers behind ChaosSearch’s technology is the application programming interface. The API allows a web application to access services and information from other sources, and ChaosSearch leverages this to run its administrative interface. It was originally modeled as an extension to the API for S3 storage.

APIs opened the door to third-party integration of data and services across numerous platforms and devices. In ChaosSearch’s own model, its API is a key element in being able to deliver views of information contained in massive datasets.

“There are a lot of challenges for how to get analytics at scale,” said Ed Walsh, chief executive officer of ChaosSearch, during an interview for the event. “Point us at the data and we index it, and make it available in a data representation that can give virtual views to end users. Those virtual views are available immediately over petabytes of data, and it gets presented to an end user as an open API.”

ChaosSearch’s own customers are leveraging the firm’s technology to build API-driven businesses as well. At Digital River Inc., a provider of global e-commerce payments and marketing services, the company is using APIs to retool its model after nearly three decades of operation.

“Twenty-seven years ago, there wasn’t a cloud, there wasn’t any public infrastructure; we stood our own datacenter up in a warehouse,” said Mark Hill, senior director of IT operations at Digital River, in an interview with theCUBE. “We’ve pivoted to a more focused API offering specializing in our global seller services.”

3. AI has become a key ingredient in the AWS storage recipe

During its Storage Day event in September, AWS added intelligent tiering to both Elastic File System and S3. The firm is continuing to automate its storage portfolio, leveraging AI for cost optimization by moving some objects automatically to less expensive storage tiers.

Operating in the cloud costs money, and enterprises are also dealing with a steadily increasing volume of data. The opportunity to save expense and not have to manually track data across all different classes of storage is a plus for AWS customers, who can also use ChaosSearch’s SaaS solution to transform S3 storage.

AWS views its latest intelligent storage offering as a key differentiator in a highly competitive market.

“This is the only one in the cloud at this point that delivers automatic storage cost savings for the customer where the data access patterns change,” said Kevin Miller, vice president and general manager of S3 at AWS, during an interview with theCUBE. “With intelligent tiering, we’re automatically monitoring data for customers and there’s no retrieval cost and no tiering charges. We’re automatically moving the data into an access tier that reduces their cost when that data is not being accessed.”

4. ChaosSearch is building its own case for data analytics standards

In its quest to democratize data handling, ChaosSearch is interested in creating industry standards by focusing on commonly used developer tools.

A prime example of this can be seen in the company’s most recent announcement. In mid-November, ChaosSearch announced a scalable, cloud native solution that it described as the “first to unlock JSON files for analytics at scale.”

JSON, or JavaScript Object Notation, is a human-readable data format derived from the JavaScript programming language. The JSON file format has become a standard for logging, and ChaosSearch has introduced JSON Flex, which allows users to store content and analyze it without requiring a lot of preparation or loss of insights.

The company’s interest in standards also extends to the data mesh, a growing option in enterprise computing that advocates for distributed solutions versus storing data in a centralized warehouse. The need for creating data standards was a common theme highlighted by ChaosSearch’s Hazel in his conversation with theCUBE.

“I think the problem with today is the lack of standards,” Hazel said. “When you draw the conceptual diagrams, you’ve got a lot of lollipops which are at the eyes, but they are all unique primitives. There aren’t standards by which the consumer can take the data the way he or she wants it and build their own data products.”

5. There is no sign yet of a slowdown in demand for data scientists and database administrators

The success of ChaosSearch and other companies specializing in the delivery of new automated tools for database administration and analytics presents a potential scenario where the need for these skills may diminish.

The firm’s solutions are clearly aimed at putting many of the functions performed by database administrators and data scientists in the hands of front-line data consumers. ChaosSearch CEO Walsh made this point during his interview with theCUBE.

“It takes about 60% of the data scientists and database analysts to do this work,” Walsh said. “You can give that as a role-based access service to your end users. You don’t have to be a data scientist or DBA.”

This raises a pertinent question: Will the industry see the demand for these jobs start to decline?

So far, the answer is no. Although 2020 was the first time in four years that data scientist was not the number one job in the U.S., data science jobs remain open five days longer than the average for all other positions, according to Glassdoor. And the U.S. Bureau of Labor Statistics continues to forecast continued expansion in the data science field, with 28% growth predicted in the number of jobs through 2026.

Likewise, there has been no sign of a slowdown for job demand in database administration. The U.S. News and World Report ranked both DBAs and data scientists in the top 10 “Best Technology Jobs,” with a projected 23,000 expected job openings combined.

(* Disclosure: TheCUBE is a paid media partner for the ChaosSearch: Make Your Data Lake Deliver event. Neither ChaosSearch Inc., the sponsor for theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Image by Nepool

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU