UPDATED 11:50 EDT / DECEMBER 14 2023

AI

Gen AI integration and orchestrating the cloud: A blueprint for success

In the fast-evolving landscape of cloud computing, the battle for supremacy is reaching new heights. It’s being fought on many fronts, with artificial intelligence one of the latest battlegrounds.

There’s been an industry-wide transformative journey from the challenges of the Hadoop era to the current state of innovation in data platforms. The significance of data production and the need to orchestrate the flow of data throughout organizations can’t be overstated.

“There was that cycle of going through new data platforms that I think now has arrived with a whole infrastructure that may still be very complicated, but is so much easier to produce data products,” said Steve Hillion (pictured), senior vice president of data and AI at Astronomer Inc. “And that’s what it’s about. It’s about data production — it’s about things that are meaningful for driving business.”

Hillion spoke with theCUBE industry analyst  John Furrier at the “Supercloud 5: The Battle for AI Supremacy” event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the latest trends shaping the cloud industry and the role of generative AI and data in shaping the future of cloud computing.

The evolution of data engineering

For enterprises, honing data has become as crucial as it is a nuanced art. There’s a real need for data production and the need to orchestrate the flow of data throughout organizations. Astronomer, known for its role as the commercial developer behind Apache Airflow, has become a key player in providing a cloud service for managing data pipelines.

“Building that platform has now become what’s foundational rather than just the individual data sets,” Hillion explained. “It’s about the operational aspects of that — it’s about data ops. Data engineers, I think, have long enjoyed generating new data sets, but now they’re enjoying creating frameworks that allow everybody to build data sets.”

Data has become the lifeblood of today’s gen AI revolution. With the industry pushing for larger and more intricate language models, there’s a soaring demand for pre-trained foundational models and the tools that facilitate the production of meaningful results with less effort.

Orchestration is key, as companies build coherent operations around their data in a forest of vastly different tooling, according to Hillion.

“What I think is the impact is that you have more tooling, but now you have to make a decision about what the right architecture is,” he noted. “And that’s, in a sense, where orchestration comes in because it’s the one thing that ties together all of these different technologies. One of the most powerful things we did as a company was to produce a registry of connectors from the Airflow pipeline.”

Data engineering in the modern stack

The last two decades have seen companies elevate data ops to a chief priority and, subsequently, this has created a vital role for data engineers in not just generating new datasets but also creating frameworks for building datasets across organizations.

Data ops is here to stay, and reliability, scalability and operational efficiency take center stage. Astronomer provides a stable platform for data engineers, making the process of orchestrating data pipelines more accessible, Hillion added.

“If you want to be using large language models, there’s a whole host of different toolkits you can use and different providers of LLMs, he said. “But these are the ones that we’ve pointed to, whether that’s Cohere or OpenAI or Weaviate, we have off-the-shelf connectors for those that make life a lot easier for the data engineer.”

The integration of gen AI into the modern stack allows benefits on different levels, by providing tooling for richer results with less effort and solving the challenge of deciding the right architecture. By simplifying the deployment of data pipelines, Astronomer solves real efficiency pain points for companies, an example of which is the retail sector.

“My favorite [example] is Laurel.ai, they used to be called Time by Ping,” Hillion explained. “Their job is to automate the process of creating the time sheets that lawyers and accountants have to submit at the end of the day — it’s an incredibly tedious process. Well if you think about it, you can just look at what they’re doing on their screens to explain and document what it is that they did and how they used their time on a 15-minute basis. And then you can summarize that with these large language models to submit that to the client.”

The orchestrator plays a crucial role in monitoring various components, including the accuracy of models and the execution of large language models, according to Hillion. Integration with tools like LangSmith, for instance, is a mechanism for observing the usage and execution of large language models, he added.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the “Supercloud 5: The Battle for AI Supremacy” event:

(* Disclosure: Astronomer Inc. sponsored this segment of theCUBE. Neither Astronomer nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU