UPDATED 12:47 EST / AUGUST 12 2023

Cloud vs. on-premises showdown: The future battlefield for generative AI dominance

BREAKING ANALYSIS by Dave Vellante

The data from enterprise customers is clear but conflicted: While 94% of customers say they’re spending more on artificial intelligence this year, they’re doing so with budget constraints that will steal from other initiatives.

As well, the choice of where customers plan to run generative AI is split almost exactly down the middle in terms of public cloud versus on-premises and edge. Further complicating matters, developers report the experiences in the public cloud with respect to feature richness and velocity of innovation have been outstanding. At the same time, organizations express valid concerns about intellectual property leakage, compliance, legal risks and cost that will limit their use of the public cloud.

In this Breaking Analysis, we’ll share the most recent data and thinking around the adoption of large language models and address the factors to consider when thinking about how the market will evolve. As always, we’ll share the latest Enterprise Technology Research data to shed new light on key issues customers face balancing risk with time to value.

Enterprise IT spending remains tight

The chart below is from the latest July ETR spending snapshot. The N of 1,777 comprises senior information technology decision-makers representing more than $750 billion in spending power.

Senior IT decision-makers exited 2022 with an expectation that their budgets would increase between 4% and 5%. By January that figure was down to 4.1% and despite small sequential increases throughout the year, currently stands at 2.9%, well below initial expectations.

Budget constraints force tradeoffs

The rush to generative AI has caused organizations to reprioritize in a climate where discretionary budgets are not plentiful. As we shared in Breaking Analysis with Andy Thurai late last year, the return on AI investments has been elusive. But the ChatGPT craze forced a top-down mandate from boardrooms and as such has shifted the spending priorities in enterprise tech.

The chart above shows the sectors ETR tracks. Net Score or spending momentum is on the vertical axis and pervasiveness in the survey on the horizontal axis. Although all sectors felt the pinch of budget constraints in 2022, AI, which was leading all segments, was suppressed to the point where by October 2022, it fell below the 40% red dotted line – the high-water mark for spending velocity. ChatGPT was introduced to the market in November and since then AI spending has accelerated. However, budgets haven’t changed dramatically.

As a result, we’re seeing compression in other sectors suggesting that in the near term, funding for gen AI will be somewhat dilutive to other segments of the market.

Spending on AI is outpacing other initiatives

As we mentioned at the top, the data below shows that, for those customers spending actively on generative AI, a huge majority of customers, 94%, report accelerating their AI spend in 2023.

While most customers are reporting a modest spend increase of 10% or less, 36% say their spending will increase by double digits.

Mandate from the C-suite conflicts with risk appetites

The top-down pressures from the corner office to “figure out” generative AI is an urgent matter. But the actual doing is much more challenging. The chart below shows what customers are doing with gen AI in production environments. Although 34% say they’re not evaluating, that number is way down from last quarter. And while you may think that 34% is very high, we believe there is a difference in the minds of respondents between playing with gen AI and “actively evaluating.”

Regardless, when you look at what’s actually happening in production environments, two things stand out: 1) Most people are still in eval mode and 2) The use cases are pretty straightforward with chatbots at the top of the list followed by code generation, summarizing text and writing marketing copy as the main areas of interest today.

We believe it’s critical for organizations to truly understand the business case and identify return on investment. The big ROI driver is going to come down to minimizing labor costs. You can put this in the productivity bucket, but at the end of the day it’s going to be about lessening the need for humans.

This doesn’t necessarily mean unemployment will rise – it simply means that the No. 1 driver of value is going to be reducing headcount requirements. And that will most certainly change the skills required for employment.

Organizations must evaluate the risks of gen AI

A key challenge facing organizations is, while top down momentum is real, deployment opens a can of risk worms. The slide below is from a recently released study by Technalysis, an independent analyst firm run by analyst Bob O’Donnell. It shares results from 1,000 IT decision-makers on their top concerns around gen AI. Compliance, IP leakage, legal concerns such as copyright infringement and bias, data and tools quality and the like.

These are legitimate reasons for being careful with generative AI and how it’s used.

Rethinking the cloud-vs.-on-prem balance

Much of the concern regarding gen AI risk is leading organizations to say they’re going to do gen AI on-premises.

Below is some data from ETR that shows organizations report an identical mix of private and public infrastructure – that is, public cloud or on-prem/edge deployments. The allure of the cloud is that it has the best tooling. But for the reasons mentioned in the Technalysis survey, private infrastructure is expected to be a popular deployment option.

But the the cloud continues to have advantages. There’s now lots of data in the cloud – we think 40% to 45% of workloads are running in the cloud today – perhaps as high as 50% by next year. As we’ve reported in previous research, the cloud and on-prem are coming more into balance – cloud is still growing much faster – but the business case for cloud migration is not as robust for many legacy applications. We believe much of the cloud growth is new apps or features on top of existing cloud workloads.

On-premises workloads are ripe for AI injection, and incumbents such as Cisco Systems Inc., IBM Corp., Dell Technologies Inc. and Hewlett Packard Enterprise Co. are eyeing opportunities and aggressively investing.

Cloud still has a massive advantage

The fact is, in speaking with developers, the cloud is exceedingly capable when it comes to AI. Below are eight points we’ve highlighted that devs tell us the public cloud is delivering on. We believe these points can serve as guideposts for customers when considering the tradeoff in functionality between cloud and on-premises gen AI offerings.

The pace of innovation in AI, building on previous tooling such as Amazon SageMaker. The simplicity of integration and the productivity it’s driving is allowing developers to get to an outcome very quickly.

We’ve encouraged our community to check out thecubeai.com as an example and sign up for our private beta. Our team built this very quickly – in a matter of weeks using tools readily available on Amazon Web Services, including open-source large language models, MongoDB, Milvus as our vector database and other cloud tools.

It’s now taking more time to train the model based on the queries we’re getting, but the time to minimum viable product was one-10th of a normal software product development cycle.

Our experience underscores No. 4 above. It’s important – that is, model optionality and diversity – not only from the cloud vendor but third parties.

The points in No. 5 and No. 6 are also critical – the ability to fence off inference requests such that the LLM vendor can’t access any customer data. Richness of security offerings as well are key factors, as well as capabilities such as ensuring data stays in region and encryption for data in flight.

The cloud offers tools that are first-rate from silicon all the way through AI tool chains, maximum database optionality, governance choices, identity access, availability of open-source tools and a rich ecosystem of partners.

So one has to ask the likes of HPE with GreenLake and Dell with APEX, even though you’re talking about having LLMs or, in the case of GreenLake, HPE has announced LLMs as a service: How capable are they and how truly integrated are they into a seamless as-a-service offering?

That being said, the advantage the traditional on-premises firms have is their relationships with customers, strong service organizations and physics. The speed of light and latency will dictate many of the deployment choices. This is something to watch closely. Although doing work on-prem can reduce risk and makes a lot of sense, much work needs to be done for incumbent firms to build out offerings and full stack of ecosystem partners.

Comparing customer spending momentum of cloud versus incumbents

The cloud players have stronger business momentum than incumbent enterprise infrastructure players. Despite all the talk of cloud optimization, repatriation and slowing growth, the numbers still dramatically favor the cloud players.

Above is ETR data showing the Net Score breakdown for several aspiring LLM leaders. Net Score is a measure of spending velocity. It tracks the percent of customers that are new logos – that’s the lime green. The forest green represents customers spending 6% or move relative to last year. The gray is flat spending, the pink is spending 6% less or worse and the bright red is churn. Subtract the red from the green and you get Net Score as shown in the column to the right of the bars.

To the right of Net Score, we show the number of responses in the survey, which is a proxy for market presence. So as you can see, AWS, Microsoft Corp. and Google LLC have Net Scores of 51%, 49% and 34% respectively and Ns near or over 1,000.

Compare this to Dell and HPE with Net Scores of 18% and 9%, respectively. Dell has a large market presence with an N over 800 and HPE a respectable 483. But the cloud still has meaningfully higher momentum from a spending standpoint.

Tracking some key data players

Below we show the same data for Databricks Inc., Snowflake Inc., IBM and Oracle Corp. – some of the key data platform names. Databricks, with a very solid Net Score of 60%, has taken over the top spot from Snowflake at 47%, although Snowflake has a bigger market presence. But clearly Databricks is converging in on the traditional domain of Snowflake. IBM and Oracle, as you see, have lower Net Scores of 10% and -1% respectively, both with large Ns in the data set.

When will gen AI show up in the income statement?

We expect spending on AI generally and gen AI specifically will begin to have a visible impact in the second half of 2023.

Using AWS as a proxy, the chart below shows AWS’ revenue growth rates going back to Q1 2022. We think the deceleration will stabilize in Q3 and our current forecast calls for a re-acceleration of growth in Q4 thanks to AI as a tailwind and Q4 seasonality. In particular, we see gen AI driving more compute and storage as well as ancillary spend in data platforms and associated tooling.

There are risks to this scenario, including the macro environment and the law of large numbers kicking in, as well as competition, but our current thinking is that we’re at the tail end of cloud optimization and shifting to new workload enablement.

The power distribution of gen AI

John Furrier often talks on theCUBE about power laws. We’re going to close by looking at how we see a modified power law distribution of large language models.

A power law distribution is a statistical relationship between two quantities. The simple way to think of a power law distribution is the 80/20 rule. For example, 80% of our sales come from 20% of the products in our portfolio. On the chart below, we’re taking liberties with the concept and saying few companies will build the largest language models. Most LLMs will live a the long tail on the X axis will be very specific to industry and these will be smaller in size.

Moreover, edge deployments will be plentiful, highly sensitive to latency, economics and power consumption.

Several points we’d like to make here:

First, we believe that enterprise tech innovation continues to be driven by consumer volumes. PC chips, data prowess from search and social media, flash storage and, more recently, gaming with Nvidia… all found their way into the enterprise via the consumer route.

The big cloud and consumer brands we believe will dominate the largest model space and the sustained running of models, whereas inference will happen on-prem and at the edge.

What’s different above from, for example, the web, where the power law curve is like a wall straight down with no torso (the orange line following the Y axis), the LLM space, we believe, will be pulled up and to the right, as shown by the red dotted line. In this area, we believe open source and third-party tools will fill the gap, along with cloud partners such as Snowflake and Databricks.

The on-prem incumbents such as Dell, HPE and IBM will succeed to the extent that they’re able to leverage LLM diversity and deploy it in their go-to-market models… in a manner that is as simple as the cloud and that is more controlled and cost-effective for their specific use cases. Importantly, we believe that enterprise AI will demand clear ROI and economic value, or projects will die on the vine.

As we said earlier, we believe that primary value will come from headcount reductions.

Meanwhile, we believe that inference at the edge will be dominated by architectures built on low-cost, low-power, high-performance systems – very often Arm based designs — which have massive volume. Think Tesla Inc. and Apple Inc.. We believe that the economics at the edge will eventually find their way into the enterprise and be a disruptive force.

It may take the better part of the decade, but the economics of enterprise IT, since the PC disrupted the mainframe, have been driven by consumer volumes and we think this wave will be no different. AI + data + volume economics will determine the fundamental structure of the industry in the coming years.

That’s a bet we think is worth making in whatever industry you’re in. Applying it, however, will require careful thought and deep thinking… not AI washing.

Keep in touch

Many thanks to Andy Thurai for his insights and participating in this Breaking Analysis segment. Alex Myerson and Ken Shifman are on production, podcasts and media workflows for Breaking Analysis. Special thanks to Kristen Martin and Cheryl Knight who help us keep our community informed and get the word out, and to Rob Hof, our editor in chief at SiliconANGLE.

Remember we publish each week on Wikibon and SiliconANGLE. These episodes are all available as podcasts wherever you listen.

Email david.vellante@siliconangle.com, DM @dvellante on Twitter and comment on our LinkedIn posts.

Also, check out this ETR Tutorial we created, which explains the spending methodology in more detail. Note: ETR is a separate company from Wikibon and SiliconANGLE. If you would like to cite or republish any of the company’s data, or inquire about its services, please contact ETR at legal@etr.ai.

Here’s the full video analysis:

All statements made regarding companies or securities are strictly beliefs, points of view and opinions held by SiliconANGLE Media, Enterprise Technology Research, other guests on theCUBE and guest writers. Such statements are not recommendations by these individuals to buy, sell or hold any security. The content presented does not constitute investment advice and should not be used as the basis for any investment decision. You and only you are responsible for your investment decisions.

Disclosure: Many of the companies cited in Breaking Analysis are sponsors of theCUBE and/or clients of Wikibon. None of these firms or other companies have any editorial control over or advanced viewing of what’s published in Breaking Analysis.

Image: Maksym Yemelyanov/Adobe Stock

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Cloud vs. on-premises showdown: The future battlefield for generative AI dominance

Enterprise IT spending remains tight

Budget constraints force tradeoffs

Spending on AI is outpacing other initiatives

Mandate from the C-suite conflicts with risk appetites

Organizations must evaluate the risks of gen AI

Rethinking the cloud-vs.-on-prem balance

Cloud still has a massive advantage

Comparing customer spending momentum of cloud versus incumbents

Tracking some key data players

When will gen AI show up in the income statement?

The power distribution of gen AI

Keep in touch

Image: Maksym Yemelyanov/Adobe Stock

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cloud vs. on-premises showdown: The future battlefield for generative AI dominance

Enterprise IT spending remains tight

Budget constraints force tradeoffs

Spending on AI is outpacing other initiatives

Mandate from the C-suite conflicts with risk appetites

Organizations must evaluate the risks of gen AI

Rethinking the cloud-vs.-on-prem balance

Cloud still has a massive advantage

Comparing customer spending momentum of cloud versus incumbents

Tracking some key data players

When will gen AI show up in the income statement?

The power distribution of gen AI

Keep in touch

Image: Maksym Yemelyanov/Adobe Stock

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies