UPDATED 12:43 EDT / AUGUST 03 2020

CLOUD

How cloud-native observability is transforming enterprise technology

In the world of information technology operations, observability extends the principles of IT monitoring by pulling together data from logs, metrics, traces and events to empower operators to identify root causes of issues and resolve them quickly. Cloud-native observability, in turn, extends these capabilities to Kubernetes and, by extension, the full gamut of multicloud hybrid IT.

Cloud-native observability is not only relevant to organizations that are implementing Kubernetes, however. As cloud-native computing represents a paradigm shift in enterprise IT, the observability part of the story also reflects new ways of leveraging technology to manage increasingly complex IT infrastructure.

To illustrate these new approaches, I spoke with 10 vendors that are leading the charge with cloud-native observability innovation. These innovations follow three main themes, each of which represents an aspect of cloud-native paradigm shift that is changing everything about how we run technology in our organizations. (* Disclosure below.)

Theme No. 1: Real-time visibility into root causes that accelerates work of DevOps teams

Vendors of traditional monitoring tools focus on the needs of operators, giving them dashboards that leverage sampled data that may be minutes or even hours old. In contrast, some cloud-native observability tools provide near real-time visibility into incidents as well as their causes.

Not only does this additional speed reduce the operators’ mean time to resolution, it also gives developers insight into the impacts of the code they’re working on at the moment – either in development or test environments, or in the production environment during canary testing.

Vendors that bring this real-time capability to cloud-native observability include Instana Inc., which provides feedback for continuous integration and continuous deployment or CI/CD activities, as well as root-cause identification and analytics and insight into the context of service dependencies.

Humio Ltd. also provides an intuitive tool for developers that offers visibility into operational behavior at the time of user interaction. In fact, Humio focuses on support for human interaction with data, both for operators and developers. A third vendor, Logz.io, focuses in particular on cloud observability for engineers.

Three incumbent vendors also provide real-time visibility that supports devops activities, including VMware Tanzu Observability (formerly Wavefront), Splunk SignalFx and New Relic. In fact, Splunk Inc. and New Relic Inc. take support for developers one step further by offering programmability in their observability platforms. New Relic enables engineers to craft custom dashboards, while Splunk’s SignalFx platform is fully programmable: every capability that it exposes to users is available via an application programming interface.

Theme No. 2: Automated, AI-driven root-cause detection

AIOps – leveraging AI (in particular, machine learning) to uncover anomalies in operational data and determine their root causes — is now a burgeoning market in its own right. Many cloud-native observability vendors also offer AIOps capabilities, with a cloud-native twist.

Zebrium Inc., for example, offers a log manager that provides autonomous incident and root cause detection. By “autonomous,” the company means that its tool features unsupervised machine learning that leverages common patterns of software failure. Zebrium can then find hotspots of abnormally correlated anomalous patterns automatically, giving operators exceptional insight into root causes of issues.

VMware Inc. also combines AIOps and cloud-native observability. VMware Tanzu Observability enables operators to troubleshoot across heterogeneous technology stacks with AI-driven root cause analytics.

The standout vendor bringing AI to the cloud-native observability story, however, is Carbon Relay. It offers automatic ML-powered optimization for Kubernetes applications.

In other words, Carbon Relay is proactive, since it continually assesses all relevant factors to determine the best set of deployment choices and then automatically implements them. It also recalculates on the fly to maintain top performance as conditions change.

It could be argued that Carbon Relay doesn’t really offer observability at all, as it focuses more on optimization. But given the fact that cloud-native observability includes empowering operators to fix issues, what better fix is there than a proactive optimization that prevents issues in the first place?

Theme No. 3: All the data, all the time

Operational telemetry has always been big data – all the logs, events and other streams of information coming off of every application and infrastructure component every second of every day.

Historically, processing and storing such vast quantities of information was cost prohibitive, so IT operations technologies had to work on samples – small subsets of all available data that statistically represented the behavior of the environment as a whole.

Today, the situation has worsened, as the number of data sources has exploded with the diversity of technologies and environments in the modern IT landscape. Combine this explosion with the fact that much of that technology is dynamic or even ephemeral, and sampling becomes increasingly impractical and ineffective.

Fortunately, the cost of storing and processing such data has also dropped, enabling some cloud-native observability vendors to process all available operational data, all the time, cost-effectively.

Epsagon Inc., for example, provides full visibility for containers, virtual machines, serverless functions and other elements of modern, cloud-native infrastructure. By leveraging all these data, it can automate detection, troubleshooting and resolution of issues with instant data correlation, payload visualization and full-depth tracing.

In fact, Epsagon automatically discovers the components of the entire applications stack, allowing operators to see performance metrics for any production resource across the cloud-native landscape automatically.

Sharing this automated discovery functionality is StackState B.V. It offers full-stack observability, with a single platform for on-premises, microservices and multicloud IT deployments – in other words, full hybrid IT support.

StackState automatically discovers the elements of the infrastructure, so it can track every change, essentially keeping track of the entire state of the enterprise topology with the ability to play back the state at any point in time, what the company calls “time travel.”

Splunk, VMware and New Relic also offer the ability to analyze all the operational data all the time. Splunk SignalFx provides a real-time streaming analytics engine that gives operators and developers the ability to monitor the entire infrastructure in seconds, not minutes. VMware also offers full-stack visibility across the full production environment including VMs, Kubernetes, serverless, cloud services and infrastructure.

New Relic lets operators search across all entities, including apps, hosts, containers, Kubernetes clusters, cloud services, databases and VMs. Because all the data are in one place, New Relic customers can see all the relationships and dependencies among their infrastructure entities and understand the context of those data within the cloud-native infrastructure. New Relic is also lowering its prices, making it more cost-effective to leverage all the data.

Making the right choice about cloud-native observability

The more established vendors in this article predictably have a broader, more complete offering than the startups. The startups, however, are driving innovation in their particular areas of focus.

More complete offerings may make more sense generally, but they may also be more difficult to implement and leverage to their full extent. The younger offerings may not have as many features, but their time to value is generally quicker than the large vendors.

As has always been true with IT operations tooling, organizations will never have just one tool. True, too many tools can also cause issues, but the best-run shops leverage a carefully selected set of complementary tools. Many things are different about cloud-native computing, but this basic fact will remain true for the foreseeable future.

Jason Bloomberg is founder and president of Intellyx, which publishes the Cloud-Native Computing Poster and advises business leaders and technology vendors on their digital transformation strategies. He wrote this article for SiliconANGLE. (* Disclosure: At the time of writing, New Relic is an Intellyx customer and VMware is a former Intellyx customer. None of the other organizations mentioned in this article is an Intellyx customer.)

Photo: Kate Ter Haar/Flickr

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

How cloud-native observability is transforming enterprise technology

Theme No. 1: Real-time visibility into root causes that accelerates work of DevOps teams

Theme No. 2: Automated, AI-driven root-cause detection

Theme No. 3: All the data, all the time

Making the right choice about cloud-native observability

Photo: Kate Ter Haar/Flickr

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Oracle Data Deep Dive NYC 2026

HPE World Quantum Day 2026

Qlik Connect 2026

Nutanix .NEXT 2026

KubeCon + CloudNativeCon EU 2026

How cloud-native observability is transforming enterprise technology

Theme No. 1: Real-time visibility into root causes that accelerates work of DevOps teams

Theme No. 2: Automated, AI-driven root-cause detection

Theme No. 3: All the data, all the time

Making the right choice about cloud-native observability

Photo: Kate Ter Haar/Flickr

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Oracle Data Deep Dive NYC 2026

HPE World Quantum Day 2026

Qlik Connect 2026

Nutanix .NEXT 2026

KubeCon + CloudNativeCon EU 2026

Cookies