INFRA
INFRA
INFRA
American Express Co. is deepening its relationship with site reliability engineering startup Traversal Inc., making a strategic investment through Amex Ventures while deploying Traversal’s artificial intelligence-driven site reliability engineering platform across its global technology infrastructure.
The partnership reflects growing interest among large financial institutions in using AI to automate the complex work of diagnosing and resolving technology outages, a task that has traditionally required teams of engineers to manually comb through system logs and monitor dashboards.
Traversal, founded by researchers from MIT, Columbia and Cornell, is building what it describes as an AI-powered site reliability engineer. Its software analyzes massive volumes of operational telemetry data, such as logs, metrics and traces, to identify the root causes of incidents and help engineering teams restore services more quickly.
The collaboration with American Express includes both a commercial deployment and a $5 million strategic investment from the credit card company’s venture capital arm.
“American Express operates at a massive scale; reliability and performance are foundational to delivering a seamless customer experience,” said Kevin Weber, managing director at Amex Ventures. “In such a complex, distributed infrastructure environment, the focus is always on advancing how operational events are detected, understood and resolved.”
Large financial institutions often have particularly difficult operational challenges because their technology environments span thousands of applications and multiple infrastructure platforms. Troubleshooting outages may require dozens of engineers from different teams to collaborate in what are sometimes called “war rooms.”
Traversal says its technology aims to automate much of that work. The challenge isn’t collecting data, as most large enterprises already have extensive observability tools that do that, but interpreting the data fast enough to find the underlying cause of problems, said co-founder and Chief Executive Anish Agarwal.
“Observability helps you visualize the data, but finding the root cause is still very labor-intensive,” Agarwal said. “At Fortune 100 enterprises, you may have 50 or 100 engineers jumping into a war room to figure out what happened.”
Part of the difficulty comes from fragmentation in the observability market. Large organizations often run multiple monitoring platforms simultaneously which don’t integrate well with each other.
“Splunk will never give you insight on data stored on Datadog, and Datadog will never give you insight on data stored on Splunk,” Agarwal said. “You need to be able to look through all of the data to give you a deep root cause of an incident.”
Traversal’s system uses large language models, AI agents, and causal machine learning techniques to analyze telemetry data across those systems. Rather than attempting to correlate irregularities in performance data, the platform infers cause-and-effect relationships within complex software environments.
“What typical correlation engines pick up are spikes,” Agarwal said. “But understanding which is the root cause versus something that happened because something else broke requires causal reasoning.”
Causal inferencing, AI agents and a security-focused architecture designed for highly regulated industries are among the factors that made Traversal stand out, Weber said.
“Traversal’s approach reflects an evolution in observability — moving from detecting patterns to understanding root causes with greater precision,” he said.
The investment also reflects a broader effort to explore how AI can improve operational resilience in large-scale technology environments.
“Interest in Traversal was driven by a forward-looking opportunity to enhance infrastructure operations through next-generation capabilities,” Weber said. “As AI-driven SRE becomes increasingly mission-critical, there is growing recognition across the industry that traditional observability approaches can be further strengthened,” he said.
Traversal has raised roughly $53 million in funding to date and is positioning its platform as a foundational layer for what Agarwal calls “agentic incident response,” in which AI agents work alongside engineers to automatically diagnose and eventually remediate system failures.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.