CLOUD
CLOUD
CLOUD
Cloudflare Inc. experienced an hours-long outage today that took several popular services offline.
OpenAI Group PBC’s ChatGPT and Sora were among the impacted applications. Claude, Shopify and the website of New Jersey’s public transportation system reportedly experienced issues as well. The downtime lasted for about five and a half hours.
Cloudflare operates a content delivery network that powers about 20% of the world’s websites. The platform works by creating numerous copies of a website’s content and scattering them across data centers worldwide. When a user visits the webpage, Cloudflare loads its contents from the data center closest to that user. The company says that this arrangement enables it to provide latency of 50 milliseconds or less for 95% of the world’s population.
Cloudflare’s platform also serves other purposes besides speeding up websites. Offloading traffic processing tasks to a CDN reduces the load on a website operator’s servers, which can boost operational efficiency. Additionally, Cloudflare provides cybersecurity features that filter malicious bots and other threats.
The company’s bot traffic filtering capability caused today’s outage, Chief Technology Officer Dane Knecht disclosed in a post on X. “A latent bug in a service underpinning our bot mitigation capability started to crash after a routine configuration change we made,” the executive wrote. “That cascaded into a broad degradation to our network and other services.”
Cloudflare first started observing unusual traffic on its platform around 5:20 a.m. EST. About an hour and a half later, the company updated its status page with a memo informing customers of the outage. The service disruption took the form of errors and elevated latency levels.
The company’s CDN service for websites wasn’t the only product affected by the outage. The malfunction also impacted its Application Services product suite, which provides CDN features for cloud and on-premises workloads. Additionally, the suite protects those workloads’ application programming interfaces from malicious traffic.
The outage impacted at least two other services. During the troubleshooting process, Cloudflare engineers disabled the company’s WARP virtual private network service in London. Additionally, some users had trouble accessing the company’s Cloudflare Access zero trust network access, or ZTNA, tool. ZTNA products serve a similar purpose as a VPN but can provide better security and performance.
Cloudflare switched WARP back on in London around 8:13 a.m. At 9:42 a.m., the company announced on its status page that its engineers had fixed the root cause of the outage. Cloudflare spent the next few hours monitoring the recovery process and “looking for ways to accelerate full recovery.” The service disruption ended at 11:44 a.m.
The company last experienced a major outage in June, when more than a half-dozen of its services went online for about two and a half hours. The outage was caused by a malfunction in its Workers KV data storage platform. In a blog post published shortly after the incident, Cloudflare pledged to make infrastructure resilience improvements and enhance the software tooling it uses to recover from outages.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.