UPDATED 14:57 EDT / JULY 24 2024

SECURITY

CrowdStrike reveals cause of faulty update that led to Windows crashes

CrowdStrike Holdings Inc. has shared new details about the faulty update that it rolled out to its Falcon cybersecurity platform last week.

In a preliminary incident report released today, the company revealed that the update caused a type of error known as an out-of-bounds memory read. That error, in turn, crashed the Windows devices on which the affected Falcon installations were deployed. CrowdStrike plans to release a full incident report down the line along with reliability enhancements to the systems it uses to roll out updates. 

Last week’s faulty update caused one of the largest information technology outages on the books. Millions of Windows machines running Falcon experienced crashes, disrupting the operations of hospitals, government agencies, airlines and numerous other organizations worldwide. Some enterprises still haven’t fully restored their systems.

Insurance company Parametrix Insurance Inc. estimates that the incident will cost members of the Fortune 500 alone $5.4 billion. That sum doesn’t include the potential expenses Microsoft Corp. may incur. Companies in the financial services, healthcare and air travel sectors are expected to be affected the most. 

Nasdaq-listed CrowdStrike is one of the world’s largest cybersecurity providers, with more than 29,000 customers worldwide. Its Falcon platform is used to protect employee devices and other systems from hackers. The platform fends off malware by installing a sensor, or lightweight monitoring program, on the computers it protects and using it to scan for malicious activity. 

CrowdStrike regularly enhances Falcon’s sensor with so-called Rapid Response Content updates, which contain data on newly identified hacking tactics. The Falcon sensor uses this data to scan the device on which it’s installed for breach indicators. In the preliminary incident report released today, CrowdStrike detailed that one of its most recent Rapid Response Content updates caused last week’s Windows crashes.

The update was one of two that the company rolled out on Friday morning. According to CrowdStrike, both went through an internal system known as the Content Validator that is designed to scan new Rapid Response Content for bugs automatically. The system failed to detect the faulty update and consequently didn’t block its release.

The Falcon sensors that received the update attempted to run it using an internal component known as the Content Interpreter. This caused an out-of-bounds memory read, a type of error that emerges when a program attempts to access a section of its host computer’s RAM that it doesn’t have permission to use. The out-of-bounds memory read is what caused affected Windows machines to crash.

“Systems in scope include Windows hosts running sensor version 7.11 and above that were online between Friday, July 19, 2024 04:09 UTC and Friday, July 19, 2024 05:27 UTC and received the update,” CrowdStrike detailed in its incident report. “The defect in the content update was reverted on Friday, July 19, 2024 at 05:27 UTC. Systems coming online after this time, or that did not connect during the window, were not impacted.”

To prevent similar incidents from happening in the future, CrowdStrike will start more thoroughly scanning Rapid Response Content updates for errors. The company detailed that its developers will use more than a half dozen different software testing methods to that end. One of the techniques the company will adopt is fault injection, which involves deliberately introducing errors into a program to check if it can recover reliably.

The company will also upgrade the systems it uses to distribute updates. Content Validator, the backend platform it uses to check the reliability of Rapid Response Content updates before release, will receive additional “validation” features for detecting errors. One of the features is specifically designed to detect faults of the kind that caused last week’s Windows crashes.

CrowdStrike will also enhance other parts of its update management infrastructure. Going forward, the company plans to roll out Rapid Response Content gradually rather than to its entire installed base at once. After rolling out an update to an initial “canary” collection of devices, CrowdStrike developers will check for errors before releasing the enhancement more broadly. 

The Falcon sensor, the lightweight program that the platform installs on customer computers, will be upgraded too as part of the initiative. CrowdStrike plans to equip the sensor with new features for recovering from faulty updates. Moreover, Falcon customers will receive the option to customize how and when they wish to download updates.

Photo: CrowdStrike

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU