UPDATED 16:14 EDT / JULY 20 2024

SECURITY

CrowdStrike incident sounds an alarm on critical infrastructure

The chaos caused by yesterday’s content update by CrowdStrike Holdings Inc. shows that even the most successful cybersecurity firms with great management, award-winning products and a growing business, are exposed to unexpected events.

What’s even more important is that it underscores the fragility of our connected world and the critical infrastructure that makes it run. Virtually every industry, and who knows how many people, were affected by the update CrowdStrike pushed out. The focus right now is on getting customers back on line. For many, it’s an on-the-job training session in disaster recovery and business resiliency.

In this Breaking Analysis, we give you our assessment of the events of July 19, related to the CrowdStrike content update — what we know and what the potential exposure could be to the company. We’ll share new data from an Enterprise Technology Research “Flash Survey” and talk about what’s next in this saga.

Day zero data shows major business disruptions across industry

In a survey of 100 CrowdStrike customers, on the day of the incident, 96% said they were affected by system crashes. More than half indicated they’re rethinking their consolidation plans with CrowdStrike or looking at reducing reliance on CrowdStrike as a direct result of this incident. Mind you, ETR took this survey at the absolute worst time, when organizations were dealing with the business impact on a Friday. But this data gives us a sentiment snapshot that we can measure over time as customers assess the damage and work with CrowdStrike to ensure this won’t happen again.

We know from other incidents that some end up being more benign and fade over time, while others cause more lasting reputational and business damage. The broad industry impact of this situation is cause for concern. Though it’s too early to draw any conclusions with respect to the long-term impact on CrowdStrike, suffice to say that short-term reaction from customers is quite negative and likely reflects quite a dose of emotional bias.

The flash crash was a ‘glitch,’ this wasn’t?

Remember the Flash Crash of May 2010? It was an event where for no clear reason, the market dropped 600 points in a matter of minutes. Blue-chip stocks such as Accenture traded at a penny for a few minutes before the market rebounded. In his book “Flash Boys,” Michael Lewis cited the opaque nature of so-called software glitches and shared the growing concerns over the lack of transparency stemming from the Security and Exchange Commission’s obscure response.

Though the Flash Crash shone a light on the risks of our increasingly digital world — mind you, this was 14 years ago — the recent CrowdStrike incident, while hugely impactful, was not a “black box.”

CrowdStrike CEO started communicating immediately

George Kurtz, chief executive of CrowdStrike, appeared on several TV programs this morning and took responsibility for the incident. He was humble, as was Todd McKinnon, CEO of Okta Inc., a couple years ago when he had to take the shrapnel for an Okta breach.

We knew early on this was not a breach. Rather, it was caused by a CrowdStrike push initiated to update a content file. Our understanding is that generally, an update in this context refers to a content push that includes new data or rules used by security sensors to detect and respond to threats. A report on Hacker News suggests that CrowdStrike pushed a new driver to the kernel of Windows clients to fix an issue with an earlier version of its Falcon sensor.

According to Kurtz, the issue was identified and it was rolled back. Before it could be fully contained, it hit a number of Windows systems — our understanding is Windows Server and PCs, which caused the “blue screen of death,” a continuous loop of impossible-to-eliminate blue screens.

As we said, virtually every industry was affected. Though some fixes could be initiated via a reboot, because there are so many permutations and combinations of Windows OS configs, many systems need to be recovered manually.

To be clear, other than the fact that companies are running many versions of Windows, this was not a Windows problem. It certainly was not – to our understanding – a security failure of Windows or a process failure on Microsoft’s end. Or on the end of CrowdStrike’s information technology customers. It appears that this was an automated push initiated by CrowdStrike, but we’re still trying to get the details there.

Many questions to be addressed

There are several issues this incident brings to light. Let’s explore some.

The first question one would ask is: Whyy wasn’t this staged? Kurtz explained to Jim Cramer on TV that it was staged. The follow-up question we have that wasn’t asked is: Were customers given the option of holding off on this push and timing it on their own? If not why not? Is that standard procedure or was this a process fail? 

Why were only Windows systems affected? Was this a Windows issue? No – this appears to be a CrowdStrike issue that was designed to address Windows systems only. Mac and Linux systems were not a target of this content push. [Note: In the accompanying video, the author mistakenly implied that push was staged and as such avoided going out to Mac and Linux users. This is incorrect.].

Why are some Windows systems back online and others not? It’s because there are many permutations of Windows. So many configurations that automation and reboots might not fix.

How do systems get fixed? Some can be fixed with a reboot, but many are still down and needing manual intervention. IT departments are scrambling, working with cloud vendors to recover. According to one thread on Hacker News:

Most of our nodes are boot looping with blue screens which in the cloud is not something you can just hit F8 and remove the driver. We have to literally take each node down, attach the disk to a working node, delete the .sys file and bring it up. Either that or bring up a new node entirely from a snapshot.
This is fine but EC2 is rammed with people doing this now so it’s taking forever. Storage latency is through the roof.

Hacker News Thread

Who is responsible for this mistake? CrowdStrike is taking responsibility for this – at least so far. And it appears it was CrowdStrike’s automated push that potentially bypassed any customers’ ability to stage the push themselves. This is probably a common occurrence. What isn’t common is a bug in the software that takes the internet down. Our colleague Alex from our production group asked, “Is this what was supposed to happen on January 1st, 2000? (Y2K)?” Yeah, that’s right.

So the other question is: Does CrowdStrike really have kernel-level access to all these devices? Yes, it does. Falcon has privileged access. It has earned that privilege, but many customers will be revisiting processes.

And you may be asking: Are companies really still running their business on Win Server and older versions of Windows on their PCs? Yes, absolutely.

The big question for investors is: What is the damage to economy? Who knows? It was expensive. Pick your cost of downtime and multiply it by the number of businesses affected. And what is the damage both reputational and financial to CrowdStrike? It’s TBD.

Assessing the exposure to CrowdStrike

Below, we’ve taken a look at the stock market reaction with some CrowdStrike key performance indicators to assess the potential exposure. This chart shows the Nasdaq in the yellow and CrowdStrike in the blue line. It opened up Friday morning down 15%, with the stock dropping below $300 most of the afternoon, then recovering a bit at the end of the day, closing down 11%.

CrowdStrike is on track for $4 billion in revenue in fiscal 2025. It had over an $80 billion market cap at Thursday’s close. It’s now at $74 billion, down around $8 billion for the day. SentinelOne Inc. closed up almost 8% Friday and Palo Alto Networks Inc. was up 2%. The Nasdaq closed down almost 1%. Again, it’s unclear what type of legal exposure CrowdStrike has, but it’s very clear that the industries affected were truly across the board.

CrowdStrike’s spending profile prior to the incident

Let’s take a look at some of the ETR data now from the most recent July survey before getting into the flash survey. Below we’re showing CrowdStrike’s Net Score granularity.

Net Score is ETR’s proprietary methodology that breaks down a company’s spending momentum and is measured over time. There are more than 400 CrowdStrike customers in this sample. The lime green indicates the percentage of those customers in survey adding CrowdStrike new. The forest green, at 44%, represents the percentage of customers planning to spend 6% or more relative to last year. The gray at 38% is flat spend, indicating plus or minus 5%. The pinkish area is spending down 6% or worse. The percentage leaving the platform or putting it into containment is the bright red at 4%.

Subtract the reds from the greens and you get Net Score of 48%. That’s that blue line, which relative to October 2023 has popped up. You can see relative to July 2023, it’s about the same. So it’s back up at very high levels. Note: A Net Score over 40% is considered highly elevated.

CrowdStrike is a clear leader relative to peers

The chart below compares CrowdStrike to a selected group of security peers. It shows Net Score on the vertical axis and Pervasion or the presence in the data set on the horizontal axis. Anything over that 40% red line is considered highly elevated.

You can see CrowdStrike is way up there. Microsoft, of course, is up there as well and way to the right. Microsoft, we would say, is CrowdStrike’s biggest competitor. In fact, Kurtz is very often quoted as saying, “Good enough security is not good enough.” He’s talking about Microsoft. There’s another joke in the CrowdStrike community that “Patch Tuesday means Hack Wednesday.”

CrowdStrike is all about speed. The premise that Kurtz puts forth is: You’ve got to be fast, you’ve got to be faster than the bad guys. He drives race cars. He emphasizes the importance of speed and finding things and remediating things quickly, which probably led to this culture of fast pushes.

It probably wanted to do the best for its customers and as a result of pushing things out, potentially without giving them an opportunity to slow them down. But in this case, it appears to have backfired. Again, we’re speculating here, but it seems that’s the case.

You can see in the data how well CrowdStrike is executing. Palo Alto is right there as well, and so are Zscaler Inc., SentinelOne and Fortinet Inc. And you can see a number of others in the mix — Sophos Group Inc., Trend Micro Inc. and the like.

The data underscores CrowdStrike’s strong market presence and it’s how they’ve achieved such a high valuation relative to most others.

Customer survey shows broad impacts and concerning reactions

Let’s take a look at the flash survey results from ETR. ETR asked 100 CrowdStrike customers the question, “Were you impacted by this incident?” Ninety-six percent out of that a hundred said they were impacted. Below you can see the degree of impact.

Forty percent cited extremely or very significant business disruption. That is a lot of money down the drain. Six percent said their business had to shut down nearly all essential operations. Twenty-two percent said somewhat significant. So around 68% of customers had a significant disruption with nearly half highly disruptive.

Instant reaction is very negative

Not surprisingly, the majority of customers (nearly 60%) indicated that this incident will change the way they think about their CrowdStrike investments. We caution that the emotional and recency bias of this survey are in play, so think of this as a snapshot in time. Our view is CrowdStrike’s damage control will be in full swing and it will do what’s necessary to dramatically calm the base. But the data below are concerning.

More than 50% of the surveyed customers say they will reconsider their situation with CrowdStrike. This includes either rethinking, how they’re consolidating their stack around CrowdStrike, which is a big theme of CrowdStrike’s value prop; or they expect to reduce their exposure in CrowdStrike.

Major migrations are not easy

The negative reaction to this incident is understandable. But the reality is CrowdStrike is strategic to many accounts and it’s not a trivial exercise to migrate off a platform such as Falcon. Contractual commitments, processes, skill sets, relationships will all come into play. And we believe that CrowdStrike’s efforts toward transparency and remediation will calm the customer base over time. We’ll see and will measure the impact as we go forward.

The chart below shows the degree of difficulty as the survey respondents see it.

Overlap with Microsoft accounts

Let’s take a look at the final chart and share with you the overlap in the accounts. What the data below shows is just Microsoft accounts, more than 1,500 in the most recent ETR survey. The vertical axis here is shared Net Score, or spend momentum, and the horizontal axis is Overlap within those 1,538 accounts.

You can see CrowdStrike has a 26% overlap — that is, 26% of those Microsoft accounts also have CrowdStrike — and 15% have SentinelOne. We chose SentinelOne because it’s probably one of the closest comparisons, although it’s more down-market.

Palo Alto Networks is highly penetrated as well. And you can see Palo Alto has the biggest presence with an N in the upper right in that table of 478, followed by CrowdStrike with 401. You can also see both CrowdStrike and Palo Alto have net scores well over the 40% marker in the case of CrowdStrike — the latter at 47% and Palo Alto right at the 40% mark — and you can see how the others in play there. We’ve also selected Tanium Inc., Sophos, Malwarebytes Inc., Trellix, which used to be McAfee and FireEye, Carbon Black and Trend Micro, all of which have endpoint security products.

The point is CrowdStrike really has a big relative presence in these accounts, and it’s going to be really important for us to gauge how that initial negative reaction that we saw from the ETR survey plays out in the next round of surveys.

Broad-based impact humanizes cybersecurity

Let’s close with what we’re calling a wake up call for organizations because it underscores the perils of digital transformation and automation.

Sometimes we use the term GRS, that is, getting rid of stuff. GRS in IT is always a challenge. People are running their business on Windows and Windows Server. There’s still, as we said earlier, a lot of .NET out there and this all creates technical debt and a lot of legacy infrastructure complexity.

It means more permutations and so many different ways to configure Windows. Third-party firms such as CrowdStrike spend a lot of time and money making sure that they can manage all those different configurations. But in this case, as we’ve seen, it has turned into a real nightmare for the company and its customers. So far, CrowdStrike’s communications have been humble and open. We’ll give them props for that. But we don’t know everything yet, so we encourage continued transparency.

We’ll see how litigious customer organizations get. The economic impact could be huge. So at some point, CrowdStrike will have to limit its liability, but it’s premature right now. So much is unknown, again, including how aggressive these affected organizations are going to be toward CrowdStrike.

We don’t want to speculate too much, but there’s a lot of information floating around in threads, on Twitter, in Hacker News, in Reddit, in other places. We’ve reached out to several of our IT contacts. Many have said things like “If you were Windows-based, you couldn’t work yesterday.” They’ve told us they’re in recovery hell right now, but we will be unpacking the anatomy of this incident as time goes on.

Let us know how you were affected by this incident, how CrowdStrike has responded and what it means for your relationship going forward.

Image: theCUBE Research/DALL·E

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU