The world experienced a digital pandemic of systems going offline and displaying the dreaded Windows Blue Screen of Death (BSOD), due to a catastrophic failure caused by a flawed file in an update to CrowdStrike cybersecurity customers. The impacts have been obscenely widespread, with many banks, airlines, train stations, financial exchanges, news agencies, supermarkets, and health care providers to name a few.
CrowdStrike is used by almost 60% of Fortune 500 companies and over half of the Fortune 1,000. It is popular in the financial sector, with deployments in eight of the top 10 financial services firms. Many of the biggest technology, healthcare, and manufacturing companies are also customers.
So far, the faulty CrowdStrike update is not attributed to malicious activities, but the impacts have been massive, prompting social media to unofficially designate today as BSOD day!
Implications
This outage of CrowdStrike customers on Windows 10 systems reinforces three important aspects.
First, cybersecurity solutions need deep and privileged access to systems, making them more impactful if they are hijacked or malfunction. This access is necessary to make preventative defensive changes before attacks occur, to monitor for stealthy attacks, and to coordinate system-level remediation actions when necessary. But when things go wrong, those permissions then can cause equally impactful damages.
The computing stack is like a layered cake, with data at the top, followed by applications, virtual machines, operating systems, VM managers, firmware, and finally hardware at the bottom. The deeper you go the more potential for problems to be impactful and difficult to remedy. Cyber attackers try to get as far down the stack as possible because they can avoid detection from any layer above and are more difficult to evict. When errors occur, the same relevance applies.
Second, the risk of supply chain attacks is real, and depending on the vendor, they could be catastrophic. CrowdStrike is one of the biggest cybersecurity players in the industry. An accidental or malicious problem in their flagship product, as we have seen, can deliver widespread impacts to the most important sectors. Let’s be glad that this was simply a technical glitch. A malicious package inserted into an update could completely take over systems or permanently destroy them.
Third, bad updates, code bugs, and misconfigurations happen all the time. No software, firmware, or hardware company is immune. More effort is needed as part of development and quality assurance, but even for the best organizations, it is possible for a series of mistakes to be made. That is why it is important to not only invest in defense and prevention but also architect ways to securely recover and resolve issues when they arise.
A Perfect Storm
This event has a combination of attributes that amplify the impacts: the issue causes catastrophic system impacts (i.e. the dreaded BSOD), across a large number of systems, in Critical Infrastructure sectors, and the offending code possesses deep permissions within the computing stack.
This is the case we are seeing with Crowdstrike.
This outage reinforces the fact that cybersecurity solutions mitigate risks but also can become a source of risk. Mistakes were made. Trust was lost. The entire cybersecurity industry will be scrutinized, and that is probably the only good outcome of this mess.
Comments