CrowdStrike uncovers the root cause of global system disruptions.

The cybersecurity company CrowdStrike has published its root cause analysis of the failure in the Falcon Sensor software update, which paralyzed millions of Windows devices worldwide. The incident, known as the “Channel File 291,” was detailed in their Post-Incident Review (PIR), revealing a content validation issue that arose after introducing a new template type to detect attack techniques that abuse Windows named pipes and other Inter-Process Communication (IPC) mechanisms.

The Cause of the Problem
The problematic content update, implemented in the cloud, was described by CrowdStrike as a “confluence” of several deficiencies. The most prominent issue was a mismatch between the 21 inputs passed to the Content Validator through the IPC Template Type and the 20 inputs supplied to the Content Interpreter. This mismatch went undetected during the “multiple layers” of the testing process due to the use of wildcard matching criteria for input number 21.

The Impact and Solution of the Problem
CrowdStrike explained that sensors receiving the new Channel File 291 version were exposed to a latent out-of-bounds read issue in the Content Interpreter. In the next operating system IPC notification, the new IPC template instances specified a comparison with the input value number 21, causing an out-of-bounds memory access and system crash.

To address this issue, CrowdStrike has implemented several measures, including validating the number of input fields in the template type during sensor build and adding runtime input array boundary checks to the content interpreter. These measures prevent out-of-bounds memory reads and ensure system integrity.

Additional Improvements and Independent Review
CrowdStrike also plans to increase test coverage during template type development to include test cases for wildcard matching criteria in all future template type fields. Additionally, the following modifications have been made:
– The Content Validator now includes new checks to ensure content in template instances does not exceed the number of fields provided to the Content Interpreter.
– The content configuration system has been updated with new testing procedures and additional layers of deployment and verification.
– The Falcon platform has been upgraded to provide customers with more control over the delivery of rapid response content.

Repercussions in the Industry
This root cause analysis is published amidst significant criticisms, including that from Delta Air Lines, which is seeking damages from CrowdStrike and Microsoft for massive disruptions and additional costs arising from thousands of canceled flights. Both CrowdStrike and Microsoft have responded, stating that they are not to blame for the disruption and suggesting that Delta’s issues may run deeper than the faulty security update.

CrowdStrike’s transparency in publishing its root cause analysis and the corrective measures implemented underscore its commitment to security and reliability, while highlighting the ongoing challenges in managing security updates in complex and globally distributed systems.

Scroll to Top