CrowdStrike minor patch, released on Friday, caused widespread issues across Microsoft Windows systems, disrupting airports, healthcare facilities, and 911 call centers. Despite the company’s sophisticated DevOps pipeline, the faulty update slipped through, highlighting potential vulnerabilities in their release process.
Financial Impact and Company Response
The update’s fallout has been significant for CrowdStrike, with its stock price plunging from $345.10 on Thursday evening to $263.10 by Monday afternoon. Although the stock has since recovered slightly, the company has faced a severe blow to its reputation. CrowdStrike acknowledged the gravity of the situation in a statement and prioritized restoring affected systems while investigating the root cause of the issue.
CrowdStrike Expert Insights on Software Deployment
Dan Rogers, CEO of LaunchDarkly, emphasized that software bugs are often controllable through careful deployment strategies, such as feature flags. These tools allow companies to control the rollout of new features and quickly disable problematic ones to prevent widespread issues. However, since the CrowdStrike problem affected the kernel operating system, it was more challenging to address than web application issues. A more gradual deployment could have detected the problem earlier.
Causes and Prevention of Deployment Failures
According to Jyoti Bansal, CEO of Harness Labs, similar issues can arise at any software company, even those with solid release practices. Bansal highlighted that buggy code sometimes bypasses thorough testing, especially in large engineering teams where varied practices and a lack of standardization increase the risk. Practical testing and adherence to a consistent DevOps pipeline are essential to prevent such issues.
CrowdStrike Strategies to Minimize Risks
Both Rogers and Bansal recommend progressive rollouts and canary deployments to minimize the risk of deployment failures. Progressive rollouts involve releasing updates to a small subset of users first, allowing for early detection and rollback if issues arise. Similar to canary deployments used in coal mines, they involve testing updates in controlled environments before a broader release. These methods help identify problems before they affect a larger audience.
Balancing Innovation and Security CrowdStrike
Experts agree that while vigorous testing and deployment practices are crucial, it is also important not to overcomplicate or lengthen the testing process excessively. Such rigidity can stifle innovation and delay valuable software releases. A balanced approach incorporating thoughtful automation and standardization without impeding development speed is crucial in managing risks effectively.