Team-BHP > Shifting gears
Register New Topics New Posts Top Thanked Team-BHP FAQ


Reply
  Search this Thread
15,941 views
Old 19th July 2024, 15:55   #1
BHPian
 
Join Date: Sep 2019
Location: Bengaluru
Posts: 82
Thanked: 253 Times
Microsoft and Crowdstrike outage

Multiple failures (or potentially related) between the microsoft 365 failure and the crowdstrike falcon sensor have caused a major internet outage similar to the Live Free or Die Hard cyber attack (just that here it was not intentional or malicious).

https://www.forbes.com/sites/kateofl...at-to-do-next/

Airports, airlines, banks, rails have been reported to have ended with a collateral. Indigo passengers, Changi desks have reported to have issued manual boarding passes to keep the passengers moving.

Just shows how much we have become reliant on these systems and cloud apps that when intended things can be brought to a standstill.
warp_10 is offline   (14) Thanks
Old 19th July 2024, 16:01   #2
BHPian
 
Join Date: Dec 2005
Location: Vijayawada-AP
Posts: 359
Thanked: 238 Times
Re: Microsoft and Crowdstrike outage

Pl see reply from CrowdStrike ..
Attached Thumbnails
Microsoft and Crowdstrike outage-microsoft-outage-crowdstrike.jpg  

vvrchandra is offline   (8) Thanks
Old 19th July 2024, 16:04   #3
BHPian
 
Join Date: Sep 2019
Location: Bengaluru
Posts: 82
Thanked: 253 Times
Re: Microsoft and Crowdstrike outage

Quote:
Originally Posted by vvrchandra View Post
Pl see reply from CrowdStrike ..
Crowdstrike will be under a lot of gun on this for sure.

PS: They are down down 15-20% in premarket on this. Short traders must be making merry !
warp_10 is offline   (1) Thanks
Old 19th July 2024, 17:07   #4
BHPian
 
Join Date: Jun 2024
Location: Mumbai
Posts: 29
Thanked: 64 Times
Re: Microsoft and Crowdstrike outage

No matter how comfortable one is with usage of automated processes, everyone must have plan B to continue work in case of unforeseen circumstances.

They should purposefully train people to work with manual system to keep them ready to face any challenging situations.
dgindia is offline  
Old 19th July 2024, 17:16   #5
Distinguished - BHPian
 
Join Date: Dec 2010
Location: --
Posts: 24,464
Thanked: 72,784 Times
Re: Microsoft and Crowdstrike outage

Tech disruptions worldwide have hit airlines, banks and businesses, which are scrambling to respond.

Major US carriers, including Delta, United and American Airlines, have had flights grounded, and airlines in Europe and Asia-Pacific region have also seen disruptions. Banks in Australia, New Zealand, South Africa and Britain have been impacted, as have health services in Israel and the UK.

Multiple Asian airlines and airports have been affected by the global IT outage.
One of Europe's largest medical facilities affected by outage.

Link:

Last edited by volkman10 : 19th July 2024 at 17:17.
volkman10 is offline   (3) Thanks
Old 19th July 2024, 18:14   #6
BHPian
 
HillMan's Avatar
 
Join Date: Jul 2013
Location: Bangalore
Posts: 926
Thanked: 834 Times
Re: Microsoft and Crowdstrike outage

If someone is still facing the issue here are some recommended workaround steps to be followed:

Option 1 - For Workstations
Reboot the Workstations thrice so that it starts working as expected(BSOD)

Option 2 - Workaround for Windows workstations
Boot Windows into Safe Mode or the Windows Recovery Environment
Navigate to the C:\Windows\System32\drivers\CrowdStrike directory
Locate the file matching “C-00000291*.sys”, and delete it.
Boot the host normally.
Note: Bitlocker -encrypted hosts may require a recovery key. Please reach out to your Local IT Admins for recovery key

Option 3 - (if users are experiencing a blue screen(BSOD))
With the workstation powered off, hold down the F5 key and power the workstation on
Hold the F5 key down until you see a message at the bottom of the screen saying it is diagnosing issues
Release the F5 key and let the workstation attempt repairs
When repairs are completed, it will reboot normally

Option 4:
Boot the workstation into Safe Mode and do a System Point restore to the last know good configuration.
When the restore is complete, it will reboot normally


Even though the solution is found. The world woke up with the jolt today to realise how connected we all are and how easily someone can bring down the entire infrastructure to its knees.

Last edited by HillMan : 19th July 2024 at 18:27.
HillMan is offline   (22) Thanks
Old 19th July 2024, 21:01   #7
BHPian
 
chiekennugget's Avatar
 
Join Date: Feb 2024
Location: Hyderabad
Posts: 158
Thanked: 799 Times
Re: Microsoft and Crowdstrike outage

https://www.motorsportweek.com/2024/...-hungarian-gp/

Even the Mercedes F1 team is facing heat from the windows outage.

Quote:
Several IT outages have wreaked havoc across the globe overnight with airlines, banks and shops all impacted alongside Mercedes’ current F1 weekend preparations.

CrowdStrike, which has supplied Mercedes since 2019, revealed that a “defect” with a content update has been behind the disruption that has become headline news.

Quote:
The disturbance is also thought to have hampered Mercedes’ three engine customers – Aston Martin, McLaren and Williams – but other teams haven’t been hampered.

Last edited by chiekennugget : 19th July 2024 at 21:05.
chiekennugget is offline   (2) Thanks
Old 19th July 2024, 21:16   #8
Senior - BHPian
 
amol4184's Avatar
 
Join Date: Oct 2007
Location: Seattle/Pune
Posts: 1,352
Thanked: 5,730 Times
Re: Microsoft and Crowdstrike outage

Quote:
Originally Posted by dgindia View Post

They should purposefully train people to work with manual system to keep them ready to face any challenging situations.
Unfortunately there is nothing that can be done manually.
This is akin to trying to start a car without engine in it.
amol4184 is offline   (4) Thanks
Old 20th July 2024, 06:46   #9
Senior - BHPian
 
Join Date: Sep 2019
Location: Pune
Posts: 2,635
Thanked: 8,055 Times
Re: Microsoft and Crowdstrike outage

It is confounding that there is such hyper dependency on one software component and how tightly it is integrated with the Windows platform. It also puts a question mark as to how we are suffering from lack of choice in terms of general computing platforms. Post covid, there was a lot of talk about reducing inter-dependencies. I think businesses need to seriously think about encouraging diverse IT platforms with different configurations for the sake of redundancy. Everything is a binary- MS vs Apple, Android vs IOS, Boeing vs Airbus !!

PS: Crowdstrike literally lived up to its name

Last edited by fhdowntheline : 20th July 2024 at 06:48.
fhdowntheline is offline   (3) Thanks
Old 20th July 2024, 08:20   #10
Team-BHP Support
 
Samurai's Avatar
 
Join Date: Jan 2005
Location: Bangalore/Udupi
Posts: 25,976
Thanked: 47,754 Times
Re: Microsoft and Crowdstrike outage

Quote:
Originally Posted by fhdowntheline View Post
It is confounding that there is such hyper dependency on one software component and how tightly it is integrated with the Windows platform.
Windows is not dependent on it.

Windows PCs in my company don't have CrowdStrike installed on it. It is an optional security software used by many companies. If you don't use it, there is no impact from this bug.
Samurai is offline   (21) Thanks
Old 20th July 2024, 10:48   #11
Team-BHP Support
 
Join Date: Feb 2004
Location: Bangalore
Posts: 15,075
Thanked: 29,816 Times
Re: Microsoft and Crowdstrike outage

My company had no issues although some degradation in services happened on some SaaS platforms
ajmat is offline  
Old 20th July 2024, 11:00   #12
BHPian
 
vamsi.vadrevu's Avatar
 
Join Date: Aug 2020
Location: Hyderabad
Posts: 156
Thanked: 444 Times
Re: Microsoft and Crowdstrike outage

Looking at all these re-affirms my belief that business critical systems should always be on Linux machines with multiple backups. I've read news about some pharmacies in USA not being able to process prescription refills because their billing systems were impacted! It underlines the importance of plan b and business continuity plans. We regularly perform business continuity plans drills in case of total system failure. We also account for 3rd party systems being out of action. I've noticed that some organizations do not even consider the fact that a 3rd party system on which they're dependent on might go down.

Windows as a platform has really become bloated to the point it is insufferably inefficient. With all the cloud hosting of most applications it is time organisations started to move out of windows platform. There would be minimal impact to end user because almost all apps are cloud hosted and end users would just be using a browser to access their apps.

If the airline operations were to be considered as emergency services, they really should have had a backup plan for outages. It is expensive to maintain a second instance of your entire platform but that's the price to pay for zero downtime. Sadly I do not see any organisations doing this. I only know of a few sectors that actually take this seriously. Banking, insurance and defence domains tend to have robust alternative mechanisms in place.

Last edited by vamsi.vadrevu : 20th July 2024 at 11:05. Reason: Added a few more points
vamsi.vadrevu is offline   (2) Thanks
Old 20th July 2024, 11:40   #13
BHPian
 
shivshanker's Avatar
 
Join Date: Jul 2007
Location: Mumbai
Posts: 350
Thanked: 207 Times
Re: Microsoft and Crowdstrike outage

Quote:
Originally Posted by Samurai View Post
Windows is not dependent on it.

Windows PCs in my company don't have CrowdStrike installed on it. It is an optional security software used by many companies. If you don't use it, there is no impact from this bug.
In my humble opinion, (I am old school, still doing Mainframes)

This is a fallout on account of desire for "automation", Integrating DevOps with into CI/CD pipeline with the ultimate desire to cut costs.

An untested patch or module deployment is unheard of in my world.
shivshanker is offline   (1) Thanks
Old 20th July 2024, 11:50   #14
BHPian
 
Join Date: May 2008
Location: Bengaluru
Posts: 422
Thanked: 1,802 Times
Re: Microsoft and Crowdstrike outage

Failure to include a null check.

Name:  BSOD.png
Views: 292
Size:  206.6 KB
AltoLXI is offline   (10) Thanks
Old 20th July 2024, 12:03   #15
BHPian
 
Join Date: Apr 2005
Location: Bangalore
Posts: 314
Thanked: 148 Times
Re: Microsoft and Crowdstrike outage

Let me share my thoughts:

Crowdstrike builds Endpoint Detection & Response (EDR) product and EDR agents will be deployed on all the laptops, servers, etc. There are many other EDR companies, Crowdstrike is one of the popular vendor. This EDR agent runs in kernel mode which is very sensitive part of OS. Some bad programming in kernel mode device driver can lead to BSOD (Blue Screen Of Death). BSOD is very popular in the context of Windows OS. Though BSOD is not rare, but not that frequent as well. All the changes in device driver MUST go through stringent testing.

Microsoft is not directly involved in this. Though, Microsoft should build the methods to protect the OS/system from BSOD and handle this scenario gracefully. This is long pending. Does that mean Linux is safe? No, just that it doesn’t show Blue Screen Of Death! I guess, Linux handles this gracefully.

The biggest surprise (for me) is, how did this defect escape from the pre-production validation consistently across all the companies??? This raises the question mark on the pre-production process across all the companies. The scale at which this has spread, this is definitely not a corner case. If you look at the banking sector, not many have failed. I know there are many of them who have stringent pre-production testing to catch such defects.

Last edited by Selective : 20th July 2024 at 12:12.
Selective is offline   (6) Thanks
Reply

Most Viewed


Copyright ©2000 - 2024, Team-BHP.com
Proudly powered by E2E Networks