Microsoft has released a fix for a harebrained Exchange Server bug that shut down on-premises mail delivery around the world just as clocks were chiming in the new year.
The mass disruption stemmed from a date check failure in Exchange Server 2016 and 2019 that made it impossible for servers to accommodate the year 2022, prompting some to call it the Y2K22 bug. The mail programs stored dates and times as signed integers, which max out at 2147483647, or 231 – 1. Microsoft uses the first two numbers of an update version to denote the year it was released. As long as the year was 2021 or earlier, everything worked fine.
“What in the absolute hell Microsoft?”
When Microsoft released version 2201010001 on New Year’s Eve, however, on-premises servers crashed because they were unable to interpret the date. Consequently, messages got stuck in transport queues. Admins around the world were left frantically trying to troubleshoot instead of ringing in the New Year with friends and family. All they had to go on were two cryptic log messages that looked like this:
Log Name: Application Source: FIPFS Logged: 1/1/2022 1:03:42 AM Event ID: 5300 Level: Error Computer: server1.contoso.com Description: The FIP-FS "Microsoft" Scan Engine failed to load. PID: 23092, Error Code: 0x80004005. Error Description: Can't convert "2201010001" to long.
Log Name: Application Source: FIPFS Logged: 1/1/2022 11:47:16 AM Event ID: 1106 Level: Error Computer: server1.contoso.com Description: The FIP-FS Scan Process failed initialization. Error: 0x80004005. Error Details: Unspecified error.
“What in the absolute hell Microsoft!?” one admin wrote in this Reddit thread, which was one of the first forums to report the mass failure. “On New Year’s Eve!? First place I check is Reddit and you guys save my life before we even get an engineer on the phone.”
The next day, Microsoft released a fix. It comes in two forms: an automated PowerShell script, or a manual solution in the event the script didn’t work properly, as reported by some admins. In either case, the fixes must be performed on every on-premises Exchange 2016 and Exchange 2019 server inside an affected organization. The automated script can run on multiple servers in parallel. The software maker said the automated script “might take some time to run” and urged admins to be patient.
The date and time check was performed when Exchange checked the version of the FIP-FS, a scanning engine that’s part of Exchange antimalware protections. Once FIP-FS versions began with the numbers 22, the check was unable to complete, and mail delivery was abruptly halted. The fix stops the Microsoft Filtering Management and Microsoft Exchange Transport services, deletes current AV engine files, and installs and starts a patched AV engine.
By Monday, things were getting back to normal for many affected organizations. It’s not clear how long the buggy date storage had been in place, but judging from the two affected versions, it was possibly introduced when Exchange Server 2016 was under development.
Author: Dan Goodin