Microsoft has issued a detailed mea culpa about the lengthy and ill-timed outage that affected its webmail services on Tuesday and Wednesday, an incident that undermines the company's push for its new Outlook.com as a better alternative to Gmail and Yahoo Mail.
"We do want to sincerely apologize to anyone that was unable to access their email during the interruption. Outages are something we take very seriously and invest a significant amount of our time and energy in doing our best to prevent," wrote Arthur de Haan, a Microsoft vice president, in a blog post.
While outages affect all providers of consumer online services at one point or another, this one in particular lasted very long -- from Tuesday afternoon until Wednesday morning -- and happened while Microsoft is fanning the marketing flames for Outlook.com, the Hotmail replacement that the company is touting as a reinvention of webmail services in general.
Microsoft announced Outlook.com with much fanfare in July 2012, when it launched it in preview mode, saying it represented a re-imagining of webmail from the data center to the user experience.
Last month, Microsoft removed the "preview" tag from Outlook.com, opening it up to all comers and setting in motion the upgrade process for Hotmail users, which should be completed by the summer.
At the same time, Microsoft has been waging an in-your-face, months-long public perception war on Gmail via its Scroogled campaign, in which Microsoft argues that Gmail disrespects its users' privacy by matching ads to the text of their messages.
It positions Outlook.com as an alternative that is more privacy-friendly than Gmail, while touting Outlook.com's improvements over Hotmail, including a redesigned user interface, broad syncing capabilities, improved message sorting and native integration with Facebook, Twitter, LinkedIn, Google and other sites.
De Haan explained in his blog post that the Hotmail and Outlook.com outage began shortly after 4:30 p.m. U.S. Eastern Time on Tuesday, and that problems were fully solved at around 8:45 a.m. on Wednesday. The SkyDrive cloud storage and file sharing service was also down for several hours.
The problem arose after Microsoft updated the firmware on equipment in one of its data centers, a routine process that somehow went very wrong, causing "a rapid and substantial temperature spike" in the facility.
"This spike was significant enough before it was mitigated that it caused our safeguards to come in to place for a large number of servers in this part of the datacenter," he wrote.
As a result, access to mailboxes on these servers was shut down, and the failover process was blocked.
"Based on the failure scenario, there was a mix of infrastructure software and human intervention that was needed to bring the core infrastructure back online. Requiring this kind of human intervention is not the norm for our services and added significant time to the restoration," he wrote.
Sign up for CIO Asia eNewsletters.