This all begs the question: Can you build applications that span multiple regions, failing over from Virginia to California if necessary?
Reuven Cohen, founder and CTO of Enomaly, a cloud software provider, goes even further. Customers should build applications to run simultaneously across multiple cloud platforms from different vendors, he said.
The fact that major websites "known to be running across multiple availability zones are down" is a sign that the zones aren't foolproof.
"Things go down. It's the nature of the Internet itself," Cohen said. "There's this idea that because you're Amazon you can achieve 100% uptime, and that's the wrong way to look at it."
If Amazon can go down, anyone can. Even Google has had problems with Gmail.
"Vendors may provide redundancy ... but it doesn't address the problem that what if the overall access to that vendor goes down," Cohen said.
Customers should contract with "multiple providers with multiple locations" to survive problems caused by a single vendor, he said.
But is that realistic? Reeves says no, at least for most customers. Cloud computing is supposed to simplify deployment and management of applications. Building an application to work across multiple vendors requires a lot of extra work.
"The reason we can't architect applications across multiple cloud providers is the lack of standards and interoperability," Reeves said. "If you're an application builder and you want to increase your capacity for storage or compute, how you allocate, charge and use that capacity is different for every provider. It's not that it can't be done, it's just very, very difficult."
The simpler idea of sticking just with Amazon and balancing applications across multiple regions isn't so simple either. Amazon doesn't provide the necessary tools to load-balance between regions, so customers have to use additional software on top of their Amazon instances, Reeves says. Amazon's load-balancing service works across availability zones -- the same ones that failed Thursday -- but not across regions.
Anytime there is a cloud outage, some will call into question all cloud computing. That shouldn't be the case, Reeves said, noting "everybody has downtime." The difference with cloud computing is that we're aggregating risk -- many companies run their sites on one platform and when that platform goes down it's a lot more noticeable than when a single business' internal data center fails.
While a single cloud failure shouldn't be seen as an indictment of all cloud computing, Reeves says it does add a new wrinkle to the economic analysis that must be done before enterprises move services to the cloud. If companies run major businesses on top of Amazon, and suffer millions of dollars in lost revenue when there are outages, was the money saved by not building IT services internally worth the risk? Can customers buy insurance to recover lost dollars?
Service-level agreements may provide payments or credits, but if an outage "costs people tens of millions of dollars [in lost revenue] Amazon's not going to pay that back," Reeves said.
Sign up for CIO Asia eNewsletters.