Looking at the second wave's traffic, we realized there were a few immediate mitigations that we could easily implement:
* This particular organization only serves U.S. customers, yet a lot of the traffic was coming from outside the country. We quickly implemented some firewall rules that had been battle tested from our work with Federal agencies, which would admit only U.S.-based traffic. This immediately stopped 40% of the traffic at the front door.
* We inserted a web application firewall behind our AWS Route 53 configuration and scaled up some HA Proxy servers, which would gather a lot of logging information for the FBI who had now become our new best friends to analyze after the fact.
* Third, we intentionally broke our auto scaling configuration. Auto scaling has triggers for scale-up and scale-down. We changed the scale-down trigger to make it much higher than the scale-up trigger. What that meant was the system would scale up properly as more traffic came in, but would never hit the scale-down threshold. As a result, every instance that was launched would stay in service permanently, leaving its logging information intact for harvesting by the FBI.
It was now 1:00 a.m. We put our game faces on. The arms race had begun.
DDoS Day Two
Our attackers scaled up. Amazon Web Services scaled up. Our attackers scaled up some more. AWS scaled up some more. This continued into Day Two. At this point we were providing hourly updates to our customer's board of directors.
At the height of the DDoS attack, we had 18 very large, compute-intensive HA Proxy servers deployed and almost 40 large web servers. The web server farm was so large because even though we had excluded the non-U.S. traffic component, representing 40% of the overall load the remaining 60% consisted of legitimate URLs originating from within the United States, most of which were accessing dynamic services that could not easily be cached. Traffic was hitting an extremely large, globally-distributed infrastructure. Our highly-scaled web server farm was deployed behind a very substantial HA Proxy firewall/load-balancer configuration. This in turn sat behind CloudFront, AWS' globally-distributed content delivery network, which itself was deployed behind Route 53, AWS' globally-redundant DNS platform. This was an infrastructure of very significant dimensions, scalable and secure at every tier.
At around until 7 p.m. that evening, something fantastic happened. We scaled up... but the bad guys didn't scale up anymore. At this point we were sustaining 86 million concurrent connections from more than 100,000 hosts around the world. We measured the traffic, and were shocked to see that we were handling 20 gigabits per second of sustained traffic through the AWS infrastructure. This equates to 40 times the industry median as observed in DDoS attacks in 2014, according to Arbor Networks. We continued to serve the website at a response rate of about 1-3 seconds per page.
Sign up for CIO Asia eNewsletters.