When it comes to keeping its thousands of servers running smoothly, Facebook relies on the open source Chef configuration manager, modified slightly to handle the size of the social networking giant's huge infrastructure.
"Chef's biggest advantage for us is its flexibility," said Phil Dibowitz, a Facebook systems engineer.
Chef is one of a number of open source configuration management tools that have grown in popularity over the last few years. As data centers grow ever larger, companies look to automate routine operations around deploying and upgrading servers, switches, OSes, databases and other components. Facebook's experience in managing its infrastructure may hold lessons for other organizations as well. Facebook wanted "a new way to manage its systems," Dibowitz said. Although Facebook does not reveal the total number of servers it runs, industry observers estimate it could be in the tens of thousands, at least. Surprisingly, the company only has a handful of employees -- four on latest count -- on the core infrastructure team, of which Dibowitz is a member. Prior to using Chef, Facebook had been using another open source configuration management package, called CFEngine. The deployment was growing increasingly unwieldy, however. Using CFEngine version 2, Facebook was experiencing a rapid and unchecked proliferation of system control files. With CFEngine, users could not edit a configuration management file directly. Instead, each time operations engineers needed to make a change to a system, they would copy a similar system file, make the necessary changes, and then submit the file back to CFEngine. "In doing so, they added a couple hundred lines of stale configuration," Dibowitz said. As a result, the infrastructure team did not know all the various configuration permutations it had on hand, or even which of the configuration settings were outdated. "It became really unsupportable," Dibowitz said. To look for a new open source configuration management system, the team ran a number of tests comparing Chef, Puppet, and Spine, which was developed by Ticketmaster.
Chef best fit the bill for a number of reasons, Dibowitz explained.
Chef offers great flexibility on how to write configuration changes, thanks in part to how it is based on Ruby, a full-fledged programming language that can be easy to learn for administrators and engineers.
"There's no limiting factor. You don't have to be in the Domain Specific Language [DSL] that CFEngine or Puppet gives you," Dibowitz said. "From there, we had a lot of power to do what we wanted."
Chef offered a number of other advantages as well. It could manage settings at a much more granular level. It also offered more flexibility in how to manage the configuration files themselves, Dibowitz said.
Sign up for CIO Asia eNewsletters.