Data centre migrations can involve a lot of pain. And a transnational migration — let alone several — in order to comply with data sovereignty regulations is almost an unthinkable nightmare. But for Jean-Manuel Becker, the director of IT services at Melbourne-based educational services company Pearson Research and Assessment, it's becoming a routine part of his job.
Pearson delivers NAPLAN testing to primary and high school students in all Australian states bar Queensland. But the Melbourne-based company also delivers testing services internationally, in Asia and a number of Middle Eastern companies, including the United Arab Emirates and Oman.
These countries frequently have regulations about the storage and processing of examination data which often means it can't leave the country, ruling out any cloud-based processing of results.
Pearson creates exam papers with barcodes that identify which student the paper is intended for. Answers are scanned in and processed with a combination of OMR (optical mark recognition — which recognises answers for 'bubble'-style multiple choice questions) and clipping of images. The clippings are fed into a Web-based environment that lets teachers mark the exams. After exams are marked the results are compiled and sent to educational authorities.
At the height of NAPLAN, which is held for students in grades 3, 5, 7 and 9, the company processes about 3.1 million exams in the space of three weeks. The online marking environment is used by some 1100 Australian teachers.
In all of the countries that Pearson operates in, it's a very seasonal business, Becker said. "It will happen for two months during the exam or even one month and then nothing will happen there," he said.
In addition to dealing with a significant number of users and IO-intensive batch processing, the nature of the business means that it has to deal with sensitive personal data. In some countries, Becker said, this means exam data can't be stored outside the country, or even, in some cases, outside the ministry of education.
Becker's solution: If the data can't go to Pearson's data centre, Pearson's data centre will go to the data.
"What I've got in Melbourne to provide the national test environment [for NAPLAN] is a standard SAN system with virtualization, VMware and all that," Becker said.
"But what I had to develop for the Middle East project is a portable data centre."
Becker's data-centre-in-a-box is a single 2U unit running an array of virtual machines to provide Oracle 11g and other databases, a firewall and load balancers, 20 virtual Linux boxes running the Apache Web server and JBoss to deliver Pearson's online marking application, and some .NET applications running on Windows virtual machines that process exam paper images delivered by high-speed scanners.
Sign up for CIO Asia eNewsletters.