Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Big data the NASA way

Sarah Putt | Oct. 31, 2012
When the Curiosity rover arrived on Mars two months ago it was just about the best public relations exercise that NASA could have hoped for, short of actually landing a human on the red planet.

"Most of the open source work I do is through Apache, a lot of it has to do with the Apache licence being a very permissive licence," Mattman says. "It allows people downstream that leverage Apache based software to use that upstream open source component in arbitrary ways. It makes it so the software I build -- when we distribute it to customers, or others we collaborate with, we don't have to give them any surprises."

Mattmann says NASA has been an active user of open source software for around 15 years but only recently has it become active on the production side. For the past two years NASA has held open source summits, outlining its contribution to open source.

NASA categorises its data in different levels, and in the next generation earth science system satellite area where Mattmann works it is publically distributed via DAACs (Distributed Active Archive Centres). He says the programs and tools used to process data vary depending on the preferences of the scientists involved in the project. "A lot times the software itself is coupled to the instrument."

Level zero data is raw data that comes off the instrument and level one data is data which has started to be calibrated from raw voltages.

Mattmann says that the public can have access, through the DAACs, to level two data. This is data that is calibrated, geospatially identified and mapped to a physical model (measurements that can be mapped in space and time).

"It's so voluminous, because it's raw measurements in space and time from an instrument. You probably won't use that in your IT organisations, it might be too big for you," he says.

It's when you get to level 3 data, which is typically mapped or gridded information, that the user can really "crank on it" because the files are lot smaller and more manageable, says Mattmann. This information is often used in discussions about temperature and climate change.

"With each level of processing there are more assumptions that are codified into the data. More scientific assumptions that you didn't necessarily make," Mattmann points out.

His enthusiasm for big data projects is contagious, but when asked how he came to have a career as a NASA computer scientist, he says it's a "lame story".

He grew up playing video games, but it wasn't until his last year at high school when working on the student Yearbook that he worked with Adobe Illustrator and decided he needed to understand more about computers. So he followed some of his friends into the computer science department at college.

"I have no secret to being successful at computer science other than hard work and then sticking my head in the books and deciding I was going to do well," Mattmann says.

 

Previous Page  1  2  3  Next Page 

Sign up for CIO Asia eNewsletters.