President Obama targets US$200 million for big data boost

Michael Cooney | March 30, 2012
The U.S. government is the poster child for big data and today President Obama is set to announce a $200 million research program to bolster the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data.

• As part of the DoD, its Defense Advanced Research Projects Agency (DARPA) is beginning the XDATA program, which intends to invest approximately $25 million annually for four years to develop computational techniques and software tools for analyzing large volumes of data, both semi-structured and unstructured traffic. Central challenges to be addressed include: Developing scalable algorithms for processing imperfect data in distributed data stores; and creating effective human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions. The XDATA program will support open source software toolkits to enable flexible software development for users to process large volumes of data in timelines commensurate with mission workflows of targeted defense applications.

• National Institutes of Health - The National Institutes of Health is announcing that the world's largest set of data on human genetic variation - produced by the international 1000 Genomes Project - is now freely available on the Amazon Web Services (AWS) cloud. At 200 terabytes - the equivalent of 16 million file cabinets filled with text, or more than 30,000 standard DVDs - the current 1000 Genomes Project data set is a prime example of big data, where data sets become so massive that few researchers have the computing power to make best use of them. AWS is storing the 1000 Genomes Project as a publicly available data set for free and researchers only will pay for the computing services that they use.

• Department of Energy - Scientific Discovery Through Advanced Computing: The Department of Energy will provide $25 million in funding to establish the Scalable Data Management, Analysis and Visualization (SDAV) Institute. Led by the Energy Department's Lawrence Berkeley National Laboratory, the SDAV Institute will bring together the expertise of six national laboratories and seven universities to develop new tools to help scientists manage and visualize data on the department's supercomputers, which will further streamline the processes that lead to discoveries made by scientists using the department's research facilities. The need for these new tools has grown as the simulations running on the department's supercomputers have increased in size and complexity.

• U.S. Geological Survey - USGS is announcing the latest awardees for grants it issues through its John Wesley Powell Center for Analysis and Synthesis. The center catalyzes innovative thinking in Earth system science by providing scientists a place and time for in-depth analysis, state-of-the-art computing capabilities, and collaborative tools invaluable for making sense of huge data sets. These Big Data projects will improve understanding of issues such as species response to climate change, earthquake recurrence rates, and the next generation of ecological indicators.


