Look at the layered disclosures from Edward Snowden, who it turns out is a finalist for the top humanitarian award in the European Union. Then look at America's mass shootings, most of which could have been avoided if the shooter's characteristics had identified him as a threat well in advance.
These unrelated events provide a template for what's at the core of the problem with big data: We focus on data, not analytics and results. When we do this, we build a solution backwards. The end result is more damaging than it is helpful.
Focus On Big Data , You May Get Little Results
A few months back, Harper Reed, the CTO of President Barack Obama's reelection campaign, spoke at an EMC event and argued that you have to start with data analytics and know what you're going to do with the information before you collect it.
Less is better, especially at first, Reed says. If you can't do much with a little data, then massively increasing the amount of data only makes things more complex and moves you away from your goal. (Interestingly, Reed's thesis is that Obama's election success was largely the result of doing the exact opposite of what his administration has done with personal information.)
We focus so much on data capture because we're fascinated with storage — specifically, the desire to capture as much information as we can. As a result, the problem we've focused on solving over the last two decades is storing, backing up and (once there's a catastrophe) restoring large amounts of data.
The Government Has a Big Data Problem
The government, for example, collects birth records, arrest records, education records, service records, health records, vehicle records, employment records and tax records (employment, property, sales and so on.) In fact, federal, state, and local government likely know more about what you do than you do.
The problem, as we saw during and after 9/11, was that no one can analyze much of this information. The data sits in disparate systems that don't talk or share with each other.
The 9/11 Commission Report clearly stated that this inability for systems to work together, either to identify the threat or to respond to it, was the core of the catastrophe. The collective "government" had enough data to foresee the attack long before it happened. The government had enough resources to stop the planes once the attack was identified, too.
In both cases, we couldn't translate the data into action. We were data rich but information poor.
NSA: We Can't Analyze Some Data, So We'll Just Collect All Data
Rather than try to fix this lack of organizational cooperation - which is admittedly a nasty problem of turf, authority and collaborative funding, as in who pays to get these databases to talk to each other - the National Security Administration instead embarked on a campaign to capture massive amounts of personal information.
Sign up for CIO Asia eNewsletters.