Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Big data: There's signal in that noise

Andrew C. Oliver | Sept. 1, 2014
When the data you capture and crunch is large and disorderly, interesting bits may come along for the ride. Don't squeeze the life out of it -- explore it.

Fearmongers warn that capturing data means capturing noise and well-tended gardens are the only way to manage data. Well, guess what? Sometimes noise is the point.

Read any article by any not-very-technical journalist parroting those who sell fear — or any comment on one of my posts from a scared cube dweller who was hoping to ride PL/SQL to retirement — and you'll hear dire warnings that capturing all this data before you can interpret it will spell doom and disaster. Caution! Much of the data is noise! Noise is bad and you risk terrible error!

With any new approach you take risks. What if the noise turns out to be more valuable than the data you're trying to capture?

That's exactly what happened recently with Jawbone's Up, a popular activity-tracking wristband. Last year, Jawbone hired a vice president of data, Monica Rogati, to start mining the gobs of data accumulated so far. After the earthquake in Napa last weekend, Jawbone published a graph showing a large percentage of Up wearers awaking, with their wake time and the percentage of those awoken directly related to their proximity to the quake. Those in San Francisco, for example, woke up slightly later and in fewer numbers than those in Napa.

Not long ago the Wall Street Journal noted that Twitter found quakes faster than seismometers — so imagine how quickly Up might work in detecting disasters compared to Twitter. There are obviously limitations and problems. (Heck, I still wish Jawbone could tell me how much more reliable the Up24 is compared to the Up Gen 2, but it refuses to say.) But imagine the potential!

Consider wildlife tracking. For years people have been radio tagging catch-and-release animals. Wouldn't they run from the epicenter? Perhaps that noise is exactly what we need. I realize they might not be tagged in sufficient numbers or have sufficient range to make that feasible, but it's a thought.

We've seen other instances of noise being more interesting than the data. Recently, while watching for plate movement, GPS recordings indicated that the West Coast is rising. Why? Because all those people moved into the desert and planted palm trees and started drinking more water than could be piped in. Meanwhile, everyone was putting more carbon in the skies, which warmed the air and depleted the water further (not a shocker that there's a drought). The GPS readings indicate how water weighs down the land -- and the degree to which land rises when the water leaves. Now we have a new measure of true water depletion across the land.

Pure science revels in these side-effect numbers — and curious scientists or people paid to rationalize the data figure out why. This "noise" ... what does it mean? Is it significant?

 

1  2  Next Page 

Sign up for CIO Asia eNewsletters.