Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Big data collection makes it hard for you to remain anonymous

Taylor Armerding | March 31, 2015
Effective techniques exist to “de-identify” personal information in Big Data collection. But what really matters is how often it is applied. And most experts say, that's not very often.

Heidi Wachs, special counsel in the Privacy and Information Governance Practice of Jenner & Block, agreed. "I think the word 'anonymous' gets thrown around a lot without a true understanding of how information is collected and shared," she said. "So much of what we do every day online can be traced back to an IP address or a device ID. Even when our names aren't being collected in conjunction with online activity, there is often some form of identifier that uniquely identifies us."

Indeed, data collectors don't need names to treat individuals differently. In 2012, the travel website Orbitz generated headlines about pitching higher-priced hotel rooms to users of Macintosh computers, since the company's data collection showed them to be wealthier or willing to pay a premium.

And plenty of data collection doesn't come with even an implied promise of anonymity. They include highway toll collection readers and ubiquitous security cameras. Ortega notes that there is, "face recognition, video that is taken without you knowing, and exercise tracker data about where you run or where you go to work out and when, etc. With the Internet of Things, there will be more data like this collected about us."

O'Neil notes that social media sites have "the most precious datasets" to marketers, and may not have rigorous protection. "Are those companies in the advertising initiatives following the same security best practices as others?" he asked. "Meanwhile, your personal data is traded and moved back and forth like high-frequency stocks between dozens of data aggregators."

Another conundrum is that the more data is de-identified, the less useful it becomes, and there are some cases where people don't expect anonymity, but they do expect their information to be protected.

"There are times when the cost of making data anonymous may be outweighed by the benefits we can reap from higher quality data," Finch said. "We also have to consider situations where we don't want perfect anonymity -- if you're a patient in a clinical study, for example, and a researcher notices a potentially dangerous abnormality in your de-identified records, it would be important they have some way to re-identify you."

Ortega agrees. "You cannot protect data with a 100% chance of being completely secure unless you lock it up in a safe and throw away the key. That would not be good for analysis," he said.

If there is a consensus among experts, it is that most collectors can and should do a better job of protecting the privacy of individuals, either through rigorous anonymization or other privacy protections.

"Data minimization plays a part here," Wachs said. "In any given data set, were all of the data elements necessary to accomplish a specific goal? Or was data just being collected because it could, or people were willing to offer it?"


Previous Page  1  2  3  4  Next Page 

Sign up for CIO Asia eNewsletters.