OK, so maybe your average homegrown terrorist isn't actually "liking" a How to Make Better Bombs page on Facebook. And perhaps your typical terrorists aren't even active on Facebook, or are clever enough to cover their tracks and not connect with those of a similar bent. But before you scoff at the idea of Facebook (and to a lesser extent LinkedIn) having enough data on its users to help the government find potential terrorists (or just malcontents) — data that may very well already be public — consider the following.
A blog post rather mildly titled Using Metadata to Find Paul Revere outlines how easy it would have been for the British government in the 1770s to discover that Paul Revere was a "person of interest" if they had rudimentary tools of the 21st century: a list of colonists belonging to various groups, a cheap desktop computer and free statistical analysis software R. We're not talking supercomputer clusters here, but any PC or Mac that can do some matrix multiplication and a bit of network analysis with R.
The blogger, Kieran Healy, an associate professor of sociology at Duke University, posted his R code on GitHub so anyone can follow along. Basically, he creates a couple of tables showing each person and the organizations they belong to, both "people by groups" and "groups by people." Multiplying them out to create a rather massive table — what he calls a "person by person matrix" — and "the cells show the number of organizations any particular pair of people both belonged to." Create a network diagram and you can see who's in the center.
You don't even need statistical analysis software like R (or the Excel plugin NodeXL); Google's Fusion Tables lets you do this as well.
(For those who want to delve into the concepts in more detail, Healy recommends a paper by Shin-Kap Han at Seoul National University in Korea; if you want to learn more about R, check out my Beginner's Guide to R). I downloaded the data and ran the code, and sure enough generated this network diagram showing Paul Revere at its center.
Now, I don't mean to minimize revelations of the U.S. government's phone andInternet surveillance programs here. Clearly, trolling through phone and email records is a serious breach of both privacy and trust in a way that combing through public social media records is not.
My point here is to help people realize that data they give out willingly about themselves can be used by anyone with a computer and some knowledge of statistical analysis to find out a lot more than you might realize. This doesn't mean it's time to drop off the grid and fit yourself for a tin-foil hat, but it probably does mean you may want to make sure you truly "like" someone, some company or some cause before clicking the button on Facebook.
Sign up for CIO Asia eNewsletters.