When British spies gave their Internet surveillance program the codename Karma Police they may have given away a little too much about its epic purpose: "To build a web-browsing profile for every visible user on the Internet."
The system ultimately gathered trillions of metadata records about Internet users' browsing habits.
In official documents obtained by The Intercept, the intent of Karma Police stands out alongside more cryptically named projects such as Moose Milk (using data mining to detect suspicious use of telephone kiosks) or Salty Otter (a technique for detecting when use of one medium, such as a telephone call, is used to trigger another, such as a chat service).
The documents, from the cache leaked by former U.S. National Security Agency contractor Edward Snowden, show that a division of the U.K. Government Communications Headquarters (GCHQ) gave the go-ahead for more work on the Karma Police. However, the Pull-Through Steering Group (PTSG), which evaluated prototype surveillance technologies for further development, noted in the minutes of its February 2008 meeting that "the legalities with respect to 'content' need to be cleared."
By the following year the tool was in production, and GCHQ staff used it to extract a sample of 6.68 million metadata records involving 224,000 unique IP addresses of Internet-radio listeners between August and October. Their goal was to explore whether streaming radio stations were used for Islamic radicalization.
As a report on that operation notes: "A wealth of datamining techniques could be applied on small closed groups of individuals, to look for potential covert communications channels for hostile intelligence agencies running agents in allied countries, terrorist cells or serious crime targets."
However, even with the data from Karma Police, the spies were able to say little more about one Egyptian Internet radio user caught up in the dragnet than that he also used "other Web 2.0" services such as Facebook, Yahoo webmail, Youtube and pornographic video sharing site Redtube.
That data, and more, was funnelled into a gigantic data repository called Black Hole -- perhaps GCHQ's equivalent of the NSA's data center in Bluffdale, Utah.
By March 2009 it already held over one trillion metadata records, and in May 2012, with records being added at the rate of 50 billion per day, work was underway to double its acquisition capacity, according to an internal GCHQ presentation on analytic cloud challenges obtained by The Intercept. The sheer volume of new data meant that GCHQ could store it for no more than six months, and sometimes even less.
GCHQ staff cross-referenced the IP address information from Karma Police with an analysis of cookies performed by another tool, Mutant Broth to identify Internet users and build up a clearer profile of their online habits. Among the cookies used to identify people were those of Google, Yahoo, YouTube, Facebook, Reddit, Wordpress, Amazon.com, CNN and the BBC.
Sign up for CIO Asia eNewsletters.