Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Big data blues: The dangers of data mining

Cindy Waxer | Nov. 5, 2013
Big data might be big business, but overzealous data mining can seriously destroy your brand. Will new ethical codes be enough to allay consumers' fears?

"Companies are starting to understand the danger of secondary uses of information and how people's personal data can be abused," says Walker. "Once they start to think about it, they're very much in favor of an ethical code."

Story continues on next page >

13 commandments for data scientists
According to Michael Walker, a managing partner at systems integrator Rose Business Technologies, data scientists should be held to high ethical standards, just as doctors and lawyers are. Toward that end, he has created a set of commandments for number-crunchers — a list that aims to keep data scientists on the straight and narrow while preserving consumer privacy.

In Walker's view, data scientists shalt not do the following:

1. Fail to use scientific methods in performing data science.

2. Fail to rank the quality of evidence in a reasonable and understandable manner for the client.

3. Claim weak or uncertain evidence is strong evidence.

4. Misuse weak or uncertain evidence to communicate a false reality or promote an illusion of understanding.

5. Fail to rank the quality of data in a reasonable and understandable manner for the client.

6. Claim bad or uncertain data quality is good data quality.

7. Misuse bad or uncertain data quality to communicate a false reality or promote an illusion of understanding.

8. Fail to disclose any and all data science results or engage in cherry-picking.

9. Fail to attempt to replicate data science results.

10. Fail to disclose that data science results could not be replicated.

11. Misuse data science results to communicate a false reality or promote an illusion of understanding.

12. Fail to disclose failed experiments or disconfirming evidence known to the data scientist to be directly adverse to the position of the client.

13. Offer evidence that the data scientist knows to be false.

If a data scientist questions the quality of data or evidence, he must disclose this to the client. If a data scientist has offered material evidence and later comes to know that it is false, he shall take reasonable remedial measures, including disclosure to the client. A data scientist may disclose and label evidence he reasonably believes is false.

- Cindy Waxer

In fact, in an August 2013 survey conducted by statistical software company Revolution Analytics, 80% of the respondents said they agreed that there should be an ethical framework for collecting and using data. And more than half of data scientists surveyed agreed that ethics already play a big part in their research.

"My solution is to have some sort of code of professional conduct that data scientists would voluntarily agree to follow to protect people's private data," says Walker. By creating a kind of Hippocratic Oath for analytics professionals, Walker says data scientists will have both the moral and legal grounds for refusing to slice and dice numbers in ways that threaten to violate consumer privacy rights.


Previous Page  1  2  3  4  5  6  Next Page 

Sign up for CIO Asia eNewsletters.