I've said it before and I'll say again: Data science is a team sport.
The gold rush has started and no one will question the wisdom of buying a random acre of land with a stream and searching for your very own gold nugget -- or in this case, a data scientist. Gosh, there are a lot of articles on what makes a good data scientist.
Enough of that: I'd rather focus on what makes a bad data scientist who has the potential to harm rather than help your organization:
1. Weak mathematical background
With very few exceptions, data scientists are, at their core, math geeks. They fall on a spectrum, from total math types who write terrible Python (and maybe R) to folks who can pop machine learning algorithms off the top of their head. You may need both depending on what you're doing. But a data scientist with a weak mathematical background probably isn't a real data scientist. Maybe they're a data architect or data engineer, but they're more likely a consultant from a staffing firm. This person won't help you. A weak mathematical background can hurt in a lot of ways -- particularly in judging whether the results you're getting are useful.
2. Weak computing background
Data scientists who are mathematicians but don't really understand computers aren't terribly useful (in the same way an executive assistant who uses a typewriter isn't terribly useful in the modern world). In plenty of circumstances, the way you'd calculate something on paper isn't the same as how you'd calculate it using a distributed platform like Spark. Your data scientist needs to understand this.
3. Too good to be true
At the same time, don't expect to find a data scientist who is a mathematician, statistician, and distributed computing developer, with an MBA and actual experience as a mathematician, distributed computing developer, business person, and so on. In the words of a friend: "How old are they -- 80?" This is why you need a team. When you see a data scientist who meets the "unicorn" definition, remember this simple rule: Unicorns do not exist!
4. Effete academics
Just like there are coders who don't code and architects with no actual technical expertise, there are data scientists with limited experience with actual, you know, data. Moreover, they don't want to get their hands dirty by digging in the code. We're talking practical application, not theory. You're not running a university.
5. Poor communication skills
Fundamentally, a data scientist is there to bring clarity to data. While you as a technology pro or business expert might not understand all of the math or be able to implement it yourself, to trust in the decision-making process, you should understand it at a high level at least conceptually. Whether it's a clustering algorithm, probability calculations, or NLP, this stuff isn't hard to convey. If your data scientist isn't making that happen, your data scientist is doing a bad job. Your data scientist needs to be approachable and make the process approachable. Also, the ability to communicate clearly with multiple groups at an organization to get adjunct information, data, or access to data -- and details on how the data was developed -- makes the work go much smoother.
Sign up for CIO Asia eNewsletters.