Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

5 things you need to know about data exhaust

Katherine Noyes | May 16, 2016
Pay close attention or your data lake may turn into a data swamp.

"A lot of exhaust data isn’t immediately valuable," agreed Nik Rouda, senior analyst with Enterprise Strategy Group. "The trick is figuring what is or could be."

4. Beware the 'swamp' -- and the legal baggage.

There can be risks associated with data exhaust.

"This is generally stuff customers may or may not be willing to have given you," Rattenbury explained. "So there are potential legal, marketing, and public-relations risks around leveraging that data. You could end up alienating your customer base or partners by knowing stuff about them that they didn't want you to know."

The implications can be subtle. If an insurance company were to make use of the fact that it can see the GPS location of everywhere you've recently parked your car, for instance, it could raise rates for customers who routinely park in higher-crime areas. Without intending to do so, it might build an algorithm that ends up discriminating racially, he pointed out.

Another potential risk is saving data that will never be useful.

"CIOs need to balance the value of data exhaust against the waste of keeping tons of useless data forever," Shacklett said. "This is very difficult to do right now. "

The goal is to save data exhaust that can go beyond just adding incremental insights and color to being transformative in business activities, Rouda said. "If there isn’t any business reason, this is where data lakes get a bad rap" and become data swamps.

5. You need to make some decisions.

The bottom line is that it's critical to be selective about what data exhaust gets saved.

"It is important to start making some executive decisions on what you are going to throw out," Shacklett said.

For instance, when it comes to smartphones and other devices, it's well-known that much of the associated streaming data is "overhead" from device handshaking and extraneous "log data gibberish," she pointed out. "It is doubtful that this type of data will ever be useful."

Companies should also consult with lawyers, Rattenbury said.

In addition, they should get their employees closest to the core business in touch with the data. "They'll have immediate questions they can ask that will show the relevance right away," he explained.

From a technical perspective, companies need scalable storage technologies as well as tools for self-service data access.

One of the hardest pieces of working with exhaust data is getting a single coherent view around it, Rattenbury said. Cleaning up and unifying that data can be a challenge.

"I might have signed up for service at one place and entered credit-card information at another," he explained. "You've recorded the same piece of data on me from a few different places."


Previous Page  1  2  3  Next Page 

Sign up for CIO Asia eNewsletters.