The ability to read hundreds of articles in less time than it would take a human to read just one, gives Terminal customers a distinct advantage, as Bloomberg Market Specialist Ian McFarlane explains.
"During the time you've got your nose buried in that piece, the stocks or bonds in your portfolio might have been mentioned in hundreds of social media posts and news articles. It's impossible for a human to keep up with that deluge of real-time data. That's where distilling sentiment from news and social media provides an advantage," he said.
This tool is being further developed to make a judgement on the reliability of Tweets and social media posts, says Mann.
"We know how to vet a news story and a news organisation. How do you vet that stuff on Twitter for accuracy?" he says.
In another project, conducted over the last two years, machine learning is being used to extract data from PDF financial reports and documents.
"Sometimes they're in a structured format like XML or XPRL but often they're in PDF and they have a huge amount of data. To extract the data, in the past we've had armies of data analysts typing stuff in looking into these reports. It's expensive, it's slow and we often don't have the recall that we want," Mann says. "So we've been mapping a fairly involved research effort to extract data from those documents."
As a next step to that work, the company is now researching how a machine can identify graphs and scatterplots to extract the numbers.
"It firstly looks at the scatterplot, then it identifies the axes of the scatterplot and the ticks on the scatterplot and then it registers each data point so that it can recover all of the data that was used to make up that scatterplot," Mann says. "All of this is an effort to give structure to all of the unstructured data."
Behind much of Bloomberg's recent builds has been an open source ethic. Mann says there has been a sea change within the company about open source.
"When the company started in 1981 there really wasn't a whole lot of open source. And so there was a mentality of if it's not invented here we're not interested," Mann says.
Gideon Mann, head of data science at Bloomberg
Indeed, Bloomberg once built networking gear for its clients, and had its own networking protocol. The company even produced its own keyboards before they became standardised.
"There's always this thread of 'well we'll build it on our own when it's not widely available and then when it becomes a commodity then then we'll adopt'. And I think the same thing was true for open source," Mann says.
Sign up for CIO Asia eNewsletters.