When the big data rubber hits the road, it's about more than just storing massive amounts of data or even analyzing and visualising a single stream. Gaining true insight from your data assets generally requires blending operational data and data from other big data sources together. Business analytics platform vendor Pentaho is striving to make that process easier than ever.
"True 'big picture' insights happen when operational data sources are blended with big data sources," says Quentin Gallivan, CEO of Pentaho. "Companies that compete largely on service, in industries like telecommunications and financial services, see big data blending's potential to help them gain market-share by providing the most personalized and interactive customer experience."
This week, Pentaho unveiled Pentaho Business Analytics 5.0, a complete redesign and overhaul of its data integration and analytics platform that addresses data blending from the ground up and offers a new interface intended to simplify the user experience.
"What we're seeing from our base is the need to make data more valuable by blending it with other data sources to provide insight," says Rosanne Saccone, CMO of Pentaho. "Customers want to blend data not just at the glass and the desktop, but at the source."
For instance, a telco might want to blend machine data about dropped calls with data from its data warehouse identifying its most valuable customers and their service level agreements (SLAs). This would allow the telco to then proactively target valuable customers that are not receiving agreed upon service levels with promotions and discounts.
Blending Can Be a Significant Data Integration Challenge
As Matt Casters, Pentaho's chief of data integration, notes, data blending allows a data integration user to create a transformation capable of deliver data directly to other business analytics tools. Traditionally, data is delivered to these tools via a relational database. But that becomes challenging when dealing with massive volumes of data or when you just don't have the time to wait until database tables are updated.
Addressing this issue often leads to hugely complex big data architectures with many moving parts: Hadoop clusters, NoSQL and traditional RDBS technologies, ETL tools, data marts, traditional BI tools and more.
Bringing it all together and giving users the capability to blend data with varying levels of data quality and granularity can be a significant challenge.
"The main problem we faced early on was that the default language used under the covers, in just about any business intelligence user facing tool, is SQL," Casters explains. "At first glance, it seems that the worlds of data integration and SQL are not compatible."
Casters says that DI requires reading from a multitude of data sources, such as databases, spreadsheets, NoSQL and big data sources, XML and JSON files, web services and more.
Sign up for CIO Asia eNewsletters.