Rethinking the status quo of data analytics
The mantra of data analytics has been the same for decades: Don't build a data warehouse until you know the questions you want to ask of your data. Data warehouses store precomputed answers intended to respond to questions relatively quickly — the limitation being that only predetermined questions can be answered.
If the questions need to change, it's impossible for business analysts to go back to the raw data to get answers to new questions or explore data beyond predefined parameters. Adding new data sets to a data warehouse also presents a challenge, as does making changes to an existing data set, such as adjusting the level of granularity (for example, from days to hours). Seemingly minor alterations like these can take weeks if not months to execute.
Today's enterprises require a more flexible approach to performing big data analytics because:
- The variety and quantity of data are growing massively.
- Analysts can't know in advance what questions they'll need to ask of their data as the market, customers, and competitors change.
- To answer the full range of unanticipated questions, self-service access must be provided to all of an enterprise's raw data.
- To stay competitive, businesses need to use their data in more ways than ever before.
Moreover, business analysts need to be empowered to manipulate data so that it can be shared with other people in the organization. In short, they must play a direct role in fostering collaboration around business intelligence. After all, business intelligence provides value to the company only if it can be used for business decision-making — and only if those decisions are made at the right time and by the right people in the organization.
An approach to enabling fast, unbounded BI
Platfora has rethought the status quo of big data analytics. It begins with the Hadoop Data Reservoir (HDR), which is our vision for how Hadoop can be used effectively within the enterprise as a single, central repository where all enterprise data can reside. The HDR serves as both the storage and the source of data for what we refer to as self-service analytics. It provides processing for data preparation and advanced analytics, ultimately eliminating data silos and reducing costs.
The integrated platform we developed to support a new era of self-service analytics helps to remove the obstacles to business intelligence described earlier by enabling an "interest-driven pipeline" of data controlled by the end-user. The end-user — typically a business analyst — can access raw data directly from Hadoop, which is then transformed into interactive, in-memory business intelligence. There is no need for a data warehouse or for separate ETL (extract, transfer, load) software and the headaches described above.
Sign up for CIO Asia eNewsletters.