· Fraud Detection -- where systems need to analyse data immediately to actively detect fraudulent activity.
· Operational Data -- users need the ability to store logs for easy lookup and have reliable information for analytic model building.
Modern analytic solutions do exist for mutable data types. However, these may require yet another technology deployment or result in redundant storage. Existing solutions have also been plagued by some common drawbacks including poor analytic performance, complex application design, and security or policy enforcement across multiple access engines.
Mutable data types inside of Hadoop have often been handled by data stores like HBase but often at the sacrifice of analytic performance. This has forced developers to leverage both HDFS and HBase to strike the balance.
The good news is that we are making progress in this area. One of the new solutions now available is Kudu. Currently, available as public beta, Kudu is an updatable columnar store for Hadoop designed for fast analytic performance. It simplifies the architecture for building analytic applications on changing data, complementing the capabilities of HDFS and HBase.
Kudu is a simpler architecture providing superior performance in a single data store to support increasingly common real-time use cases. We expect it to greatly enhance the performance of Hadoop components like Impala and is helping continue to drive Impala's performance leadership in the ecosystem.
Another benefit that Kudu brings is that it eliminates the need to explore tiering solutions that complicate Hadoop's unified design. Developers no longer have to make a choice between the scanning analytic capabilities of HDFS and the insert and update capabilities of Apache HBase.
With data, you cannot have performance without ensuring security. Another solution that we have recently made available for public beta download is RecordService. RecordService is the new role-based policy enforcement engine for Apache Hadoop. It is designed to provide centralised policy enforcement so that developers and users can continue to add new features to Hadoop with a core standard of policy management.
RecordService provides the controls that allow us to integrate sensitive data sources so that we are creating a better, full-fidelity view of data. This is crucial as more organisations are using big data systems to handle highly sensitive data.
With data analytics, you have control
There is no doubt that Apache Hadoop is advancing the state of modern analytic databases. Business leaders are saying that data analytics is going to be the future of everything, but we believe that what is more important is that data analytics offer more control now.
To most businesses, it can be a little chaotic and overwhelming now. What with an onslaught of data growth, and a competitive environment that is fast transforming. It is crucial that data analytics allow decision makers to measure, monitor and predict so that they can make changes if necessary. Data analytics offers them a view into who their customers are, what they are buying, how their business operation is running, what their market outlook is like, and much more. With a clearer view into what is going on, what data analytics offers is better control.
Sign up for CIO Asia eNewsletters.