Organisations have been rapidly adopting Hadoop and other big data technologies over the past several years, but it has been accompanied by a steady undercurrent of concern about the state of enterprise-grade security.
While Hadoop distribution vendors and the open source community have been working to add security and governance features to Hadoop, Redwood City, Calif.-based startup BlueTalon has been developing a policy engine intended to span an organization's data infrastructure, providing fine-grained access control and data masking for Hadoop clusters, relational database management systems (RDBMS), NoSQL data stores and more, on-premise, in the cloud and in hybrid cloud environments.
"We're the only company that's coming at this at an enterprise-wide basis," says Eric Tilenius, CEO of BlueTalon and formerly executive-in-residence at Scale Venture Partners. "Companies don't have one data system. Being able to have one consistent access control is really important to data-centric security. We work across all the various data sources."
BlueTalon made a splash at the Strata+Hadoop World conference in San Jose in February, where it was the Audience Choice winner of the Startup Showcase.
Today, BlueTalon announced that Hadoop distribution vendor Cloudera has certified the BlueTalon Policy Engine 2.0 with Impala or Hive as part of Cloudera Enterprise.
The BlueTalon Policy Engine integrates with Impala and Hive as part of Cloudera Enterprise to achieve the following results:
- Provide filtering with fine-grained access control at the row, column, cell or partial cell levels.
- Dynamically mask data and allow users to utilize sensitive data in queries without revealing it.
- Provision precise data access by enabling role- and purpose-based data access. policies to be authored from a central, easy-to-use graphical user interface.
- Enforce consistent data access policies across users, applications and data repositories.
- Audit data access to ensure compliance with industry regulations such as HIPAA and PCI, and to quickly spot anomalous data requests before significant data leakage occurs.
Tilenius notes that organizations are increasingly putting their data in massive data repositories like data lakes, and while there are tremendous potential benefits in doing so, it also increases risk.
"Businesses nowadays run on data," he says. "It's not OK to just have one guy in the inner sanctum who tells you what the data is. People want direct access to the data. But Hadoop is among the least hardened systems in the enterprise."
"You say security and people think about things in black and white," he adds. "Authentication, Kerberos, encryption — people look at the perimeter. But when attacks come from compromised credentials, none of that protects you. It's not sufficient anymore. It's more important than ever to have a data-centric approach — what is the data and who should be access it and what can they see?"
Sign up for CIO Asia eNewsletters.