Demand for Hadoop professionals falls into three broad categories: data analysts or data scientists; data engineers ;and IT data management professionals, said Martin Hall, CEO of Karmasphere, which sells software products for Hadoop environments.
The data management professionals will be the ones who choose, install, manage, provision and scale Hadoop clusters, Hall said. These are the IT professionals who decide whether Hadoop is located in the cloud or on premise, which vendors to choose, which distribution of Hadoop to use, the size of the cluster and whether it will be used for running production applications or for quality-testing purposes.
The skills required for this role are similar to those required for doing the same tasks in traditional relational database and data warehouse environments, he said.
Hadoop data engineers, meanwhile, are those responsible for creating the data processing jobs and building the distributed MapReduce algorithms for use by data analysts. Those with skills in areas such as Java and C++ could find more opportunities as enterprises begin deploying Hadoop, he said.
The third category of professional in demand are data scientists with experience in areas such as SAS, SPSS and programming languages such as R, Hall said. These are the professionals who will generate, analyze, share and integrate intelligence gathered and stored in Hadoop environments.
For the moment, the shortage of Hadoop manpower means companies need help from service providers to deploy the technology. One indication of this is the fact that the revenues generated by professional consulting and systems integration firms involving Hadoop is significantly larger than the revenues from sale of Hadoop products, Kobielus said.
Companies such as Cloudera, MapR, Hortonworks and IBM today offer training courses in Hadoop that companies can take advantage of to build their own Hadoop centers of excellence, he said.
Sign up for CIO Asia eNewsletters.