Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

How to get your mainframe's data for Hadoop analytics

Andrew C. Oliver | July 1, 2016
IT's mainframe managers don't want to give you access but do want the mainframe's data used. Here's how to square that circle

Orchestration and more

Virtually any of these techniques will require some kind of orchestration, which I've covered before. I've had more than one client require me to write that tool in shell scripts or, worse, Oozie (which is Hadoop's worst-written piece of software and all copies should be taken out to the desert and turned into a Burning Man statue). Seriously, though, use an orchestration tool rather than writing your own or leaving it implicit.

Just because there are patterns doesn't mean you should write this from scratch. There are certainly ETL tools that do some or most of this.

To be fair, frequently the configuration and mapping required makes you wish you had done so in the end. You can check out anything from Talend to Zaloni that might work better than rolling your own.

The bottom line is that you can use mainframe data with Hadoop or Spark. There is no obstacle that you can't overcome, via no-install to middle-of-the-night to EBCDIC techniques.

As a result, you don't have to replace the mainframe just because you've decided to do more advanced analytics from enterprise data hubs to analyze-in-place. The mainframe team should like that.

Source: InfoWorld


Previous Page  1  2  3 

Sign up for CIO Asia eNewsletters.