So No.1 we have machine data, and I'm using a term that Splunk popularized. Machine data includes log files, SNMP and WMI, and all of these data sources are largely unstructured. Splunk and others like them realized that enterprises are producing a lot of this unstructured machine data and not really doing anything with it. So they built a platform to index it, archive it, and analyze it to derive some intelligence from it.
I sometimes joke that it's been transformational in the same way as fracking has been in the energy market. What I mean by that is, the value was always there, but by applying new technology we can now access it and extract it. So I think one source of data in the IT environment is this unstructured machine data.
Another source is what I would call code-level instrumentation. And this is what traditional Application Performance Management is based upon. Wily (acquired by CA) really founded that market, but companies like DynaTrace and AppDynamics and even New Relic make use of code-level instrumentation. They have agents that instrument the Java JVM or the .NET common language runtime, and they can derive some intelligence and some performance metrics around how that service performs. Where are the hotspots and bottlenecks? What's it doing? These are very useful tools for developers who have intimate knowledge of the code and want to see how it runs in production.
The third source of data I call service checks. There are lots of facilities for doing this. If you're running some sort of synthetic transaction (basically a script mirroring common user actions), you can use internal checks, which is what HP's Mercury SiteScope and Nagios do today, or external service checks like a Keynote or Compuware's Gomez. These give you a sense of if your service or your application are up or down and, to some degree, how it is performing. But there are some challenges with this approach because, given these things are periodic in nature, there's an inherent under sampling problem. So that means that if you've got any sort of intermittent issue you very well might miss it.
And finally the fourth fundamental source of data for intelligence is what we call wire data. That's everything on the network, from the packets to the payload of individual transactions. It is a very deep, very rich source of data. In fact, all indications are that wire data is at least one or two orders of magnitude larger than other sources of data, because there is just so much moving across our networks. And it's definitive. We know that a transaction completes if we can observe it completing on the wire and we can observe the peers in this conversation acknowledge that that transaction completed.
Sign up for CIO Asia eNewsletters.