Although vendor-written, this contributed piece does not promote a product or service and has been edited and approved by Executive Networks Media editors.
I’m an aerospace engineer by degree and an IT executive by practice. Early in my career, I worked on missile hardware and simulators with some of the smartest minds at Marshall Space Flight Center in Huntsville, AL. An adage from those days still drives me today: “Better is the evil of good enough.”
In rocket science, an astronaut’s life is literally in the balance with every engineering decision. Being perfect is mission critical. But along the way, NASA engineers realized while perfection is important, it was not to be universally adopted, for several key reasons: It is very expensive, it draws out timelines, and it can result in extreme over-engineering.
When it comes to IT network monitoring and the need for rapid response in determining, resolving and ultimately preventing problems, remnants of this old behavior are still in existence. Teams eager to build the best, the most complete, and the most comprehensive solutions can fall into the trap of endless design, constantly adding a new metric, method, or collection point to a system that’s not even deployed.
For IT leaders in today’s economy, this approach is impractical.
With today’s agile design and development practices, we are all pushed to get out minimum viable product, to fail fast, to break then fix. Success is measured in days – even hours. The most cutting edge development shops are in a continuous build, continuous test, DevOps mode, getting solutions to market at unheard of speeds.
So how are today’s IT leaders, who are as intolerant of failure as rocket scientists, supposed to respond to demands for a fast, iterative, rapid feedback monitoring solution? Here are three ideas.
Pick a platform, not a set of tools. NASA taught us the benefit of the hard work in building reliable launch vehicles that could be used in many creative ways. The now-retired space shuttle was a marvel of engineering. It put into space the GPS array that each of our smartphones uses today, and delivered countless astronauts, experiment materials and components to the space station, over a lifespan of decades.
Your monitoring solution needs to be based on a reliable, extensible platform that provides basic but essential capabilities: event and data collection, message queuing, scale availability and extensibility, for example.
It is very tempting to white board a set of available tools that together provide this capability and get your smart team to put it all together. Ultimately, however, this approach fails.
Your team will begin to uncover the problems of getting all these pieces to work together seamlessly and reliably. You will make decisions based on a host of third party technologies, each with its own roadmap. Some of those technologies will fail or will disappear. Each will advance capability at a different pace.
Sign up for CIO Asia eNewsletters.