Cacti is an extensive performance graphing and trending tool that can be used to track nearly any monitored metric that can be plotted on a graph. It's also infinitely customizable, which means it can get complex in places.
Nagios is a mature network monitoring framework that's been in active development for many years. Written in C, it's almost everything that system and network administrators could ask for in a monitoring package. The Web GUI is fast and intuitive, and the back end is extremely robust.
As with Cacti, a very active community supports Nagios, and plug-ins exist for a massive array of hardware and software. From basic ping tests to integration with plug-ins like WebInject, you can constantly monitor the status of servers, services, network links, and basically anything that speaks IP. I use Nagios to monitor server disk space, RAM and CPU utilization, FLEXlm license utilization, server exhaust temperatures, and WAN and Internet link latency. It can be used to ensure that Web servers are not only answering HTTP queries, but that they're returning the expected pages and haven't been hijacked, for example.
Network and server monitoring is obviously incomplete without notifications. Nagios has a full email/SMS notification engine and an escalation layout that can be used to make intelligent decisions on who and when to notify, which can save plenty of sleep if used correctly. In addition, I've integrated Nagios notifications with Jabber, so the instant an exception is thrown, I get an IM from Nagios detailing the problem in addition to an SMS or email, depending on the escalation settings for that object. The Web GUI can be used to quickly suspend notifications or acknowledge problems when they occur, and it can even record notes entered by admins.
As if this wasn't enough, a mapping function displays all the monitored devices in a logical representation of their placement on the network, with color-coding to show problems as they occur.
The downside to Nagios is the configuration. The config is best done via command line and can present a significant learning curve for newbies, though folks who are comfortable with standard Linux/Unix config files will feel right at home. As with many tools, the capabilities of Nagios are immense, but the effort to take advantage of some of those capabilities is equally significant.
Don't let the complexity discourage you -- Nagios has saved my bacon more times than I can possibly recall. The benefits of the early-warning systems provided by this tool for so many different aspects of the network cannot be overstated. It's easily worth your time and effort.
Icinga started out as a fork of Nagios, but has recently been rewritten as Icinga 2. Both versions are under active development and available today, and Icinga 1.x is backward-compatible with Nagios plug-ins and configurations. Icinga 2 has been developed to be smaller and sleeker, and it offers distributed monitoring and multithreading frameworks that aren't present in Nagios or Icinga 1. You can migrate from Nagios to Icinga 1 and from Icinga 1 to Icinga 2.
Sign up for CIO Asia eNewsletters.