The Watcher Knows

It’s 3 a.m. Do you know what your servers are doing? Is your web server struggling to handle its load? Is one of your application servers running out of hard disk space or memory? Is that legacy Windows server infected with the worm of the week? ZABBIX knows.
It’s 3 a.m. Do you know what your servers are doing? Is your web server struggling to handle its load? Is one of your application servers running out of hard disk space or memory? Is that legacy Windows server infected with the worm of the week?
Crashes, faults, and worms: it’s the stuff system administrator’s nightmares are made of. But open source has a cure for your insomnia: the ZABBIX network monitoring system. Sweet dreams are made of this.
If your network is typical, there are probably a couple dozen essential services spread across several servers. You probably host DNS, firewalls, routers, email, web, and database servers. You probably also have legacy applications, such as the the one finance uses to do payroll. You probably have some home-made applications like ticket tracking software, a database-driven web site, or even applications that you sell and host. And let’s not forget those “under the radar” services installed by your users, like the print server installed on someone’s workstation to allow network printing to a 15-year-old laser printer, or a CVS server set up by one of your developers on her machine.
And if your job as a system administrator is typical, it’s hard to find time to check in on all those servers and services to make sure that that things are running smoothly. Moreover, it’s much more fun to be building a cluster than monitoring servers. Luckily, the computer is good at boring repetitive tasks, and unlike a junior administrator, it rarely complains. Not only that, the computer is a few thousand times faster than any human.
If you go looking for monitoring software, you’ll find a huge array of options, but largely, all of them can be divided into two basic categories: real-time monitoring and historical monitoring. Real-time monitoring software is designed to tell you right away when something needs your attention. For example, a real-time system might send you an email when the disk is almost full on your NFS server. But most system administrators also want something more than just real-time alerts. They need to know how quickly that file server is filling up and therefore when to buy new disks. That’s the job of historical monitoring software.
But what if you want both sets of features? And why wouldn’t you? All that cluster hardware is gathering dust, waiting for you to assemble it.
Lucky for you, open source has a solution. Recently, the ZABBIX (http://www.zabbix.com) system monitoring system reached Version 1.0. It offers SNMP support, native clients for a variety of operating systems, as well as both real-time and historical monitoring features. ZABBIX also provides other critical features such as the ability to set up dependencies between services, and a web interface to configure custom graphs and network maps. Once you download, install, and use ZABBIX, you’ll hardly be able to imagine life without it.[ For a description of why the author chose ZABBIX over other solutions, see the sidebar “Why Humantech chose ZABBIX.”]

Installing the ZABBIX Server

Fundamentally, the ZABBIX server software consists of a server daemon and a web interface. Both services are attached to a single SQL database, and do not communicate with each other directly.
The zabbix_suckerd daemon is the central ZABBIX process. It periodically polls clients running zabbix_agentd for updated statistics and saves the information into a database. Beyond that, the zabbix_suckerd daemon also collects SNMP data and performs housekeeping functions such as purging old data.
The other core piece of ZABBIX is the PHP- based web interface. The web interface is used for configuration and administration, as well as for easy access to data about monitored services.
ZABBIX also has other components, such as a standalone (non- inetd) ZABBIX agent and other daemons, but you usually don’t need to use those. In fact, ZABBIX is much easier to use if you stick with the core services. To install the core services, you should download the ZABBIX tarballs from http://www.zabbix.com/download.php and compile and install the code.
But before you do that, there are several dependencies, albeit very common and probably already available in your favorite Linux distribution. Before starting the ZABBIX install, check to see if your server has Apache 1.3.12 (or later), either MySQL 3.22 or PostgreSQL 7.0.2 (or later), PHP 4 (running as an Apache module), PHP-GD and NET-SNMP, and all of the development libraries required for the latter two packages. If you are missing any of the dependencies, download and install them before proceeding.
Once you have all the prerequisite packages installed, create a new user named zabbix for all of the ZABBIX services. It is highly recommended that you not run ZABBIX as root, bin, or any other account with special privileges. In fact, ZABBIX daemons started as root automatically switch to the zabbix account, and fail to start if it does not exist.
Now, create a database named zabbix, and a user account in the database system with full permissions to the zabbix database. (This article assumes that you’re using MySQL. If you prefer to use PostgreSQL, see the ZABBIX web site for specific instructions for that database.)
Next, extract the ZABBIX tarball to a temporary directory. In that directory, you should find a directory named create that contains the schema and initial data for the zabbix database. To install the schema and data into your database, open a terminal window, go to the ZABBIX directory, and type:
$mysql –u username –p password< create/mysql/schema.sql
$mysql –u username –p password< create/data/data.sql
Once the database is set up you are ready to configure and compile the software:
$ ./configure –with-mysql –with-net-snmp
$ make
(If you want to force everything to get statically linked, say, because you’re going to install the resulting binaries on another box, add the –enable-static flag to the ./configure command.)
Once the compilation is complete, copy the binaries from bin (inyour temporary ZABBIX directory) to /usr/local/bin.
Next, create an /etc/zabbix directory, and copy all of the .conf files from misc/conf in your ZABBIX directory to /etc/zabbix.. Edit the zabbix_suckerd.conf and zabbix_trapperd.conf files to specify your database name and password.
Then, as root, start the two core ZABBIX services:
# cd /usr/local/bin
# ./zabbix_suckerd
# ./zabbix_trapperd
If you also want to monitor resources on the machine running the ZABBIX server itself, also start the ZABBIX agent with ./zabbix_agentd.
Finally, you’re now ready to configure ZABBIX’s web interface. Start by editing the frontends/php/includes/db.inc file to reflect your database login information:
$DB_SERVER ="localhost"
$DB_DATABASE ="zabbix"
$DB_PWD ="I "
Then, copy all of the files in frontend/php to a directory in your Apache’s DocumentRoot. Depending on your Apache configuration, this could be /home/zabbix/public_html/ or /var/www/html/zabbix.
Now open up a web browser, type in http://localhost/zabbix (or your URL). If everything worked, you should be able to login to the web based ZABBIX administration tool.
The default login is Admin with a blank password. Once you log in, set up a secure password for the Admin account. To do this simply click on configuration and then users. Click the change link next to the Admin user, and on the next page, enter a secure password and leave everything else the same. You should also click on media, and enter email server information so ZABBIX can send you email alerts.

Installing the ZABBIX Client on Linux and Unix

You can monitor the availability of simple services like HTTP, FTP, SMTP, IMAP, and NNTP directly from ZABBIX without installing the ZABBIX client. But if you want to track network I/O, processor utilization, free disk space, and other statistics of other Linux and Unix machines on your network, you should install the ZABBIX client software on each of those machines. The ZABBIX client agents are very efficient, using very few system resources.
The client install is quite simple. You don’t need Apache, PHP, MySQL, or even Net-SNMP, and there’s a good chance that pre-compiled binaries for whatever *nix system you’re running are already available. (As this issue of Linux Magazine goes to print, the download page of the ZABBIX site offers client software for Solaris 8.0 and 9, SUSE Linux 9.0, FreeBSD 4.x, HP-UX 11.00, Slackware 10.0, and Tru64 5.1.)
The first step is to create a non-privileged ZABBIX account on each client machine. Just like the server version, the standard practice is to use the name zabbix.
If your server and client are the same operating system and version, and you statically linked everything when setting up the server, you can just copy the zabbix_agentd and zabbix_agentd.conf file to /opt/zabbix/bin/ and /etc/zabbix/, respectively.
If there isn’t a pre-compiled agent available for your OS version, or you don’t want statically linked libraries, just download the same tarball as you did for the server install. Once you extract the tarball in a temporary directory, all you need to do to compile the ZABBIX client is to go to that directory and run:
$ ./configure && make
The make step leaves binaries in ZABBIX’s bin directory, ready to copy to /usr/local/bin/. Also copy the zabbix_agentd.conf file from ZABBIX’s misc/conf directory) to /etc/zabbix/. Edit line 8 of the /etc/zabbix_agentd.conf file to specify the IP address of your ZABBIX monitoring server.

Installing the ZABBIX client on Windows

Do some of your users depend on legacy applications running on Windows servers? No problem! ZABBIX has a native 32-bit Windows agent. In the standard ZABBIX tarball, the bin contains the file ZabbixW23.exe. Just copy that file, along with a sample zabbix_agentd.conf file to the Windows server to be monitored.
Copy the configuration file to wherever you need it stored. Edit the configuration file to include the IP address of your ZABBIX server. Finally, open up a DOS prompt and type:
C: ZabbixW32.exe –config  install
C: ZabbixW32.exe start
ZABBIX is painless, even on Windows.

Configuring Monitored Hosts on the ZABBIX server

With the software installed and running on the server and clients, it’s time to start monitoring. Log into the web interface, click Configure and then Hosts. (Figure One shows a sample configuration page.)
FIGURE ONE: Configuring methods to send messages to administrators

You can now add hosts by entering the necessary information in the form at the bottom of the page. It’s best to use the domain name of the host you are monitoring, or in the case of network equipment, to use the actual IP address of the machine. You can then select one or more groups, which will contain your new host. Select the port that the client is listening on, and one of the appropriate templates from the list. Make sure the status is monitored. When you’re finished, click Add.
If your client is configured correctly, and your ZABBIX server is able to connect to it over the network, it will show up as monitored. If there is a problem connecting to the client, the status changes to unreachable after a few seconds.
To add additional monitored hosts, simply repeat this process. To monitor hosts without installing the ZABBIX client, simply choose the Host.Standalone or SNMP templates.
To configure which items are monitored on a new host, either click the name of the host in the host list or click on items, and then choose your group and host from the dropdown lists at the top. You should see a complete list of the items set up to monitor on that host. Check the boxes you wish to activate, deactivate, or delete, and choose the appropriate button at the bottom of the list.

Triggers and Actions

Triggers are user-definable expressions that evaluate to either true, false, or unknown. Actions can also be defined and can occur whenever a trigger goes from true to false, or from false to true. For example, a common action is to send an email alert to an individual or a group, such as the webmasters.
To configure triggers, click Configure and then Triggers and choose the host you want to configure from the host list. You can then enter an expression such as {rt.humantech.com:memory[free]}<16384>, which sends notification immediately if the server ever has less than 16 mb free memory. You can then click on the action link associated with that trigger and determine who is to receive email when the trigger goes from false to true, or true to false, and how often that email is to be repeated.
Another important field to fill out is dependencies. If you set up your dependencies properly, you won’t get alerts telling you that all your remote servers are down just because your VPN connection to the remote office went down.

Using Graphs and Screens

With ZABBIX, it’s easy to set up graphs to monitor trends on your servers. To set up a new graph, click Configure and Graphs, and enter the name and size of your new in the fields at the bottom of the screen. Then click the name of the graph to update the parameters you want to graph. You can use graphs to trace memory usage (to look for leaks, for instance), to find performance bottlenecks, or to predict when a fileshare is going to fill up.
Another important time saver is the screens feature, shown in Figure Two. A ZABBIX screen allows you to group together related graphs and network maps on a single screen. The screens link allows you to set up a new screen with several rows and columns placing a graph or network map in each cell. You can set up screens to give you a quick overview of what is happening on your network, or to monitor several different parameters on a single server to determine if you have any bottlenecks.
FIGURE TWO: Condense many graphs into a single ZABBIX screen

Tuning the Demands of Monitoring

When monitoring another host is so simple, it can be tempting to monitor everything. And while you can monitor 100 different parameters on each machine on your network, if you do, you may start to run into performance problems.
A best practice is to monitor everything you need and no more. In practice this generally means, monitor servers and important services, but not client machines. It also means not monitoring the same thing six different ways. While you can monitor and record the 1-, 5-, and 15-minute average CPU load on a server, you definitely don’t need to monitor all three of them every thirty seconds!
But what counts as a server? Should you only monitor the officially supported servers in your data center? Or should do you include the Mac OS X file server shared by the graphic designers? What about a Windows file share on a one of the VP’s desktop? Opinions certainly vary, but a good rule-of-thumb is that any computer that provides a service for more than one user is a server. Of course, non-mission-critical servers can be monitored less frequently, but should be monitored nonetheless.
ZABBIX allows you to define new templates, and you can this feature extensively to monitor the minimum information you need for each kind of server you manage. For example, you can build a template for MySQL, NFS, Samba, email, web, and Java application servers.
With ZABBIX, there are two likely performance bottlenecks: server resources (CPU time and database size) and bandwidth usage. For local network monitoring, you’ll likely run into server resource issues before you even begin to touch bandwidth limitations. However, when monitoring services over slower WAN connections, it’s important to remember that monitoring uses up bandwidth.
The ZABBIX manual says that a ZABBIX server (assuming a 1.5 GHz Pentium 4 with 256 MB RAM and a single IDE hard drive) can handle over 200 parameters per second. If you are using a less powerfull box, running PostgreSQL rather than MySQL, or if your ZABBIX server is hosting other services, you’ll definitely need to limit the total number of parameters accordingly.
But let’s say you do have a server capable of running over 200 parameters per second. Theoretically, you should be able to handle 600 servers when monitoring only 10 parameters per server. But if you leave all 100+ parameters in the Unix client template turned on, you’ll be lucky to keep up with monitoring 60 machines. So, again, if want to monitor a lot of machines, be careful to monitor only the information you really need.
Equally importantly, anybody monitoring remote servers or services will want to tune the monitored parameters to fit their bandwidth limitations. The general rule of thumb for monitoring is that you should never use up more than 1% of your total link bandwidth for monitoring services. ZABBIX uses up about 50 KBps when processing 150 metrics per second. At that rate, you probably shouldn’t monitor more than 5 metrics per second over a T1 line.
If you’re monitoring web services remotely, you might be able to get away with monitoring just the response time of an HTTP request. This not only tells you that a specific web server is up, it also gives you an idea how well it is responding to user requests. Similarly, if you’re monitoring an email server at a remote location, you might want to regularly check SMTP response times, and check its free disk space, CPU utilization, and free memory less frequently.

Information Is Good

No monitoring software can replace your experience and judgement, but it can make important data more visible, and it can check on your systems far more often and regularly than you can. With ZABBIX, you can also gain insight into server activity, and identify trends, remove bottlenecks, and respond to potential problems with more speed and precision.

Mark Ramm is a Linux enthusiast, the IT Manager at Humantech, Inc., and the lead consultant at Pragmatic System Administration. He uses Linux and other free software to solve problems and create reliable infrastructure for small businesses.

Comments are closed.