Index
I'm having trouble compiling Netsaint - What can I do? | ||||||||||||
Compiling NetSaint on different OSes doesn't really seem to be much of a problem anymore, unless you're missing some string functions... If you're getting errors about the strncat(), strncpy(), or snprintf() functions, you probably don't have the glibc libraries installed on your system. This tends to happen most often on HP-UX and Solaris boxes. I've tried to prevent potential buffer overflows in NetSaint and the CGIs by using these functions, so they are all over the code. If you don't want to install the glibc libraries for some reason, you'll have to find some other way to get everything compiled. If all you're missing is the snprintf() function, you might want to try grabbing the snprintf.c file from http://www.ijs.si/software/snprintf/ and adding it to the Makefiles so that it gets included during when you compile things. A few people have mentioned that this version of snprintf does not support the '%f' formatting flag, so you may be out of luck. Sorry. |
||||||||||||
The statusmap and trends CGIs don't work! | ||||||||||||
If you compile all the CGIs, but don't find the statusmap CGI or trends CGI (or can't get them to work), you probably don't have the following libraries installed correctly on your system:
The gd library is dependent upon the zlib and png libraries (along with a few others), so you'll have to have those libraries installed on your system before you can install the gd library. Newer versions of the gd library also require that the jpeg library also be installed on your system. If you find that the CGIs has not been compiled or do not work properly, make sure you have the gd, png, zlib, and any other required libraries installed on your system, clean out old configuration information and rerun the configure script as follows:
make devclean ./configure --with-gd-lib=LIBDIR --with-gd-inc=INCDIR [other options...] Replace LIBDIR with the directory in which the gd library is installed (usually /usr/lib or /usr/local/lib) and replace INCDIR with the directory in which the header files for the gd library are installed (usually /usr/include or /usr/local/include). After you rerun the configure script, make sure to recompile the CGIs and install them in their proper location. If you're running RedHat Linux and are having a lot of trouble getting things working, I would recommend downloading and installing both the gd and gd-devel RPMs from www.rpmfind.net. Note that other applications that depend on the gd library (PHP, MRTG, etc.) may break when you upgrade, so they may need to be upgraded or rebuilt as well. |
||||||||||||
No hosts are displayed in the image or VRML world generated by the statusmap or statuswrl CGIs | ||||||||||||
The most likely cause of this problem is the fact that you haven't supplied any 2-D or 3-D drawing coordinates for the hosts. Although the statusmap CGI can generate and auto-layout of your hosts, it defaults to trying to use user-supplied coordinates. The statuswrl CGI (VRML) will not function if you do not specify 3-D drawing coordinates. Where do you specify drawing coordinates for hosts? In the hostextinfo[] definitions in the CGI configuration file. Note: If you've compiled the CGIs with database support for extended data, the coordinates should be stored in a database table, not the CGI config file. More information on DB support can be found here. |
||||||||||||
The installation didn't create a libexec/ directory. Where are all the plugins? | ||||||||||||
If you didn't read the installation documentation carefully and didn't notice the big note on the downloads page, you're most likely going to be wondering where all the plugins are. Quick and easy answer - they are distributed separately from NetSaint, so you'll have to grab them from the downloads page or directly from the SourceForge project page. |
||||||||||||
"NetSaint process may not be running" warnings in the CGIs | ||||||||||||
If you are getting erroneous messages about the NetSaint process not running while viewing the CGIs, its probably due to one of the following items:
The CGIs will not allow you to sumbit any commands while they think the NetSaint process is not running. This is done primarily to prevent people from accidentally submitting multiple shutdown/restart commands that don't get processed until NetSaint is started at some future time. |
||||||||||||
Why do I get notifications when hosts are UNREACHABLE? |
||||||||||||
Easy answer. You enabled the notify_unreachable option in the host definition(s). If you don't want to get notified when a host becomes unreachable, disable this option in the host definition. I get lots of emails asking why NetSaint isn't smart enough to disable notifications for hosts that unreachable. The fact of the matter is that NetSaint is smart enough to distinguish between DOWN and UNREACHABLE states - you just have to configure your notification options properly. |
||||||||||||
Hosts are incorrectly listed as being DOWN or UNREACHABLE |
||||||||||||
This seems to be one of the biggest issues for new users. 99.9% of the time this problem is due to an incorrect command definition for the host check command you specified in the host definition. A major cause for this problem was due to a syntax change to the command line arguments of the check_ping plugin. You need to make sure that the host check command is using the proper syntax for the version of the check_ping plugin that you have. You can check to see if the command works properly by executing it manually from the command line. Recent versions of the check_ping plugin require that a -p flag be used to specify the number of packets to send. Previous versions of the plugin did not require this flag - that's where the problem lies. Check your version of the plugin to find out what syntax you should be using and the check your host check command definition(s) to make sure they are using the proper syntax. Important! Just because you have a service that is monitoring ping statistics for a host does not mean that the actual host status is being checked. The status of a host is only checked when a service check results in a non-OK state or if the host was previously down and a service check results in an OK state. Some symptoms of incorrect host check commands include:
|
||||||||||||
When hosts go down, I get notification about services instead of hosts and the service notifications contain incorrect data |
||||||||||||
Several people have reported this problem and I spent hours trying to find the problem until I realized it wasn't a bug in the code. If you get service notifications when you should be getting host notifications (and the service notifications you get seem to contain bogus data), check your contact definitions in the host config file. They are most likely incorrect. Make sure that you are not using the same notification command for service and host notification commands. Service and host notifications are very different and make use of macros which are not transferrable between each type. Look at the sample host config file provided with NetSaint to see what the contact definitions look like and how the service and host notification commands differ. If you're wondering what macros can be used in either type of notification, look at this table. |
||||||||||||
Can I monitor a host without defining any services for it? |
||||||||||||
No, not really. Although you can define a host and not assign any services to it, you will not get the results you are expecting. You must define at least one service for each host you want to monitor. NetSaint is primarily geared towards monitoring services - hosts are really only checked when there are problems or recoveries with services (as noted in the service check scheduling documentation). |
||||||||||||
Can I host or service definitions without restarting NetSaint? |
||||||||||||
No. You must restart NetSaint every time you add hosts or services (or any other type of object definition found in the host config file). Important: If you make configuration changes without restarting NetSaint, you may notice irregularities in the CGIs. Some new hosts may appear, some services may disappear, etc. This does not mean that NetSaint has stopped monitoring the original services and hosts that it started with. This is simply a side effect of the authorization logic in the CGIs, which uses a combination of information stored in the status log and the host config file in deciding what to display. |
||||||||||||
How can I change the timeout values for service checks? |
||||||||||||
First you need to identify where the timeout is occurring. Most plugins time out after 10 seconds of not being able to contact a service (FTP, HTTP, etc). If the plugins are timing out after a short period of time, increase the timeout value for the plugin (use an appropriate command line argument). In addition to plugins having timeouts, NetSaint enforces its own timeout value on all service checks that run. By default, this is set to 30 seconds. If the plugin executes for more than 30 seconds, NetSaint will automatically kill it off and return a critical error for that service. If you see entries in the log file that say a service check timed out, this may be your problem. You can adjust the maximum timeout value for service checks by using the service_check_timeout directive. As a side note, there are also directives for setting the maximum timeout for host checks, notifications, event handlers, and the ocsp command. |
||||||||||||
"Return code of x is out of bounds" errors |
||||||||||||
If the plugin output for a host or service check give a "(Return code of x is out of bounds)" error it usually means one of two things:
| ||||||||||||
I get error messages when email notifications should get sent out |
||||||||||||
If you're seeing message like "mail: Null names are not allowed", "You must specify direct recipients with -s, -c, or -b.", or something similiar, you've probably got an error in your notification command definitions. Make sure that the syntax used to call /bin/mail (or whatever/wherever your mail program happens to be) in your notification command definitions is correct for your system. |
||||||||||||
Debugging "unknown variable" errors during configuration file verification or runtime | ||||||||||||
When trying to run NetSaint or verify your configuration file data using the -v argument, NetSaint may print out a message like "Error in configuration file 'xxxxxxx.cfg' - Line 34 (Unknown variable)". A few simple checks will usually resolve this problem...
|
||||||||||||
How do I run multiple instances on NetSaint on the same machine? | ||||||||||||
You can run multiple instances of NetSaint on the same machine, if you ensure that the following variables are unique for each instance of NetSaint...
If you are using the web interface, you will have to setup separate directories to hold the CGIs for each instance of NetSaint and create appropriate script aliases in your web server configuration file. This is necessary because CGI configuration file must be unique for each setup of CGIs, as it contains a reference to which main configuration file the CGIs should read. One last thing you should check is your init script (if you're using one). The init script should start, stop, restart, and reload all copies of NetSaint (if that's what you want it to do). |
||||||||||||
When I access the CGIs I don't see everything I should or I get authorzation errors... | ||||||||||||
If you believe you are unable to see all the information in the CGIs or if you are getting authorization errors, you probably haven't configured the web server to require authentication or haven't setup authorzation correctly. See the documentation on authentication and authorization in the CGIs here. |
||||||||||||
Where can I find the traceroute and daemonchk CGIs? | ||||||||||||
The traceroute and daemonchk CGIs are now included in the contrib/ subdirectory of the main NetSaint distribution. |
||||||||||||
How do I requre users to authenticate before accessing the web interface? | ||||||||||||
See the documentation on authentication and authorization in the CGIs here. |
||||||||||||
How do I get those pretty pretty host icons to display in my CGIs? | ||||||||||||
If you want to associate images with particular hosts for use in the status, status map, status world, and extended information CGIs, you must define extended host information entries in your CGI configuration file. |
||||||||||||
I'm getting errors when attempting to commit commands to NetSaint via the command CGI | ||||||||||||
If you are getting 'Could not open command file somefile for update' errors when attempting to commit commands to NetSaint via the command CGI, the most likely problem is with directory and/or file permissions. Here is what you can do to fix it...
|
||||||||||||
NetSaint shuts down with warnings about permissions on the command file | ||||||||||||
If NetSaint is shutting itself down after it processes external commands and you get warnings in the log file about incorrect permissions on the command file, make sure to read the directions found here. |
||||||||||||
How do I monitor remote host information? | ||||||||||||
Several people have asked how to use various plugins that check information on the local host to report information from remote hosts. Various methods for doing this are described below.. If you need to actually execute a plugin on a remote host and get the results back, you can use one of the following methods...
If all you need is to check disk space, etc. on a remote host, you can use one of the methods below...
|
||||||||||||
How can I monitor Windows NT servers? | ||||||||||||
Yes, you can monitor NT servers with NetSaint. There are basically two ways it can currently be done...
SNMP The good news is that NT has a lot of performance data that you can monitor. The bad news is that its difficult to do. Your best bet is probably going to be to install SNMP services on all your NT boxes. In order to expose NT performance counters for monitoring, you'll have to run the SNMP service on all servers you want to monitor. You'll also have to install any necessary performance MIBs for the services you want to monitor. I believe these can be found in the NT Resource Kit or in various server admin packages. If you've feeling extra lucky you can try to search the Microsoft site for the terms SNMP and MIB and maybe you'll find something... You can search the MRTG mailing list archives for more information on configuring NT servers to expose various performance counters via SNMP. I know this has been discussed in the past, as many people are graphing various NT performance statistics using MRTG. In fact, somebody from Microsoft is actually doing it - you can find their web page at http://snmpboy.rte.microsoft.com/. Once you've actually got the SNMP stuff working, you can use the check_snmp plugin to query your NT servers and generate alarms. NSSERVICER Addon Jan Christian Kaldestad and Hallstein Lohne have written the nsservicer addon for monitoring NT servers. The addon includes a service that runs on your NT servers and several plugins that run from the NetSaint host. The nsservicer addon is capable of monitoring the event log, disk usage, process usage, and other info. You can find the addon int the contrib section of the downloads page. |
||||||||||||
How can I monitor Novell Netware servers? | ||||||||||||
You can monitor basic stats on your Novell server like disk usage, user connections, LRU sitting time, cache buffers, long term cache hits, and processor load by using the check_nwstat plugin (which is included in the main plugin distribution). In order for the plugin to work, you have to install and run James Drew's MRTGEXT NLM on your Novell server. The NLM can be obtained here. |
||||||||||||
Can NetSaint send SNMP traps to management hosts? | ||||||||||||
Yes, but not directly. NetSaint relies on plugins to handle the gathering of service and host information and event handler scripts to handle events that occur with services and hosts. If you want to have NetSaint send an SNMP trap to a management host in the event that a particular service has a problem, you will have to write a service event handler script and add it to the event_handler option of the service definition. If you have the UCD-SNMP package installed on your host, you could have the script call the snmptrap command to actually send a trap message, depending on what type of service event occurred. Look at the example event handler script to get a better idea of how to write a script. |
||||||||||||
Can NetSaint receive SNMP traps? | ||||||||||||
Yes, but not directly. NetSaint is not designed to be a replacement for a full-blown SNMP management system, but you can configure it to generate alerts based on SNMP traps that are received by some host on your network. If you have the UCD-SNMP (now called NET-SNMP) package installed on a host on your network, you can have the snmptrapd daemon route SNMP traps to NetSaint using passive checks. More information on doing this can be found here. |
||||||||||||
Can NetSaint log host and service events to an external database? | ||||||||||||
Not directly, but this can be done fairly easily. You'll probably want to define global host and service event handlers to do this. The global event handlers could call a script which inserts the appropriate event information into a database of your choosing. This would allow you to run queries and generate more detailed reports than what are available in the CGIs. |
||||||||||||
Something isn't working properly - How can I track down the problem? | ||||||||||||
I've worked in tech support for a few years and have spent my share of time on a helpdesk. Most people are vague when they report a problem and have no desire whatsoever to try and track down the problem - they just want you to fix it now. I hope you are not that type of person. NetSaint is relatively new and is probably chock full of bugs, so things will not always work properly. If you suspect that either the service check or notification routines are not working, here are a few things you can do to try and track down the problem... This first thing you should do is verify your configuration data by running NetSaint with the -v option. Example: /usr/local/netsaint/bin/netsaint -v /usr/local/netsaint/etc/netsaint.cfg If no errors are found, proceed to the next steps. If NetSaint reports some error, go back and fix your configuration files. The next step will take more time, but will give you more information on what is going on inside of NetSaint. When I first developed NetSaint I added a lot of debugging code to help me track down problems. I still use that code when I add new features or track down bugs myself. Here is how to use the debugging code... Reconfigure NetSaint and enable one or more debug options as follows, replacing the "--enable-DEBUGx" with one or more of the values from the table below: ./configure --prefix=/your/netsaint/directory --enable-DEBUGx Debugging Options
Recompile NetSaint. Verify your configuration data again - you'll see a lot more information this time if you have enabled the DEBUG1 option. Try redirecting output to a file so that you can view or print it at a later time. If you have defined either the DEBUG3 or DEBUG4 options, run NetSaint as a foreground process and start monitoring your services. Example: /usr/local/netsaint/bin/netsaint /usr/local/netsaint/etc/netsaint.cfg Kill NetSaint at an approprate point (i.e. after a service check fails) and look through the output. It should help you track down where the problem is occurring. You may want to redirect the output to a file to make it easier to review it. Some code tweaking may be necessary on your part in order to fix things. Let me know if you have to make any such alterations so I can include the fix in future releases. If you are unable to determine or fix the problem on your own, email me the following items (give me some warning if you're planning on sending a large attachment):
|
||||||||||||