Nagios plug-in development guidelines(4)
本帖最后由 monicazhang 于 2015-10-30 21:22 编辑20151030淡然续上
If temp files are needed, make sure that the plugin will fail cleanly if the file can't be written (e.g., too few file handles, out of disk space, incorrect permissions, etc.) and delete the temp file when processing is complete. 如果需要使用临时文件,确保插件可以干净地失败,在文件不能被写入的情况下(如:过少的文件处理?,磁盘空间不足,不正确的权限等),并且保证处理完成后删除临时文件。 nagios安装
3.4. Don't be tricked into following symlinks不要被符号链接欺骗If your plugin opens any files, take steps to ensure that you are not following a symlink to another location on the system. 如果你的插件打开了一些文件,请采取措施确保你没有被一个符号链接重定向到系统的其它地址。
3.5. Validate all input验证所有输入use routines in utils.c or utils.pm and write more as needed 使用utils.c或utils.pm中的函数,或自己写如果需要
4. Perl PluginsPerl plugins are coded a little more defensively than other plugins because of embedded Perl. When configured as such, embedded Perl Nagios (ePN) requires stricter use of the some of Perl's features. This section outlines some of the steps needed to use ePN effectively.1. Do not use BEGIN and END blocks since they will be called only once (when Nagios starts and shuts down) with Embedded Perl (ePN). In particular, do not use BEGIN blocks to initialize variables. 开源监控软件2. To use utils.pm, you need to provide a full path to the module in order for it to work.e.g.
use lib "/usr/local/nagios/libexec";
use utils qw(...);
3. Perl scripts should be called with "-w"4. All Perl plugins must compile cleanly under "use strict" - i.e. at least explicitly package names as in "$main::x" or predeclare every variable.Explicitly initialize each variable in use. Otherwise with caching enabled, the plugin will not be recompiled each time, and therefore Perl will not reinitialize all the variables. All old variable values will still be in effect.5. Do not use >DATA< handles (these simply do not compile under ePN).6. Do not use global variables in named subroutines. This is bad practise anyway, but with ePN the compiler will report an error "<global_var> will not stay shared ..". Values used by subroutines should be passed in the argument list.7. If writing to a file (perhaps recording performance data) explicitly close close it. The plugin never calls exit; that is caught by p1.pl, so output streams are never closed.8. As in Section 5 all plugins need to monitor their runtime, specially if they are using network resources. Use of the alarm is recommended noting that some Perl modules (eg LWP) manage timers, so that an alarm set by a plugin using such a module is overwritten by the module. (workarounds are cunning (TM) or using the module timer) Plugins may import a default time out ($TIMEOUT) from utils.pm.9. Perl plugins should import %ERRORS from utils.pm and then "exit $ERRORS{'OK'}" rather than "exit 0"
5. Runtime TimeoutsPlugins have a very limited runtime - typically 10 sec. As a result, it is very important for plugins to maintain internal code to exit if runtime exceeds a threshold. 插件有一个特有的限制的运行时间—代表性地是10秒。因此,保持当内部代码运行时间超出一个阀值时退出非常重要。All plugins should timeout gracefully, not just networking plugins. For instance, df may lock if you have automounted drives and your network fails - but on first glance, who'd think df could lock up like that. Plus, it should just be more error resistant to be able to time out rather than consume resources. 所有插件应该优雅地超时,并不仅仅是联网插件。比如,df命令会锁定如果你自动挂载驱动并且你的网络失败了—但是第一次扫视,你最好认为df会那样锁定。另外, nagios配置
5.1. Use DEFAULT_SOCKET_TIMEOUTAll network plugins should use DEFAULT_SOCKET_TIMEOUT to timeout5.2. Add alarms to network pluginsIf you write a plugin which communicates with another networked host, you should make sure to set an alarm() in your code that prevents the plugin from hanging due to abnormal socket closures, etc. Nagios takes steps to protect itself against unruly plugins that timeout, but any plugins you create should be well behaved on their own. 如果你写了一个插件,这个插件会同其它网络主机交互,你应该确保设置一个alarm()在你的代码中,来防止插件因为不正常的socket关闭而挂起等。Nagios采取措施来保护它自己来对抗不守规矩的插件,那就是超时。但是你创建的所有插件应该是运自身行良好的。
6. Plugin OptionsA well written plugin should have --help as a way to get verbose help. Code and output should try to respect the 80x25 size of a crt (remember when fixing stuff in the server room!)
6.1. Option ProcessingFor plugins written in C, we recommend the C standard getopt library for short options. Getopt_long is always available.For plugins written in Perl, we recommend Getopt::Long module.Positional arguments are strongly discouraged.There are a few reserved options that should not be used for other purposes: 监控软件 -V version (--version)
-h help (--help)
-t timeout (--timeout)
-w warning threshold (--warning)
-c critical threshold (--critical)
-H hostname (--hostname)
-v verbose (--verbose)In addition to the reserved options above, some other standard options are: -C SNMP community (--community)
-a authentication password (--authentication)
-l login name (--logname)
-p port or password (--port or --passwd/--password)monitors operational
-u url or username (--url or --username)Look at check_pgsql and check_procs to see how I currently think this can work. Standard options are: nagios实施The option -V or --version should be present in all plugins. For C plugins it should result in a call to print_revision, a function in utils.c which takes two character arguments, the command name and the plugin revision.The -? option, or any other unparsable set of options, should print out a short usage statement. Character width should be 80 and less and no more that 23 lines should be printed (it should display cleanly on a dumb terminal in a server room).The option -h or --help should be present in all plugins. In C plugins, it should result in a call to print_help (or equivalent). The function print_help should call print_revision, then print_usage, then should provide detailed help. Help text should fit on an 80-character width display, but may run as many lines as needed.The option -v or --verbose should be present in all plugins. The user should be allowed to specify -v multiple times to increase the verbosity level, as described in Table 1.
6.2. Plugins with more than one type of threshold, or with threshold ranges nagios培训Old style was to do things like -ct for critical time and -cv for critical value. That goes out the window with POSIX getopt. The allowable alternatives are:1. long options like -critical-time (or -ct and -cv, I suppose).
待续:http://ITIL-foundation.cn/thread-53036-1-1.html
本帖关键字:Nagios
页:
[1]