Check leak

Version 3 (Laurent Defert, 11/15/2009 12:14 am)

1 2 Laurent Defert
h1. Overview
2 1
3 2 Laurent Defert
check_leak is a "Nagios":http://www.nagios.org/ plugin in Perl, that monitors process memory usage. It emits warnings before the system is overloaded
4 1
5 2 Laurent Defert
h1. Getting the source code
6 2 Laurent Defert
7 2 Laurent Defert
Download the source on the git repository with the command:
8 2 Laurent Defert
9 2 Laurent Defert
<pre>
10 2 Laurent Defert
git clone git://piggledy.org/check_leak
11 2 Laurent Defert
</pre>
12 2 Laurent Defert
13 1
h1. Documentation
14 1
15 1
Arguments:
16 2 Laurent Defert
<pre>
17 2 Laurent Defert
Syntax:
18 2 Laurent Defert
19 2 Laurent Defert
	check_leak -a
20 2 Laurent Defert
	check_leak -m <mem> -w <warning time> -c <critical time>
21 2 Laurent Defert
	
22 2 Laurent Defert
	Options:
23 2 Laurent Defert
	-a		Show all leaking processes
24 2 Laurent Defert
	-m		Memory limit of the system in Mo
25 2 Laurent Defert
	-w		Time to emit a warning notification in hours before the used memory reaches the memory limit
26 2 Laurent Defert
	-c		Time to emit a critical notification in hours before the used memory reaches the memory limit
27 2 Laurent Defert
</pre>
28 2 Laurent Defert
29 2 Laurent Defert
In order to predict when the system will be totally overloaded, check_leak keeps track of memory consumption information in a file in /tmp/cheak_leak_data.
30 3 Laurent Defert
31 3 Laurent Defert
h1. How does it works
32 3 Laurent Defert
33 3 Laurent Defert
Every time Nagios triggers the script, it records the current memory consumption of all processes and compute the mean memory allocation rate of each process. Those values can be seen using the -a flag of the check_leak command:
34 3 Laurent Defert
35 3 Laurent Defert
<pre>
36 3 Laurent Defert
 # ./check_leak -a
37 3 Laurent Defert
Process 6367 is leaking 2 o/s (/usr/sbin/openvpn --config /etc/openvpn/shan.conf --writepid /var/run/openvpn.shan.pid --daemon --setenv SVCNAME openvpn.shan --cd /etc/openvpn --nobind --up-delay --up-restart --script-security 2 --up /etc/openvpn/up.sh --down-pre --down /etc/openvpn/down.sh)
38 3 Laurent Defert
Process 11980 is leaking 12 o/s (bash)
39 3 Laurent Defert
Process 7239 is leaking 21 o/s (ssh shan)
40 3 Laurent Defert
Process 5819 is leaking 22 o/s (/usr/bin/X :0 vt7)
41 3 Laurent Defert
Process 9610 is leaking 24 o/s (sshfs shan:/home/lids /home/lids/Distant/shan)
42 3 Laurent Defert
Process 7482 is leaking 138 o/s (/opt/firefox/firefox-bin)
43 3 Laurent Defert
</pre>
44 3 Laurent Defert
45 3 Laurent Defert
Since a process can legitimately allocate memory, only linearly increasing memory consumption is taken into account (to reduce false positive).
46 3 Laurent Defert
47 3 Laurent Defert
+For example:+
48 3 Laurent Defert
This graph shows the memory consumption of a running Firefox session over 1h30 minutes (1 record every 2 minutes).
49 3 Laurent Defert
!attached_image!
50 3 Laurent Defert
51 3 Laurent Defert
The red curve show memory usage (in byte)
52 3 Laurent Defert
The green shows the memory allocation rate (in byte/s).
53 3 Laurent Defert
The blue curve shows the allocation rate evolution (in byte/s^2)
54 3 Laurent Defert
At about 2500 seconds, the browser was being used, introducing a 15Mo allocation. At this point the allocation rate reaches a threshold level (of 100) meaning the allocation rate was not constant. This value will then be ignored when computing the mean leak rate.