-
Change Request
-
Resolution: Fixed
-
Trivial
-
5.0.12
-
None
-
Linux
-
3
The filesystem free space checks do not function properly and need serious revamping
1- First and foremost a bug makes them not work:
https://support.zabbix.com/browse/ZBX-19502
For fixing the bug report above, "free" space check should be added.
vfs.fs.size[{#FSNAME},free]
2- Use macros for the minimum free WARN (10G) and CRIT (5G) values.
{$VFS.FS.FREE.MIN.CRIT:"{#FSNAME} -> 5G {$VFS.FS.FREE.MIN.WARN:"{#FSNAME} -> 10G
3- Make the `timeleft()` prediction optional. Why? Because this does not work for filesystems where there is bursts of data transfer. For example a filesystem where backups are taken. Free space may be reduced very quickly and cause erroneous warnings.
For example for a 1TB filesystem. The free space warning trigger activates after 80% when means there is still 200GB space. A backup process which take 2 hours is able to trigger false warnings.
Same goes for a filesystem with recordings. A recorder may be constantly writing data to disk. The trigger activates even though some process regularly cleans up data.
I am not sure what is the best way to do it. Perhaps with an ON/OFF switch?
{$VFS.FS.PFULL.PREDICT:"{#FSNAME}"}=1
So the final result with problems 1,2 and 3 the resulting triggers would be like:
{Template Module Linux filesystems by Zabbix agent:vfs.fs.size[{#FSNAME},pused].last()}>{$VFS.FS.PUSED.MAX.CRIT:"{#FSNAME}"} and ({Template Module Linux filesystems by Zabbix agent:vfs.fs.size[{#FSNAME},free].last()}<{$VFS.FS.FREE.MIN.CRIT:"{#FSNAME}"} or ({$VFS.FS.PFULL.PREDICT:"{#FSNAME}"}=1 and {Template Module Linux filesystems by Zabbix agent:vfs.fs.size[{#FSNAME},pused].timeleft(1h,,100)}<1d)) {Template Module Linux filesystems by Zabbix agent:vfs.fs.size[{#FSNAME},pused].last()}>{$VFS.FS.PUSED.MAX.WARN:"{#FSNAME}"} and ({Template Module Linux filesystems by Zabbix agent:vfs.fs.size[{#FSNAME},free].last()}<{$VFS.FS.FREE.MIN.WARN:"{#FSNAME}"} or ({$VFS.FS.PFULL.PREDICT:"{#FSNAME}"}=1 and {Template Module Linux filesystems by Zabbix agent:vfs.fs.size[{#FSNAME},pused].timeleft(1h,,100)}<1d))
4a- Update the graph prototype for the "{#FSNAME}: Disk space usage" to not use "used space" because when disk is full, the chart never reaches 100%. Instead use Total-Free to calculate used space. Because used space does NOT consider the reserved space! It gives a false sensation that there is actually free space because Total - Used > 0 !
4b-Update the graph prototype for the "{#FSNAME}: Disk space usage" from "pie chart" to "normal". Why? Because the pie chart is not useful. It does not show the general direction that the filesystem usage is moving. Was the filesystem getting more and more full every month? or was it staying same? It is impossible to see from a pie chart.
Thank you!