[ZBX-2966] negative value vfs.fs.size amount of free space on partition Created: 2010 Aug 31 Updated: 2017 May 30 Resolved: 2015 Nov 13 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 1.8.3 |
Fix Version/s: | 3.0.0alpha4 |
Type: | Incident report | Priority: | Major |
Reporter: | rootd | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 9 |
Labels: | agent, freebsd | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
FreeBSD 8.1 RELEASE |
Attachments: | 2010-08-31_170544.jpg 2010-08-31_170614.jpg zabbix-2.4.6-negative-vfs-fs-size.patch | ||||||||||||
Issue Links: |
|
Description |
When there is no space left on a partition, FreeBSD has a negative space available, See screenshots.... |
Comments |
Comment by Frank Wall [ 2011 Feb 22 ] |
I have the same problem. It does NOT appear with FreeBSD 7.1, but after upgrading to FreeBSD 7.3 I ran into the same issue. The Zabbix Frontend shows 16 EB free disk space, while it actually is 0 Bytes (or in FreeBSD notation: -40 MB). |
Comment by Ilyas [ 2011 Apr 19 ] |
The problem affects also vfs.fs.size[/somepath,pfree] and vfs.fs.size[/somepath,pused]. [root@zbxserv.local]# zabbix_get -s zbxagent.local -k 'vfs.fs.size["/mnt/disk",pused]' [root@zbxagent.local ~]# df -h | grep /mnt/disk In our cluster we have above 500 hard drives for first time and its count continuously growing. There is more one trouble looks like (cut from zabbix_server.log): Agent and server both: (yes, we're ugins auto discovery for filesystems) OS is FreeBSD 8.2-STABLE. |
Comment by Ilyas [ 2011 Apr 19 ] |
Oh, I forget write, we using 2TB hard drives. Yes, if you could provide patches I run zabbix with its. |
Comment by Alexander Vladishev [ 2013 Feb 17 ] |
Comment by Sap [ 2013 Mar 15 ] |
Same problem FreeBSD 8.2-RELEASE #0: Fri Feb 18 02:24:46 UTC 2011
# zabbix_agent -V
Zabbix Agent v1.8.3 (revision 13928) (16 August 2010)
Compilation time: Dec 13 2010 22:12:16
# df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1a 4.5G 4.1G -27M 101% /
devfs 1.0K 1.0K 0B 100% /dev
# zabbix_agent -t 'vfs.fs.size[/,pfree]'
vfs.fs.size[] [d|854069308670196.500000]
After remove some files to free space:
# df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1a 4.5G 3.7G 418M 90% /
devfs 1.0K 1.0K 0B 100% /dev
# zabbix_agent -t 'vfs.fs.size[/,pfree]'
vfs.fs.size[] [d|9.916499]
------- |
Comment by Riaan Olivier [ 2013 Jun 03 ] |
Got the same issue on FreeBSD 7.3 and 8.1 # zabbix_agent -V Zabbix agent v2.0.5 (revision 33558) (12 February 2013) Compilation time: Apr 12 2013 11:05:51 # df -h Filesystem Size Used Avail Capacity Mounted on /dev/mirror/gm0s1f 1.9G 1.9G -158M 109% /var # zabbix_agentd -t "vfs.fs.size[/var,free]" vfs.fs.size[/var,free][/var,free] [u|18446744073544177664] # zabbix_agentd -t "vfs.fs.size[/var,pfree]" vfs.fs.size[/var,pfree][/var,pfree] [d|1979319602661605.750000] Can we request for this to be fixed as soon as possible, because it causes triggers to be missed and downtime on production environments. |
Comment by Aleksandrs Saveljevs [ 2015 Mar 09 ] |
Taking FreeBSD code in src/libs/zbxsysinfo/freebsd/diskspace.c as an example, below is the status of the current implementation, which seems to have remained unchanged since Zabbix 1.7. There are no comments regarding the implementation, but it seems that it was written with the intention to mimic df output. Our code is as follows, supplemented with comments in each conditional describing the corresponding output in df, according to df source code at https://www.gitorious.org/freebsd/freebsd-head/source/214589d0d7e189b66514f6098f7c2a2c9b61dd87:bin/df/df.c#L239 : #ifdef HAVE_SYS_STATVFS_H # define ZBX_STATFS statvfs # define ZBX_BSIZE f_frsize #else # define ZBX_STATFS statfs # define ZBX_BSIZE f_bsize #endif struct ZBX_STATFS s; if (0 != ZBX_STATFS(fs, &s)) return SYSINFO_RET_FAIL; // uint64_t f_blocks; /* total data blocks in filesystem */ // uint64_t f_bfree; /* free blocks in filesystem */ // (u)int64_t f_bavail; /* free blocks avail to non-superuser */ if (NULL != total) { // Size: // f_blocks *total = (zbx_uint64_t)s.f_blocks * s.ZBX_BSIZE; } if (NULL != free) { // Avail: // f_avail *free = (zbx_uint64_t)s.f_bavail * s.ZBX_BSIZE; } if (NULL != used) { // Used: // f_blocks - f_bfree *used = (zbx_uint64_t)(s.f_blocks - s.f_bfree) * s.ZBX_BSIZE; } if (NULL != pfree) { if (0 != s.f_blocks - s.f_bfree + s.f_bavail) *pfree = (double)(100.0 * s.f_bavail) / (s.f_blocks - s.f_bfree + s.f_bavail); else *pfree = 0; } if (NULL != pused) { // Capacity: // (f_blocks - f_bfree) / (f_blocks - f_bfree + f_avail) if (0 != s.f_blocks - s.f_bfree + s.f_bavail) *pused = 100.0 - (double)(100.0 * s.f_bavail) / (s.f_blocks - s.f_bfree + s.f_bavail); else *pused = 0; } There are several things to note:
|
Comment by Aleksandrs Saveljevs [ 2015 Mar 09 ] |
Some of the complications with mimicking df are described above. Here is another one: if we wish "free" to return a negative value as in df, then we have to make it return a float if "f_bavail" is negative. However, we still have to return an unsigned integer in case "f_bavail" is non-negative, because our floats are only limited to 10^12 and this value is too low for modern drives. This would be good enough for "zabbix_get", but it is of no use for Zabbix server: it is not possible for an item to accept both large unsigned integers and negative values. Therefore, an item will become unsupported in case "free" is negative. One solution is to let it be and offer users to trigger based on "pfree" (which we shall also fix to return negative percentage) instead of "free": it is a floating-point item and should always work. Item "free" will then remain unreliable for triggering. Another solution would be to abandon the idea of mimicking df behavior, but the changes in 1.7 were made specifically to look like df. Going back might not be an option. |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
Specification is necessary before proceeding with development. |
Comment by dimir [ 2015 Oct 28 ] |
Man tunefs -m minfree Specify the percentage of space held back from normal users; the minimum free space threshold. The default value used is 8%. Note that lowering the threshold can adversely affect performance: o Settings of 5% and less force space optimization to always be used which will greatly increase the overhead for file writes. o The file system's ability to avoid fragmentation will be reduced when the total free space, including the reserve, drops below 15%. As free space approaches zero, throughput can degrade by up to a factor of three over the performance obtained at a 10% threshold. If the value is raised above the current usage level, users will be unable to allocate files until enough files have been deleted to get under the higher threshold So, there is a reserved disk space that is available to root but not others. And when this reserved space is hit the value of available space becomes negative: [build@freebsd73 ~]$ df -h / Filesystem Size Used Avail Capacity Mounted on /dev/da0s1a 1.7G 1.6G -12K 100% / [build@freebsd73 ~]$ whoami build [build@freebsd73 ~]$ pwd /home/build [build@freebsd73 ~]$ echo foo > foo /: write failed, filesystem is full -bash: echo: write error: No space left on device [build@freebsd73 ~]$ su Password: [root@freebsd73 /home/build]# whoami root [root@freebsd73 /home/build]# pwd /home/build [root@freebsd73 /home/build]# echo foo > foo [root@freebsd73 /home/build]# cat foo foo [root@freebsd73 /home/build]# uname -a FreeBSD freebsd73 7.3-RELEASE FreeBSD 7.3-RELEASE #0: Sun Mar 21 06:15:01 UTC 2010 root@walker.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 |
Comment by dimir [ 2015 Oct 30 ] |
For FreeBSD we have 2 options how to get filesystem data:
The decision is made at compile time and statvfs() is preferable. Here's a snippet from statvfs man page IMPLEMENTATION NOTES The statvfs() and fstatvfs() functions are implemented as wrappers around the statfs() and fstatfs() functions, respectively. Not all the informa- tion provided by those functions is made available through this inter- face. There is a difference in statfs and statvfs structures which makes it impossible to detect negative value when using statvfs:
struct statfs {
[...]
int64_t f_bavail; /* free blocks avail to non-superuser */
[...]
}
---------------------------------------------------------------------------------------------------------------
typedef __uint64_t __fsblkcnt_t;
typedef __fsblkcnt_t fsblkcnt_t;
struct statvfs {
[...]
fsblkcnt_t f_bavail;
[...]
}
So, seems that when using statvfs we loose the signedness of filesystem sizes which makes it impossible for us to detect it. Using statfs we can get the negative value. |
Comment by dimir [ 2015 Oct 30 ] |
Too bad I missed same things already mentioned by asaveljevs. |
Comment by dimir [ 2015 Oct 30 ] |
We just had an internal discussion and it was decided that in this case we should detect negative size and change it to 0 (zero). We thought from a user perspective this would be the most convenient decision. |
Comment by dimir [ 2015 Nov 04 ] |
Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-2966 . |
Comment by Andris Zeila [ 2015 Nov 06 ] |
Successfully tested, please review minor changes in r56581 |
Comment by dimir [ 2015 Nov 06 ] |
(1) [G] Moved ZBX_IS_TOP_BIT_SET macro sysinfo.h -> zbxtypes.h , please check. wiper CLOSED, please review another improvement in ZBX_IS_TOP_BIT_SET macro r56583 |
Comment by dimir [ 2015 Nov 06 ] |
The fix will only be available for trunk (3.0). |
Comment by dimir [ 2015 Nov 06 ] |
Fixed in pre-3.0.0alpha4 (r56585). The fix is only available for 3.0 because the change might affect the returned value which might cause a regression. E. g. AIX with 32-bit stat[v]fs interface with big disks might report 0 if available disk space is 16 TB (considering 4096 block size). This behavior wasn't noticed but is possible in theory. The patch for 2.4.6 is attached. |
Comment by dimir [ 2015 Nov 09 ] |
(2) [D] Documented here. wiper CLOSED |
Comment by Alexander Vladishev [ 2015 Nov 12 ] |
(3) The code of vfs.fs.inode must be also reviewed and fixed <dimir> Actually I have checked and tested vfs.fs.inode and there is no issue. The issue is with stat[v]fs structure field f_bavail, which is not used in case of vfs.fs.inode. RESOLVED sasha Thanks! CLOSED |