[ZBX-10223] Free Disk Space incorrect calculated Created: 2015 Dec 30  Updated: 2017 May 30  Resolved: 2015 Dec 30

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.0.0alpha5
Fix Version/s: None

Type: Incident report Priority: Major
Reporter: Mathew Assignee: Unassigned
Resolution: Won't fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian Linux (Jessie)
1x SSD attached storage to a KVM virtual machine (VirtIO)



 Description   

Default Linux template incorrectly calculates the used space on partition "/". Calculation appears to return / be based off the size of the underlying hard disk. When there are multiple partitions involved this can be quite incorrect.

So, it outputs the total used space on partition + size of other partitions

# df -h | grep vda1
/dev/vda1                 30G  2.0G   26G   8% /

# fdisk -l

Disk /dev/vda: 150 GiB, 161061273600 bytes, 314572800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x7891a01c

Device     Boot     Start       End   Sectors   Size Id Type
/dev/vda1  *         2048  62500863  62498816  29.8G 83 Linux
/dev/vda2       301754366 314570751  12816386   6.1G  5 Extended
/dev/vda3        62500864 301752319 239251456 114.1G 83 Linux
/dev/vda5       301754368 314570751  12816384   6.1G 82 Linux swap / Solaris

In Zabbix

Free disk space on /	          = 37.07 GB	
Free disk space on / (percentage) = 10.59 %
Free inodes on / (percentage)     = 99.96 %
Total disk space on /             = 350 GB
Used disk space on /              = 312.93 GB

I am unsure if this is a bug introduced in Zabbix 3.0 as we are testing the alpha version on a separate machine with a different hardware configuration.



 Comments   
Comment by Aleksandrs Saveljevs [ 2015 Dec 30 ]

Just to be a bit more confident that there is a bug, could you please run the following commands on the same machine you ran "df" and "fdisk"?

# zabbix_agentd -t vfs.fs.size[/,free]
# zabbix_agentd -t vfs.fs.size[/,pfree]
# zabbix_agentd -t vfs.fs.inode[/,pfree]
# zabbix_agentd -t vfs.fs.size[/,total]
# zabbix_agentd -t vfs.fs.size[/,used]

Currently, it seems suspicious, because "Total disk space on /" is reported in Zabbix frontend as 350 GB, but the total disk size is "150 GiB" according to "fdisk", and these numbers do not seem to be related.

Comment by Mathew [ 2015 Dec 30 ]

Really strange.

I executed those commands and got the correct results

root@monitor:~# zabbix_agentd -t vfs.fs.size[/,pfree]
vfs.fs.size[/,pfree]                          [d|92.779216]
root@monitor:~# zabbix_agentd -t vfs.fs.size[/,free]
vfs.fs.size[/,free]                           [u|27598151680]
root@monitor:~# zabbix_agentd -t vfs.fs.size[/,pfree]
vfs.fs.size[/,pfree]                          [d|92.779065]
root@monitor:~# zabbix_agentd -t vfs.fs.inode[/,pfree]
vfs.fs.inode[/,pfree]                         [d|96.472582]
root@monitor:~# zabbix_agentd -t vfs.fs.size[/,total]
vfs.fs.size[/,total]                          [u|31362842624]
root@monitor:~# zabbix_agentd -t vfs.fs.size[/,used]
vfs.fs.size[/,used]                           [u|2147946496]

So I went and took a look in the latest data, and it had updated: http://img.x4b.org/30-12-2015/19_25_49.png

I captured the screenshot including the difference indication. The only actions taken, ssh login and then the execution of those commands given.

Nothing was changed on my end, this server has been setup now for almost a week - and the items are updating every minute and have been doing so....

Comment by richlv [ 2015 Dec 30 ]

...and make sure you don't have an active zabbix agent on another system reporting data with that hostname

Comment by Aleksandrs Saveljevs [ 2015 Dec 30 ]

Could you please show a graph for one of these items (e.g., the "total" item) and an exported XML for that host? Could it be that Zabbix gets this data from multiple agents and/or zabbix_sender?

Comment by Mathew [ 2015 Dec 30 ]

Heres pfree, its the most irritating of the lot -

http://img.x4b.org/30-12-2015/19_39_02.png

Despite the template having a different ("Platform/Linux (Template)") its the same as your bundled Linux Template with only a few additional unrelated metrics. The discovery rule for mounted filesystem and its item prototypes are stock.

https://dl.dropboxusercontent.com/u/62365823/zbx_export_templates.xml

Comment by Aleksandrs Saveljevs [ 2015 Dec 30 ]

According to the XML, item prototypes for vfs.fs.size[] seem to be of type "Zabbix agent (active)". As richlv suggested, could you please check that there is no other active agent reporting data with the same host name? (Although, if there would be two agents reporting data, I would expect the graph to fluctuate before 8:21. Have you enabled active checks on the host's agent recently? Maybe they were disabled before 8:21 or an agent was running with old configuration file contents?)

Comment by richlv [ 2015 Dec 30 ]

overall, this looks terribly like a support case and would be best pursued through the channels listed in http://zabbix.org/wiki/Getting_help

Comment by Mathew [ 2015 Dec 30 ]

Nope no changes at that time.

I checked and there is only one agent running on the machine (3 listeners, 1 parent). And no other agent should have the same hostname.

I did a service stop, and then a killall -9 zabbix_agentd to be sure and then restarted the service. This does appear to have fixed the issue, although I am unsure why.

Perhaps value cache corruption of some kind?

Comment by Mathew [ 2015 Dec 30 ]

If its not a bug that you can reproduce its not terribly concerning to me at this time.

Comment by Aleksandrs Saveljevs [ 2015 Dec 30 ]

It can be seen on http://img.x4b.org/30-12-2015/19_39_02.png that spikes are non-symmetric: the low values are reported at regular intervals, the high values are reported at regular intervals, but the interval between a low value and a high value is not the same as the interval between a high value and a low value. This supports the suspicion that there are two active agents reporting data with this hostname. One way to debug this would be to turn DebugLevel=4 for trappers, see when they get this data, and use tcpdump or Wireshark to get the IPs of the incoming connections, because trappers do not log the IPs, as far as I remember.

We shall close this issue for now. Please reopen if it turns out not to be a configuration problem, but a bug indeed.

Comment by Mathew [ 2016 Jan 26 ]

I am just going to leave this here, its not only Free Disk space it seems. Also seen with CPU related metrics.

http://img.x4b.org/26-01-2016/14_29_32.png

%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 99.5 id,  0.5 wa,  0.0 hi,  0.0 si,  0.0 st

No easy way to replicate other than to use the Zabbix 3.0 agent, this has not been seen with any servers still using the 2.4 agent.

There is also the possibility that it is kernel or system related. So I will leave this here. Environment:

  • KVM
  • 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u1
  • Debian Jessie
Comment by Aleksandrs Saveljevs [ 2016 Jan 26 ]

This is a different problem and would be best discussed elsewhere (for instance, https://www.zabbix.org/wiki/Getting_help). Currently, since you are in a virtualized environment, a high CPU steal time is natural and it does not necessarily indicate a bug in Zabbix: see also http://stackoverflow.com/questions/32363957/htop-reports-100-cpu-steal-time-top-reports-0-after-virsh-restore .

Generated at Sat Apr 26 05:57:13 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.