[ZBX-12164] zabbix-agent get wrong memory values in Linux Containers Created: 2017 May 11 Updated: 2024 Sep 23 Resolved: 2020 Oct 19 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 4.0.24, 5.0.3, 5.2 (plan) |
Fix Version/s: | 5.2 (plan) |
Type: | Problem report | Priority: | Trivial |
Reporter: | Falk Hackenberger | Assignee: | Michael Veksler |
Resolution: | Won't fix | Votes: | 11 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Linux Container on proxmox |
Issue Links: |
|
||||||||
Team: | |||||||||
Sprint: | Sprint 68 (Sep 2020), Sprint 69 (Oct 2020) | ||||||||
Story Points: | 3 |
Description |
The the zabbix-agent get the wrong memory values in a Linux Container which is limited with cgroup. A Simple workaround is to detect if you are in a lcx-container (/proc/1/environ) and do not use this values in the trigger. |
Comments |
Comment by kvaps [ 2017 Nov 28 ] |
Agree, I have same problem, |
Comment by kvaps [ 2017 Nov 29 ] |
I created custom UserParameter for solve it: |
Comment by Maksims Edelmans [ 2017 Dec 27 ] |
I have the same problem on LXC machines. Zabbix agent returns total host memory instead of total container memory. |
Comment by Daniel Berteaud [ 2018 Jan 10 ] |
While it's possible to create custom UserParameter, we can't override default keys with UserParameter, so we have to maintain different templates for container. It'd be better to be able to use a generic one and get the correct value |
Comment by kvaps [ 2018 Jan 25 ] |
Hi, I've found out that CPU Load collecting wrong too, so I've added new UserParameter into my zabbix template: |
Comment by kvaps [ 2018 Feb 14 ] |
I here thought in some cases this behavior can play into my hands, for example when I want to monitor physical host but want to run zabbix-agent inside the container. |
Comment by Maksims Edelmans [ 2018 Feb 28 ] |
Any comments from Zabbix Team? |
Comment by David Česal [ 2018 May 13 ] |
Zabbix Team, please, look at this issue. Repair is probably quite easy |
Comment by Tomasz Kłoczko [ 2019 Jun 20 ] |
Zabbix agent is monitoring memory over sysinfo() syscall. Looks like Linux namespaces are still not wrapping what is should be returned in the container. Even if zabbix will switch from sysinfo() syscall to what provides /proc/memoryinfo still it will be the issue on kernel side. I don't know how it is with latest kernel versions but at least Ubuntu 4.4.0-141-generic has this issue still not solved. |
Comment by Enrique JEANNE [ 2019 Jul 16 ] |
Same problem too |
Comment by Tomasz Kłoczko [ 2020 Sep 30 ] |
If someone in Zabbix is going to spend some time on that issue IMO better would be invest that time into fixing what returns sysinfo() syscall and submit that patch to the kernel source tree. Nothing inside lxc zone like on Solaris shout present any data from global zone. |
Comment by Dmitrijs Goloscapovs [ 2020 Sep 30 ] |
I researched similar behavior in case if Docker is used and made some conclusions. Inside Docker containers /proc/meminfo and sysinfo() are returning values from host, not from container. top and free also return values from host - as they fetch values from /proc/meminfo (and other files like cpuinfo, diskstats, stat, swaps, uptime). Docker does not provide any substitutes for these files like LXCFS does.
It will be undesirable to start fetching values from /proc/meminfo instead of sysinfo - because it will introduce discrepancy and confusion:
My conclusion:
|
Comment by Tomasz Kłoczko [ 2020 Oct 01 ] |
@dgoloscapov As I wrote IMO this is not per se zabbix bug and it should be solved outside of the zabbix. I'm not sure who is maintaining namespaces code in Linux kernel code but this issue should be fixed there and only there. If it will be fixed in kernel area all typed of containerisations will be possible to monitor OOTB. I would even abandon provide metrics based on sysfs content because when that issue will be solved on kernel layer that code will be only legacy. |
Comment by Adrian Gattorno GIl [ 2020 Oct 25 ] |
Tengo el mismo problema. Espero que tenga pronto alguna solución oficial. saludos |
Comment by Elbandi [ 2024 Sep 23 ] |
According to github issue, lxc cant/wont change sysinfo response, so "the bug should be solved outside" is not possible. We can call this situation "development deadlock" |