[ZBX-12164] zabbix-agent get wrong memory values in Linux Containers Created: 2017 May 11  Updated: 2024 Sep 23  Resolved: 2020 Oct 19

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 4.0.24, 5.0.3, 5.2 (plan)
Fix Version/s: 5.2 (plan)

Type: Problem report Priority: Trivial
Reporter: Falk Hackenberger Assignee: Michael Veksler
Resolution: Won't fix Votes: 11
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux Container on proxmox


Issue Links:
Causes
causes ZBXNEXT-6228 Collecting containers metrics with re... Open
Team: Team C
Sprint: Sprint 68 (Sep 2020), Sprint 69 (Oct 2020)
Story Points: 3

 Description   

The the zabbix-agent get the wrong memory values in a Linux Container which is limited with cgroup.
It seems that the zabbix-agent report the memory values of the host system and not from the container. So it is useless for triggers.
top in the container shows the correct values.
also /proc/meminfo shows the correct values.

A Simple workaround is to detect if you are in a lcx-container (/proc/1/environ) and do not use this values in the trigger.



 Comments   
Comment by kvaps [ 2017 Nov 28 ]

Agree, I have same problem,
I'am use proxmox with LXC containers.

Comment by kvaps [ 2017 Nov 29 ]

I created custom UserParameter for solve it:
https://medium.com/@kvapss/zabbix-solve-memory-monitoring-issue-inside-lxc-containers-98ddf191051c

Comment by Maksims Edelmans [ 2017 Dec 27 ]

I have the same problem on LXC machines. Zabbix agent returns total host memory instead of total container memory.

Comment by Daniel Berteaud [ 2018 Jan 10 ]

While it's possible to create custom UserParameter, we can't override default keys with UserParameter, so we have to maintain different templates for container. It'd be better to be able to use a generic one and get the correct value

Comment by kvaps [ 2018 Jan 25 ]

Hi, I've found out that CPU Load collecting wrong too, so I've added new UserParameter into my zabbix template:
https://github.com/kvaps/zabbix-linux-container-template
Please use it

Comment by kvaps [ 2018 Feb 14 ]

I here thought in some cases this behavior can play into my hands, for example when I want to monitor physical host but want to run zabbix-agent inside the container.
In my opinnion this is good idea to have option inside `zabbix-agentd.conf` for set what data exactly needed, is that data from the container environment or from the physical host.

Comment by Maksims Edelmans [ 2018 Feb 28 ]

Any comments from Zabbix Team?

Comment by David Česal [ 2018 May 13 ]

Zabbix Team, please, look at this issue. Repair is probably quite easy  Thanks.

Comment by Tomasz Kłoczko [ 2019 Jun 20 ]

Zabbix agent is monitoring memory over sysinfo() syscall. Looks like Linux namespaces are still not wrapping what is should be returned in the container. Even if zabbix will switch from sysinfo() syscall to what provides /proc/memoryinfo still it will be the issue on kernel side. I don't know how it is with latest kernel versions but at least Ubuntu 4.4.0-141-generic has this issue still not solved.
IMO this problem needs to be sorted out on kernel side because sysinfo() syscall shpould report some values correctly.

Comment by Enrique JEANNE [ 2019 Jul 16 ]

Same problem too

Comment by Tomasz Kłoczko [ 2020 Sep 30 ]

If someone in Zabbix is going to spend some time on that issue IMO better would be invest that time into fixing what returns sysinfo() syscall and submit that patch to the kernel source tree.

Nothing inside lxc zone like on Solaris shout present any data from global zone.

Comment by Dmitrijs Goloscapovs [ 2020 Sep 30 ]

I researched similar behavior in case if Docker is used and made some conclusions.

Inside Docker containers /proc/meminfo and sysinfo() are returning values from host, not from container. top and free also return values from host - as they fetch values from /proc/meminfo (and other files like cpuinfo, diskstats, stat, swaps, uptime). Docker does not provide any substitutes for these files like LXCFS does.
Container's metrics (that respect its limits) can be fetched from cgroups metrics in /sys/fs/cgroup/memory/ (memory.stat and other memory.* files). This path is valid for cgroups v1.
In the nearest future cgroups v2 will become mainstream (its already used in Podman) and cgroups sysfs file structure and hierarchy will be different.
Users can monitor individual container's metrics through host, by using their containerization platform's APIs and tools.
From the other side: agent2 with https://www.zabbix.com/integrations/docker exists.

 

It will be undesirable to start fetching values from /proc/meminfo instead of sysinfo - because it will introduce discrepancy and confusion:

  • LXCFS is a feature provided by LXC. Docker/Podman does not provide any substitutes for /proc/ files. Therefore if agent will fetch values only from /proc/meminfo, the discrepancy between behavior of different containers will be introduced.
  • User may want to containerize agent and collect metrics from host (maybe even without using LXCFS). User will not have choice if sysinfo values will be simply replaced with /proc/meminfo values.

 

My conclusion:

  • This is not a bug in Agent, it is just a small nuisance of containerization.
  • New agent metrics can be implemented to fetch container's metrics using cgroups' sysfs. However, this is new feature implementation and it is out of this issue's scope.

 

 

Comment by Tomasz Kłoczko [ 2020 Oct 01 ]

@dgoloscapov As I wrote IMO this is not per se zabbix bug and it should be solved outside of the zabbix. I'm not sure who is maintaining namespaces code in Linux kernel code but this issue should be fixed there and only there. If it will be fixed in kernel area all typed of containerisations will be possible to monitor OOTB. I would even abandon provide metrics based on sysfs content because when that issue will be solved on kernel layer that code will be only legacy.

Comment by Adrian Gattorno GIl [ 2020 Oct 25 ]

Tengo el mismo problema. Espero que tenga pronto alguna solución oficial.

saludos

Comment by Elbandi [ 2024 Sep 23 ]

According to github issue, lxc cant/wont change sysinfo response, so "the bug should be solved outside" is not possible. We can call this situation "development deadlock"

Generated at Wed Apr 30 06:37:59 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.