[ZBX-27156] Zabbix Agent 2 (7.0.20) spikes to 100% CPU and stays there Created: 2025 Oct 29 Updated: 2026 Jan 30 Resolved: 2025 Nov 05 |
|
| Status: | Closed |
| Project: | ZABBIX BUGS AND ISSUES |
| Component/s: | None |
| Affects Version/s: | 7.0.20, 7.4.4 |
| Fix Version/s: | 7.0.21, 7.4.5, 8.0.0alpha2 |
| Type: | Problem report | Priority: | Critical |
| Reporter: | Marek Krolikowski | Assignee: | Eriks Sneiders |
| Resolution: | Fixed | Votes: | 12 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | 6h | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Zabbix Agent 2: 7.0.20 (upgrade from 7.0.19) |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Team: | |||||||||||||
| Story Points: | 0.125 | ||||||||||||
| Description |
|
Steps to Reproduce Upgrade Zabbix Agent 2 from 7.0.19 to 7.0.20 using APT. Expected Result Actual Result Evidence ps aux | grep zabbix_agent zabbix 2052826 99.3 0.1 2142772 54332 ? Ssl 05:53 97:39 /usr/sbin/zabbix_agent2 -c /etc/zabbix/zabbix_agent2.conf |
| Comments |
| Comment by Marek Krolikowski [ 2025 Oct 29 ] |
*Environment recap*
*Symptom*
*What we traced (strace on hot process / threads)*
``` ", 18756) = 18756 , ...) = EINPROGRESS
*Interpretation / hypothesis*
*What we already tried*
*Next diagnostics we can attach on request*
*Impact*
*Request*
*Representative strace snippets* ``` ", 18756) = 18756 ], 128, 999, NULL, 0) = 1 |
| Comment by Marek Krolikowski [ 2025 Oct 29 ] |
|
The issue appears when using Zabbix Agent 2 together with the “MySQL by Zabbix agent 2” template. |
| Comment by Marek Krolikowski [ 2025 Oct 29 ] |
|
Another workaround confirmed — downgrade fixes the issue apt-get install zabbix-agent2=1:7.0.19-1+debian13 (The exact package suffix may vary by distro/architecture, e.g. …+debian12 or …arm64.) |
| Comment by Frank [ 2025 Oct 30 ] |
|
How is the issue resolved? We cannot stay forever on 7.0.19 as a workaround. Someone needs to fix the bug for an upcoming version of the agent |
| Comment by Eriks Sneiders [ 2025 Oct 30 ] |
|
Hi Frank Brandt! I understand your confusion, and can tell you "Resolved" means: The issue if found and fixed internally and is moving forward to be implemented in a release. |
| Comment by Frank [ 2025 Oct 30 ] |
|
OK thanks for the info. |
| Comment by Marek Krolikowski [ 2025 Oct 30 ] |
|
Just to confirm my understanding: the fix has been merged internally and is planned for Agent 2 v7.0.21. |
| Comment by Alex Kalimulin [ 2025 Oct 30 ] |
|
This is a high priority issue so 7.0.21 will be released ahead of the usual schedule. ETA early next week. |
| Comment by Fernando Viñan-Cano [ 2025 Oct 31 ] |
|
Came here from the forums, this is also happening for v7.4.4 which I see is not mentioned - my Fedora Server started giving my Proxmox host some CPU stress after I upgraded from v7.4.3, so I downgraded. |
| Comment by Marek Krolikowski [ 2025 Oct 31 ] |
|
Taomyn But they know about this issue in 7.4.x too look: |
| Comment by Eriks Sneiders [ 2025 Oct 31 ] |
|
Fixed in Zabbix agent 2
|
| Comment by Marek Krolikowski [ 2025 Nov 03 ] |
|
esneiders Tested packages: root@taken:~# dpkg -l | grep zabbix ii zabbix-agent2 1:7.0.21-1+debian13 amd64 Zabbix network monitoring solution - agent ii zabbix-release 1:7.0-2+debian13 all Zabbix official repository configuration Agent version: root@taken:~# /usr/sbin/zabbix_agent2 -V zabbix_agent2 (Zabbix) 7.0.21 Process list: root@taken:~# ps aux | grep zabbix zabbix 1837639 100.0 2.4 1921072 47852 ? Ssl 12:26 2:52 /usr/sbin/zabbix_agent2 -c /etc/zabbix/zabbix_agent2.conf So even with *7.0.21* installed from the official repo on Debian, the agent still consumes 100% CPU in our setup. It looks like the issue is not fully resolved yet. |
| Comment by Stefan [ 2025 Nov 03 ] |
|
Sorry guys, but the bug is still there but this time it has nothing to do with mysql.. I see a lot of futex(0x164bc20, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) |
| Comment by Marek Krolikowski [ 2025 Nov 03 ] |
|
Thanks for confirming this, shad0w — I was starting to worry it was something specific to my setup only. Good to see you’re seeing the same futex loop pattern. |
| Comment by Eriks Sneiders [ 2025 Nov 03 ] |
|
Thank you for your input! We will keep investigating. shad0w based on the stack trace I see you do log monitoring, is there any additional information you could provide, if as you mention this is not connected to mysql? TaKeN does the issue also persist for you regardless of mysql monitoring? |
| Comment by Marek Krolikowski [ 2025 Nov 03 ] |
|
Yes esneiders, it still persists for me on the hosts where I monitor MySQL. To be precise:
So it’s not happening everywhere — only on the machines that previously had the MySQL-related monitoring and were upgraded from 7.0.19. |
| Comment by Stefan [ 2025 Nov 03 ] |
|
esneiders sorry, I was on two servers at the same time.. it looks like mysql related, because on servers where we don't monitor mysql everything is fine, this CPU usage is only where we monitor mysql.. |
| Comment by Marek Krolikowski [ 2025 Nov 03 ] |
|
*Observed behavior* After upgrading to *Zabbix agent 2 7.0.21* the agent stays idle until the first *passive* MySQL check is requested by the server. At the moment the server sends a passive request for a MySQL item, the `zabbix_agent2` process jumps to *100% CPU* and stays there. *Log excerpt from the agent* 2025/11/03 14:11:41.772059 received passive check request from "{\"request\":\"passive checks\",\"data\":[{\"key\":\"mysql.get_status_variables[\\\"tcp://127.0.0.1:3306\\\",\\\"zabbix\\\",\\\"XXXXXXXXXXXXXXXXXXXXXXXXXXXX\\\"]\",\"timeout\":30}]}": "1.1.1.1" 2025/11/03 14:11:41.772545 [1] created direct exporter task for plugin 'Mysql' ... 2025/11/03 14:11:41.772594 plugin Mysql: executing configurator task 2025/11/03 14:11:41.772912 plugin Mysql: executing starter task 2025/11/03 14:11:41.773132 [Mysql] creating new connection for host: 127.0.0.1 2025/11/03 14:11:41.773157 [Mysql] Created new connection: 127.0.0.1:3306 Right *after* this MySQL plugin initialization the agent process goes to 100% CPU. *Important notes*
*So in short*
This looks consistent with the recent changes in the MySQL plugin and connection manager. |
| Comment by Tim Harman [ 2025 Nov 03 ] |
|
I've just upgraded to 7.4.5 and I'm still seeing this very high CPU usage. Was the fix deployed to 7.4.5? I've rolled this back to 7.4.3 to stop the CPU burn |
| Comment by Carlos Eduardo Commim [ 2025 Nov 04 ] |
|
7.4.5 without MySQL without problems, with MySQL the same problem occurs, I reverted to 7.4.3 without problems without and with MySQL, the problem only occurs in 7.4.4 and 7.4.5. OS: Ubuntu, 22.04, 24.04 and 25.04 |
| Comment by Eddie Stassen [ 2025 Nov 04 ] |
|
Unfortunately the issue persists with 7.0.21 on Rocky Linux 9.6 |
| Comment by Marek Krolikowski [ 2025 Nov 04 ] |
|
I tested 7.0.21 again and the high CPU is still there. This time I profiled the agent with perf and it clearly shows that the CPU is burned inside the MySQL plugin housekeeping loop, not in the actual MySQL queries. Top frames from perf report: golang.zabbix.com/agent2/plugins/mysql.NewConnManager.gowrap1 golang.zabbix.com/agent2/plugins/mysql.(*ConnManager).housekeeper golang.zabbix.com/agent2/plugins/mysql.(*ConnManager).closeUnused a lot of time in Go timers / map iteration (runtime.selectgo, runtime.(*timer).maybeRunChan, etc.) So the agent is basically spinning in the MySQL connection manager housekeeper goroutine and that’s what keeps one CPU at 100%. It still happens only on the hosts where I monitor MySQL (when I remove the “MySQL by Zabbix agent 2” template or when I downgrade the agent back to 7.0.19, the problem goes away). So it looks like the regression is in the MySQL plugin housekeeping logic introduced after 7.0.19, not in MySQL itself. |
| Comment by Marek Krolikowski [ 2025 Nov 04 ] |
|
I retested with 7.0.19 and the high CPU disappears. golang.zabbix.com/agent2/plugins/mysql.(*ConnManager).housekeeper golang.zabbix.com/agent2/plugins/mysql.(*ConnManager).closeUnused |
| Comment by Marek Krolikowski [ 2025 Nov 04 ] |
|
I re-checked on my side and with 7.0.21 the CPU burn is happening inside the MySQL plugin’s connection housekeeper. Details from `perf` (7.0.21):
When I downgrade the same host (same MySQL, same template) to: apt-get install zabbix-agent2=1:7.0.19-1+debian13 the problem disappears immediately and CPU goes back to normal. So it looks like the new MySQL connection manager introduced in 7.0.20/7.0.21 creates/keeps more distinct connection entries, and the periodic cleanup (`housekeeper()`) becomes expensive and keeps the agent thread busy all the time. In short:
Please check `src/go/plugins/mysql/conn.go` around `ConnManager.housekeeper()` / `closeUnused()` and how the connection key is constructed — it’s likely iterating over too many connections every 10 seconds. |
| Comment by Patrik Leifert [ 2025 Nov 04 ] |
|
Hello, just upgraded from 7.0.19 to 7.0.21 and I can confirm we have this issue too in our environment on Rocky Linux 9.6 [root@zabbix-proxy-88 ~]# ps aux | grep zabbix_agent zabbix 2504612 95.9 1.6 1696636 30680 ? Ssl 10:09 24:35 /usr/sbin/zabbix_agent2 -c /etc/zabbix/zabbix_agent2.conf Downgrading back to 7.0.19 helped
|
| Comment by Sergejs Maklakovs [ 2025 Nov 04 ] |
|
Hello! |
| Comment by Marek Krolikowski [ 2025 Nov 04 ] |
|
Hello smaklakovs I’ve just upgraded to the new build and can confirm the fix works. *Tested on:*
Packages now show e.g.: zabbix-agent2 1:7.0.21-2+debian13 MySQL template is still attached to the host and the agent process no longer sticks at 100% CPU — it stays low and stable after several minutes of uptime. So the issue is resolved for 7.0.21-2 on these systems. |
| Comment by Stefan [ 2025 Nov 04 ] |
|
yep resolved |
| Comment by Carlos Eduardo Commim [ 2025 Nov 04 ] |
|
Hello! The new version 74.5-2 resolved the problem. |
| Comment by Patrik Leifert [ 2025 Nov 05 ] |
|
Also can confirm the new version 7.0.21-2 solved the problem in our RL9 environment. |
| Comment by Geoff Collins [ 2025 Nov 05 ] |
|
Also confirmed - problem appears resolved in 7.4.5-2 (on Amazon Linux 2023) |
| Comment by Fernando Viñan-Cano [ 2025 Nov 05 ] |
|
Confirmed for Fedora 42 once the release2 version was available |
| Comment by Antti Hurme [ 2025 Nov 05 ] |
|
7.0.21-2 confirmed to fix with rhel9 and mariadb 10.11. |