[ZBX-18390] zabbix-agent2 "driver: bad connection" on MySQL check Created: 2020 Sep 18  Updated: 2021 Mar 05

Status: Need info
Project: ZABBIX BUGS AND ISSUES
Component/s: None
Affects Version/s: 5.0.3
Fix Version/s: None

Type: Problem report Priority: Trivial
Reporter: Dmitry Verkhoturov Assignee: Renats Valiahmetovs (Inactive)
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian 10
mariadb-server-10.3 1:10.3.23-0+deb10u1
zabbix-agent2 1:5.0.3-1+buster


Attachments: PNG File image-2020-09-18-00-27-49-997.png     PNG File image-2020-09-18-00-31-13-799.png     PNG File image-2021-03-05-01-23-33-375.png    

 Description   

I have a MariaDB server in which I created a user according to "Template DB MySQL by Zabbix agent 2" instruction.
Next to it, I have zabbix-agent2 with the following config:

$ egrep -v '^#|^$' /etc/zabbix/zabbix_agent2.conf
PidFile=/var/run/zabbix/zabbix_agent2.pid
LogFile=/var/log/zabbix/zabbix_agent2.log
LogFileSize=0
ServerActive=example.org,<server external IP>
Hostname=server.example.org
Include=/etc/zabbix/zabbix_agent2.d/*.conf
ControlSocket=/tmp/agent.sock

In the web-interface, I have "Template DB MySQL by Zabbix agent 2" added to that host. The only alteration I made is I changed the items' type from "Zabbix Agent" to "Zabbix Agent ("active") to make it work without incoming connections, but the problem manifested itself before that change as well.

The problem is that MySQL data from the agent comes in a very unreliable manner. Here is the log of the agent:

$ tail -1000 /var/log/zabbix/zabbix_agent2.log | cut -d ' ' -f 3- | sort | uniq -c
    264 check 'mysql.db.discovery["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password"]' is not supported: driver: bad connection
    263 check 'mysql.db.size["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password","db_name"]' is not supported: driver: bad connection
    209 check 'mysql.db.size["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password","other_db_name"]' is not supported: driver: bad connection
    264 check 'mysql.get_status_variables["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password"]' is not supported: driver: bad connection

And here is what I see in the web-interface:

Data collection flaps with an unhelpful error message. MySQL error log:

$ grep zbx_monitor /var/log/mysql/error.log
<...>
2020-09-18  1:23:27 611815 [Warning] Aborted connection 611815 to db: 'unconnected' user: 'zbx_monitor' host: 'localhost' (Got an error writing communication packets)
2020-09-18  1:23:48 611818 [Warning] Aborted connection 611818 to db: 'unconnected' user: 'zbx_monitor' host: 'localhost' (Got timeout reading communication packets)
$ grep zbx_monitor /var/log/mysql/error.log | grep -c 'Got an error writing communication packets'
1
$ grep zbx_monitor /var/log/mysql/error.log | grep -c 'Got timeout reading communication packets'
129

The fact that the message is unhelpful could be attributed to MySQL library, however, it could be wrapped with more context which will be helpful.

Also if there is a way to help me figure out why exactly this is happening, I would be very grateful.



 Comments   
Comment by Aigars Kadikis [ 2020 Sep 21 ]

Also if there is a way to help me figure out why exactly this is happening, I would be very grateful.

This can also happen if you have 'Timeout=30' in zabbix_agentd.conf and the host has one heavy UserParameter installed(not related to MySQL monitoring) which consumes all 30 second to obtain one metric. It's a rare case, but still a possibility.

Please also try to simulate the check via passive channel, just to eliminate the problem with DB engine:

time zabbix_get -s 127.0.0.1 -p 10050 -k 'mysql.db.discovery["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password"]'
time zabbix_get -s 127.0.0.1 -p 10050 -k 'mysql.db.size["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password","db_name"]''
time zabbix_get -s 127.0.0.1 -p 10050 -k 'mysql.db.size["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password","other_db_name"]'
time zabbix_get -s 127.0.0.1 -p 10050 -k 'mysql.get_status_variables["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password"]'

How fast the metrics get obtained now? 

For testing purposes, can you try to leave the only MySQL monitoring in this host, does the behaviour is the same?

Comment by Dmitry Verkhoturov [ 2020 Sep 23 ]
root@kvmru05-19571:~# time zabbix_get -s 127.0.0.1 -p 10050 -k mysql.db.size["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password","db_name"]
ZBX_NOTSUPPORTED: driver: bad connectionreal	0m0.012s
user	0m0.008s
sys	0m0.002s 

The bad answer right away. I can't remove other templates as I'll lose all history that way.

Comment by Dmitry Verkhoturov [ 2021 Feb 17 ]

I've tested once more with zabbix_agent2 (Zabbix) 5.2.3 Revision ae46273 21 December 2020, compilation time: Dec 30 2020 23:47:20 running inside the container, it gets all information for MySQL once after the restart and then spams logs with that error:

2021-02-17T19:56:26.000954736Z 2021/02/17 19:56:26.000807 check 'mysql.get_status_variables["unix:/var/run/mysqld/mysqld.sock","zbx_monitor","password"]' is not supported: driver: bad connection
Comment by Dmitry Verkhoturov [ 2021 Mar 05 ]

I found the issue root cause: multiple servers in ServerActive, two in my case, were causing the agenе misbehavior. After leaving only one server on the list everything works fine:

I believe I've had it with MariaDB version based on 5.7, and had it with PerconaDB 5.7 and 8.0, and the last change in behaviour after leaving single ServerActive was observed on zabbix_agent2 (Zabbix) 5.2.5 Revision 1afd0de 22 February 2021, compilation time: Feb 22 2021 16:08:20.

 

Renats Valiahmetovs would you be so kind to test if you could reproduce it?

Generated at Wed May 21 06:23:36 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.