-
Incident report
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
3.2.3
-
None
We have currently 10 ODBC targets configured in /etc/odbc.ini like this:
[mysql.example.com] Description=standalone MySQL server Driver=MySQL Server=mysql.example.com Port=3306 User=monitoring Password=foobar [galera1.example.com] Description=Galera node 1 Driver=MySQL Server=galera1.example.com Port=3306 User=monitoring Password=foobar [galera2.example.com] Description=Galera node 2 Driver=MySQL Server=galera2.example.com Port=3306 User=monitoring Password=foobar
For each ODBC configuration there is a host in Zabbix with the following item/trigger configuration:
Item Key: db.odbc.select[version,{HOST.HOST}] SQL query: select VARIABLE_VALUE from information_schema.GLOBAL_VARIABLES where VARIABLE_NAME = 'version';
Trigger Expression: {Template App MySQL ODBC:db.odbc.select[version,{HOST.HOST}].nodata(1m)}=1
(Besides that, 50 other items are configured for each ODBC connection/host to run more interesting SQL queries. In total there are 500+ items configured that rely on these ODBC connections.)
Now if one of these MySQL hosts is no longer reachable (i.e. the connection attempt times out) and the trigger changes to PROBLEM state, all other hosts with this item/trigger will be reported as in PROBLEM state too.
It is interesting to note, that ODBC connections to the other hosts are not dead when this occurs. "Latest Data" still shows new values coming in for the other ODBC hosts.
The trigger for these other ODBC hosts may change it's state to OK and back to PROBLEM once in a while. (Couldn't find a pattern.)
If I disable the offline ODBC host, the trigger for all other ODBC hosts will change back to OK state.
IMPORTANT:
It seems to be related to how the connection attempt to the MySQL server (ODBC target) is answered: in this case the connection attempt just timed out. And only this bevahiour seems to cause this issue.
In other cases, for example when the connection attempt failed with "connection refused" or something similar, the trigger and all other ODBC hosts work as expected.
This sounds like either a timing issue, or a ODBC limitation, or too many SQL queries?