-
Incident report
-
Resolution: Won't fix
-
Blocker
-
None
-
4.0.2
We run multiple times into the same issue. Zabbix Server is not crached, but data is not collected anymore and the frontend is blocked. In the log we see always the same kind of queries failing and always during the houskeeper execution.
Please note we are intensively using LLD (SNMP and external checks)
It seems that it' started some seconds (+-30") after the housekeeper is executed.
We see always errors on updating item_discovery (see zabbix_server.log attached)
e.g.
22017:20190110:130525.841 [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update item_discovery set lastcheck=1547121824 where (itemid between 616377 and 616391 or itemid between 669993 and 670010 or itemid between 733218 and 733223 or itemid between 735766 and 735771 or itemid between 774180 and 774221 or itemid in (616338,616339,616340,616359,616360,616361));
]
22017:20190110:130525.841 slow query: 101.629963 sec, "update item_discovery set lastcheck=1547121824 where (itemid between 616377 and 616391 or itemid between 669993 and 670010 or itemid between 733218 and 733223 or itemid between 735766 and 735771 or itemid between 774180 and 774221 or itemid in (616338,616339,616340,616359,616360,616361));
Zabbix stops collecting data, the frontend is not responding, we have to restart MariaDB and Zabbix to get to a stable situation.
It's difficult to reproduce, because it happens at random times (already 3 times last 3weeks)