[ZBX-15438] Zabbix unresponsive after '[Z3005] query failed: [1205] Lock wait timeout exceeded' Created: 2019 Jan 11 Updated: 2019 Jan 11 Resolved: 2019 Jan 11 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Frontend (F), Server (S) |
Affects Version/s: | 4.0.2 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Blocker |
Reporter: | Erik De Neve | Assignee: | Unassigned |
Resolution: | Won't fix | Votes: | 0 |
Labels: | database, lld | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Galera MariaDB cluser with 3 nodes (2 masters and 1 arbitrattor). |
Attachments: | mariadb.conf zabbix_server.conf zabbix_server.log |
Description |
We run multiple times into the same issue. Zabbix Server is not crached, but data is not collected anymore and the frontend is blocked. In the log we see always the same kind of queries failing and always during the houskeeper execution. Please note we are intensively using LLD (SNMP and external checks) It seems that it' started some seconds (+-30") after the housekeeper is executed. e.g. 22017:20190110:130525.841 [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update item_discovery set lastcheck=1547121824 where (itemid between 616377 and 616391 or itemid between 669993 and 670010 or itemid between 733218 and 733223 or itemid between 735766 and 735771 or itemid between 774180 and 774221 or itemid in (616338,616339,616340,616359,616360,616361)); Zabbix stops collecting data, the frontend is not responding, we have to restart MariaDB and Zabbix to get to a stable situation. It's difficult to reproduce, because it happens at random times (already 3 times last 3weeks) |
Comments |
Comment by Arturs Lontons [ 2019 Jan 11 ] |
Hi, The root cause of the issue seems to be related to database performance. There are some additional steps that you could perform to improve the DB query performance, for example - enabling large pages, increasing the LLD update interval, implementing DB partitioning and so on. Feel free to use our forums located at https://www.zabbix.com/forum to ask for specific Database performance tuning advice for your specific environment. Since this is not a Zabbix bug report, I will be closing the ticket. |