[#ZBX-5225] Frequent "Lock wait timeout exceeded; try restarting transaction" on zabbix

1636:20130205:085708.902 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: deadlock detected DETAIL: Process 1659 waits for ShareLock on transaction 26507712; blocked by process 1729. Process 1729 waits for ShareLock on transaction 26507984; blocked by process 1659. HINT: See server log for query details. [update items set name='Used disk space on $1',key_='vfs.fs.size[C:\\,used]',type=0,value_type=3,data_type=0,delay=60,delay_flex='',history=7,trends=365,trapper_hosts='',units='B',multiplier=0,delta=0,formula='1',logtimefmt='',valuemapid=null,params='',ipmi_sensor='',snmp_community='',snmp_oid='',port='',snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',authtype=0,username='',password='',publickey='',privatekey='',description='',interfaceid=73,flags=4 where itemid=28926; update item_discovery set key_='vfs.fs.size[{#FSNAME},used]',lastcheck=1360054627,ts_delete=0 where itemid=28926 and parent_itemid=28632; update items set name='Used disk space on $1',key_='vfs.fs.size[E:\\,used]',type=0,value_type=3,data_type=0,delay=60,delay_flex='',history=7,trends=365,trapper_hosts='',units='B',multiplier=0,delta=0,formula='1',logtimefmt='',valuemapid=null,params='',ipmi_sensor='',snmp_community='',snmp_oid='',port='',snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',authtype=0,username='',password='',publickey='',privatekey='',description='',interfaceid=73,flags=4 where itemid=29422; update item_discovery set key_='vfs.fs.size[{#FSNAME},used]',lastcheck=1360054627,ts_delete=0 where itemid=29422 and parent_itemid=28632; ]

2013-02-05 08:57:08 GMT ERROR: deadlock detected 2013-02-05 08:57:08 GMT DETAIL: Process 1659 waits for ShareLock on transaction 26507712; blocked by process 1729. Process 1729 waits for ShareLock on transaction 26507984; blocked by process 1659. Process 1659: update items set name='Used disk space on $1',key_='vfs.fs.size[C:\\,used]',type=0,value_type=3,data_type=0,delay=60,delay_flex='',history=7,trends=365,trapper_hosts='',units='B',multiplier=0,delta=0,formula='1',logtimefmt='',valuemapid=null,params='',ipmi_sensor='',snmp_community='',snmp_oid='',port='',snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',authtype=0,username='',password='',publickey='',privatekey='',description='',interfaceid=73,flags=4 where itemid=28926; update item_discovery set key_='vfs.fs.size[{#FSNAME},used]',lastcheck=1360054627,ts_delete=0 where itemid=28926 and parent_itemid=28632; update items set name='Used disk space on $1',key_='vfs.fs.size[E:\\,used]',type=0,value_type=3,data_type=0,delay=60,delay_flex='',history=7,trends=365,trapper_hosts='',units='B',multiplier=0,delta=0,formula='1',logtimefmt='',valuemapid=null,params='',ipmi_sensor='',snmp_community='',snmp_oid='',port='',snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpa Process 1729: update items set lastclock=1360054567,lastns=441864580,prevvalue=lastvalue,lastvalue='1' where itemid=23287; update items set lastclock=1360054571,lastns=326074725,prevvalue=lastvalue,lastvalue='1' where itemid=23291; update items set lastclock=1360054572,lastns=854848589,prevvalue=lastvalue,lastvalue='256' where itemid=23292; update items set lastclock=1360054574,lastns=459605438,prevorgvalue='506003581',prevvalue=lastvalue,lastvalue='2639' where itemid=23294; update items set lastclock=1360054575,lastns=339624171,prevvalue=lastvalue,lastvalue='1.151250' where itemid=23295; update items set lastclock=1360054576,lastns=973642146,prevvalue=lastvalue,lastvalue='1.095000' where itemid=23296; update items set lastclock=1360054577,lastns=61672013,prevvalue=lastvalue,lastvalue='1.112500' where itemid=23297; update items set lastclock=1360054578,lastns=36596477,prevorgvalue='1104252186',prevvalue=lastvalue,lastvalue='5508' where itemid=23298; update items set lastclock=1360054563,lastns=387421557,prevvalue=lastvalu 2013-02-05 08:57:08 GMT HINT: See server log for query details. 2013-02-05 08:57:08 GMT STATEMENT: update items set name='Used disk space on $1',key_='vfs.fs.size[C:\\,used]',type=0,value_type=3,data_type=0,delay=60,delay_flex='',history=7,trends=365,trapper_hosts='',units='B',multiplier=0,delta=0,formula='1',logtimefmt='',valuemapid=null,params='',ipmi_sensor='',snmp_community='',snmp_oid='',port='',snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',authtype=0,username='',password='',publickey='',privatekey='',description='',interfaceid=73,flags=4 where itemid=28926; update item_discovery set key_='vfs.fs.size[{#FSNAME},used]',lastcheck=1360054627,ts_delete=0 where itemid=28926 and parent_itemid=28632; update items set name='Used disk space on $1',key_='vfs.fs.size[E:\\,used]',type=0,value_type=3,data_type=0,delay=60,delay_flex='',history=7,trends=365,trapper_hosts='',units='B',multiplier=0,delta=0,formula='1',logtimefmt='',valuemapid=null,params='',ipmi_sensor='',snmp_community='',snmp_oid='',port='',snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',authtype=0,username='',password='',publickey='',privatekey='',description='',interfaceid=73,flags=4 where itemid=29422; update item_discovery set key_='vfs.fs.size[{#FSNAME},used]',lastcheck=1360054627,ts_delete=0 where itemid=29422 and parent_itemid=28632;

2013-02-03 06:40:58 GMT ERROR: deadlock detected 2013-02-03 07:34:48 GMT ERROR: deadlock detected 2013-02-03 08:04:43 GMT ERROR: deadlock detected 2013-02-03 08:34:59 GMT ERROR: deadlock detected 2013-02-03 08:38:20 GMT ERROR: deadlock detected 2013-02-03 09:34:04 GMT ERROR: deadlock detected 2013-02-03 09:34:04 GMT ERROR: deadlock detected 2013-02-03 11:01:33 GMT ERROR: deadlock detected 2013-02-03 11:34:22 GMT ERROR: deadlock detected 2013-02-03 12:47:05 GMT ERROR: deadlock detected 2013-02-03 13:32:29 GMT ERROR: deadlock detected 2013-02-03 14:40:10 GMT ERROR: deadlock detected 2013-02-03 15:32:47 GMT ERROR: deadlock detected 2013-02-03 16:32:09 GMT ERROR: deadlock detected 2013-02-03 16:38:19 GMT ERROR: deadlock detected 2013-02-03 22:36:28 GMT ERROR: deadlock detected 2013-02-03 22:36:30 GMT ERROR: deadlock detected 2013-02-03 23:39:20 GMT ERROR: deadlock detected 2013-02-04 00:37:03 GMT ERROR: deadlock detected 2013-02-04 00:56:51 GMT ERROR: deadlock detected 2013-02-04 02:57:55 GMT ERROR: deadlock detected 2013-02-04 03:34:08 GMT ERROR: deadlock detected 2013-02-04 04:38:47 GMT ERROR: deadlock detected 2013-02-04 05:38:57 GMT ERROR: deadlock detected 2013-02-04 05:38:58 GMT ERROR: deadlock detected 2013-02-04 09:38:02 GMT ERROR: deadlock detected 2013-02-04 12:00:01 GMT ERROR: deadlock detected 2013-02-04 12:33:59 GMT ERROR: deadlock detected 2013-02-04 13:32:20 GMT ERROR: deadlock detected 2013-02-04 13:37:33 GMT ERROR: deadlock detected 2013-02-04 13:50:17 GMT ERROR: deadlock detected 2013-02-04 14:39:38 GMT ERROR: deadlock detected 2013-02-04 15:39:18 GMT ERROR: deadlock detected 2013-02-04 16:33:42 GMT ERROR: deadlock detected 2013-02-04 17:37:17 GMT ERROR: deadlock detected 2013-02-04 19:37:54 GMT ERROR: deadlock detected 2013-02-04 19:50:21 GMT ERROR: deadlock detected 2013-02-04 20:37:14 GMT ERROR: deadlock detected 2013-02-04 21:39:23 GMT ERROR: deadlock detected 2013-02-04 21:41:38 GMT ERROR: deadlock detected 2013-02-05 00:36:33 GMT ERROR: deadlock detected 2013-02-05 05:58:43 GMT ERROR: deadlock detected 2013-02-05 08:57:08 GMT ERROR: deadlock detected

Innotop
=======

When   Load  Cxns     QPS    Slow   Se/In/Up/De%  QCacheHit  KCacheHit  BpsIn    BpsOut 
Now    0.00  161      1.60k  0      69/ 0/29/ 0      22.45%    100.00%  281.01k  358.13k
Total  0.00    1.46k  1.87k  7.53k  41/ 0/58/ 0      90.84%    100.00%  337.20k    1.30M

Cmd    ID      State               User    Host           DB      Time   Query                                                    
Query   14283  Updating            zabbix  localhost      zabbix  00:49  update items set lastclock=1360917672,lastns=336481966,pr
Query   14280  Updating            zabbix  localhost      zabbix  00:46  update items set lastclock=1360917671,lastns=106638034,pr
Query   14282  Updating            zabbix  localhost      zabbix  00:46  update items set lastclock=1360917673,lastns=539945615,pr
Query   14286  Updating            zabbix  localhost      zabbix  00:46  update items set lastclock=1360917668,lastns=654048890,pr
Query   14287  Updating            zabbix  localhost      zabbix  00:46  update items set lastclock=1360917675,lastns=491710979,pr
Query   14288  Updating            zabbix  localhost      zabbix  00:46  update items set lastclock=1360917675,lastns=145399825,pr
Query   14289  Updating            zabbix  localhost      zabbix  00:46  update items set lastclock=1360917669,lastns=784594499,pr
Query   14281  Updating            zabbix  localhost      zabbix  00:44  update items set lastclock=1360917677,lastns=386003654,pr
Query   14274  Updating            zabbix  localhost      zabbix  00:43  update items set name='Administrative state uplink1, IP i
Query   14081  Writing to net      zabbix  localhost      zabbix  00:00  select i.itemid,i.hostid,h.proxy_hostid,i.type,i.data_typ
Query   14095  Copying to tmp tab  zabbix  localhost      zabbix  00:00  select distinct t.triggerid,t.expression from triggers t,
Query   14102  Writing to net      zabbix  localhost      zabbix  00:00  select distinct i.itemid from items i,item_discovery id w
Query   14215  Copying to tmp tab  zabbix  localhost      zabbix  00:00  select distinct i.itemid from items i,item_discovery id w
Query   14226  Copying to tmp tab  zabbix  localhost      zabbix  00:00  select distinct t.triggerid,t.expression from triggers t,
Query   14277  Copying to tmp tab  zabbix  localhost      zabbix  00:00  select distinct t.triggerid,t.expression from triggers t,
Press any key to continue


INNODB ENGINE STATUS
====================
...
---TRANSACTION 977AC509, ACTIVE 102 sec starting index read
mysql tables in use 3, locked 0
274 lock struct(s), heap size 31160, 22239 row lock(s), undo log entries 14816
MySQL thread id 14223, OS thread handle 0x7ffdf9c91700, query id 116167938 localhost zabbix Copying to tmp table
select distinct t.triggerid,t.expression from triggers t,functions f,items i where t.triggerid=f.triggerid and f.itemid=i.itemid a
nd i.hostid=10126 and t.description='{HOST.NAME}: Interface OperState is DOWN while AdminState is up GigabitEthernet2/28 ({ITEM.VA
LUE})' and t.triggerid<>25672
Trx read view will not see trx with id >= 977AC50A, sees < 977ABC48
---TRANSACTION 977AC31F, ACTIVE 118 sec starting index read
mysql tables in use 3, locked 0
286 lock struct(s), heap size 47544, 22003 row lock(s), undo log entries 14656
MySQL thread id 14102, OS thread handle 0x7ffdf8d96700, query id 116167563 localhost zabbix Copying to tmp table
select distinct t.triggerid,t.expression from triggers t,functions f,items i where t.triggerid=f.triggerid and f.itemid=i.itemid a
nd i.hostid=10124 and t.description='{HOST.NAME}: Interface OperState is DOWN while AdminState is up GigabitEthernet4/5 ({ITEM.VAL
UE})' and t.triggerid<>24456
Trx read view will not see trx with id >= 977AC320, sees < 977ABC48
---TRANSACTION 977AC182, ACTIVE 131 sec starting index read
mysql tables in use 3, locked 0
269 lock struct(s), heap size 31160, 22288 row lock(s), undo log entries 14848
MySQL thread id 14215, OS thread handle 0x7ffbbdd9d700, query id 116167845 localhost zabbix Copying to tmp table
select distinct t.triggerid,t.expression from triggers t,functions f,items i where t.triggerid=f.triggerid and f.itemid=i.itemid and i.hostid=10125 and t.description='{HOST.NAME}: Interface OperState is DOWN while AdminState is up Vlan206 ({ITEM.VALUE})' and t.triggerid<>120022
Trx read view will not see trx with id >= 977AC183, sees < 977ABC48
...

It seems that these transactions (SELECT) are locking considerable amount of rows in items table thus blocking UPDATE queries from both poller and paraller lld processes. At least this has very undesired effect with LLD: the items are being marked for deletion.

diff -uNrp zabbix-2.0.4/src/libs/zbxdbhigh/lld.c zabbix-2.0.4nblpatches/src/libs/zbxdbhigh/lld.c --- zabbix-2.0.4/src/libs/zbxdbhigh/lld.c 2012-12-08 13:09:15.000000000 +0200 +++ zabbix-2.0.4nblpatches/src/libs/zbxdbhigh/lld.c 2013-02-20 10:17:40.000000000 +0200 @@ -2236,7 +2236,7 @@ void DBlld_process_discovery_rule(zbx_ui if (0 == hostid) goto clean; - DBbegin(); + //DBbegin(); error = zbx_strdup(error, ""); @@ -2314,7 +2314,7 @@ error: DBexecute("%s", sql); - DBcommit(); + //DBcommit(); clean: zbx_free(error); zbx_free(db_error);

Action

Time, sec.

Number of SQL statements

Total size of SQL statements, bytes

first discovery (creating of 8 * 256 = 2048 items)

1.272079

8290

1668647

next discovery (updating)

3.657499

8204

2445575

next discovery (changed key in item prototype)

10.474360

10250

2703623

Action

Time, sec.

Number of SQL statements

Total size of SQL statements, bytes

first discovery (creating of 8 * 256 = 2048 items)

0.353674

129

888988

next discovery (updating)

0.112492

6807

next discovery (changed key in item prototype)

0.731188

350631

[ZBX-5225] Frequent "Lock wait timeout exceeded; try restarting transaction" on zabbix_server. LLD related ? Created: 2012 Jun 21 Updated: 2017 May 30 Resolved: 2013 Apr 25
Status:	Closed
Project:	ZABBIX BUGS AND ISSUES
Component/s:	Server (S)
Affects Version/s:	2.0.0, 2.0.1
Fix Version/s:	2.0.6rc1, 2.1.0

[ZBX-5225] Frequent "Lock wait timeout exceeded; try restarting transaction" on zabbix_server. LLD related ? Created: 2012 Jun 21 Updated: 2017 May 30 Resolved: 2013 Apr 25

Whats new:

Tests:

2.0.5

2.0.5 with the fix