[ZBX-3771] Lock wait Timeout exceeded Created: 2011 Apr 29  Updated: 2017 May 30  Resolved: 2011 Nov 23

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: None
Affects Version/s: 1.9.4 (alpha)
Fix Version/s: 2.0.0

Type: Incident report Priority: Blocker
Reporter: Evan James Anderson Assignee: Unassigned
Resolution: Fixed Votes: 2
Labels: deadlock, lld
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

fedora 13 mysql 5.1.52 rev 19183



 Description   

I'm no expert just so you know, but I believe this to be related to low level discovery. In our environment we have switch gear with 240+ ports and I believe that discovery is struggling for some reason. I got this from my zabbix server log:

4021:20110429:075938.614 [Z3005] Query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='items' and field_name='itemid']
4002:20110429:075955.631 [Z3005] Query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='items' and field_name='itemid']
4007:20110429:080011.648 [Z3005] Query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='items' and field_name='itemid']
4012:20110429:080027.666 [Z3005] Query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='items' and field_name='itemid']
4011:20110429:084341.457 [Z3005] Query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='functions' and field_name='functionid']

I've tweaked mysql with:
innodb_file_per_table
innodb_lock_wait_timeout=10000 (I think that's the max from pokin around on the net)
But in the long run its still occuring.

I've got 6 items being discovery per interface and 7 triggers on top of those items...



 Comments   
Comment by Aleksandrs Saveljevs [ 2011 May 02 ]

Seems similar to ZBX-2494 and related issues.

Comment by Evan James Anderson [ 2011 Aug 30 ]

I've been busy with other things here, but I remember that I tracked this down to the update interval for low level discovery that I had configured. I think the interval that I set was too low and the previous discovery hadn't finish before a new one had begun.

Comment by Evan James Anderson [ 2011 Aug 30 ]

I had set it to 60 seconds(to get discovery started), but then forgot to change it back. I haven't run into it since.

Comment by Evan James Anderson [ 2011 Aug 30 ]

To note: I didn't know there was a separate discovery update interval parameter and it may be useful to note that this parameter shouldn't be set too low.

Comment by Evan James Anderson [ 2011 Aug 30 ]

I'm not sure what caused the dead lock, but I suspect discovery was running while the items that where discovered previously, where being updated by their individual update interval.

Comment by Evan James Anderson [ 2011 Aug 30 ]

I should also note that I'm now running pre-zabbix-1.9.6.x-21167.tar.gz, so it might have been fixed previously. It could also be that I pay more attention to the low level discovery update interval.

Comment by Evan James Anderson [ 2011 Aug 31 ]

What would happen if the update interval on discovery was set so low that discovery would start before a previous run of discovery had ended?

Comment by richlv [ 2011 Sep 05 ]

just to clarify, you don't have "normal" network discovery, just lld - and the lld interval is the one you set to 60 seconds initially ?

Comment by richlv [ 2011 Sep 09 ]

> What would happen if the update interval on discovery was set so low that discovery would start before a previous run of discovery had ended?

low level discovery or network discovery ?
network discovery would start "that interval" after the previous one.
lld... after some clarification, it would run as normal, but all the incoming data would be processed sequentially and values from same lld rule would never be processed at the same time (unless there's a Terrible Bug somewhere )

Comment by Evan James Anderson [ 2011 Sep 12 ]

>just to clarify, you don't have "normal" network discovery, just lld - and the lld interval is the one you set to 60 seconds initially ?

if my memory serves me correctly this would have been "normal" network discovery and during "normal" network discovery a lld template was being linked to hosts via a discovery action.

From ZBX-3770, >if you have network discovery that links to a template with lld, sometimes items get assigned to the wrong host ? i assume this happens only if network discovery finds more than one host ? ---Q1 is yes, I'm not exactly certain if the items get assigned incorrectly with any pedictablity though, I'd have to test it. Q2. is yes, a discovery action that links a template with lld works flawlessly one only one host is discovered

With that being said, perhaps having items linked via lld to the wrong host could cause the Lock wait timeout...

Comment by Alexander Vladishev [ 2011 Oct 26 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-3771

Comment by Aleksandrs Saveljevs [ 2011 Oct 27 ]

(1) Please review my changes in r22746.

<sasha> CLOSED

<asaveljevs> In some queries item_discovery table is aliases to "id", in others - just "d".

<sasha> RESOLVED

<asaveljevs> CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Oct 27 ]

(2) Currently, when creating real items from item prototypes, SNMP community and other SNMP parameters are taken from discovery rules, even though it is possible to edit them for prototypes. SNMP parameters should be taken from item prototypes.

<sasha> RESOLVED

<asaveljevs> CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Oct 27 ]

(3) Let's move the code for low-level discovery into a different module: src/libs/zbxdbhigh/proxy.c is not the best place for it.

<sasha> RESOLVED

<asaveljevs> Something tells me that leaving out file lld.c from the repository was not intended.

<sasha> sorry, already added

<asaveljevs> CLOSED. I have only removed a trailing empty line from that file.

Comment by Alexander Vladishev [ 2011 Nov 23 ]

Available in version pre1.9.8, r23473.

Generated at Tue Apr 23 11:57:35 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.