[ZBX-2494] deadlock between server and frontend Created: 2010 Jun 01  Updated: 2024 Apr 10  Resolved: 2019 Apr 09

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: API (A)
Affects Version/s: 1.8.2, 3.0.25, 4.0.4, 4.2.0alpha3
Fix Version/s: 3.0.27rc1, 4.0.7rc1, 4.2.1rc1, 4.4.0alpha1, 4.4 (plan)

Type: Problem report Priority: Critical
Reporter: Aleksandrs Saveljevs Assignee: Alexander Shubin
Resolution: Fixed Votes: 32
Labels: deadlock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive PS-480-debug-frontend.zip    
Issue Links:
Causes
Duplicate
is duplicated by ZBXNEXT-1476 Using Auto Increment as EventID, Audi... Closed
is duplicated by ZBX-6532 postgresql deadlock on items table wh... Open
is duplicated by ZBX-3785 When performing mass update operation... Closed
is duplicated by ZBX-4501 Zabbix server lockup when the same us... Closed
is duplicated by ZBX-4923 PostgreSQL deadlock while trying to d... Closed
is duplicated by ZBX-6118 Very low performance changing templat... Closed
is duplicated by ZBX-5594 mysql deadlock when trying to add man... Closed
is duplicated by ZBX-5787 Postgresql 9.1 ShareLocks when updati... Closed
is duplicated by ZBX-6485 Deadlocks when doing updates in frontend Closed
is duplicated by ZBX-6754 Deadlock when using "Unlink and clear... Closed
Team: Team D
Team: Team D
Sprint: Sprint 49 (Feb 2019), Sprint 50 (Mar 2019), Sprint 51 (Apr 2019)
Story Points: 0.5

 Description   

Server and frontend sometimes deadlock.

Relevant forum threads:



 Comments   
Comment by Aleksandrs Saveljevs [ 2010 Jun 01 ]

Relevant issues: ZBX-1401 and ZBX-1402 (server deadlocks by itself). Also ZBX-3680.

Comment by Pritpal [ 2010 Jun 28 ]

Hey Guys,

Is there a timeline or some sort of idea when this issue may get fixed?

Comment by Aleksandrs Saveljevs [ 2010 Jun 29 ]

My best guess would be 2.0, but this estimate might or might not change later.

Comment by richlv [ 2010 Aug 19 ]

ZBX-2897 also seems to have some issues with heavy server & frontend usage, although it might not be a dealock as such

Comment by Aleksandrs Saveljevs [ 2010 Sep 22 ]

No idea how to reproduce, but the following has just happened in my development environment:

15956:20100922:133443.239 [Z3005] Query failed: [1213] Deadlock found when trying to get lock; try restarting transaction [update triggers set value=1,lastchange=1285151681,error='' where triggerid=1401400000013838]
15958:20100922:133443.267 [Z3005] Query failed: [1213] Deadlock found when trying to get lock; try restarting transaction [update triggers set value=0,lastchange=1285151682,error='' where triggerid=1401400000013839]

Coincidentally, it happened exactly when a host was deleted through the GUI.

Comment by Aleksandrs Saveljevs [ 2010 Oct 05 ]

This is just to say that we now have a reliable way of reproducing deadlocks like those shown in the previous post. In these cases, server deadlocks by itself, without any cooperation from the GUI. So the problem also falls under ZBX-1402.

Comment by Alexander Vladishev [ 2010 Oct 06 ]

... and this in my environment:

26005 root dm.zabbix.com:10589 sasha18 Query 671 Locked update items set status=3,lastclock=1286345864,error='Type of received value [err] is not suitable f
26012 root dm.zabbix.com:10597 sasha18 Query 569 Locked update items set status=3,lastclock=1286345864,error='Type of received value [err] is not suitable f
26016 root dm.zabbix.com:10601 sasha18 Query 823 Updating update items set status=3,lastclock=1286345866,error='Not supported by Zabbix Agent' where itemid=28
26017 root dm.zabbix.com:10603 sasha18 Query 212 Locked update items set status=3,lastclock=1286345865,error='Type of received value [err] is not suitable f
26033 root dm.zabbix.com:10622 sasha18 Query 518 Locked update items set lastclock=1286345864,prevvalue=lastvalue,prevorgvalue=NULL,lastvalue='14.137645' wh
26034 root dm.zabbix.com:10623 sasha18 Query 773 Locked update items set lastclock=1286345850,prevvalue=lastvalue,prevorgvalue=NULL,lastvalue='0',status=0,e
26036 root dm.zabbix.com:10642 sasha18 Query 671 Locked update items set status=3,lastclock=1286345864,error='Type of received value [err] is not suitable f
26042 root dm.zabbix.com:10653 sasha18 Query 569 Locked update items set status=3,lastclock=1286345864,error='Type of received value [err] is not suitable f
Comment by Nick Tzaperas [ 2011 Mar 18 ]

Would you mind giving some info on how you are able to reproduce the deadlock? This issue is affected me on a weekly basis and any information as to what is going on would be appreciated.

Comment by Aleksandrs Saveljevs [ 2011 Mar 18 ]

Nick, you are right, we should have documented that. However, it is not something you would do in real life.

As far as I remember, you start by creating 10,000 hosts. You then create item "random.number" (updated each second) and trigger "

{Template_Linux:random.number.last(0)}

=1" on all of them. The "random.number" item is a user parameter that either returns 0 or 1, or becomes unsupported:

UserParameter=random.number[*],case $$(($$RANDOM%3)) in 0) echo 0 ;; 1) echo 1 ;; 2) echo "" ;; esac

This setup produces a lot of history and events, and causes SQL queries on the server to deadlock fairly quickly.

Comment by richlv [ 2012 Feb 14 ]

the server deadlocking itself might be a different issue.

most likely a server <-> frontend deadlock was just observed while updating a templated item (while server was running, ~2.4k nvps) :

ociexecute() [<a href='function.ociexecute'>function.ociexecute</a>]: ORA-00060: deadlock detected while waiting for resource [include/db.inc.php:557]

SQL error [ORA-00060: deadlock detected while waiting for resource] in [UPDATE items SET description='Version of zabbix_agent(d) running',key_='agent.version',hostid=100100000019092,delay=1800,history=7,type=7,snmp_community='',snmp_oid='',value_type=1,data_type=0,trapper_hosts='',snmp_port=161,units='',multiplier=0,delta=0,snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',formula='0',trends=0,logtimefmt='',valuemapid=0,delay_flex='',params='',ipmi_sensor='',templateid=100100000010774,authtype=0,username='',password='',publickey='',privatekey='' WHERE itemid=100100000248959]

Comment by Oleksii Zagorskyi [ 2012 Feb 14 ]

ZBXNEXT-822 could be considered (could help?)

Comment by Randall Svancara [ 2012 Nov 02 ]

I looked at this patch. I believe it would result in just moving the problem to another table at least for the backend. For the front end, it may help where the items table could be touched and reduce the risk of a lock.

The fact of the matter is the items table is pounded by the zabbix backend. It is very obvious that multiple concurrent updates to the same table by several backend threads in conjunction with changes from the front end would cause some sort of locking issue.

Comment by Eugene [ 2013 Feb 20 ]

When the problem will be solved? Because of this problem, it is impossible to make changes to the database without stopping the Zabbix server.

Comment by Jasper Spit [ 2013 Apr 13 ]

This affects 2.0.4, 2.0.5, 2.0.6rc1 as well.

Please fix soon, makes Zabbix pretty much unusable for me at the moment. Mass updates of a few items or updating a graph causes these deadlocks for me, the update I do affects +/- 200 hosts with about 80 items, so nothing out of the ordinary.

Comment by richlv [ 2013 Apr 13 ]

a more significant redesign is planned as a part of ZBXNEXT-1689

Comment by Iskra Zabbix [ 2013 Aug 14 ]

This affects 2.0.7 as well. Very annoying.

Comment by Anton Samets [ 2013 Oct 28 ]

+ to this task - right now I can't add new graph or update existent one with errors:

ERROR: Cannot add graph
Error in query [INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('558291','200912','427745','2','1','00C800','0','2','0')] [Deadlock found when trying to get lock; try restarting transaction]
SQL statement execution has failed "INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('558291','200912','427745','2','1','00C800','0','2','0')".

or:

ERROR: Cannot update graph
Error in query [INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('570943','197762','385000','2','1','EE0000','0','2','0')] [Got error -1 from storage engine]
SQL statement execution has failed "INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('570943','197762','385000','2','1','EE0000','0','2','0')".

mysql 5.6.10

Sometimes it saves the graph, but this issue is very annoying.
Zabbix 2.0.9

Comment by Rostislav Palivoda [ 2018 Jan 08 ]

Currently supported versions are 2.2, 3.0, 3.4 and 4.0
https://www.zabbix.com/life_cycle_and_release_policy

Comment by Oleksii Zagorskyi [ 2019 Feb 13 ]

Zabbix version 3.4.15
We try to update an item in a template, which linked to 17K hosts.
After 25 minutes of waiting we always get a deadlock error.
DB is postgresql and already tuned by professional DBA.
We tried to do it at night time - it does not help.
We cannot stop zabbix server for that.

Frontend side error - https://pastebin.com/bwYYNRGa
Server side error - https://pastebin.com/qwB3mTwu

One more case - both errors https://pastebin.com/0KbksUbK

What to do? ...

Comment by Vladislavs Sokurenko [ 2019 Feb 13 ]

Could it be due to hostid updates ?

What frontend could possibly do is select for update where itemid in (<sorted itemids>); then do updates, this is regarding zalex_ua comment

But I would also like to see frontend queries that failed (whole transaction) in case it was not sorted by itemid, perhaps it is sorted by hostid first.

Comment by Vladislavs Sokurenko [ 2019 Feb 13 ]

Reproduced on trunk Issue that zalex_ua mentioned

  1. Enable debug mode
  2. Create template with one item
  3. Create 2 hosts
  4. Link tempalate to second host
  5. Link template to first host
  6. Update item history storage period in template

Notice that update is not ordered by itemid:

SQL (0.000154): UPDATE items SET history='32d',itemid='28539',hostid='10263',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28539'
items.php:847 → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → CItem->update() → CItemGeneral->inherit() → CItemGeneral->_inherit() → CItem->updateReal() → DB::update() → DBexecute() in include/classes/db/DB.php:573

SQL (0.000218): UPDATE items SET history='32d',itemid='28538',hostid='10264',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28538'
items.php:847 → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → CItem->update() → CItemGeneral->inherit() → CItemGeneral->_inherit() → CItem->updateReal() → DB::update() → DBexecute() in include/classes/db/DB.php:573

When server will update lastlogsize it sometimes deadlock as per zalex_ua comment this can happen:

53918:20190213:125722.423 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR:  deadlock detected
DETAIL:  Process 73842 waits for ShareLock on transaction 314356128; blocked by process 79179.
Process 79179 waits for ShareLock on transaction 314358666; blocked by process 73842.
HINT:  See server log for query details.
CONTEXT:  while updating tuple (155702,41) in relation "items"
[update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059315;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059316;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059319;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059321;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059328;
update items set lastlogsize=1005195,mtime=1550051840 where itemid=3083837;
update items set lastlogsize=1143660,mtime=1550051821 where itemid=3174332;
update items set lastlogsize=1143660,mtime=1550051821 where itemid=3174333;
update items set lastlogsize=1143660,mtime=1550051821 where itemid=3174340;
update items set lastlogsize=18770462,mtime=1550051833 where itemid=5414135;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=9436439;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10573903;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10677857;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10914849;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10959567;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10959737;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10959907;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=10964381;
update items set lastlogsize=949171,mtime=1550051821 where itemid=15122616;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462490;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462491;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462492;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462494;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462495;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462496;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462497;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462498;
update items set lastlogsize=942205,mtime=1550051821 where itemid=15462499;
update items set lastlogsize=946380,mtime=1550051821 where itemid=15468103;
update items set lastlogsize=496972,mtime=1550051821 where itemid=15470823;
]

Example scenario:

transaction frontend:
begin;
UPDATE items SET history='32d',itemid='28539',hostid='10263',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28539';

transaction server:
begin;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=28538;
update items set lastlogsize=4787511,mtime=1550051820 where itemid=28539;

transaction frontend:
UPDATE items SET history='32d',itemid='28538',hostid='10264',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28538';

Result:
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Expected:
item updates are ordered by itemid

Comment by Vladislavs Sokurenko [ 2019 Feb 13 ]

In attached log it is seen that there is no sorting by itemid:

SQL (0.000729): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3805213',hostid='27411',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3805213'

SQL (0.000754): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='15455690',hostid='42864',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='15455690'

SQL (0.000732): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='4876374',hostid='31268',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='4876374'

SQL (0.007819): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3157101',hostid='24480',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3157101'

SQL (0.000854): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='15317491',hostid='42294',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='15317491'

SQL (0.000775): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='1042031',hostid='12159',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='1042031'

SQL (0.00064): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='1765330',hostid='17501',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='1765330'

SQL (0.000701): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='2108822',hostid='19667',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='2108822'

SQL (1.001411): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3094106',hostid='23886',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3094106'


SQL (0.000402): ROLLBACK
items.php:721 > DBend() > DBrollback() > DBexecute() in include/db.inc.php:319
Cannot update item
Details
pg_query(): Query failed: ERROR: deadlock detected
DETAIL: Process 52858 waits for ShareLock on transaction 188596064; blocked by process 61879.
Process 61879 waits for ShareLock on transaction 188591617; blocked by process 52858.
HINT: See server log for query details.
CONTEXT: while updating tuple (160487,16) in relation "items" [items.php:713 > CFrontendApiWrapper->update() > CApiWrapper->__call() > CFrontendApiWrapper->callMethod() > CApiWrapper->callMethod() > CFrontendApiWrapper->callClientMethod() > CLocalApiClient->callMethod() > call_user_func_array() > CItem->update() > CItem->inherit() > CItem->updateReal() > DB::update() > DBexecute() > pg_query() in include/db.inc.php:483]
Error in query [UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3094106',hostid='23886',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3094106'] [ERROR: deadlock detected
DETAIL: Process 52858 waits for ShareLock on transaction 188596064; blocked by process 61879.
Process 61879 waits for ShareLock on transaction 188591617; blocked by process 52858.
HINT: See server log for query details.
CONTEXT: while updating tuple (160487,16) in relation "items"]
SQL statement execution has failed "UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3094106',hostid='23886',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3094106'". [items.php:713 > CFrontendApiWrapper->update() > CApiWrapper->__call() > CFrontendApiWrapper->callMethod() > CApiWrapper->callMethod() > CFrontendApiWrapper->callClientMethod() > CLocalApiClient->callMethod() > call_user_func_array() > CItem->update() > CItem->inherit() > CItem->updateReal() > DB::update() > DB::exception() in include/classes/db/DB.php:574]
Comment by Oleksii Zagorskyi [ 2019 Feb 19 ]

For our issue I've described above, we have tested a patch, which performs sorting by itemid in the UPDATE sql.
We were able to update our template now, thanks.

Comment by Oleksii Zagorskyi [ 2019 Feb 19 ]

ZBXNEXT-5040 may improve it in future as well 

Comment by Alexander Shubin [ 2019 Feb 19 ]

Fixed in development branch:
svn://svn.zabbix.com/branches/dev/ZBX-2494 (for 4.0+)
svn://svn.zabbix.com/branches/dev/ZBX-2494-3.0 (for 3.0)

Comment by Alexander Shubin [ 2019 Apr 09 ]

Fixed in:

  • 3.0.27rc1 r92340
  • 4.0.7rc1 r92342
  • 4.2.1rc1 r92343
  • 4.4.0alpha1 r92344
Generated at Fri Apr 19 16:33:07 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.