[ZBX-2494] deadlock between server and frontend Created: 2010 Jun 01 Updated: 2024 Apr 10 Resolved: 2019 Apr 09 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | API (A) |
Affects Version/s: | 1.8.2, 3.0.25, 4.0.4, 4.2.0alpha3 |
Fix Version/s: | 3.0.27rc1, 4.0.7rc1, 4.2.1rc1, 4.4.0alpha1, 4.4 (plan) |
Type: | Problem report | Priority: | Critical |
Reporter: | Aleksandrs Saveljevs | Assignee: | Alexander Shubin |
Resolution: | Fixed | Votes: | 32 |
Labels: | deadlock | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Attachments: |
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||
Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
Team: | |||||||||||||||||||||||||||||||||||||||||||||||||
Sprint: | Sprint 49 (Feb 2019), Sprint 50 (Mar 2019), Sprint 51 (Apr 2019) | ||||||||||||||||||||||||||||||||||||||||||||||||
Story Points: | 0.5 |
Description |
Server and frontend sometimes deadlock. Relevant forum threads:
|
Comments |
Comment by Aleksandrs Saveljevs [ 2010 Jun 01 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Relevant issues: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Pritpal [ 2010 Jun 28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Hey Guys, Is there a timeline or some sort of idea when this issue may get fixed? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Aleksandrs Saveljevs [ 2010 Jun 29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
My best guess would be 2.0, but this estimate might or might not change later. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by richlv [ 2010 Aug 19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Aleksandrs Saveljevs [ 2010 Sep 22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
No idea how to reproduce, but the following has just happened in my development environment: 15956:20100922:133443.239 [Z3005] Query failed: [1213] Deadlock found when trying to get lock; try restarting transaction [update triggers set value=1,lastchange=1285151681,error='' where triggerid=1401400000013838] Coincidentally, it happened exactly when a host was deleted through the GUI. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Aleksandrs Saveljevs [ 2010 Oct 05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This is just to say that we now have a reliable way of reproducing deadlocks like those shown in the previous post. In these cases, server deadlocks by itself, without any cooperation from the GUI. So the problem also falls under | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Alexander Vladishev [ 2010 Oct 06 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
... and this in my environment:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Nick Tzaperas [ 2011 Mar 18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Would you mind giving some info on how you are able to reproduce the deadlock? This issue is affected me on a weekly basis and any information as to what is going on would be appreciated. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Aleksandrs Saveljevs [ 2011 Mar 18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nick, you are right, we should have documented that. However, it is not something you would do in real life. As far as I remember, you start by creating 10,000 hosts. You then create item "random.number" (updated each second) and trigger " {Template_Linux:random.number.last(0)}=1" on all of them. The "random.number" item is a user parameter that either returns 0 or 1, or becomes unsupported: UserParameter=random.number[*],case $$(($$RANDOM%3)) in 0) echo 0 ;; 1) echo 1 ;; 2) echo "" ;; esac This setup produces a lot of history and events, and causes SQL queries on the server to deadlock fairly quickly. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by richlv [ 2012 Feb 14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
the server deadlocking itself might be a different issue. most likely a server <-> frontend deadlock was just observed while updating a templated item (while server was running, ~2.4k nvps) : ociexecute() [<a href='function.ociexecute'>function.ociexecute</a>]: ORA-00060: deadlock detected while waiting for resource [include/db.inc.php:557] SQL error [ORA-00060: deadlock detected while waiting for resource] in [UPDATE items SET description='Version of zabbix_agent(d) running',key_='agent.version',hostid=100100000019092,delay=1800,history=7,type=7,snmp_community='',snmp_oid='',value_type=1,data_type=0,trapper_hosts='',snmp_port=161,units='',multiplier=0,delta=0,snmpv3_securityname='',snmpv3_securitylevel=0,snmpv3_authpassphrase='',snmpv3_privpassphrase='',formula='0',trends=0,logtimefmt='',valuemapid=0,delay_flex='',params='',ipmi_sensor='',templateid=100100000010774,authtype=0,username='',password='',publickey='',privatekey='' WHERE itemid=100100000248959] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Oleksii Zagorskyi [ 2012 Feb 14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ZBXNEXT-822 could be considered (could help?) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Randall Svancara [ 2012 Nov 02 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I looked at this patch. I believe it would result in just moving the problem to another table at least for the backend. For the front end, it may help where the items table could be touched and reduce the risk of a lock. The fact of the matter is the items table is pounded by the zabbix backend. It is very obvious that multiple concurrent updates to the same table by several backend threads in conjunction with changes from the front end would cause some sort of locking issue. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Eugene [ 2013 Feb 20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
When the problem will be solved? Because of this problem, it is impossible to make changes to the database without stopping the Zabbix server. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Jasper Spit [ 2013 Apr 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This affects 2.0.4, 2.0.5, 2.0.6rc1 as well. Please fix soon, makes Zabbix pretty much unusable for me at the moment. Mass updates of a few items or updating a graph causes these deadlocks for me, the update I do affects +/- 200 hosts with about 80 items, so nothing out of the ordinary. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by richlv [ 2013 Apr 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
a more significant redesign is planned as a part of | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Iskra Zabbix [ 2013 Aug 14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This affects 2.0.7 as well. Very annoying. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Anton Samets [ 2013 Oct 28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+ to this task - right now I can't add new graph or update existent one with errors: ERROR: Cannot add graph Error in query [INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('558291','200912','427745','2','1','00C800','0','2','0')] [Deadlock found when trying to get lock; try restarting transaction] SQL statement execution has failed "INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('558291','200912','427745','2','1','00C800','0','2','0')". or: ERROR: Cannot update graph Error in query [INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('570943','197762','385000','2','1','EE0000','0','2','0')] [Got error -1 from storage engine] SQL statement execution has failed "INSERT INTO graphs_items (gitemid,graphid,itemid,drawtype,sortorder,color,yaxisside,calc_fnc,type) VALUES ('570943','197762','385000','2','1','EE0000','0','2','0')". mysql 5.6.10 Sometimes it saves the graph, but this issue is very annoying. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Rostislav Palivoda [ 2018 Jan 08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Currently supported versions are 2.2, 3.0, 3.4 and 4.0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Oleksii Zagorskyi [ 2019 Feb 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Zabbix version 3.4.15 Frontend side error - https://pastebin.com/bwYYNRGa One more case - both errors https://pastebin.com/0KbksUbK What to do? ... | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Vladislavs Sokurenko [ 2019 Feb 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Could it be due to hostid updates ? What frontend could possibly do is select for update where itemid in (<sorted itemids>); then do updates, this is regarding zalex_ua comment But I would also like to see frontend queries that failed (whole transaction) in case it was not sorted by itemid, perhaps it is sorted by hostid first. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Vladislavs Sokurenko [ 2019 Feb 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reproduced on trunk Issue that zalex_ua mentioned
Notice that update is not ordered by itemid: SQL (0.000154): UPDATE items SET history='32d',itemid='28539',hostid='10263',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28539' items.php:847 → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → CItem->update() → CItemGeneral->inherit() → CItemGeneral->_inherit() → CItem->updateReal() → DB::update() → DBexecute() in include/classes/db/DB.php:573 SQL (0.000218): UPDATE items SET history='32d',itemid='28538',hostid='10264',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28538' items.php:847 → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → CItem->update() → CItemGeneral->inherit() → CItemGeneral->_inherit() → CItem->updateReal() → DB::update() → DBexecute() in include/classes/db/DB.php:573 When server will update lastlogsize it sometimes deadlock as per zalex_ua comment this can happen: 53918:20190213:125722.423 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: deadlock detected DETAIL: Process 73842 waits for ShareLock on transaction 314356128; blocked by process 79179. Process 79179 waits for ShareLock on transaction 314358666; blocked by process 73842. HINT: See server log for query details. CONTEXT: while updating tuple (155702,41) in relation "items" [update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059315; update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059316; update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059319; update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059321; update items set lastlogsize=4787511,mtime=1550051820 where itemid=3059328; update items set lastlogsize=1005195,mtime=1550051840 where itemid=3083837; update items set lastlogsize=1143660,mtime=1550051821 where itemid=3174332; update items set lastlogsize=1143660,mtime=1550051821 where itemid=3174333; update items set lastlogsize=1143660,mtime=1550051821 where itemid=3174340; update items set lastlogsize=18770462,mtime=1550051833 where itemid=5414135; update items set lastlogsize=4787511,mtime=1550051820 where itemid=9436439; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10573903; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10677857; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10914849; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10959567; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10959737; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10959907; update items set lastlogsize=4787511,mtime=1550051820 where itemid=10964381; update items set lastlogsize=949171,mtime=1550051821 where itemid=15122616; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462490; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462491; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462492; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462494; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462495; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462496; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462497; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462498; update items set lastlogsize=942205,mtime=1550051821 where itemid=15462499; update items set lastlogsize=946380,mtime=1550051821 where itemid=15468103; update items set lastlogsize=496972,mtime=1550051821 where itemid=15470823; ] Example scenario: transaction frontend: begin; UPDATE items SET history='32d',itemid='28539',hostid='10263',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28539'; transaction server: begin; update items set lastlogsize=4787511,mtime=1550051820 where itemid=28538; update items set lastlogsize=4787511,mtime=1550051820 where itemid=28539; transaction frontend: UPDATE items SET history='32d',itemid='28538',hostid='10264',type='2',name='trap1',key_='trap1',delay='0',master_itemid=NULL,query_fields='',headers='',templateid='28535' WHERE itemid='28538'; Result: ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction Expected: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Vladislavs Sokurenko [ 2019 Feb 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
In attached log it is seen that there is no sorting by itemid: SQL (0.000729): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3805213',hostid='27411',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3805213' SQL (0.000754): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='15455690',hostid='42864',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='15455690' SQL (0.000732): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='4876374',hostid='31268',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='4876374' SQL (0.007819): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3157101',hostid='24480',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3157101' SQL (0.000854): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='15317491',hostid='42294',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='15317491' SQL (0.000775): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='1042031',hostid='12159',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='1042031' SQL (0.00064): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='1765330',hostid='17501',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='1765330' SQL (0.000701): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='2108822',hostid='19667',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='2108822' SQL (1.001411): UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3094106',hostid='23886',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3094106' SQL (0.000402): ROLLBACK items.php:721 > DBend() > DBrollback() > DBexecute() in include/db.inc.php:319 Cannot update item Details pg_query(): Query failed: ERROR: deadlock detected DETAIL: Process 52858 waits for ShareLock on transaction 188596064; blocked by process 61879. Process 61879 waits for ShareLock on transaction 188591617; blocked by process 52858. HINT: See server log for query details. CONTEXT: while updating tuple (160487,16) in relation "items" [items.php:713 > CFrontendApiWrapper->update() > CApiWrapper->__call() > CFrontendApiWrapper->callMethod() > CApiWrapper->callMethod() > CFrontendApiWrapper->callClientMethod() > CLocalApiClient->callMethod() > call_user_func_array() > CItem->update() > CItem->inherit() > CItem->updateReal() > DB::update() > DBexecute() > pg_query() in include/db.inc.php:483] Error in query [UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3094106',hostid='23886',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3094106'] [ERROR: deadlock detected DETAIL: Process 52858 waits for ShareLock on transaction 188596064; blocked by process 61879. Process 61879 waits for ShareLock on transaction 188591617; blocked by process 52858. HINT: See server log for query details. CONTEXT: while updating tuple (160487,16) in relation "items"] SQL statement execution has failed "UPDATE items SET key_='logrt[/var/log/messages,(error(?!.*dev fd0)(?!.*ansible-command Invoked)),,,skip,,]',trends='0',itemid='3094106',hostid='23886',type='7',name='error',delay='60s',master_itemid=NULL,templateid='921754' WHERE itemid='3094106'". [items.php:713 > CFrontendApiWrapper->update() > CApiWrapper->__call() > CFrontendApiWrapper->callMethod() > CApiWrapper->callMethod() > CFrontendApiWrapper->callClientMethod() > CLocalApiClient->callMethod() > call_user_func_array() > CItem->update() > CItem->inherit() > CItem->updateReal() > DB::update() > DB::exception() in include/classes/db/DB.php:574] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Oleksii Zagorskyi [ 2019 Feb 19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
For our issue I've described above, we have tested a patch, which performs sorting by itemid in the UPDATE sql. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Oleksii Zagorskyi [ 2019 Feb 19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Alexander Shubin [ 2019 Feb 19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Fixed in development branch: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Alexander Shubin [ 2019 Apr 09 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Fixed in:
|