[ZBX-8181] Possible lock itservices processing Created: 2014 May 07  Updated: 2017 May 30  Resolved: 2014 May 09

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 2.2.3
Fix Version/s: 2.0.12rc3, 2.2.4rc1, 2.3.0

Type: Incident report Priority: Major
Reporter: Alexey Pustovalov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Oracle


Issue Links:
Duplicate

 Description   

May 6 22:55:15 zabbix_server[8887]: In DBupdate_itservices()
May 6 22:55:15 zabbix_server[8887]: In its_flush_updates()
May 6 22:55:15 zabbix_server[8887]: In its_load_services_by_triggerids()
May 6 22:55:15 zabbix_server[8887]: query [txnlev:1] [select serviceid,triggerid,status,algorithm from services where triggerid in (19407,1237656,1237662,1237959,1238091,1238207,1238264,1238409,1238434,1238679,1238705,1238733,1690422,1705162)]
May 6 22:55:15 zabbix_server[8887]: In its_itservices_load_parents()
May 6 22:55:15 zabbix_server[8887]: query [txnlev:1] [select s.serviceid,s.status,s.algorithm,sl.servicedownid from services s,services_links sl where s.serviceid=sl.serviceupid and sl.servicedownid in (28,29,30,32)]
May 6 22:55:15 zabbix_server[8887]: In its_itservices_load_parents()
May 6 22:55:15 zabbix_server[8887]: query [txnlev:1] [select s.serviceid,s.status,s.algorithm,sl.servicedownid from services s,services_links sl where s.serviceid=sl.serviceupid and sl.servicedownid in (13,14,15,31)]
May 6 22:55:15 zabbix_server[8887]: In its_itservices_load_parents()
May 6 22:55:15 zabbix_server[8887]: query [txnlev:1] [select s.serviceid,s.status,s.algorithm,sl.servicedownid from services s,services_links sl where s.serviceid=sl.serviceupid and sl.servicedownid=12]
May 6 22:55:15 zabbix_server[8887]: End of its_itservices_load_parents()
May 6 22:55:15 zabbix_server[8887]: End of its_itservices_load_parents()
May 6 22:55:15 zabbix_server[8887]: End of its_itservices_load_parents()
May 6 22:55:15 zabbix_server[8887]: In its_itservices_load_children()
May 6 22:55:15 zabbix_server[8887]: query [txnlev:1] [select s.serviceid,s.status,s.algorithm,sl.serviceupid from services s,services_links sl where s.serviceid=sl.servicedownid and sl.serviceupid in (12,13,14,15,31)]
May 6 22:55:15 zabbix_server[8887]: End of its_itservices_load_children()
May 6 22:55:15 zabbix_server[8887]: End of its_load_services_by_triggerids()

and nothing more for a few hours, till server restart



 Comments   
Comment by Alexey Pustovalov [ 2014 May 07 ]
select serviceid,triggerid,status,algorithm from services where triggerid in (19407,1237656,1237662,1237959,1238091,1238207,1238264,1238409,1238434,1238679,1238705,1238733,1690422,1705162);
SERVICEID	TRIGGERID	STATUS	ALGORITHM
29      	19407	        0	1
28	        19407	        0	1
30	        19407	        0	1
32	        19407	        0	1

select s.serviceid,s.status,s.algorithm,sl.servicedownid from services s,services_links sl where s.serviceid=sl.serviceupid and sl.servicedownid in (28,29,30,32);

SERVICEID	STATUS	ALGORITHM	SERVICEDOWNID
13	        0	1	        28
14	        0	1	        30
15	        0	1	        29
31	        0	1	        32


select s.serviceid,s.status,s.algorithm,sl.servicedownid from services s,services_links sl where s.serviceid=sl.serviceupid and sl.servicedownid in (13,14,15,31);

SERVICEID	STATUS	ALGORITHM	SERVICEDOWNID
12	        0	1	        13
12	        0	1	        14
12	        0	1	        15
12	        0	1	        31

select s.serviceid,s.status,s.algorithm,sl.servicedownid from services s,services_links sl where s.serviceid=sl.serviceupid and sl.servicedownid=12;

 no data found 


select s.serviceid,s.status,s.algorithm,sl.serviceupid from services s,services_links sl where s.serviceid=sl.servicedownid and sl.serviceupid in (12,13,14,15,31);


SERVICEID	STATUS	ALGORITHM	SERVICEUPID
13	        0	1	        12
14	        0	1	        12
15	        0	1	        12
29	        0	1	        15
21	        0	1	        14
22	        0	1	        13
23	        0	1	        15
24	        0	1	        14
25	        0	1	        13
26	        0	1	        13
27	        0	1	        14
28	        0	1	        13
30	        0	1	        14
31	        0	1	        12
32	        0	1	        31

Comment by Alexey Pustovalov [ 2014 May 09 ]
 7206 ?        S     20:16 /usr/local/sbin/zabbix_server: configuration syncer [synced configuration in 19.088262 sec, idle 1800 sec]
 8882 ?        S    107:29 /usr/local/sbin/zabbix_server: history syncer #1 [synced 6 items in 0.235823 sec, syncing history]
 8883 ?        S     96:45 /usr/local/sbin/zabbix_server: history syncer #2 [synced 990 items in 6.397798 sec, syncing history]
 8884 ?        S    103:07 /usr/local/sbin/zabbix_server: history syncer #3 [synced 0 items in 0.000084 sec, syncing history]
 8885 ?        S    101:47 /usr/local/sbin/zabbix_server: history syncer #4 [synced 1 items in 0.002056 sec, syncing history]
 8886 ?        S     97:52 /usr/local/sbin/zabbix_server: history syncer #5 [synced 323 items in 3.941230 sec, syncing history]
 8887 ?        R    457:04 /usr/local/sbin/zabbix_server: history syncer #6 [synced 56 items in 4.075265 sec, syncing history]
 8888 ?        S    102:45 /usr/local/sbin/zabbix_server: history syncer #7 [synced 0 items in 0.000125 sec, syncing history]
 8889 ?        S    109:29 /usr/local/sbin/zabbix_server: history syncer #8 [synced 0 items in 0.000033 sec, syncing history]
 8890 ?        S    102:24 /usr/local/sbin/zabbix_server: history syncer #9 [synced 1000 items in 4.613279 sec, syncing history]
 8891 ?        S    102:12 /usr/local/sbin/zabbix_server: history syncer #10 [synced 996 items in 9.428713 sec, syncing history]
 8892 ?        S     99:40 /usr/local/sbin/zabbix_server: history syncer #11 [synced 0 items in 0.000889 sec, syncing history]
 8893 ?        S     99:44 /usr/local/sbin/zabbix_server: history syncer #12 [synced 101 items in 0.135954 sec, syncing history]
24243 pts/0    S+     0:00 grep sync
Comment by dimir [ 2014 May 09 ]

The problem happens when there are 2 child nodes with the same trigger:

Service Status calculation Trigger
root    
├ child11 Problem, if at least one child has a problem trigger1
└ child12 Problem, if at least one child has a problem trigger1

In this case when the trigger fires the syncer process will enter eternal loop. The fix will be available for 2.0, 2.2 and trunk.

Comment by dimir [ 2014 May 09 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-8181 (for 2.0)

Comment by Alexander Vladishev [ 2014 May 10 ]

Fixed in pre-2.0.12rc3 r45322, pre-2.2.4rc1 r45323 and pre-2.3.0 (trunk) r45324.

Generated at Thu Apr 25 11:52:01 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.