[ZBX-21445] Long lock in dbsyncer Created: 2022 Aug 08  Updated: 2024 Apr 10  Resolved: 2022 Aug 24

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 5.0.26
Fix Version/s: 5.0.27rc1, 6.0.8rc1, 6.2.2rc1, 6.4.0alpha1, 6.4 (plan)

Type: Problem report Priority: Critical
Reporter: Alexey Pustovalov Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 0
Labels: dbsyncer, futex, lock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File ZBX-21445.diff    
Issue Links:
Duplicate
Team: Team A
Sprint: Sprint 91 (Aug 2022)
Story Points: 1

 Description   

The following futex locking takes long time quite often:

334161 1659187673.103607 sendto(7, "Q\0\0\0\fcommit;\0", 13, MSG_NOSIGNAL, NULL, 0) = 13 <0.000029>
334161 1659187673.103671 poll([{fd=7, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=7, revents=POLLIN}]) <0.027096>
334161 1659187673.130843 recvfrom(7, "C\0\0\0\vCOMMIT\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 18 <0.000018>
334161 1659187673.131029 futex(0x7fdea62d1214, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 3, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
334163 1659187682.067957 <... futex resumed>) = 0 <10.893623>
334164 1659187682.067996 <... futex resumed>) = 0 <10.812465>
334162 1659187682.068007 <... futex resumed>) = 0 <10.815846>
334165 1659187682.068018 <... futex resumed>) = 0 <10.807555>
334164 1659187682.068031 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334163 1659187682.068042 futex(0x7fdea62d1214, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 3, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
334165 1659187682.068053 futex(0x7fdea62d1214, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 3, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
334162 1659187682.068061 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334165 1659187682.068072 <... futex resumed>) = -1 EAGAIN (Resource temporarily unavailable) <0.000013>
334164 1659187682.068086 <... futex resumed>) = 1 <0.000046>
334160 1659187682.068097 <... futex resumed>) = 0 <10.802822>
334165 1659187682.068112 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334162 1659187682.068124 <... futex resumed>) = 1 <0.000056>
334155 1659187682.068135 <... futex resumed>) = 0 <10.795516>
334167 1659187682.068150 <... futex resumed>) = 0 <10.771625>
334165 1659187682.068162 <... futex resumed>) = 1 <0.000041>
334164 1659187682.068173 sendto(7, "Q\0\0\0\fcommit;\0", 13, MSG_NOSIGNAL, NULL, 0 <unfinished ...>
334160 1659187682.068217 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334167 1659187682.068246 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334165 1659187682.068257 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334164 1659187682.068268 <... sendto resumed>) = 13 <0.000056>
334160 1659187682.068287 <... futex resumed>) = 1 <0.000047>
334167 1659187682.068309 <... futex resumed>) = 1 <0.000055>
334164 1659187682.068318 poll([{fd=7, events=POLLIN|POLLERR}], 1, -1 <unfinished ...>
334160 1659187682.068334 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334167 1659187682.068344 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334159 1659187682.068354 <... futex resumed>) = 0 <10.735996>
334163 1659187682.068429 <... futex resumed>) = -1 EAGAIN (Resource temporarily unavailable) <0.000379>
334159 1659187682.068442 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334169 1659187682.068802 <... futex resumed>) = 0 <10.771944>
334163 1659187682.068823 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334162 1659187682.068832 futex(0x7fdea62d1028, FUTEX_WAKE, 1 <unfinished ...>
334159 1659187682.068839 <... futex resumed>) = 1 <0.000391>
334157 1659187682.068849 <... futex resumed>) = 0 <10.703599>
334155 1659187682.068865 futex(0x7fdea62d1214, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 3, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
334169 1659187682.068877 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334165 1659187682.068885 <... futex resumed>) = 0 <0.000621>
334162 1659187682.068894 <... futex resumed>) = 1 <0.000056>
334169 1659187682.068904 <... futex resumed>) = 1 <0.000021>
334165 1659187682.068912 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334157 1659187682.068921 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334158 1659187682.069051 <... futex resumed>) = 0 <10.513387>
334157 1659187682.069065 <... futex resumed>) = 1 <0.000138>
334158 1659187682.069089 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334157 1659187682.069106 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334158 1659187682.069127 <... futex resumed>) = 1 <0.000030>
334158 1659187682.069150 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334162 1659187682.069192 rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
334169 1659187682.069208 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334162 1659187682.069217 <... rt_sigprocmask resumed>NULL, 8) = 0 <0.000011>
334162 1659187682.069252 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT USR1 USR2 TERM],  <unfinished ...>
334159 1659187682.069271 futex(0x7fdea62d1028, FUTEX_WAKE, 1 <unfinished ...>
334162 1659187682.069280 <... rt_sigprocmask resumed>[], 8) = 0 <0.000013>
334162 1659187682.069303 stat("/var/log/zabbix/zabbix_server.log",  <unfinished ...>
334159 1659187682.069321 <... futex resumed>) = 1 <0.000038>
334162 1659187682.069332 <... stat resumed>{st_mode=S_IFREG|0664, st_size=708905937, ...}) = 0 <0.000019>
334160 1659187682.069345 <... futex resumed>) = 0 <0.001004>
334162 1659187682.069355 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000039>
334162 1659187682.069450 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=352, ...}) = 0 <0.000020>
334162 1659187682.069502 rt_sigprocmask(SIG_BLOCK, [INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000013>
334162 1659187682.069554 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334159 1659187682.069564 munmap(0x7fd91395b000, 34586624 <unfinished ...>
334160 1659187682.069773 futex(0x7fdea62d1028, FUTEX_WAKE, 1 <unfinished ...>
334159 1659187682.069786 <... munmap resumed>) = 0 <0.000215>
334160 1659187682.069796 <... futex resumed>) = 1 <0.000015>
334167 1659187682.069806 <... futex resumed>) = 0 <0.001455>
334159 1659187682.069826 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000015>
334159 1659187682.069878 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000013>
334159 1659187682.069923 stat("/var/log/zabbix/zabbix_server.log", {st_mode=S_IFREG|0664, st_size=708905937, ...}) = 0 <0.000016>
334159 1659187682.069967 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000012>
334159 1659187682.070005 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=352, ...}) = 0 <0.000014>
334159 1659187682.070049 rt_sigprocmask(SIG_BLOCK, [INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000012>
334160 1659187682.070104 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000012>
334160 1659187682.070199 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000021>
334160 1659187682.070276 stat("/var/log/zabbix/zabbix_server.log", {st_mode=S_IFREG|0664, st_size=708905937, ...}) = 0 <0.000015>
334160 1659187682.070321 rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
334167 1659187682.070330 futex(0x7fdea62d1028, FUTEX_WAKE, 1 <unfinished ...>
334160 1659187682.070340 <... rt_sigprocmask resumed>NULL, 8) = 0 <0.000011>
334154 1659187682.070350 <... futex resumed>) = 0 <10.144995>
334160 1659187682.070360 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=352, ...}) = 0 <0.000014>
334154 1659187682.070390 futex(0x7fdea62d1210, FUTEX_WAKE, 2147483647 <unfinished ...>
334160 1659187682.070401 rt_sigprocmask(SIG_BLOCK, [INT QUIT USR1 USR2 TERM],  <unfinished ...>
334154 1659187682.070412 <... futex resumed>) = 2 <0.000015>
334160 1659187682.070422 <... rt_sigprocmask resumed>[], 8) = 0 <0.000014>
334154 1659187682.070443 futex(0x7fdea62d1214, FUTEX_WAKE, 1 <unfinished ...>
334160 1659187682.070466 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334154 1659187682.070476 <... futex resumed>) = 1 <0.000026>
334154 1659187682.070498 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334167 1659187682.070654 <... futex resumed>) = 1 <0.000317>
334156 1659187682.070797 <... futex resumed>) = 0 <9.983853>
334168 1659187682.070820 <... futex resumed>) = 0 <9.966691>
334168 1659187682.070896 futex(0x7fdea62d1214, FUTEX_WAKE, 1) = 1 <0.000022>
334166 1659187682.070932 <... futex resumed>) = 0 <9.564827>
334168 1659187682.070941 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334166 1659187682.070965 futex(0x7fdea62d1214, FUTEX_WAKE, 1) = 1 <0.000008>
334166 1659187682.070987 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334167 1659187682.071024 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000015>
334167 1659187682.071074 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000014>
334167 1659187682.071155 stat("/var/log/zabbix/zabbix_server.log", {st_mode=S_IFREG|0664, st_size=708905937, ...}) = 0 <0.000019>
334167 1659187682.071203 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000019>
334167 1659187682.071247 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=352, ...}) = 0 <0.000015>
334167 1659187682.071291 rt_sigprocmask(SIG_BLOCK, [INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000014>
334167 1659187682.071345 futex(0x7fdea62d1028, FUTEX_WAIT, 2, NULL <unfinished ...>
334159 1659187682.071428 futex(0x7fdea62d1028, FUTEX_WAKE, 1) = 1 <0.000022>
334165 1659187682.071464 <... futex resumed>) = 0 <0.002546>
334159 1659187682.071475 futex(0x7fdea62d1210, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 2, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
334165 1659187682.071981 futex(0x7fdea62d1028, FUTEX_WAKE, 1) = 1 <0.000025>
334157 1659187682.072022 <... futex resumed>) = 0 <0.002909>
334159 1659187682.072176 <... futex resumed>) = 0 <0.000694>
334165 1659187682.072305 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000015>
334165 1659187682.072360 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT USR1 USR2 TERM], [], 8) = 0 <0.000014>
334157 1659187682.072518 futex(0x7fdea62d1028, FUTEX_WAKE, 1 <unfinished ...>
334165 1659187682.072550 stat("/var/log/zabbix/zabbix_server.log",  <unfinished ...>
334157 1659187682.072561 <... futex resumed>) = 1 <0.000035>


 Comments   
Comment by Vladislavs Sokurenko [ 2022 Aug 08 ]

If it helped then it still remains a mystery what helped, inside this function there is zbx_handle_log() and update_resolver_conf(), it seems that update_resolver_conf() is more suspicious if that is the case then its caused by ZBXNEXT-1862

Comment by Vladislavs Sokurenko [ 2022 Aug 08 ]

Should check if this code is needed anymore since fix:
https://sourceware.org/bugzilla/show_bug.cgi?id=984
It says "Fixed in 2.26." available 5 years ago

Comment by Vladislavs Sokurenko [ 2022 Aug 12 ]

Tested on Centos 6 ldd (GNU libc) 2.12 and indeed res_init was required, but also tested on Ubuntu ldd (Ubuntu GLIBC 2.35-0ubuntu3.1) 2.35 and res_init is no longer required.

Comment by Vladislavs Sokurenko [ 2022 Aug 12 ]

res_init should be removed, see ZBX-21445.diff, as it was only used as a workaround for glibc issue that is fixed now. It could improve performance in history syncers, preprocessing workers and many others processes. Additionally it is better to remove log rotation in history syncers when going to idle for performance reasons.

Comment by Vladislavs Sokurenko [ 2022 Aug 16 ]

Fixed in pull request feature/ZBX-21445-5.0

Comment by Vladislavs Sokurenko [ 2022 Aug 19 ]

Fixed in:

Generated at Wed Jul 02 08:10:24 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.