[ZBX-19398] Sqlite deadlock between history syncer or conf syncer. Created: 2021 May 18  Updated: 2024 Apr 10  Resolved: 2021 Jun 14

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P)
Affects Version/s: 5.4.0
Fix Version/s: 5.4.1rc2, 6.0.0alpha1, 6.0 (plan)

Type: Problem report Priority: Blocker
Reporter: Andrei Gushchin (Inactive) Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 17
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive ps_bt.zip    
Issue Links:
Causes
caused by ZBXNEXT-6292 Support of tags for configuration ent... Closed
Duplicate
is duplicated by ZBX-19489 Proxy 5.4.0 not response Closed
is duplicated by ZBX-19701 zabbix-proxy stops working Closed
Team: Team A
Sprint: Sprint 76 (May 2021), Sprint 77 (Jun 2021)
Story Points: 1

 Description   

Steps to reproduce:
start proxy with pretty big confguration
after some time new loop configuration synced starts and all processes will hand for waiting mutex of SQLite.

 1261:20210518:152725.569 == locks diagnostic information ==
  1261:20210518:152725.578 locks:
  1261:20210518:152725.587   ZBX_MUTEX_LOG:0x7fbb6fecd000
  1261:20210518:152725.596   ZBX_MUTEX_CACHE:0x7fbb6fecd028
  1261:20210518:152725.604   ZBX_MUTEX_TRENDS:0x7fbb6fecd050
  1261:20210518:152725.617   ZBX_MUTEX_CACHE_IDS:0x7fbb6fecd078
  1261:20210518:152725.626   ZBX_MUTEX_SELFMON:0x7fbb6fecd0a0
  1261:20210518:152725.640   ZBX_MUTEX_CPUSTATS:0x7fbb6fecd0c8
  1261:20210518:152725.652   ZBX_MUTEX_DISKSTATS:0x7fbb6fecd0f0
  1261:20210518:152725.661   ZBX_MUTEX_ITSERVICES:0x7fbb6fecd118
  1261:20210518:152725.671   ZBX_MUTEX_VALUECACHE:0x7fbb6fecd140
  1261:20210518:152725.680   ZBX_MUTEX_VMWARE:0x7fbb6fecd168
  1261:20210518:152725.689   ZBX_MUTEX_SQLITE3:0x7fbb6fecd190
  1261:20210518:152725.698   ZBX_MUTEX_PROCSTAT:0x7fbb6fecd1b8
  1261:20210518:152725.707   ZBX_MUTEX_PROXY_HISTORY:0x7fbb6fecd1e0
  1261:20210518:152725.717   ZBX_MUTEX_MODBUS:0x7fbb6fecd208
  1261:20210518:152725.725   ZBX_MUTEX_TREND_FUNC:0x7fbb6fecd230
  1261:20210518:152725.735   ZBX_RWLOCK_CONFIG:0x7fbb6fecd258
  1261:20210518:152725.744   ZBX_RWLOCK_VALUECACHE:0x7fbb6fecd290
  1261:20210518:152725.753 ==

root@s01:~$ ps ax | grep sync
 1263 ?        S      1:06 /usr/sbin/zabbix_proxy: configuration syncer [loading configuration]
 1332 ?        S      0:02 /usr/sbin/zabbix_proxy: history syncer #1 [processed 0 values in 0.000029 sec, syncing history]
 1333 ?        S      0:02 /usr/sbin/zabbix_proxy: history syncer #2 [processed 6090 values in 0.397928 sec, syncing history]
 1334 ?        S      0:02 /usr/sbin/zabbix_proxy: history syncer #3 [processed 0 values in 0.000036 sec, syncing history]
 1335 ?        S      0:02 /usr/sbin/zabbix_proxy: history syncer #4 [processed 7000 values in 0.397221 sec, syncing history]
13236 pts/4    S+     0:00 grep --color=auto sync
root@s01:~$ strace -p 1332
strace: Process 1332 attached
futex(0x7fbb6fecd190, FUTEX_WAIT, 2, NULL^Cstrace: Process 1332 detached
 <detached ...>
root@s01:~$ strace -p 1263
strace: Process 1263 attached
futex(0x7fbb6fecd190, FUTEX_WAIT, 2, NULL^Cstrace: Process 1263 detached
 <detached ...>

Result:
No synchronisation no config update. Proxy just idle.
Expected:
Processing data and configuration.



 Comments   
Comment by Adrian Kirchner [ 2021 May 21 ]

I'm experiencing this on two proxies. Happens roughly every 12 hours

 

 

$ lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal


dpkg -l | grep zabbix 
ii zabbix-agent 1:5.4.0-1+ubuntu20.04 amd64 Zabbix network monitoring solution - agent 
ii zabbix-proxy-sqlite3 1:5.4.0-1+ubuntu20.04 amd64 Zabbix network monitoring solution - proxy (SQLite3) 
ii zabbix-sender 1:5.4.0-1+ubuntu20.04 amd64 Zabbix network monitoring solution - sender

 

Comment by Max Ried [ 2021 May 25 ]

I'm seeing the same issue I guess. I doesn't happen regularly, so I tried to avoid enabling the max debugging level all the time. Neither executing zabbix_proxy --runtime-control diaginfo while in this state doesn't produce any log output, nor does increasing the log level at runtime change anything. When I send a config_cache_reload it displays "configuration cache reloading is already in progress". It's on Debian 10, amd64, zabbix-proxy-sqlite 5.4.0. systemd can't restart it, unless if you first stop it with --signal=SIGKILL.
https://support.zabbix.com/secure/AddComment!default.jspa?id=82529

Comment by Guilherme Xavier [ 2021 May 30 ]

I am also facing this problem. After upgrading from version 5.0 to 5.4 my zabbix proxy is unable to send data to the zabbix server. I use Debian 10, with proxy, server and database installed on different servers. I tried to redo the bank of the proxy that uses sqlite but without success. When downloading the service or restarting the host there is a very long delay. I have 350 vps on average.

 

 zabbix-proxy-sqlite3 1:5.4.0-2+debian10

 

zabbix_server.conf

AlertScriptsPath=/usr/lib/zabbix/alertscripts
CacheSize=512M
DBHost=IP
DBName=LOGIN
DBPassword=SENHA
DBUser=USER
DebugLevel=5
ExternalScripts=/usr/lib/zabbix/externalscripts
Fping6Location=/usr/bin/fping6
FpingLocation=/usr/bin/fping
LogFile=/var/log/zabbix/zabbix_server.log
LogSlowQueries=3000
MaxHousekeeperDelete=0
PidFile=/var/run/zabbix/zabbix_server.pid
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
SocketDir=/var/run/zabbix
StartDiscoverers=20
StartVMwareCollectors=5
StatsAllowedIP=127.0.0.1
ValueCacheSize=64M

 

zabbix_proxy.conf

CacheSize=512M
ConfigFrequency=120
DBName=/opt/zabbix-proxy-db/zbpx.db
DebugLevel=5
ExternalScripts=/usr/lib/zabbix/externalscripts
Fping6Location=/usr/bin/fping6
FpingLocation=/usr/bin/fping
Hostname=host
LogFileSize=0
LogFile=/var/log/zabbix/zabbix_proxy.log
LogSlowQueries=3000
PidFile=/run/zabbix/zabbix_proxy.pid
ProxyOfflineBuffer=8
Server=SERVER
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
SocketDir=/run/zabbix
StartDiscoverers=20
StartVMwareCollectors=5
StatsAllowedIP=127.0.0.1
Timeout=4
TLSAccept=psk
TLSConnect=psk
TLSPSKFile=/home/zbpx.psk
TLSPSKIdentity=ZBpx001

Comment by Ted Serreyn [ 2021 May 31 ]

Looks like I am seeing this issue also on Zabbix-proxy with sqlite3 on raspberry pi.  I have several of these proxies and most of them have experienced this problem.  The restart issue is kind of a big deal also as just a simple restart doesn't easily fix it.

 

Comment by Bruno Scota de Carvalho [ 2021 May 31 ]

Oh! its not happening only to me! It started to happen after upgrade 5.2 to 5.4 using zabbix proxy containered. I have 3 proxies at separated locations. All with the same symptoms. (200vps each)

Comment by Max Ried [ 2021 Jun 01 ]

I only have 20 resp. 40 values per second, so it does not seem to have to to with this. The 40 vps one locks up more often though. 

Comment by Vladislavs Sokurenko [ 2021 Jun 01 ]

It seems to have been caused by ZBXNEXT-6292,
Backtrace provided by Kalimulin

0x00007f747f7a829c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00007f747f7a829c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f747f7a1714 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x0000561e93929f4c in __zbx_mutex_lock ()
#3  0x0000561e9397667c in zbx_db_vselect ()
#4  0x0000561e9395e75a in DBselect ()
#5  0x0000561e938da887 in zbx_dbsync_compare_item_tags ()
#6  0x0000561e938b9727 in DCsync_configuration ()
#7  0x0000561e9396d27b in process_proxyconfig ()
#8  0x0000561e937d4fac in ?? ()
#9  0x0000561e937d5172 in proxyconfig_thread ()
#10 0x0000561e9392a274 in zbx_thread_start ()
#11 0x0000561e937cc793 in MAIN_ZABBIX_ENTRY ()
#12 0x0000561e93924c31 in daemon_start ()
#13 0x0000561e937cbda1 in main ()

As you see zbx_dbsync_compare_item_tags is called under configuration cache lock, meaning that configuration cache lock is locked for write lock, then for database queries sqlite mutex is locked.

Expected:
No database queries should be performed under configuration cache lock.

The other process is database syncer that locks mutex for sqlite3 and then locks mutex for configuration cache thus deadlock occur.

 0x00007f747f7a4025 in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) #0  0x00007f747f7a4025 in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000561e93929d48 in __zbx_rwlock_wrlock ()
#2  0x0000561e938cb23d in DCconfig_items_apply_changes ()
#3  0x0000561e938a5379 in ?? ()
#4  0x0000561e938a7099 in ?? ()
#5  0x0000561e938a8276 in zbx_sync_history_cache ()
#6  0x0000561e937cd7f5 in dbsyncer_thread ()
#7  0x0000561e9392a274 in zbx_thread_start ()
#8  0x0000561e937cca12 in MAIN_ZABBIX_ENTRY ()
#9  0x0000561e93924c31 in daemon_start ()
#10 0x0000561e937cbda1 in main ()
Comment by Vladislavs Sokurenko [ 2021 Jun 01 ]

Fixed in pull request feature/ZBX-19398-5.4

Comment by Rostislav Palivoda (Inactive) [ 2021 Jun 02 ]

Releasing in 5.4.1rc2 today. 

Comment by Vladislavs Sokurenko [ 2021 Jun 02 ]

Fixed in:

Comment by Konstantīns Ošmjans [ 2021 Jun 09 ]

In the headers of this Jira issue:

Status: Done
Resolution: Unresolved
Fix Version/s: [ ...long list... ]

Is it really fixed or unresolved?

Comment by Alex Kalimulin [ 2021 Jun 09 ]

constantin.oshmyan, it's fixed and available in the recently released 5.4.1.

Generated at Tue Jun 17 07:33:14 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.