[ZBX-6630] Host status is not actual on proxy side because of configuration syncer process Created: 2013 May 27 Updated: 2022 Oct 08 Resolved: 2013 Jul 05 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Proxy (P) |
Affects Version/s: | 2.0.7rc1, 2.1.0 |
Fix Version/s: | 2.0.7rc1, 2.1.0 |
Type: | Incident report | Priority: | Major |
Reporter: | Alexey Pustovalov | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 0 |
Labels: | performance, proxy | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
MySQL with locktimeout 50 seconds (default) |
Attachments: | ZBX-6630.log blocked connections.png proxy.c.gz | ||||
Issue Links: |
|
Description |
On heavy proxy (about 2000nvps), we block hosts table for long time. While this time unreachable poller and pollers can not update host status. So we should update only changed configration rows instead of all rows. |
Comments |
Comment by Andris Mednis [ 2013 Jun 28 ] |
Available in the development branch svn://svn.zabbix.com/branches/dev/ZBX-6630. |
Comment by Alexey Pustovalov [ 2013 Jun 28 ] |
(1) proxy died because of null value in port column for Trapper item: 26463:20130628:201123.574 Number of cell 26 [] 26463:20130628:201123.574 Number of cell 27 [(null)] 26463:20130628:201123.574 Got signal [signal:11(SIGSEGV),reason:1,refaddr:(nil)]. Crashing ... 26463:20130628:201123.574 ====== Fatal information: ====== 26463:20130628:201123.574 Program counter: 0x7fb720446d5f 26463:20130628:201123.574 === Registers: === 26463:20130628:201123.574 r8 = 0 = 0 = 0 26463:20130628:201123.574 r9 = 7fb7206a2ed0 = 140424499572432 = 140424499572432 26463:20130628:201123.574 r10 = 7fb7206a2ed0 = 140424499572432 = 140424499572432 26463:20130628:201123.574 r11 = 206 = 518 = 518 26463:20130628:201123.574 r12 = 7fff879eac88 = 140735468711048 = 140735468711048 26463:20130628:201123.574 r13 = 0 = 0 = 0 26463:20130628:201123.574 r14 = 7fff879eac88 = 140735468711048 = 140735468711048 26463:20130628:201123.574 r15 = 7fff879eac90 = 140735468711056 = 140735468711056 26463:20130628:201123.574 rdi = 0 = 0 = 0 26463:20130628:201123.574 rsi = 7fff879eac90 = 140735468711056 = 140735468711056 26463:20130628:201123.574 rbp = 7fff879eac90 = 140735468711056 = 140735468711056 26463:20130628:201123.574 rbx = 7fff879eacb8 = 140735468711096 = 140735468711096 26463:20130628:201123.574 rdx = 7fff879eac88 = 140735468711048 = 140735468711048 26463:20130628:201123.574 rax = 0 = 0 = 0 26463:20130628:201123.574 rcx = 0 = 0 = 0 sqlite> select itemid,type,snmp_community,snmp_oid,hostid,key_,delay,status,value_type,trapper_hosts,snmpv3_securityname,snmpv3_securitylevel,snmpv3_authpassphrase,snmpv3_privpassphrase,formula,logtimefmt,delay_flex,params,ipmi_sensor,data_type,authtype,username,password,publickey,privatekey,flags,filter,interfaceid,port from items where itemid in (7048852,6746840); 6746840|3|||45319|icmppingloss[{HOSTNAME},10,,32,600]|120|0|3|||0|||1|||||0|0|||||0||68134| 7048852|2|||45319|trap.maintenance.callerid|0|0|4|||0|||1|||||0|0|||||0||| Proxy dies while processing 7048852 item. [6746840,3,"","",45319,"icmppingloss[{HOSTNAME},10,,32,600]",120,0,3,"","",0,"","","1","","","","",0,0,"","","","",0,"",68134,""], [7048852,2,"","",45319,"trap.maintenance.callerid",0,0,4,"","",0,"","","1","","","","",0,0,"","","","",0,"",null,""], andris RESOLVED in r36669 |
Comment by Alexey Pustovalov [ 2013 Jun 28 ] |
tests: new: old |
Comment by Andris Mednis [ 2013 Jun 28 ] |
Thanks for a good test, Alexey! I will prepare a "proxy.c" file with more time logging to find out which part of the fix is the slowest and needs improvement. |
Comment by Andris Mednis [ 2013 Jun 28 ] |
Attached is a modified "src/libs/zbxdbhigh/proxy.c" with more time logging added (it does not fix crash). |
Comment by Alexey Pustovalov [ 2013 Jul 02 ] |
23614:20130702:195926.736 proxy #1 started [configuration syncer #1] 23614:20130702:195926.762 In process_configuration_sync() 23614:20130702:195940.736 Received configuration data from server. Datalen 66192063 23614:20130702:195943.627 slow query: 2.266737 sec, "select itemid,type,snmp_community,snmp_oid,hostid,key_,delay,status,value_type,trapper_hosts,snmpv3_securityname,snmpv3_securitylevel,snmpv3_authpassphrase,snmpv3_privpassphrase,formula,logtimefmt,delay_flex,params,ipmi_sensor,data_type,authtype,username,password,publickey,privatekey,flags,filter,interfaceid,port from items" 23614:20130702:195949.220 slow query: 2.828733 sec, "select i.itemid,i.hostid,h.proxy_hostid,i.type,i.data_type,i.value_type,i.key_,i.snmp_community,i.snmp_oid,i.port,i.snmpv3_securityname,i.snmpv3_securitylevel,i.snmpv3_authpassphrase,i.snmpv3_privpassphrase,i.ipmi_sensor,i.delay,i.delay_flex,i.trapper_hosts,i.logtimefmt,i.params,i.status,i.authtype,i.username,i.password,i.publickey,i.privatekey,i.flags,i.interfaceid,i.lastclock from items i,hosts h where i.hostid=h.hostid and h.status in (0) and i.status in (0,3)" 23614:20130702:195951.633 End of process_configuration_sync() 23614:20130702:200022.043 forced reloading of the configuration cache 23614:20130702:200022.043 In process_configuration_sync() 23614:20130702:200036.543 Received configuration data from server. Datalen 66192063 23614:20130702:200039.537 slow query: 2.314157 sec, "select itemid,type,snmp_community,snmp_oid,hostid,key_,delay,status,value_type,trapper_hosts,snmpv3_securityname,snmpv3_securitylevel,snmpv3_authpassphrase,snmpv3_privpassphrase,formula,logtimefmt,delay_flex,params,ipmi_sensor,data_type,authtype,username,password,publickey,privatekey,flags,filter,interfaceid,port from items" 23614:20130702:200045.189 slow query: 2.868602 sec, "select i.itemid,i.hostid,h.proxy_hostid,i.type,i.data_type,i.value_type,i.key_,i.snmp_community,i.snmp_oid,i.port,i.snmpv3_securityname,i.snmpv3_securitylevel,i.snmpv3_authpassphrase,i.snmpv3_privpassphrase,i.ipmi_sensor,i.delay,i.delay_flex,i.trapper_hosts,i.logtimefmt,i.params,i.status,i.authtype,i.username,i.password,i.publickey,i.privatekey,i.flags,i.interfaceid,i.lastclock from items i,hosts h where i.hostid=h.hostid and h.status in (0) and i.status in (0,3)" 23614:20130702:200047.578 End of process_configuration_sync() |
Comment by Andris Mednis [ 2013 Jul 03 ] |
The performance issue, crash, and NULL value handling are fixed in r36699. |
Comment by Alexander Vladishev [ 2013 Jul 04 ] |
Successfully tested! Please review my changes in r36729. |
Comment by richlv [ 2013 Jul 05 ] |
(2) added to whatsnew at https://www.zabbix.com/documentation/2.0/manual/introduction/whatsnew207#improved_proxy_performance , please review andris Reviewed. Minor changes proposed. <richlv> CLOSED |
Comment by Andris Mednis [ 2013 Jul 05 ] |
Fixed in versions pre-2.0.7 rev. 36754 and pre-2.1.0 rev. 36769. |