-
Problem report
-
Resolution: Fixed
-
Major
-
5.0.12rc1
-
Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 81 (Oct 2021)
-
1
I have debug (only) level log of proxy. Each attempt to start the proxy finishes after a minute. Here are some details:
Both failed processes exited when tried to connect to IPC socket:
# egrep " 20107" zabbix-proxy-8.log 20107:20210517:141038.612 proxy #220 started [poller #203] 20107:20210517:141049.946 zbx_setproctitle() title:'poller #203 [connecting to the database]' 20107:20210517:141049.946 In DBconnect() flag:0 20107:20210517:141049.984 End of DBconnect():0 20107:20210517:141049.984 zbx_setproctitle() title:'poller #203 [got 0 values in 0.000000 sec, getting values]' 20107:20210517:141049.984 In get_values() 20107:20210517:141049.984 In DCconfig_get_poller_items() poller_type:0 20107:20210517:141049.984 End of DCconfig_get_poller_items():1 20107:20210517:141049.985 In substitute_key_macros_impl() data:'net.tcp.service[http]' 20107:20210517:141049.985 End of substitute_key_macros_impl():SUCCEED data:'net.tcp.service[http]' 20107:20210517:141049.985 In substitute_simple_macros_impl() data:EMPTY 20107:20210517:141049.985 In substitute_simple_macros_impl() data:EMPTY 20107:20210517:141049.985 In get_value() key:'net.tcp.service[http]' 20107:20210517:141049.985 In get_value_simple() key_orig:'net.tcp.service[http]' addr:'10.50.87.144' 20107:20210517:141049.988 End of get_value_simple():SUCCEED 20107:20210517:141049.988 End of get_value():SUCCEED 20107:20210517:141049.988 In zbx_activate_item_host() hostid:14807 itemid:3945157 type:3 20107:20210517:141049.988 End of zbx_activate_item_host() 20107:20210517:141049.988 In zbx_preprocess_item_value() 20107:20210517:141049.988 End of zbx_preprocess_item_value() 20107:20210517:141049.989 In zbx_ipc_socket_open() 20107:20210517:141150.047 End of zbx_ipc_socket_open():FAIL 20107:20210517:141150.047 cannot connect to preprocessing service: Cannot connect to service "preprocessing": [2] No such file or directory.
root@it0:/tmp# egrep " 20033" zabbix-proxy-8.log 20033:20210517:141037.692 proxy #148 started [poller #131] 20033:20210517:141049.288 zbx_setproctitle() title:'poller #131 [connecting to the database]' 20033:20210517:141049.305 In DBconnect() flag:0 20033:20210517:141049.318 End of DBconnect():0 20033:20210517:141049.321 zbx_setproctitle() title:'poller #131 [got 0 values in 0.000000 sec, getting values]' 20033:20210517:141049.326 In get_values() 20033:20210517:141049.326 In DCconfig_get_poller_items() poller_type:0 20033:20210517:141049.327 End of DCconfig_get_poller_items():1 20033:20210517:141049.328 In substitute_key_macros_impl() data:'net.if.out[ens3]' 20033:20210517:141049.329 End of substitute_key_macros_impl():SUCCEED data:'net.if.out[ens3]' 20033:20210517:141049.329 In substitute_simple_macros_impl() data:'10050' 20033:20210517:141049.331 In get_value() key:'net.if.out[ens3]' 20033:20210517:141049.331 In get_value_agent() host:'zabbix-proxy-8' addr:'192.168.101.27' key:'net.if.out[ens3]' conn:'unencrypted' 20033:20210517:141049.335 Sending [net.if.out[ens3]] 20033:20210517:141049.336 get value from agent result: '10335816499' 20033:20210517:141049.337 End of get_value_agent():SUCCEED 20033:20210517:141049.337 End of get_value():SUCCEED 20033:20210517:141049.337 In zbx_activate_item_host() hostid:13851 itemid:2577877 type:0 20033:20210517:141049.338 End of zbx_activate_item_host() 20033:20210517:141049.338 In zbx_preprocess_item_value() 20033:20210517:141049.338 End of zbx_preprocess_item_value() 20033:20210517:141049.338 In zbx_ipc_socket_open() 20033:20210517:141150.041 End of zbx_ipc_socket_open():FAIL 20033:20210517:141150.041 cannot connect to preprocessing service: Cannot connect to service "preprocessing": [2] No such file or directory.
child processes still were starting (1240 already started ! ) and preprocessing manager did not start yet.
1000 pollers started and 221 unreachable pollers:
# egrep " started" zabbix-proxy-8.log | tail -n3
22029:20210517:141149.919 proxy #1236 started [unreachable poller #219]
22031:20210517:141149.944 proxy #1237 started [unreachable poller #220]
22034:20210517:141150.180 proxy #1238 started [unreachable poller #221]
# egrep " started" zabbix-proxy-8.log
19877:20210517:141035.430 proxy #0 started [main process]
19884:20210517:141035.431 proxy #1 started [configuration syncer #1]
19887:20210517:141037.034 proxy #2 started [trapper #1]
19888:20210517:141037.034 proxy #3 started [heartbeat sender #1]
19889:20210517:141037.035 proxy #4 started [data sender #1]
19890:20210517:141037.036 proxy #5 started [housekeeper #1]
19891:20210517:141037.038 proxy #6 started [discoverer #1]
19892:20210517:141037.039 proxy #7 started [discoverer #2]
19893:20210517:141037.039 proxy #8 started [discoverer #3]
19894:20210517:141037.040 proxy #9 started [discoverer #4]
19895:20210517:141037.041 proxy #10 started [discoverer #5]
19896:20210517:141037.042 proxy #11 started [history syncer #1]
19897:20210517:141037.042 proxy #12 started [history syncer #2]
19898:20210517:141037.045 proxy #13 started [history syncer #3]
19899:20210517:141037.046 proxy #14 started [history syncer #4]
19900:20210517:141037.047 proxy #15 started [snmp trapper #1]
19901:20210517:141037.048 proxy #16 started [self-monitoring #1]
19902:20210517:141037.049 proxy #17 started [task manager #1]
19903:20210517:141037.050 proxy #18 started [poller #1]
...
21519:20210517:141130.529 proxy #1014 started [poller #997]
21523:20210517:141130.732 proxy #1017 started [poller #1000]
21529:20210517:141130.744 proxy #1018 started [unreachable poller #1]
21531:20210517:141130.864 proxy #1019 started [unreachable poller #2]
...
... 1+ minute passed.
22029:20210517:141149.919 proxy #1236 started [unreachable poller #219]
22031:20210517:141149.944 proxy #1237 started [unreachable poller #220]
22034:20210517:141150.180 proxy #1238 started [unreachable poller #221]
Later number of pollers was decreased and the proxy was able to operate successfully without exiting.
I think that preprocessing processes should start before pollers, to avid such issue.