-
Incident report
-
Resolution: Duplicate
-
Critical
-
None
-
2.2.5
-
Centos 6.5/Ubuntu 14.04
MySQL 5.6
Zabbix poller processes get somehow "stuck", which results in failed polls and "first network error, wait for 15 seconds" messages in log.
Most of my items are SNMPv2, and, as i already commented in ZBX-7426, without doing any alteration of configuration percent of busy pollers on proxies is rising - you can see yourself in attached graphs. Those big drops in busy are proxy restarts. In third graph you can see how min/max/acv busy pollers change after restart
I have noticed that as time passes i start to get more and more pollers reporting
zabbix_proxy: poller #123 [got x values in 30.034998 sec, idle 1 sec]
In my configuration timeout is set to 30 sec, so polls longer that that should end in error. Unclear thing is why they report that they actually got some values. Anyway number of such processes build up. And those are not same processes but different ones,
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31378 0.1 0.6 599500 52160 ? S 13:57 0:02 zabbix_proxy: poller #2 [got 1 values in 30.042152 sec, getting values]
zabbix 31394 0.0 0.5 597364 47632 ? S 13:57 0:01 zabbix_proxy: poller #18 [got 3 values in 30.057145 sec, getting values]
zabbix 31407 0.2 0.6 599500 53756 ? S 13:57 0:03 zabbix_proxy: poller #28 [got 3 values in 30.004999 sec, getting values]
zabbix 31410 0.1 0.6 599988 52160 ? S 13:57 0:02 zabbix_proxy: poller #31 [got 2 values in 30.052261 sec, getting values]
zabbix 31430 0.2 0.6 599972 54824 ? S 13:57 0:04 zabbix_proxy: poller #51 [got 2 values in 30.052217 sec, getting values]
zabbix 31432 0.1 0.6 599168 51448 ? S 13:57 0:02 zabbix_proxy: poller #53 [got 3 values in 30.056933 sec, getting values]
zabbix 31447 0.1 0.6 597312 48920 ? S 13:57 0:02 zabbix_proxy: poller #63 [got 8 values in 30.354670 sec, getting values]
zabbix 31449 0.1 0.6 597312 49200 ? S 13:57 0:02 zabbix_proxy: poller #65 [got 1 values in 30.041501 sec, getting values]
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
zabbix 31511 0.1 0.6 599840 52844 ? R 13:57 0:01 zabbix_proxy: poller #122 [got 1 values in 30.039643 sec, getting values]
zabbix 31515 0.1 0.6 599300 52316 ? S 13:57 0:02 zabbix_proxy: poller #126 [got 2 values in 30.050534 sec, getting values]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31407 0.2 0.6 599500 53756 ? S 13:57 0:03 zabbix_proxy: poller #28 [got 3 values in 30.004999 sec, getting values]
zabbix 31410 0.1 0.6 599988 52160 ? S 13:57 0:02 zabbix_proxy: poller #31 [got 2 values in 30.052261 sec, getting values]
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31407 0.2 0.6 599500 53756 ? S 13:57 0:03 zabbix_proxy: poller #28 [got 3 values in 30.004999 sec, getting values]
zabbix 31410 0.1 0.6 599988 52160 ? S 13:57 0:02 zabbix_proxy: poller #31 [got 2 values in 30.052261 sec, getting values]
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31410 0.1 0.6 599988 52160 ? S 13:57 0:02 zabbix_proxy: poller #31 [got 2 values in 30.052261 sec, getting values]
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
zabbix 31536 0.0 0.1 596804 10964 ? S 13:57 0:00 zabbix_proxy: unreachable poller #15 [got 1 values in 30.033028 sec, idle 4 sec]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
zabbix 31497 0.0 0.6 599428 49556 ? S 13:57 0:01 zabbix_proxy: poller #108 [got 1 values in 30.039372 sec, getting values]
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
admin@zbx-prxy1 ~ >ps aux|grep poll |grep "in 30"
- duplicates
-
ZBX-8528 random lost UDP packets lead to not bulk snmp requests and as results increased CPU usage etc
- Closed