-
Incident report
-
Resolution: Unresolved
-
Trivial
-
None
-
4.0.0
-
None
-
Debian 9 - MySQL Percona 5.7.23-23-57 - Elasticsearch 6.4.1
This problem was not present with alpha version of Zabbix 4.x
Steps to reproduce:
- Start server
- Wait a few days
Result:
- Every trigger start to fails (notifications are sent)
- No new data are sent to history storage backend (ES)
- WebUI show all triggers in error and zabbix server not running
- Server is still writing to log and doing housekeeping
Active checks don't work either, and agent shows these errors :
96109:20181014:202413.324 active check data upload to [xxxxxxxxxxx:10051] started to fail ([connect] cannot connect to [[xxxxxxxxxxx]:10051]: [4] Interrupted system call)
96109:20181014:202433.360 active check data upload to [xxxxxxxxxxx:10051] is working again
96109:20181014:202443.400 active check data upload to [xxxxxxxxxxx:10051] started to fail ([connect] cannot connect to [[xxxxxxxxxxx]:10051]: [4] Interrupted system call)
96109:20181014:202503.434 active check data upload to [xxxxxxxxxxx:10051] is working again
96109:20181014:202513.469 active check data upload to [xxxxxxxxxxx:10051] started to fail ([connect] cannot connect to [[xxxxxxxxxxx]:10051]: [4] Interrupted system call)
96109:20181014:202603.547 active check data upload to [xxxxxxxxxxx:10051] is working again
96109:20181014:202613.586 active check data upload to [xxxxxxxxxxx:10051] started to fail ([connect] cannot connect to [[xxxxxxxxxxx]:10051]: [4] Interrupted system call)
96109:20181014:202633.633 active check data upload to [xxxxxxxxxxx:10051] is working again
Server process list while in this frozen state :
CGroup: /system.slice/zabbix-server.service
├─51020 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
├─51023 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 3.684796 sec, idle 60 sec]
├─51024 /usr/sbin/zabbix_server: ipmi manager #1 [scheduled 0, polled 0 values, idle 5.005401 sec during 5.005402 sec]
├─51025 /usr/sbin/zabbix_server: alerter #1 [sent 0, failed 0 alerts, idle 6.063715 sec during 6.576066 sec]
├─51026 /usr/sbin/zabbix_server: alerter #2 [sent 0, failed 0 alerts, idle 17.063970 sec during 17.579276 sec]
├─51027 /usr/sbin/zabbix_server: alerter #3 [sent 0, failed 0 alerts, idle 6.057038 sec during 6.571383 sec]
├─51028 /usr/sbin/zabbix_server: housekeeper [deleted 3453 hist/trends, 0 items/triggers, 33352 events, 0 sessions, 0 alarms, 0 audit items in 18.509272 sec, idle for 1 hour(s)]
├─51029 /usr/sbin/zabbix_server: timer #1 [updated 0 hosts, suppressed 0 events in 0.000517 sec, idle 59 sec]
├─51030 /usr/sbin/zabbix_server: timer #2 [suppressed 4 events in 0.009227 sec, idle 59 sec]
├─51031 /usr/sbin/zabbix_server: timer #3 [suppressed 4 events in 0.012005 sec, idle 59 sec]
├─51032 /usr/sbin/zabbix_server: http poller #1 [got 0 values in 0.000828 sec, getting values]
├─51033 /usr/sbin/zabbix_server: http poller #2 [got 1 values in 0.004189 sec, getting values]
├─51034 /usr/sbin/zabbix_server: http poller #3 [got 1 values in 0.031016 sec, getting values]
├─51035 /usr/sbin/zabbix_server: http poller #4 [got 1 values in 0.064232 sec, getting values]
├─51036 /usr/sbin/zabbix_server: http poller #5 [got 0 values in 0.000755 sec, getting values]
├─51037 /usr/sbin/zabbix_server: discoverer #1 [processed 0 rules in 0.000603 sec, idle 60 sec]
├─51038 /usr/sbin/zabbix_server: discoverer #2 [processed 0 rules in 0.000613 sec, idle 60 sec]
├─51039 /usr/sbin/zabbix_server: discoverer #3 [processed 0 rules in 0.000616 sec, idle 60 sec]
├─51040 /usr/sbin/zabbix_server: discoverer #4 [processed 0 rules in 0.000411 sec, idle 60 sec]
├─51041 /usr/sbin/zabbix_server: discoverer #5 [processed 0 rules in 0.000623 sec, idle 60 sec]
├─51042 /usr/sbin/zabbix_server: discoverer #6 [processed 0 rules in 0.000389 sec, idle 60 sec]
├─51043 /usr/sbin/zabbix_server: discoverer #7 [processed 0 rules in 0.000000 sec, performing discovery]
├─51044 /usr/sbin/zabbix_server: discoverer #8 [processed 0 rules in 0.000000 sec, performing discovery]
├─51045 /usr/sbin/zabbix_server: history syncer #1 [processed 0 values, 0 triggers in 0.000932 sec, idle 1 sec]
├─51046 /usr/sbin/zabbix_server: history syncer #2 [processed 0 values, 0 triggers in 0.000887 sec, idle 1 sec]
├─51047 /usr/sbin/zabbix_server: history syncer #3 [processed 2052 values, 15 triggers in 0.148149 sec, syncing history]
├─51049 /usr/sbin/zabbix_server: history syncer #4 [processed 0 values, 0 triggers in 0.000847 sec, idle 1 sec]
├─51051 /usr/sbin/zabbix_server: history syncer #5 [processed 0 values, 0 triggers in 0.000810 sec, idle 1 sec]
├─51053 /usr/sbin/zabbix_server: history syncer #6 [processed 0 values, 0 triggers in 0.000832 sec, idle 1 sec]
├─51055 /usr/sbin/zabbix_server: history syncer #7 [processed 0 values, 0 triggers in 0.000855 sec, idle 1 sec]
├─51057 /usr/sbin/zabbix_server: history syncer #8 [processed 0 values, 0 triggers in 0.000008 sec, idle 1 sec]
├─51059 /usr/sbin/zabbix_server: history syncer #9 [processed 0 values, 0 triggers in 0.000938 sec, idle 1 sec]
├─51061 /usr/sbin/zabbix_server: history syncer #10 [processed 0 values, 0 triggers in 0.000698 sec, idle 1 sec]
├─51063 /usr/sbin/zabbix_server: history syncer #11 [processed 0 values, 0 triggers in 0.000839 sec, idle 1 sec]
├─51065 /usr/sbin/zabbix_server: history syncer #12 [processed 0 values, 14 triggers in 0.001297 sec, idle 1 sec]
├─51067 /usr/sbin/zabbix_server: escalator #1 [processed 0 escalations in 0.001201 sec, idle 3 sec]
├─51068 /usr/sbin/zabbix_server: ipmi poller #1 [polled 0 values, idle 3599.986798 sec during 3599.986965 sec]
├─51069 /usr/sbin/zabbix_server: ipmi poller #2 [polled 0 values, idle 3599.985592 sec during 3599.985746 sec]
├─51071 /usr/sbin/zabbix_server: ipmi poller #3 [polled 0 values, idle 3599.985576 sec during 3599.985903 sec]
├─51072 /usr/sbin/zabbix_server: ipmi poller #4 [polled 0 values, idle 3599.986043 sec during 3599.986735 sec]
├─51073 /usr/sbin/zabbix_server: ipmi poller #5 [polled 0 values, idle 3599.986137 sec during 3599.986399 sec]
├─51075 /usr/sbin/zabbix_server: java poller #1 [got 0 values in 0.000002 sec, getting values]
├─51076 /usr/sbin/zabbix_server: java poller #2 [got 60 values in 0.048707 sec, getting values]
├─51077 /usr/sbin/zabbix_server: java poller #3 [got 0 values in 0.000002 sec, getting values]
├─51078 /usr/sbin/zabbix_server: java poller #4 [got 60 values in 0.052165 sec, getting values]
├─51079 /usr/sbin/zabbix_server: java poller #5 [got 87 values in 0.065649 sec, getting values]
├─51080 /usr/sbin/zabbix_server: snmp trapper [processed data in 0.000011 sec, idle 1 sec]
├─51082 /usr/sbin/zabbix_server: proxy poller #1 [exchanged data with 0 proxies in 0.000002 sec, idle 5 sec]
├─51084 /usr/sbin/zabbix_server: self-monitoring [processed data in 0.000013 sec, idle 1 sec]
├─51085 /usr/sbin/zabbix_server: task manager [processed 0 task(s) in 0.000318 sec, idle 5 sec]
├─51089 /usr/sbin/zabbix_server: poller #1 [got 0 values in 0.000004 sec, getting values]
├─51090 /usr/sbin/zabbix_server: poller #2 [got 0 values in 0.000004 sec, getting values]
├─51091 /usr/sbin/zabbix_server: poller #3 [got 0 values in 0.000003 sec, getting values]
├─51093 /usr/sbin/zabbix_server: poller #4 [got 0 values in 0.000003 sec, getting values]
├─51095 /usr/sbin/zabbix_server: poller #5 [got 97 values in 2.306670 sec, getting values]
├─51096 /usr/sbin/zabbix_server: poller #6 [got 11 values in 0.032682 sec, getting values]
├─51098 /usr/sbin/zabbix_server: poller #7 [got 599 values in 3.736298 sec, getting values]
├─51101 /usr/sbin/zabbix_server: poller #8 [got 69 values in 0.171742 sec, getting values]
├─51102 /usr/sbin/zabbix_server: poller #9 [got 50 values in 3.188347 sec, getting values]
├─51103 /usr/sbin/zabbix_server: poller #10 [got 0 values in 0.000004 sec, getting values]
├─51104 /usr/sbin/zabbix_server: poller #11 [got 0 values in 0.000003 sec, getting values]
├─51105 /usr/sbin/zabbix_server: poller #12 [got 221 values in 0.759326 sec, getting values]
├─51106 /usr/sbin/zabbix_server: poller #13 [got 19 values in 2.120538 sec, getting values]
├─51107 /usr/sbin/zabbix_server: poller #14 [got 59 values in 1.135586 sec, getting values]
├─51108 /usr/sbin/zabbix_server: poller #15 [got 465 values in 5.965436 sec, getting values]
├─51109 /usr/sbin/zabbix_server: poller #16 [got 0 values in 0.000004 sec, getting values]
├─51110 /usr/sbin/zabbix_server: poller #17 [got 1 values in 0.073697 sec, getting values]
├─51111 /usr/sbin/zabbix_server: poller #18 [got 11 values in 2.041732 sec, getting values]
├─51112 /usr/sbin/zabbix_server: poller #19 [got 59 values in 0.516575 sec, getting values]
├─51113 /usr/sbin/zabbix_server: poller #20 [got 0 values in 0.000003 sec, getting values]
├─51114 /usr/sbin/zabbix_server: poller #21 [got 75 values in 0.594648 sec, getting values]
├─51115 /usr/sbin/zabbix_server: poller #22 [got 36 values in 3.871845 sec, getting values]
├─51116 /usr/sbin/zabbix_server: poller #23 [got 0 values in 0.000003 sec, getting values]
├─51117 /usr/sbin/zabbix_server: poller #24 [got 15 values in 3.698477 sec, getting values]
├─51118 /usr/sbin/zabbix_server: poller #25 [got 280 values in 2.844432 sec, getting values]
├─51119 /usr/sbin/zabbix_server: poller #26 [got 80 values in 0.597410 sec, getting values]
├─51120 /usr/sbin/zabbix_server: poller #27 [got 58 values in 0.077234 sec, getting values]
├─51121 /usr/sbin/zabbix_server: poller #28 [got 88 values in 0.338467 sec, getting values]
├─51122 /usr/sbin/zabbix_server: poller #29 [got 92 values in 1.559905 sec, getting values]
├─51123 /usr/sbin/zabbix_server: poller #30 [got 0 values in 0.000003 sec, getting values]
├─51124 /usr/sbin/zabbix_server: poller #31 [got 0 values in 0.000003 sec, getting values]
├─51125 /usr/sbin/zabbix_server: poller #32 [got 143 values in 0.733590 sec, getting values]
├─51126 /usr/sbin/zabbix_server: poller #33 [got 0 values in 0.000004 sec, getting values]
├─51127 /usr/sbin/zabbix_server: poller #34 [got 29 values in 0.100466 sec, getting values]
├─51128 /usr/sbin/zabbix_server: poller #35 [got 0 values in 0.000003 sec, getting values]
├─51129 /usr/sbin/zabbix_server: poller #36 [got 0 values in 0.000003 sec, getting values]
├─51130 /usr/sbin/zabbix_server: poller #37 [got 0 values in 0.000003 sec, getting values]
├─51131 /usr/sbin/zabbix_server: poller #38 [got 52 values in 3.235480 sec, getting values]
├─51132 /usr/sbin/zabbix_server: poller #39 [got 78 values in 0.105895 sec, getting values]
├─51133 /usr/sbin/zabbix_server: poller #40 [got 18 values in 3.617870 sec, getting values]
├─51134 /usr/sbin/zabbix_server: unreachable poller #1 [got 0 values in 0.000003 sec, idle 3 sec]
├─51135 /usr/sbin/zabbix_server: unreachable poller #2 [got 0 values in 0.000016 sec, getting values]
├─51136 /usr/sbin/zabbix_server: unreachable poller #3 [got 0 values in 0.000002 sec, idle 3 sec]
├─51137 /usr/sbin/zabbix_server: unreachable poller #4 [got 0 values in 0.000004 sec, idle 3 sec]
├─51138 /usr/sbin/zabbix_server: unreachable poller #5 [got 0 values in 0.000004 sec, idle 3 sec]
├─51139 /usr/sbin/zabbix_server: unreachable poller #6 [got 0 values in 0.000003 sec, idle 3 sec]
├─51140 /usr/sbin/zabbix_server: unreachable poller #7 [got 0 values in 0.000040 sec, getting values]
├─51141 /usr/sbin/zabbix_server: unreachable poller #8 [got 0 values in 0.000098 sec, getting values]
├─51142 /usr/sbin/zabbix_server: unreachable poller #9 [got 0 values in 0.000002 sec, idle 3 sec]
├─51143 /usr/sbin/zabbix_server: unreachable poller #10 [got 1 values in 0.002377 sec, idle 3 sec]
├─51144 /usr/sbin/zabbix_server: trapper #1 [processing data]
├─51145 /usr/sbin/zabbix_server: trapper #2 [processing data]
├─51146 /usr/sbin/zabbix_server: trapper #3 [processing data]
├─51147 /usr/sbin/zabbix_server: trapper #4 [processing data]
├─51148 /usr/sbin/zabbix_server: trapper #5 [processing data]
├─51149 /usr/sbin/zabbix_server: icmp pinger #1 [pinging hosts]
├─51150 /usr/sbin/zabbix_server: icmp pinger #2 [pinging hosts]
├─51152 /usr/sbin/zabbix_server: icmp pinger #3 [pinging hosts]
├─51153 /usr/sbin/zabbix_server: icmp pinger #4 [pinging hosts]
├─51155 /usr/sbin/zabbix_server: icmp pinger #5 [pinging hosts]
├─51156 /usr/sbin/zabbix_server: icmp pinger #6 [pinging hosts]
├─51157 /usr/sbin/zabbix_server: icmp pinger #7 [pinging hosts]
├─51159 /usr/sbin/zabbix_server: icmp pinger #8 [pinging hosts]
├─51161 /usr/sbin/zabbix_server: icmp pinger #9 [pinging hosts]
├─51163 /usr/sbin/zabbix_server: icmp pinger #10 [pinging hosts]
├─51165 /usr/sbin/zabbix_server: alert manager #1 [sent 0, failed 0 alerts, idle 5.008287 sec during 5.008289 sec]
├─51166 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 0, processed 4676 values, idle 4.930970 sec during 5.002322 sec]
├─51167 /usr/sbin/zabbix_server: preprocessing worker #1 started
├─51169 /usr/sbin/zabbix_server: preprocessing worker #2 started
├─51170 /usr/sbin/zabbix_server: preprocessing worker #3 started
├─51171 /usr/sbin/zabbix_server: preprocessing worker #4 started
├─51172 /usr/sbin/zabbix_server: preprocessing worker #5 started
├─51173 /usr/sbin/zabbix_server: preprocessing worker #6 started
├─51174 /usr/sbin/zabbix_server: preprocessing worker #7 started
├─51175 /usr/sbin/zabbix_server: preprocessing worker #8 started
├─51176 /usr/sbin/zabbix_server: preprocessing worker #9 started
└─51177 /usr/sbin/zabbix_server: preprocessing worker #10 started
A reboot of zabbix server resolves issues
Seems like a Network error to me, Zbx server is no longer able to receive data, hence triggers are activated and no historical data is stored.