-
Type:
Incident report
-
Resolution: Cannot Reproduce
-
Priority:
Blocker
-
None
-
Affects Version/s: 1.8.9
-
Component/s: Server (S)
-
None
-
Environment:Debian linux 6.0.3
Zabbix 1.8.9
Mysql: mysql Ver 14.14 Distrib 5.1.49, for debian-linux-gnu (x86_64) using readline 6.1
Architecture: Linux Zabbix-brx 2.6.32-5-amd64 #1 SMP Fri Sep 9 20:23:16 UTC 2011 x86_64 GNU/Linux
I'm running a multi-node Zabbix DM setup.
At random, aproximatly once a week, one of the Zabbix server (slave) nodes just dies with this very unclear error:
>
1653:20111130:160008.341 NODE 3: Sending history_uint_sync of node 3 to node 1 datalen 3548
1653:20111130:160008.511 NODE 3: Sending history_text of node 3 to node 1 datalen 1294
1653:20111130:160016.581 NODE 3: Sending history_sync of node 3 to node 1 datalen 6240
1653:20111130:160017.221 NODE 3: Sending history_uint_sync of node 3 to node 1 datalen 2376
1653:20111130:160017.415 NODE 3: Sending history_str_sync of node 3 to node 1 datalen 64
1653:20111130:160017.439 NODE 3: Sending history_text of node 3 to node 1 datalen 1196
1617:20111130:160108.180 One child process died (PID:1619,exitcode/signal:9). Exiting ...
1617:20111130:160115.829 syncing history data...
1617:20111130:160117.607 syncing history data done
1617:20111130:160117.607 syncing trends data...
1617:20111130:160123.669 syncing trends data done
1617:20111130:160123.675 Zabbix Server stopped. Zabbix 1.8.9 (revision 23398).
>
I now have a script that jolts Zabbix back to life after such a crash, but would like to see this fixed...
Debuglevel is at 3. It logs a huge amount of syncing information, i'm affraid, if i put it to 4 it'll kill the performance of the node.
As i can't predict when this issue is going to happen.