[ZBX-2293] Zabbix crash with unknown PID Created: 2010 Apr 09  Updated: 2017 May 30  Resolved: 2010 Nov 22

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.4
Fix Version/s: 1.8.4

Type: Incident report Priority: Major
Reporter: Alexey Pustovalov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

trunk zabbix_server, Gentoo 2.26.31-r10, Oracle 10g2



 Description   
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep died
        30729:20100408:120833.772 One child process died (PID:28019). Exiting ...
        29564:20100409:063037.892 One child process died (PID:23171). Exiting ...
        28587:20100409:072805.831 One child process died (PID:11018). Exiting ...
        15224:20100409:080239.676 One child process died (PID:23555). Exiting ...
        27604:20100409:082749.289 One child process died (PID:31517). Exiting ...
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep 31517:
        ***~ #
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep 23555:
      • ~ #

***~ # cat /var/log/zabbix/zabbix_server.log | grep 11018:

      • ~ #
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep 23171:
      • ~ #


 Comments   
Comment by Alexey Pustovalov [ 2010 Apr 09 ]

new crash
24994:20100409:120806.874 Before executing [/usr/local/sbin/SendSUCTSMS]
2296:20100409:120847.884 One child process died (PID:24994). Exiting ..

thise debug log

Comment by Alexey Pustovalov [ 2010 Apr 29 ]

Server crash beacouse pid watchdog send email script:

      • ~ # cat /var/log/zabbix/zabbix_server.log | grep died
        10017:20100428:031714.431 One child process died (PID:12123,exitcode/signal:0). Exiting ...
        19399:20100428:043151.731 One child process died (PID:11781,exitcode/signal:0). Exiting ...
        2647:20100428:062449.067 One child process died (PID:8466,exitcode/signal:0). Exiting ...
        27043:20100428:131847.483 One child process died (PID:18083,exitcode/signal:0). Exiting ...
        21029:20100428:133648.866 One child process died (PID:24980,exitcode/signal:0). Exiting ...
        27133:20100429:101428.057 One child process died (PID:26186,exitcode/signal:0). Exiting ...
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep 26186
        27133:20100429:101428.057 One child process died (PID:26186,exitcode/signal:0). Exiting ...
      • ~ # cd psax/
      • psax # grep 26186 *
        1272514408:26186 ? SN 0:00 /usr/bin/perl /usr/local/sbin/SendSUCTSMS *** Zabbix database is down. Zabbix database is down.
        1272514410:26186 ? SN 0:00 /usr/bin/perl /usr/local/sbin/SendSUCTSMS *** Zabbix database is down. Zabbix database is down.
        1272514413:26186 ? SN 0:00 /usr/bin/perl /usr/local/sbin/SendSUCTSMS *** Zabbix database is down. Zabbix database is down.
Comment by Alexey Pustovalov [ 2010 Apr 29 ]

script exit with 0 code

Comment by richlv [ 2010 May 27 ]

if this problem is still reproducible, could you get the log with svn head of zabbix server ?
it might print out more information

Comment by Alexey Pustovalov [ 2010 Jun 14 ]

This problem occurs so far. Zabbix_server crashes when trying to send a message "Database was down", starting with the PID the script sends a message as its.

        • ~ # cat /var/log/zabbix/zabbix_server.log | grep died
          26321:20100530:005258.544 One child process died (PID:28616,exitcode/signal:255). Exiting ...
          27151:20100603:220251.085 One child process died (PID:28614,exitcode/signal:255). Exiting ...
          29910:20100603:223307.029 One child process died (PID:3167,exitcode/signal:0). Exiting ...
          7840:20100604:101618.006 One child process died (PID:2517,exitcode/signal:0). Exiting ...
          6672:20100604:111638.873 One child process died (PID:30955,exitcode/signal:0). Exiting ...
          3542:20100604:115031.781 One child process died (PID:12553,exitcode/signal:0). Exiting ...
          12879:20100604:120841.017 One child process died (PID:22890,exitcode/signal:0). Exiting ...
          23157:20100604:122046.110 One child process died (PID:29634,exitcode/signal:0). Exiting ...
          13952:20100605:174627.645 One child process died (PID:30210,exitcode/signal:0). Exiting ...
          945:20100609:073816.401 One child process died (PID:9217,exitcode/signal:0). Exiting ...
          19342:20100609:082547.624 One child process died (PID:23005,exitcode/signal:0). Exiting ...
          24074:20100609:083833.128 One child process died (PID:29592,exitcode/signal:0). Exiting ...
          4969:20100609:181413.097 One child process died (PID:759,exitcode/signal:0). Exiting ...
          4932:20100609:213921.575 One child process died (PID:18497,exitcode/signal:0). Exiting ...
          23784:20100610:142503.828 One child process died (PID:7370,exitcode/signal:0). Exiting ...
          12587:20100610:150153.569 One child process died (PID:23221,exitcode/signal:0). Exiting ...
          24943:20100611:160656.857 One child process died (PID:29729,exitcode/signal:0). Exiting ...
          23294:20100611:172835.893 One child process died (PID:23652,exitcode/signal:255). Exiting ...
          24164:20100611:174418.246 One child process died (PID:2217,exitcode/signal:0). Exiting ...
          3292:20100611:175603.684 One child process died (PID:4806,exitcode/signal:255). Exiting ...
          8440:20100611:184402.803 One child process died (PID:30770,exitcode/signal:0). Exiting ...
          9714:20100611:193700.705 One child process died (PID:14360,exitcode/signal:0). Exiting ...
          14898:20100611:194932.557 One child process died (PID:21034,exitcode/signal:0). Exiting ...
          32640:20100612:095908.056 One child process died (PID:14660,exitcode/signal:0). Exiting ...
          19761:20100612:174417.333 One child process died (PID:12140,exitcode/signal:0). Exiting ...
          18130:20100613:012727.676 One child process died (PID:14741,exitcode/signal:0). Exiting ...
          19355:20100613:022104.260 One child process died (PID:2901,exitcode/signal:0). Exiting ...
          16243:20100613:034558.712 One child process died (PID:31636,exitcode/signal:0). Exiting ...
          2723:20100614:150549.144 One child process died (PID:22270,exitcode/signal:0). Exiting ...
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep "31636:"
      • ~ #
      • ~ # cat /var/log/zabbix/zabbix_server.log | grep "2723:"
        2723:20100613:035926.310 Starting Zabbix Server. Zabbix 1.9 (revision {ZABBIX_REVISION}).
        2723:20100613:035926.311 **** Enabled features ****
        2723:20100613:035926.311 SNMP monitoring: YES
        2723:20100613:035926.311 IPMI monitoring: NO
        2723:20100613:035926.311 WEB monitoring: YES
        2723:20100613:035926.311 Jabber notifications: YES
        2723:20100613:035926.311 ODBC: YES
        2723:20100613:035926.311 SSH2 support: NO
        2723:20100613:035926.311 IPv6 support: NO
        2723:20100613:035926.311 **************************
        2723:20100613:040003.916 server #0 started [Watchdog]
        2723:20100614:150507.869 [Z3001] Connection to database '(null)' failed: [-1] ORA-03135: connection lost contact
        2723:20100614:150507.875 Watchdog: Database is down
        2723:20100614:150549.144 One child process died (PID:22270,exitcode/signal:0). Exiting ...
        2723:20100614:150552.474 Syncing history data...
        2723:20100614:150647.745 Slow query: 51.549911 sec, "begin
        2723:20100614:150659.793 Syncing history data... 4.064875%
        2723:20100614:150711.830 Syncing history data... 8.129751%
        2723:20100614:150726.383 Syncing history data... 16.259502%
        2723:20100614:150745.940 Syncing history data... 20.324377%
        2723:20100614:150757.178 Syncing history data... 28.454128%
        2723:20100614:150809.251 Syncing history data... 32.519003%
        2723:20100614:150821.051 Evaluation failed for function: delta
        2723:20100614:150821.051 Expression [{341624}>0] for item [585978][***:lmemBufferFail] cannot be evaluated: Evaluation failed for function: delta
        2723:20100614:150821.120 Evaluation failed for function: delta
        2723:20100614:150821.121 Expression [{341621}>75] for item [585977][***:lcpuPercentBusy.5sec] cannot be evaluated: Evaluation failed for function: delta
        2723:20100614:150821.255 Evaluation failed for function: delta
        2723:20100614:150821.255 Expression [{341629}>10240] for item [585980][***:lmemFreeMem] cannot be evaluated: Evaluation failed for function: delta
        2723:20100614:150822.009 Syncing history data... 44.713630%
        2723:20100614:150828.247 Evaluation failed for function: delta
        2723:20100614:150828.247 Expression [{341619}>0] for item [585979][***:lmemBufferNoMem] cannot be evaluated: Evaluation failed for function: delta
        2723:20100614:150836.238 Syncing history data... 52.843380%
        2723:20100614:150847.812 Syncing history data... 65.038007%
        2723:20100614:150847.828 Parameter [542099][***:agent.emule.users] became supported
        2723:20100614:150857.696 Syncing history data... 73.167757%
        2723:20100614:150908.286 Syncing history data... 89.427259%
        2723:20100614:150915.780 Syncing history data... done.
        2723:20100614:150915.780 Syncing trends data...
        2723:20100614:151028.879 Slow query: 73.072147 sec, "begin
        2723:20100614:151043.360 Syncing trends data... done.
        2723:20100614:151043.361 Zabbix Server stopped. Zabbix 1.9 (revision {ZABBIX_REVISION}

        ).

Comment by Aleksandrs Saveljevs [ 2010 Nov 22 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-2293.

Comment by Aleksandrs Saveljevs [ 2010 Nov 22 ]

Merged into pre-1.8.4 in r15606.

Generated at Sat Apr 20 06:26:58 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.