[ZBX-3902] Child processes sometimes exit into a zombie state Created: 2011 Jun 21  Updated: 2017 May 30  Resolved: 2011 Jul 21

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G), Proxy (P), Server (S)
Affects Version/s: None
Fix Version/s: 1.8.6, 1.9.5 (alpha)

Type: Incident report Priority: Major
Reporter: Rudolfs Kreicbergs Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

When a child process exits it is not always detected. It seems to be happen on a random basis, in one example case when conf syncer exits because of a bug, ~1 out of 3 times the server does not detect that and other processes continue to function normally.

Also when the collector process exits for a UNIX agent, it is ignored by the main process and always goes into zombie state.



 Comments   
Comment by Rudolfs Kreicbergs [ 2011 Jun 30 ]

Fixed in dev branch: svn://svn.zabbix.com/branches/dev/ZBX-3902

The problem was caused by a race condition - SIGCHLD was registered after all child processes were created. Now SIGCHLD is registered when starting the daemon and set to SIG_IGN for each child in zbx_fork().

Comment by Rudolfs Kreicbergs [ 2011 Jul 20 ]

Fixed/available in pre-1.8.6 r20693 and in pre-1.9.5 r20694.

Comment by Alexander Vladishev [ 2011 Jul 21 ]

After merge this changes into 1.8 branch don't work external checks:

zabbix_server.log:
30827:20110721:114711.599 failed to kill [/home/sasha/zabbix-bin/scripts/random.sh 192.168.3.4 56]: [3] No such process
30827:20110721:114711.599 zbx_waitpid() failed: [10] No child processes

<rudolfs> RESOLVED in 20708. The problem was that SIGCHLD was set to SIG_IGN. While the signal is ignored by default, setting it to SIG_IGN shows the system to wait on the children. In this case, the scripts exited before zbx_waitpid() was called and system had already taken care of them.

<sasha> CLOSED

Comment by richlv [ 2011 Jul 28 ]

crashes server & proxy upon config reload. see ZBX-3990

Generated at Wed Apr 24 01:57:06 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.