Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-10215

host availability not updated for connection errors on timeouting items

XMLWordPrintable

      This seems to have been introduced in ZBX-4284.

      Suppose we have a host with a "sleeping" item (one that simply sleeps when queried, thus causing a timeout). If we query that item on a running agent, then the server will mark the agent as reachable, but the item as unreachable.

      If we now stop the agent, then the corresponding Zabbix host will not be marked as unavailable. The reason seems to be the last lines of the following code:

      static void	deactivate_host(DC_ITEM *item, zbx_timespec_t *ts, int *available, const char *error)
      {
      	const char		*__function_name = "deactivate_host";
      
      	zbx_host_availability_t	in, out;
      
      	zabbix_log(LOG_LEVEL_DEBUG, "In %s() hostid:" ZBX_FS_UI64 " itemid:" ZBX_FS_UI64 " type:%d",
      			__function_name, item->host.hostid, item->itemid, (int)item->type);
      
      	if (FAIL == host_get_availability(&item->host, item->type, &in))
      		goto out;
      
      	/* if the item is still flagged as unreachable while the host is reachable, */
      	/* it means that this is item rather than network failure                   */
      	if (0 == in.errors_from && 0 != item->unreachable)
      		goto out;
      	
      	...
      

      If an item is unreachable, then it does not influence host's availability, even if the error is NETWORK_ERROR, not TIMEOUT_ERROR.

      Another consequence of the bug is that deactivate_host() does not set "last_available" to HOST_AVAILABLE_FALSE in this case, and the item becomes not supported, as opposed to host becoming unavailable (see get_values() function in poller.c).

            Unassigned Unassigned
            asaveljevs Aleksandrs Saveljevs
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: