Loading...

XML

Word

Printable

Type: Problem report
Resolution: Fixed
Priority: Critical
Fix Version/s: 4.0.8rc1, 4.2.2rc1, 4.4.0alpha1, 4.4 (plan)
Affects Version/s: 4.0.6, 4.2.0
Component/s: Proxy (P), Server (S)
Labels:
- bug
- ipmi
Environment:
OpenIPMI >= 2.0.26

Sprint:
Sprint 51 (Apr 2019), Sprint 52 (May 2019)
Story Points:
0.5

Steps to reproduce:

Have IPMI checks which enter zbx_perform_all_openipmi_ops() and return from perform_one_op() before the timeout expires. perform_one_op() updates the remaining timeout, which will get driven down to 0.0, at which point we just sit and spin.

Result:
**

100% CPU usage on IPMI thread

Expected:

Not this.

Further discussion:

Once perform_one_op() returns before timeout, we never break out of the loop, since we reset start_time each cycle, but we keep comparing duration against the original timeout, not the (updated by perform_one_op()) remaining timeout.

Since perform_one_op() updates the remaining timeout internally (and returns a timeout of {0,0} if it did timeout), skip the start_time tracking completely, and just loop while (tv.tv_sec + tv.tv_usec > 0) and reset the tv to the timeout at the start of each loop:

void	zbx_perform_all_openipmi_ops(int timeout)
{
	struct timeval	tv = {1, 0};

	while (tv.tv_sec + tv.tv_usec > 0)
	{
		int	res;

		tv.tv_sec = timeout;
		tv.tv_usec = 0;

		res = os_hnd->perform_one_op(os_hnd, &tv);

		/* perform_one_op() returns 0 on success, errno on failure (timeout means success) */
		if (0 != res)
		{
			zabbix_log(LOG_LEVEL_DEBUG, "IPMI error: %s", zbx_strerror(res));
			break;
		}
	}
}

caused by

ZBX-15578 IPMI times out and fails to read values when polls aren't frequent enough

Closed

Assignee:: Andrejs Sitals (Inactive)

Reporter:: Eric A. Borisch

Team:: Team I

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2019 Apr 04 01:30

Updated:: 2024 Apr 10 17:27

Resolved:: 2019 May 19 20:43

Details

Description

Attachments

Issue Links

Activity

People

Dates