-
Patch request
-
Resolution: Unresolved
-
Trivial
-
None
-
6.0.42, 7.4.3
-
None
-
Linux RHEL 8.6
Steps to reproduce:
Running multiple IBM x3650 M4 devices with IMM2 IPMI implementation.
IPMI discovery discovers around 190 items for each device.
Post enable, the Poller bombards the IMM2 with requests for data, which
overwhelms the IMM2 and starts locking up requests. The Poller then turns
these items into invalid as it does not trap for errors correctly.
When I attempt to query the items directly, I use Ipmitool which
returns the state of "BMC busy". Over time the BMC seems to free up and
allow queries again however the poller continues to request data which results
in a floating list of "items" in a "Not Supported" state. This is identical behavior
for all the IBM devices however worse for others. Eventually on one device the
BMC never comes back to life until a IMM2 reboot is initiated.
After some experimenting, I placed a "usleep(1000);" call in the beginning
of the get_value_ipmi() function in checks_ipmi.c.
This defiantly helped solved the issue! Basically just slowing down the number
of calls made to the BMC. Everything is fully functional now.
This however, is just a hack. We need a proper fix for this. My suggestion
is maybe a "IPMIPollerDelay=xxx" in milliseconds so we can tune this behavior
in the zabbix_server.conf file.
Happy to assist with debugging and testing this item.