[ZBX-7847] zabbix server continues polling disabled ipmi hosts Created: 2014 Feb 20  Updated: 2017 May 30  Resolved: 2015 Sep 08

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 2.2.2
Fix Version/s: 3.0.0alpha2

Type: Incident report Priority: Blocker
Reporter: richlv Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: ipmi
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

openipmi 2.0.16 and 2.0.21


Issue Links:
Duplicate

 Description   

test scenario :
have two hosts, each with one ipmi item. verify that both work. disable item on one of those hosts.
observe how server (openipmi lib, probably) still connects to this host.

probably same happens if host is disabled, is in nodata maintenance and maybe some other scenario (like is deleted ?).

it does not seem to be reproducible with one ipmi host only - there must be another, even though they are not related.



 Comments   
Comment by Aleksandrs Saveljevs [ 2014 May 08 ]

ZBX-8188 might be related.

Comment by Igors Homjakovs (Inactive) [ 2015 Jun 30 ]

Fixed in svn://svn.zabbix.com/branches/dev/ZBX-7847

Comment by dimir [ 2015 Jul 29 ]

Yes, zabbix still connects to ipmi host even after the host is deleted. In result, if you delete ipmi host and then add it again you end up with 2 entities each connecting to ipmi host. And if you do that again you get 3 and so on.

Comment by dimir [ 2015 Jul 29 ]

(1) Question, should we delete ipmi connection when host is deleted? With the fix it will be deleted after one day (we check for inactive ipmi connection every hour and delete it if it's one day old).

dimir After discussion we decided to keep it that way but this has to be documented (see sub-issue 3 below).

CLOSED

Comment by dimir [ 2015 Jul 29 ]

(2) In current implementation we use integer to store auto-incremented ID of ipmi host. Should we worry about overflow if we add lots of ipmi hosts and server constantly running without restart?

dimir After discussion we decided that 4 billion should be enough for creating all ipmi connections during the working time of zabbix server or proxy.

CLOSED

Comment by dimir [ 2015 Jul 30 ]

(3) [D] If ipmi checks are not performed (by any reason: all host ipmi items disabled/notsupported, host disabled/deleted, host in maintenance etc.) the ipmi connection will be terminated from Zabbix server or proxy in one day (24 hours + 0..60 minutes). This needs to be documented.

igorsh Documented in https://www.zabbix.com/documentation/3.0/manual/config/items/itemtypes/ipmi

RESOLVED.

<richlv> 0..60 notation is not used often. also, where does the 60 second interval come ? if that's config cache update, that could be changed, and we should actually say so here.

what about older versions, do they keep on making connections for deleted/disabled hosts, too ? if so, we should document that.

<dimir> The 60 minutes comes from the fact that we check for inactive hosts once an hour. So 0..60 depends on the time when Zabbix server was started. Documentation updated.

RESOLVED

<richlv> oh, i was sure it's in seconds
how about simplifying it to "in 3 to 4 hours, depending on the time when Zabbix server was started" ?
also, let's add the information it to all versions instead - users of 2.2 are not that likely to check for this behaviour in 3.0 docs

<dimir>

https://www.zabbix.com/documentation/2.2/manual/config/items/itemtypes/ipmi
https://www.zabbix.com/documentation/2.4/manual/config/items/itemtypes/ipmi
https://www.zabbix.com/documentation/3.0/manual/config/items/itemtypes/ipmi

RESOLVED

<richlv> yay, that should help users a lot - thank you
CLOSED

Comment by dimir [ 2015 Jul 30 ]

Successfully tested. Please review my changes in r54613 and r54620.

Comment by richlv [ 2015 Aug 03 ]

(4) after some discussion on irc, i'd like to raise the fact that one day seems like a very long period. on one hand, we could imagine a user who sets up some initial ipmi monitoring that's too aggressive, disables or deletes that host when it starts killing the endpoint... but zabbix would keep on hammering that interface for 23 more hours.

another case could be a support call where zabbix is making ipmi connections, somebody could look at items - and no ipmi items would exist.

value cache was brought up as an example as that is checked for old items every 24 hours, but that's conceptually different as it is an internal thing and would not potentially impact other systems.

what are the potential drawbacks of lowering this time period ? what about ipmi items with interval longer than one day, how they be impacted ?

<dimir> I personally agree, but we already discussed it with sasha and he decided to keep it that way. I'm not sure it's worth making this "24 hours" value configurable, I'd think just lowering it to 3 hours would be much better. 3 as in often used 3 attempts to decide something is not responding.

sasha I agree to lowering this period to 3 hours

<dimir> RESOLVED in r55456

sasha Looks good! CLOSED

Comment by dimir [ 2015 Sep 08 ]

Fixed in pre-3.0.0alpha2 (r55473).

Generated at Sat Apr 20 06:18:42 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.