[ZBX-8188] Zabbix server continue to retrieve values from IPMI agent even if the machine is shutdown state Created: 2014 May 08  Updated: 2017 May 30  Resolved: 2014 May 12

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 2.0.11, 2.2.3
Fix Version/s: 2.0.13rc1, 2.2.4, 2.3.2

Type: Incident report Priority: Minor
Reporter: Kodai Terashima Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: ipmi
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate

 Description   

Zabbix server and proxy continue to retrieve values from IPMI sensors even if the machine is shutdown state. ipmitools return "na" for such case.

IPMI device return values with sensor status but Zabbix doesn't check the status.
So Zabbix retrieve values from not working IPMI sensors (i.e. CPU temperature is 50 degree from stopped machine)

According to IPMI 2.0 specification, IPMI client should check sensor status before getting value. IPMI item should be not-supported state in that case.



 Comments   
Comment by Aleksandrs Saveljevs [ 2014 May 08 ]

ZBX-7847 might be related.

kodai I guess these are different issue. this is problem when monitored machine is stopped (shutdown), ZBX-7847 is about disabled host on Zabbix frontend.

Comment by Nikolajs Agafonovs (Inactive) [ 2014 May 12 ]

Fix available in svn://svn.zabbix.com/branches/dev/ZBX-8188

Comment by Aleksandrs Saveljevs [ 2014 May 16 ]

Kodai, could you please provide a reference to IPMI 2.0 specification where it says about checking sensor status?

kodai I checked ipmitool source (ipmitool return n/a for stopped device), haven't checked IPMI 2.0 specification. will try to find it from the specification.

Comment by Aleksandrs Saveljevs [ 2014 May 19 ]

Alternative solution is available in the same development branch: svn://svn.zabbix.com/branches/dev/ZBX-8188 .

It uses functions ipmi_is_sensor_scanning_enabled() and ipmi_is_initial_update_in_progress() as previously suggested by Kodai.

Comment by Kodai Terashima [ 2014 May 27 ]

According to page 464 on http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/second-gen-interface-spec-v2.pdf, Table 35-15, Get Sensor Reading Command 3rd bit of response data:

reading/state unavailable (formerly “initial update in progress”). This bit is set to indicate that a ‘re-arm’ or ‘Set Event Receiver’ command has been used to request an update of the sensor status, and that update has not occurred yet. Software should use this bit to avoid getting an incorrect status while the first sensor update is in progress. This bit is only required if it is possible for the controller to receive and process a ‘Get Sensor Reading’ or ‘Get Sensor Event Status’ command for the sensor before the update has completed. This is most likely to be the case for sensors, such as fan RPM sensors, that may require seconds to accumulate the first reading after a re-arm. The bit is also used to indicate when a reading/state is unavailable because the management controller cannot obtain a valid reading or state for the monitored entity, typically because the entity is not present. See Section 16.4, Event Status, Event Conditions, and Present State and Section 16.6, Re-arming for more information.

Comment by dimir [ 2014 Jun 02 ]

Another one from the same doc regarding "sensor scanning enabled":

An Entity is present if there is at least one active sensor for the Entity (and there is no explicit sensor saying
the Entity is 'absent'). A sensor is 'active' if scanning is enabled. For each of these sensors, check to see that
at least one of the sensors is scanning by checking the "sensor scanning disabled" bit via the Get Sensor
Reading command. Per section 11.5, software should ignore this bit if its set to 'disabled'. If there are no
active sensors for the entity, then it should be assumed that the Entity is absent.

Comment by dimir [ 2014 Jun 02 ]

(1) [S] As I understood from the above mentioned document, the "initial update in progress" and "sensor scanning disabled" means in general that the sensor data is not available (either not yet ready or entity does not exist). So I suggest to simplify the error message for the user like this:

        if (0 == ipmi_is_sensor_scanning_enabled(states))
        {
-               h->err = zbx_strdup(h->err, "sensor scanning is not enabled");
+               h->err = zbx_strdup(h->err, "sensor data is not available");
@@ -376,7 +376,7 @@
 
        if (0 != ipmi_is_initial_update_in_progress(states))
        {
-               h->err = zbx_strdup(h->err, "initial update is in progress");
+               h->err = zbx_strdup(h->err, "sensor data is not available");

Please see my suggestion in r46099.

asaveljevs Looks good, CLOSED.

Comment by dimir [ 2014 Jun 02 ]

Other than that the issue is tested.

Comment by Aleksandrs Saveljevs [ 2014 Jun 03 ]

(2) There was a conflict merging from 2.0 into 2.2 because the newer version has discrete sensor support.

Resolved conflicts are available in development branch svn://svn.zabbix.com/branches/dev/ZBX-8188-2.2 . Please take a look.

dimir Looks good. Please see my suggestions to simplify the code a bit in r46134. RESOLVED

asaveljevs Great! CLOSED.

Comment by Aleksandrs Saveljevs [ 2014 Jun 03 ]

Fixed in pre-2.0.13 r46114, pre-2.2.4 r46164, pre-2.3.2 (trunk) r46165.

Generated at Fri Mar 29 09:14:49 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.