ZABBIX BUGS AND ISSUES
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-8188

Zabbix server continue to retrieve values from IPMI agent even if the machine is shutdown state

    Details

      Description

      Zabbix server and proxy continue to retrieve values from IPMI sensors even if the machine is shutdown state. ipmitools return "na" for such case.

      IPMI device return values with sensor status but Zabbix doesn't check the status.
      So Zabbix retrieve values from not working IPMI sensors (i.e. CPU temperature is 50 degree from stopped machine)

      According to IPMI 2.0 specification, IPMI client should check sensor status before getting value. IPMI item should be not-supported state in that case.

        Activity

        Hide
        Aleksandrs Saveljevs added a comment - - edited

        ZBX-7847 might be related.

        Kodai Terashima I guess these are different issue. this is problem when monitored machine is stopped (shutdown), ZBX-7847 is about disabled host on Zabbix frontend.

        Show
        Aleksandrs Saveljevs added a comment - - edited ZBX-7847 might be related. Kodai Terashima I guess these are different issue. this is problem when monitored machine is stopped (shutdown), ZBX-7847 is about disabled host on Zabbix frontend.
        Hide
        Nikolajs Agafonovs added a comment - - edited

        Fix available in svn://svn.zabbix.com/branches/dev/ZBX-8188

        Show
        Nikolajs Agafonovs added a comment - - edited Fix available in svn://svn.zabbix.com/branches/dev/ZBX-8188
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        Kodai, could you please provide a reference to IPMI 2.0 specification where it says about checking sensor status?

        Kodai Terashima I checked ipmitool source (ipmitool return n/a for stopped device), haven't checked IPMI 2.0 specification. will try to find it from the specification.

        Show
        Aleksandrs Saveljevs added a comment - - edited Kodai, could you please provide a reference to IPMI 2.0 specification where it says about checking sensor status? Kodai Terashima I checked ipmitool source (ipmitool return n/a for stopped device), haven't checked IPMI 2.0 specification. will try to find it from the specification.
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        Alternative solution is available in the same development branch: svn://svn.zabbix.com/branches/dev/ZBX-8188 .

        It uses functions ipmi_is_sensor_scanning_enabled() and ipmi_is_initial_update_in_progress() as previously suggested by Kodai.

        Show
        Aleksandrs Saveljevs added a comment - - edited Alternative solution is available in the same development branch: svn://svn.zabbix.com/branches/dev/ZBX-8188 . It uses functions ipmi_is_sensor_scanning_enabled() and ipmi_is_initial_update_in_progress() as previously suggested by Kodai.
        Hide
        Kodai Terashima added a comment - - edited

        According to page 464 on http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/second-gen-interface-spec-v2.pdf, Table 35-15, Get Sensor Reading Command 3rd bit of response data:

        reading/state unavailable (formerly “initial update in progress”). This bit is set to indicate that a ‘re-arm’ or ‘Set Event Receiver’ command has been used to request an update of the sensor status, and that update has not occurred yet. Software should use this bit to avoid getting an incorrect status while the first sensor update is in progress. This bit is only required if it is possible for the controller to receive and process a ‘Get Sensor Reading’ or ‘Get Sensor Event Status’ command for the sensor before the update has completed. This is most likely to be the case for sensors, such as fan RPM sensors, that may require seconds to accumulate the first reading after a re-arm. The bit is also used to indicate when a reading/state is unavailable because the management controller cannot obtain a valid reading or state for the monitored entity, typically because the entity is not present. See Section 16.4, Event Status, Event Conditions, and Present State and Section 16.6, Re-arming for more information.

        Show
        Kodai Terashima added a comment - - edited According to page 464 on http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/second-gen-interface-spec-v2.pdf , Table 35-15, Get Sensor Reading Command 3rd bit of response data: reading/state unavailable (formerly “initial update in progress”). This bit is set to indicate that a ‘re-arm’ or ‘Set Event Receiver’ command has been used to request an update of the sensor status, and that update has not occurred yet. Software should use this bit to avoid getting an incorrect status while the first sensor update is in progress. This bit is only required if it is possible for the controller to receive and process a ‘Get Sensor Reading’ or ‘Get Sensor Event Status’ command for the sensor before the update has completed . This is most likely to be the case for sensors, such as fan RPM sensors, that may require seconds to accumulate the first reading after a re-arm. The bit is also used to indicate when a reading/state is unavailable because the management controller cannot obtain a valid reading or state for the monitored entity, typically because the entity is not present. See Section 16.4, Event Status, Event Conditions, and Present State and Section 16.6, Re-arming for more information.
        Hide
        dimir added a comment - - edited

        Another one from the same doc regarding "sensor scanning enabled":

        An Entity is present if there is at least one active sensor for the Entity (and there is no explicit sensor saying
        the Entity is 'absent'). A sensor is 'active' if scanning is enabled. For each of these sensors, check to see that
        at least one of the sensors is scanning by checking the "sensor scanning disabled" bit via the Get Sensor
        Reading command. Per section 11.5, software should ignore this bit if its set to 'disabled'. If there are no
        active sensors for the entity, then it should be assumed that the Entity is absent.

        Show
        dimir added a comment - - edited Another one from the same doc regarding "sensor scanning enabled": An Entity is present if there is at least one active sensor for the Entity (and there is no explicit sensor saying the Entity is 'absent'). A sensor is 'active' if scanning is enabled. For each of these sensors, check to see that at least one of the sensors is scanning by checking the "sensor scanning disabled" bit via the Get Sensor Reading command. Per section 11.5, software should ignore this bit if its set to 'disabled'. If there are no active sensors for the entity, then it should be assumed that the Entity is absent.
        Hide
        dimir added a comment - - edited

        (1) [S] As I understood from the above mentioned document, the "initial update in progress" and "sensor scanning disabled" means in general that the sensor data is not available (either not yet ready or entity does not exist). So I suggest to simplify the error message for the user like this:

                if (0 == ipmi_is_sensor_scanning_enabled(states))
                {
        -               h->err = zbx_strdup(h->err, "sensor scanning is not enabled");
        +               h->err = zbx_strdup(h->err, "sensor data is not available");
        @@ -376,7 +376,7 @@
         
                if (0 != ipmi_is_initial_update_in_progress(states))
                {
        -               h->err = zbx_strdup(h->err, "initial update is in progress");
        +               h->err = zbx_strdup(h->err, "sensor data is not available");
        

        Please see my suggestion in r46099.

        Aleksandrs Saveljevs Looks good, CLOSED.

        Show
        dimir added a comment - - edited (1) [S] As I understood from the above mentioned document, the "initial update in progress" and "sensor scanning disabled" means in general that the sensor data is not available (either not yet ready or entity does not exist). So I suggest to simplify the error message for the user like this: if (0 == ipmi_is_sensor_scanning_enabled(states)) { - h->err = zbx_strdup(h->err, "sensor scanning is not enabled"); + h->err = zbx_strdup(h->err, "sensor data is not available"); @@ -376,7 +376,7 @@ if (0 != ipmi_is_initial_update_in_progress(states)) { - h->err = zbx_strdup(h->err, "initial update is in progress"); + h->err = zbx_strdup(h->err, "sensor data is not available"); Please see my suggestion in r46099. Aleksandrs Saveljevs Looks good, CLOSED.
        Hide
        dimir added a comment -

        Other than that the issue is tested.

        Show
        dimir added a comment - Other than that the issue is tested.
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        (2) There was a conflict merging from 2.0 into 2.2 because the newer version has discrete sensor support.

        Resolved conflicts are available in development branch svn://svn.zabbix.com/branches/dev/ZBX-8188-2.2 . Please take a look.

        dimir Looks good. Please see my suggestions to simplify the code a bit in r46134. RESOLVED

        Aleksandrs Saveljevs Great! CLOSED.

        Show
        Aleksandrs Saveljevs added a comment - - edited (2) There was a conflict merging from 2.0 into 2.2 because the newer version has discrete sensor support. Resolved conflicts are available in development branch svn://svn.zabbix.com/branches/dev/ZBX-8188-2.2 . Please take a look. dimir Looks good. Please see my suggestions to simplify the code a bit in r46134. RESOLVED Aleksandrs Saveljevs Great! CLOSED.
        Hide
        Aleksandrs Saveljevs added a comment -

        Fixed in pre-2.0.13 r46114, pre-2.2.4 r46164, pre-2.3.2 (trunk) r46165.

        Show
        Aleksandrs Saveljevs added a comment - Fixed in pre-2.0.13 r46114, pre-2.2.4 r46164, pre-2.3.2 (trunk) r46165.

          People

          • Assignee:
            Unassigned
            Reporter:
            Kodai Terashima
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: