Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-16061

Zabbix reporting incorrect IPMI sensor values on since upgrading from v3.2.7

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Cannot Reproduce
    • Icon: Trivial Trivial
    • None
    • 4.2.1
    • Server (S)
    • None
    • Zabbix 4.2.1 installed on fully patched CentOS 7

      We have an issue since upgrading from Zabbix 3.2.7 to 4.2.1.
      For IBM x3650 (M4) servers, Zabbix seems to be returning wrong/strange values for discrete DIMMs (and some discreteCPU sensors).
      The value Zabbix returns does not match the sensro state shown in the IBM IMM console, or when I use ipmitools.
      So all of my existing triggers for DIMMs (and some CPU sensors) using "band" (which were working fine) are reporting a Problem, when there is no problem.

      Here is an example a DIMM sensor.
      ipmi shows sensor status of "Presence Detected":

      [root@DWSMON1 ~]# ipmitool -H X.X.XX.XX -U XXXX -P XXXX sensor get "DIMM 1"
      Locating sensor record...
      Sensor ID              : DIMM 1 (0xb0)
       Entity ID             : 32.1
       Sensor Type (Discrete): Memory
       States Asserted       : Memory
                               [Presence Detected]

      This is Sensor Type 0Ch and Offset 06h.
      When I look at Latest data in Zabbix, I can see that the value being returned is 100. What is 100??
      The value I would expect it to be returning is 64.  It was returning 64 prior to us upgrading.
      With this value being returned, this is triggering a wrong sensor value - Currently triggering Offset 05h ("Correctable ECC").

      Here's an example of CPU sensor.
      ipmi shows sensor status of "Presence Detected":

      [root@DWSMON1 ~]# ipmitool -H X.X.XX.XX -U XXXX -P XXXX sensor get "CPU 1"
      Locating sensor record...
      Sensor ID              : CPU 1 (0x90)
       Entity ID             : 3.1
       Sensor Type (Discrete): Processor
       States Asserted       : Processor
                               [Presence detected]

       

      This is Sensor Type 07h and Offset 07h.
      When I look at Latest data in Zabbix, I can see that the value being returned is 296. What is 296??
      The value I would expect it to be returning is 128.It was returning 128 prior to upgrading.
      Strangely, "CPU 2" is returning 128, but "CPU 1" is retuning 296. What is 296??
      With this value being returned, this is triggering wrong sensor values - Currently triggering:
      1. Offset 05h ("Configuration Error")
      2. Offset 08h ("CPU Disabled")

       

      I have attached:

      1. Screenshot - Latest Data results
      2. Screenshot - Graph showing the history. You can see where the value changed from always being 64, to 100
      3. Zabbix_server log (DebugLevel=4)

       
      Let me know if you need anything else.

        1. DIMM 1 Example.JPG
          DIMM 1 Example.JPG
          49 kB
        2. DIMM1 Latest Data.JPG
          DIMM1 Latest Data.JPG
          40 kB
        3. zabbix_server.zip
          2.65 MB

            kpavars Kristians Pavars
            SimG SimG
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: