[ZBX-16061] Zabbix reporting incorrect IPMI sensor values on since upgrading from v3.2.7 Created: 2019 Apr 30  Updated: 2020 Jul 03  Resolved: 2020 Jun 16

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 4.2.1
Fix Version/s: None

Type: Problem report Priority: Trivial
Reporter: SimG Assignee: Kristians Pavars
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Zabbix 4.2.1 installed on fully patched CentOS 7


Attachments: JPEG File DIMM 1 Example.JPG     JPEG File DIMM1 Latest Data.JPG     Zip Archive zabbix_server.zip    

 Description   

We have an issue since upgrading from Zabbix 3.2.7 to 4.2.1.
For IBM x3650 (M4) servers, Zabbix seems to be returning wrong/strange values for discrete DIMMs (and some discreteCPU sensors).
The value Zabbix returns does not match the sensro state shown in the IBM IMM console, or when I use ipmitools.
So all of my existing triggers for DIMMs (and some CPU sensors) using "band" (which were working fine) are reporting a Problem, when there is no problem.

Here is an example a DIMM sensor.
ipmi shows sensor status of "Presence Detected":

[root@DWSMON1 ~]# ipmitool -H X.X.XX.XX -U XXXX -P XXXX sensor get "DIMM 1"
Locating sensor record...
Sensor ID              : DIMM 1 (0xb0)
 Entity ID             : 32.1
 Sensor Type (Discrete): Memory
 States Asserted       : Memory
                         [Presence Detected]

This is Sensor Type 0Ch and Offset 06h.
When I look at Latest data in Zabbix, I can see that the value being returned is 100. What is 100??
The value I would expect it to be returning is 64.  It was returning 64 prior to us upgrading.
With this value being returned, this is triggering a wrong sensor value - Currently triggering Offset 05h ("Correctable ECC").

Here's an example of CPU sensor.
ipmi shows sensor status of "Presence Detected":

[root@DWSMON1 ~]# ipmitool -H X.X.XX.XX -U XXXX -P XXXX sensor get "CPU 1"
Locating sensor record...
Sensor ID              : CPU 1 (0x90)
 Entity ID             : 3.1
 Sensor Type (Discrete): Processor
 States Asserted       : Processor
                         [Presence detected]

 

This is Sensor Type 07h and Offset 07h.
When I look at Latest data in Zabbix, I can see that the value being returned is 296. What is 296??
The value I would expect it to be returning is 128.It was returning 128 prior to upgrading.
Strangely, "CPU 2" is returning 128, but "CPU 1" is retuning 296. What is 296??
With this value being returned, this is triggering wrong sensor values - Currently triggering:
1. Offset 05h ("Configuration Error")
2. Offset 08h ("CPU Disabled")

 

I have attached:

  1. Screenshot - Latest Data results
  2. Screenshot - Graph showing the history. You can see where the value changed from always being 64, to 100
  3. Zabbix_server log (DebugLevel=4)

 
Let me know if you need anything else.



 Comments   
Comment by SimG [ 2019 Sep 23 ]

Hi

Any further news or information on this yet?

 

Thanks

Comment by Kristians Pavars [ 2020 Jun 15 ]

Hi SimG,

 

Sorry for the long delay on this issue. Can you please attach your item configuration with any preprocessing steps?

 

Thanks,
Kristiāns Pavārs

Comment by Kristians Pavars [ 2020 Jun 16 ]

Hi SimG,

 

Please consider upgrading to latest versions 4.4.x or 5.x, as 4.2 is non-LTS version and thus is no longer supported. I will be closing this issue.

 

Regards,
Kristiāns

Generated at Thu Apr 25 13:22:42 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.