[ZBX-3186] Patch to add IPMI connection hack support Created: 2010 Nov 05  Updated: 2019 Aug 28  Resolved: 2019 Aug 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.3
Fix Version/s: None

Type: Incident report Priority: Major
Reporter: Christoph Berg Assignee: Unassigned
Resolution: Won't fix Votes: 2
Labels: ipmi, macosx, patch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian Linux 5.0 running a self compiled Zabbix server 1.4.3 linked against a self compiled libopenipmi 2.0.16 (Debian disabled OpenSSL support for whoever knows)


Attachments: Text File zabbix-1.8.3-support-for-ipmi-hacks-dbschema-1.patch     Text File zabbix-1.8.3-support-for-ipmi-hacks-frontend-php.patch     Text File zabbix-1.8.3-support-for-ipmi-hacks-server-1.patch    
Issue Links:
Duplicate

 Description   

We use Zabbix to monitor MacOS X Server hardware having an integrated IPMI solution provided by Intel. As Intel IPMI controllers have a bug when using RMCP+ for authentication and MacOS X does not seem to support using an alternative algorithm, Zabbix IPMI agent is not able to read values from the chip in its current form.

Provided is a patch, which allows to activate all supported hacks by OpenIPMI for connections. The only problem I could not solve is the non working mass update in the PHP frontend, because of the multiple checkboxes used to activate the hacks. Perhaps you have an idea how to solve this.

One issue I encountered with the patch is, if I change the refresh rate of the IPMI sensors from the default 30 to 60 seconds, I get these messages:

31437:20101105:103732.519 Item [dn21-mac-web5:sensor[temp2]] error: Error 0x10000c3 while reading threshold sensor
31437:20101105:103732.525 IPMI Host [dn21-mac-web5]: first network error, wait for 15 seconds
31437:20101105:103826.528 Item [dn21-mac-web5:sensor[temp2]] error: Error 0x10000ff while reading threshold sensor
31437:20101105:103826.537 Disabling IPMI host [dn21-mac-web5]
31437:20101105:104029.840 Enabling IPMI host [dn21-mac-web5]
31437:20101105:104232.891 Item [dn21-mac-web5:sensor[temp2]] error: Error 0x10000c3 while reading threshold sensor
31437:20101105:104232.896 IPMI Host [dn21-mac-web5]: first network error, wait for 15 seconds
31437:20101105:104326.900 Item [dn21-mac-web5:sensor[temp2]] error: Error 0x10000ff while reading threshold sensor
31437:20101105:104326.909 Disabling IPMI host [dn21-mac-web5]
31437:20101105:104530.332 Enabling IPMI host [dn21-mac-web5]
31437:20101105:104732.379 Item [dn21-mac-web5:sensor[temp2]] error: Error 0x10000c3 while reading threshold sensor
31437:20101105:104732.384 IPMI Host [dn21-mac-web5]: first network error, wait for 15 seconds
31437:20101105:104826.388 Item [dn21-mac-web5:sensor[temp2]] error: Error 0x10000ff while reading threshold sensor
31437:20101105:104826.397 Disabling IPMI host [dn21-mac-web5]

Changing it back to the default fixes this issue. I don't know if this is due to the patch or if allow IPMI sensors have this problem.

I splitted the patches in three parts, so it is - hopefully - easier to review.



 Comments   
Comment by richlv [ 2010 Nov 05 ]

hmm, very interesting. the biggest problems i see are that this is something very specific, and that it involves database schema changes.

maybe it's worth starting a thread on the forum (if there's not one already) to find out how many other people are seeing these issues

Comment by Christoph Berg [ 2010 Nov 08 ]

I just noticed, that the activation of the hack is not necessary on two of the three machines we want to monitor. The series having this bug is the Apple Xserve 1.1 series, but I'm in no position to replace this hardware, so we have to live with its "interpretation" of the spec.

As suggested I will start a forum thread, but looking through the posts dealing with IPMI it sounds like nobody had this problem before.

Comment by Christian Nancy [ 2010 Nov 09 ]

Slightly different although somewhat related:

I just upgraded from 1.8.2 to 1.8.3. Monitoring a Dell PE 2950 via IPMI. Before the upgrade my log was showing:

25622:20101108:214502.685 IPMI Host [Dell PowerEdge 2950]: another network error, wait for 15 seconds
25638:20101108:214616.733 IPMI Host [Dell PowerEdge 2950]: first network error, wait for 15 seconds
25638:20101108:214922.853 IPMI Host [Dell PowerEdge 2950]: first network error, wait for 15 seconds

But the sensor would be read and the data would show in Zabbix.

After the upgrade, the log shows:

27596:20101108:222031.838 Enabling IPMI host [Dell PowerEdge 2950]
27595:20101108:222422.152 Item [Dell PowerEdge 2950:Ambient_Temp] error: Error 0x10000c3 while reading threshold sensor
27595:20101108:222422.166 IPMI Host [Dell PowerEdge 2950]: first network error, wait for 15 seconds
27595:20101108:222516.173 Item [Dell PowerEdge 2950:Ambient_Temp] error: Error 0x10000ff while reading threshold sensor
27595:20101108:222516.178 Disabling IPMI host [Dell PowerEdge 2950]

And repeat again and again, also no readout in Zabbix.

Nothing else was changed on the Zabbix server, ipmitool on command line returns correct reading. I did not apply the patch discussed in this bug report as it doesn't pertain to my situation.

So something in the way Zabbix uses the ipmitool has changed from 1.8.2 to 1.8.3 which makes reading no longer working (unless of course this is an operator error, which is entirely possible).

Please let me know if this is off-topic and where should I post if that's the case.

Thanks,
Christian

Comment by Christoph Berg [ 2010 Nov 09 ]

As I said above, you can try reducing the time between two sensor checks. In my case switching back from 60 to 30 seconds worked.

Comment by Christoph Berg [ 2010 Nov 09 ]

Added a forum post for discussion: http://www.zabbix.com/forum/showthread.php?p=75170

Comment by Christian Nancy [ 2010 Nov 10 ]

It appears that in my case the error only happens every few minutes (varies) and most other times, it polls the data correctly. It has happened 427 times since I upgraded and restarted at 11/08 22:09. I've changed the interval to 30 seconds as suggested but that didn't have any impact.

I think I'll leave it at that for now. I changed the interval to 5 minutes to minimize the chatty logs.

Thank,
Christian

Comment by Rick [ 2010 Nov 13 ]

I've encountered the same issue trying to use OpenIPMI's ipmilan tool to create a udp server for Zabbix to query. If you run ipmilan with the -n and -d options you can see that Zabbix leaves the connection open but the ipmilan server "times the connection out". After that successive queries fail with an invalid SID message.

I see a patch in 1.8.4 that seems to refer to dropping and re-establishing connections which I'm going to investigate this week and will post my observations

Comment by Sergey Syreskin [ 2010 Nov 29 ]

It seems that I have a similar problem: https://support.zabbix.com/browse/ZBX-3243?focusedCommentId=35764

Comment by Blakkheim.GW [ 2011 Mar 07 ]

I have exactly the same issue than then one described by Christian Nancy. Since upgrade to 1.8.3 from 1.8.2, I have these errors in Zabbix logs and data is not polled correctly, only one check is done from time to time. Reducing the interval has no effect.

Error 0x10000c3 while reading threshold sensor
and then :
Error 0x10000ff while reading threshold sensor

I've just upgraded to 1.8.4 today but the issue stills here.

Comment by Aleksandrs Saveljevs [ 2016 Jan 26 ]

Does anybody still experience this issue with the latest versions of Zabbix?

Comment by Tyler French [ 2017 Oct 02 ]

@Aleksandrs Savelijevs Late and random but i just happened to notice this issue occurs and then "fixes" it self on some SuperMicro X10DRW-E devices that we are monitoring via IPMI. The error reported was

error 0x10000ff while reading threshold sensor
Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 29 ]

Dear frenchtoasters, I would suggest to open a new ticket. This one seems to be long dead.

Comment by Vladislavs Boborikins (Inactive) [ 2019 Aug 28 ]

Hello,

Since this version of Zabbix is no longer supported, we've decided not to prioritize this bug for the near future and close the issue with "Won't fix" resolution.

Please let us know if this decision should be reconsidered.

Regards
Vladislavs

Generated at Wed Apr 09 04:40:32 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.