[ZBX-12360] IPMI Agent get wrong value Created: 2017 Jul 07 Updated: 2026 Mar 24 |
|
| Status: | Reopened |
| Project: | ZABBIX BUGS AND ISSUES |
| Component/s: | Server (S) |
| Affects Version/s: | 3.2.6 |
| Fix Version/s: | None |
| Type: | Problem report | Priority: | Trivial |
| Reporter: | BM | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | ipmi | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Sprint: | Sprint 12, Sprint 13, Sprint 14, Sprint 15, Sprint 16, Sprint 17, Sprint 18 | ||||||||
| Story Points: | 4 | ||||||||
| Description |
|
Hello, I've got an issue where Zabbix IPMI agent get wrong values. With openipmi commands I've got the rights values sofdevzabbix01:/home/bob# ipmitool -I lanplus -H 172.18.131.80 -U -P sdr Only temperature sensors get good values. |
| Comments |
| Comment by Andrea Biscuola (Inactive) [ 2017 Jul 17 ] |
|
Can't reproduce this. On our devices that have IPMI, the fan speed is reported in RPM. BM Can you try to remove from the fan item configuration the unit and see what are the raw values without any potential conversion? |
| Comment by BM [ 2017 Jul 17 ] |
|
Unit have been removed from the template for Fan, The template have been "unlinked and clear" then "link" again to the host. The raw values are the same, I've got "1" for fan speed. I want to precise that the values are from an ILO DL380p Gen8 |
| Comment by Andrea Biscuola (Inactive) [ 2017 Jul 17 ] |
|
BM Can you also post the output of the "sensor" command instead of "sdr"? |
| Comment by BM [ 2017 Jul 17 ] |
|
Yes here it is bob@sofdevzabbix01:~$ ipmitool -I lanplus -H 172.18.131.80 -U -P usensor |
| Comment by Andrea Biscuola (Inactive) [ 2017 Jul 17 ] |
|
BM I have a theory about how the bug is originated. But I need from you to run the server with increased logging level (Change the DebugLevel parameter to 4 in the zabbix_server.conf file). Thanks |
| Comment by BM [ 2017 Jul 18 ] |
|
In link you have the log with a level 4 https://drive.google.com/open?id=0BxYYuq2hl7h9TWRzdVdiejY4R2c |
| Comment by Andrea Biscuola (Inactive) [ 2017 Jul 20 ] |
|
So, After a lot of thinking I understood (I think) why BM receive wrong values. It took me a while and also a lot of reading of the IPMI version 2.0 specification. The Power Supply and Fan sensors in BM hardware are 'discrete' (see section 42.1 of the IPMI 2.0 specification). switch (s->reading_type) { case IPMI_EVENT_READING_TYPE_THRESHOLD: if (0 != (ret = ipmi_sensor_get_reading(s->sensor, zbx_got_thresh_reading_cb, h))) { /* do not use pointer to sensor here - the sensor may have disappeared during */ /* ipmi_sensor_get_reading(), as domain might be closed due to communication failure */ h->err = zbx_dsprintf(h->err, "Cannot read sensor \"%s\"." " ipmi_sensor_get_reading() return error: 0x%x", id_str, ret); h->ret = NOTSUPPORTED; goto out; } break; case IPMI_EVENT_READING_TYPE_DISCRETE_USAGE: case IPMI_EVENT_READING_TYPE_DISCRETE_STATE: case IPMI_EVENT_READING_TYPE_DISCRETE_PREDICTIVE_FAILURE: case IPMI_EVENT_READING_TYPE_DISCRETE_LIMIT_EXCEEDED: case IPMI_EVENT_READING_TYPE_DISCRETE_PERFORMANCE_MET: case IPMI_EVENT_READING_TYPE_DISCRETE_SEVERITY: case IPMI_EVENT_READING_TYPE_DISCRETE_DEVICE_PRESENCE: case IPMI_EVENT_READING_TYPE_DISCRETE_DEVICE_ENABLE: case IPMI_EVENT_READING_TYPE_DISCRETE_AVAILABILITY: case IPMI_EVENT_READING_TYPE_DISCRETE_REDUNDANCY: case IPMI_EVENT_READING_TYPE_DISCRETE_ACPI_POWER: case IPMI_EVENT_READING_TYPE_SENSOR_SPECIFIC: case 0x70: /* reading types 70h-7Fh are for OEM discrete sensors */ case 0x71: case 0x72: case 0x73: case 0x74: case 0x75: case 0x76: case 0x77: case 0x78: case 0x79: case 0x7a: case 0x7b: case 0x7c: case 0x7d: case 0x7e: case 0x7f: if (0 != (ret = ipmi_sensor_get_states(s->sensor, zbx_got_discrete_states_cb, h))) { /* do not use pointer to sensor here - the sensor may have disappeared during */ /* ipmi_sensor_get_states(), as domain might be closed due to communication failure */ h->err = zbx_dsprintf(h->err, "Cannot read sensor \"%s\"." " ipmi_sensor_get_states() return error: 0x%x", id_str, ret); h->ret = NOTSUPPORTED; goto out; } break; default: s_reading_type_string = ipmi_sensor_get_event_reading_type_string(s->sensor); h->err = zbx_dsprintf(h->err, "Cannot read sensor \"%s\"." " IPMI reading type \"%s\" is not supported", id_str, s_reading_type_string); h->ret = NOTSUPPORTED; goto out; } This switch statement process the various type of 'readings' that a sensor expose. In the case of BM we can see from the log provided that, while the temperature sensors (all the ones with correct values) have a type of 0x1, so purely "threshold-based", the others are 'generic discrete' (for example the Fan sensors have type 0xa and it fall in the 'Generic discrete' category).
I hope it's clear where the problem lies |