[ZBX-282] Sensors data is incorrectly reading from procfs instead of sysfs on newer Linux kernels. Created: 2008 Jan 21  Updated: 2017 May 30  Resolved: 2013 Oct 15

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 1.4, 1.4.1, 1.4.2, 1.4.3
Fix Version/s: 2.1.2

Type: Incident report Priority: Minor
Reporter: Richard Hurt Assignee: Unassigned
Resolution: Fixed Votes: 7
Labels: items, linux
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux x86


Attachments: File linux-2.6.x-sensors.patch     File linux-2.6.x-sensors.patch     File linux-newer-kernel-sensors.patch     File linux-newer-kernel-sensors.patch     File zabbix-newer-linux-kernel-sensors.patch     File zabbix-newer-linux-kernel-sensors.patch     File zabbix-newer-linux-kernel-sensors.patch     File zabbix-newer-linux-kernel-sensors.patch     File zabbix_1.6.2_sensor.patch     File zabbix_1.6.2_sensor.patch    

 Description   

After seeing this (http://www.zabbix.com/forum/showthread.php?t=6279) post in the forums I decided to check it out a bit further. It appears as if sensors.c is still using the old procfs interface to get things like CPU & M/B temperature readings. Linux has been moving away from the procfs interface (at least for this data) for quite a while and is now focused on using sysfs. Sysfs provides a much cleaner, more defined interface and allows for one "file" per device.

There is a work around for this problem as high-lighted in this forum post (http://www.zabbix.com/forum/showthread.php?t=2508), but it's not clean and forces users to go searching for a solution to a problem that really shouldn't exist.



 Comments   
Comment by Richard Hurt [ 2008 Jan 21 ]

FYI: I have confirmed that this problem exists on the current development branch (1.5) of the Zabbix source.

zabbix-1.5/src/libs/zbxsysinfo/linux/sensors.c

Comment by Bart Verwilst [ 2008 Sep 14 ]

Still present at today's SVN checkout. Seems like a /ignore to me

Comment by Alexandru Ticlea [ 2009 Mar 16 ]

I solve this problem on zabbix 16.2 with attached patch. I use kernel 2.6.18-6-686 from Debian

Comment by Trever L. Adams [ 2011 Jun 15 ]

This patch fixes the sensors for 2.6.x kernels. The linux template needs to be modified to do sensors[hwmon0,temp1] and sensors[hwmon1,temp1], etc.

This is because some boards will have the proper sensors on hwmon1 and others on hwmon0.

Additionally:

There are two types of devices in 2.6.x. One has the files in the actual directory for hwmonX. The other it is in device. I don't know the difference. Examples:

/sys/class/hwmon/hwmon1
device fan2_max in0_label in2_input in3_min temp1_max
fan1_input fan2_min in0_max in2_label name temp2_crit
fan1_label fan3_input in0_min in2_max power temp2_input
fan1_max fan3_label in1_input in2_min subsystem temp2_label
fan1_min fan3_max in1_label in3_input temp1_crit temp2_max
fan2_input fan3_min in1_max in3_label temp1_input uevent
fan2_label in0_input in1_min in3_max temp1_label

/sys/class/hwmon/hwmon0/device
broken_parity_status hwmon power temp1_crit_hyst
class irq remove temp1_input
config local_cpulist rescan temp1_max
consistent_dma_mask_bits local_cpus resource uevent
device modalias subsystem vendor
dma_mask_bits msi_bus subsystem_device
driver name subsystem_vendor
enable numa_node temp1_crit

inX_max is the maximum voltage that should be reported
inX_min is the minimum voltage that should be reported
inX_label is the value of the device
fanX_min, fanX_max fanX_label similar

*X_input is the actual value and should be divided by 1000 (except for fans)

tempX_whatever is the same as inX and fanX

I would be willing to work on a set of changes for 2.6.x kernels that would search in *X_label to find what people are looking for since temp1 and temp2 are not stable from MB to MB as CPU or MB. This does add overhead of opening an extra file, etc.

The same could be done for hwmonX, as the name file has the name of the device such as atk0110 (ACPI device on one of my boards) and k10temp which is the CPU's own sensors.

If someone can tell me how to allow access to the max/min values without messing with avg,min,max, etc. I will be glad to implement it so that people can use them. These should also be usable in triggers, such as sensor[hwmon0,temp1] > sensor[hwmon0,temp1_max] (however the latter gets accessed), should trigger if the temperature is too high.

Comment by Trever L. Adams [ 2011 Jun 15 ]

As per my other patch in another bug, it would be great if this was applied in a 1.8.x tree so Fedora will pick it up for the current release.

Comment by Trever L. Adams [ 2011 Jun 15 ]

Sorry, I uploaded the wrong patch. It can be erased. This is the correct patch.

Comment by Trever L. Adams [ 2011 Jun 16 ]

This version of the patch brings functionality on par with 2.4.x kernels. min,max,avg work. This properly returns values for in (voltages), fan (RPM), temp (degrees Celsius).

This still has the same bug as this code did for 2.4.x in that it assumes temp1,2,3 is the same on each board (1 is MB, 2 CPU1, 3 CPU2, etc.).

To fix this bug is fairly easy given that sensor[device,CPU Temperature] will work. (I have not tried spaces as arguments here.)

It will also be somewhat slow as all of the typeX_label files would have to be walked and not all sensors have _label. This could be sped up with a lookup table, except (both are my ignorance potentially):

1) I would need to know how to do a setup/teardown function for the sensors so that a small table can be malloced/freed, etc. and any setup for #2

2) I would need to know zabbix's locking model to work with said table safely.

Comment by Trever L. Adams [ 2012 Jan 09 ]

This patch is now updated to work with 3.1 kernels (3.0 not handled, although it would be easy to do as it is just autoconf junk).

This also now works whether the new kernel has the item in $device/device or $device. It does this by checking where the "name" file is.

This patch would be better if someone more familiar with zabbix would help me know the best way (zabbix coding standards) to have a lookup table such that it has:

device name (hwmon0, hwmon1, etc.)
alias (the contents of the "name" file)
path (/sys/class/hwmon/hwmonX and if it has everything under device, /device on the end)

This would allow this code to function much faster and for people to use the alias (as this will be common to many boards within a company, while X in hwmonX will not necessarily be so).

I would greatly appreciate this patch being reviewed and applied if possible. This is against 1.8.10.

Comment by Trever L. Adams [ 2012 Jan 10 ]

A slight clean up, making the code easier to understand and removing a string copy.

Comment by Trever L. Adams [ 2012 Jan 24 ]

This does not include 3.0 support because I am lazy and I don't know of anyone using it. It does support 2.6.x 3.1.x and 3.2.x.

Comment by Trever L. Adams [ 2012 Apr 01 ]

This is an updated patch. It should work for all the 3.x series of kernels unless the interface changes.

Comment by Trever L. Adams [ 2012 Sep 07 ]

Alexei would you mind saying what RTF is?

Comment by Victor [ 2012 Oct 03 ]

confirm on 3.3.8 kernel, gentoo. trace show that no attempts to read data from procfs at all.
maybe something wrong in makefile?

Comment by Trever L. Adams [ 2012 Oct 03 ]

The patch needs updating. It may be a few days before I get to it. The makefile needs to change the 3.x.x stuff to just match any 3.x.x kernel. The define it creates should probably be renamed to fit the fact it is such as KERNEL_3_X and all the ifdefs need to be adjusted to use that for all the current 3.x.x tests currently being used.

Comment by Victor [ 2012 Oct 18 ]

good day. Trever, are you need addition info?

Comment by Trever L. Adams [ 2012 Oct 18 ]

No, the problem is only the autoconf/makefile/define issue mentioned above. I just haven't had a moment to fix the patch yet. I will try to fix it in the next few minutes.

Comment by Trever L. Adams [ 2012 Oct 18 ]

I haven't tested this with 2.4.x kernels because I have none. However, it is largely unchanged for that.

Assuming the interface won't change in future 3.x or 4.x kernels, this will work for kernels up >=2.4.x and <=5.x.

If 5.x doesn't change the interface, only configure.in will need to be changed. This is working for 3.6.x here.

Comment by Victor [ 2012 Oct 19 ]

Incredible.

.# zabbix_agentd -p | grep sensor
sensor[w83781d-i2c-0-2d,temp1] [m|ZBX_NOTSUPPORTED]

.# sensors
via_cputemp-isa-0000
Adapter: ISA adapter
Core 0: +35.0 C
cpu0_vid: +0.684 V

vt1211-isa-6000
Adapter: ISA adapter
in0: +2.10 V (min = +0.00 V, max = +2.63 V)
in1: +2.09 V (min = +0.00 V, max = +2.63 V)
in2: +1.99 V (min = +0.00 V, max = +2.63 V)
in3: +0.68 V (min = +0.00 V, max = +2.63 V)
+3.3V: +3.37 V (min = +0.00 V, max = +4.18 V)
fan1: 0 RPM (min = 0 RPM, div = 2)
fan2: 0 RPM (min = 0 RPM, div = 2)
temp1: +106.0 C (high = +255.0 C, hyst = +0.0 C)
SIO Temp: +59.0 C (high = +204.0 C, hyst = +0.0 C)
temp7: +0.7 C (high = -0.0 C, hyst = +2.6 C)
cpu0_vid: +1.708 V

after patch.

.# zabbix_get -s <ip> -k 'sensor[vt1211-isa-6000,temp1]'
ZBX_NOTSUPPORTED

i cant't understand: if sensor's support enabled - why ZBX_NOTSUPPORTED?
if disabled - why zabbix_agentd -p print sensor[...?

where founded w83781d-i2c-0-2d? i seems has no such sensor

Comment by Trever L. Adams [ 2012 Oct 19 ]

The problem is that "vt1211-isa-6000" isn't the name of the sys/class/hwmon entry. It will be something like "hwmon0" or "hwmon1" (which is what the 2.4.x interface that currently exists requires). If you look in the /sys/class/hwmon/ directory, you will find ones that either have a "file" called name or it will be device/name. That name will hold "w83781d" or "vt121" in your case.

I would like to have it do the lookup for you, but I do not know enough about zabbix internals and coding standards to know how best to store a directory/device name lookup table (it would be slow to always look in the name file... so it should be remembered).

I hope this solves your problem.

Comment by Victor [ 2012 Oct 19 ]

.# cat /sys/class/hwmon/hwmon0/device/name
via_cputemp
.# cat /sys/class/hwmon/hwmon1/device/name
vt1211

~ # zabbix_get -s <IP> -k 'sensor[vt1211,temp1]'
ZBX_NOTSUPPORTED

i think if agentd get incorrect sensor name, error must be "not found" or "incorrect" or smng simular.
NOTSUPPORTED seems "no sensor support", isn't it?

i try strace, but not see access to /sys/class/hwmon/* at all.

ok, how zabbix_agentd know that sensors supported?

Comment by Victor [ 2012 Oct 19 ]

Seems it's gentoo ebuild problem, i think compiled part:

_#else

int GET_SENSOR(const char *cmd, const char *param, unsigned flags, AGENT_RESULT *result)

{ return SYSINFO_RET_FAIL; }

_#endif /* KERNEL_2_4 || KERNEL_2_6_Xplus */

Comment by Trever L. Adams [ 2012 Oct 19 ]

I am afraid you misunderstood me.

".# cat /sys/class/hwmon/hwmon1/device/name
vt1211"

So, you should be doing ~ # zabbix_get -s <IP> -k 'sensor[hwmon1,temp1]'

Comment by Victor [ 2012 Oct 21 ]

hmm. its work! Thank you! But i think "ZBX_UNSUPPORTED" bad idea, better smthg like "incorrect dev name"

Comment by Victor [ 2013 Feb 28 ]

Hi again. 3.7.9 kernel, 1.8.16 agent, config stil the same.

# zabbix_get -s <IP> -k 'sensor[hwmon1,temp1]'
ZBX_NOTSUPPORTED

Comment by Victor [ 2013 Mar 20 ]

Hi Trever, same in version 2.0.5. Seems all code in sensors.c only for 2.4 kernel.

Comment by Trever L. Adams [ 2013 May 03 ]

This is a cleaned up patch for 2.0.x of Zabbix. The autoconf is cleaned up and works better. It shouldn't require updating unless/until the kernel interface changes again.

Zabbix developers, you still have the sensors.c code, it hasn't been useful on current kernels in many years. Please, consider this patch.

Comment by Trever L. Adams [ 2013 May 03 ]

Victor, it appears you didn't apply the patch, which needed updating. My Zabbix install may be going away if this and other bugs are not fixed. So, please, bug the Zabbix developers about accepting this patch.

Comment by Igors Homjakovs (Inactive) [ 2013 Jun 26 ]

Thank you for active involvement. We are currently working on this issue.

Comment by dimir [ 2013 Jun 27 ]

Thanks a lot for the effort you guys have put into it. We decided it's time to spend some time on this.

First of all we noticed there is huge amount of Linux kernel hwmon modules that provide interface to monitoring the sensors. The problem is that many of them differ in a way they provide access to the data (e. g. coretemp module provides support for Intel core chips):

https://www.kernel.org/doc/Documentation/hwmon/

Some GNU/Linux distros have command that auto-detects sensors available on the system called "sensors-detect" and a program that lists them "sensors". The available sensors are separated by type of hardware: RAM sensors, CPU Core sensors, etc. We would like to integrate something like that into Zabbix. So that user will not have to specify hwmon channel (hwmon0, hwmon1, hwmonX...) and item (temp1, temp2, tempX...) but a type of hardware and ID, e. g. sensor[cpu0, temp] sensor[ram0, temp], something like that. So currently we are studying the sensors auto-detection program on how it detects the type of hardware sensors belong to.

Comment by Igors Homjakovs (Inactive) [ 2013 Jun 27 ]

All sensor chips are located in /sys/class/hwmon/hwmon*. If there is no /sys/class/hwmon/hwmon*/device directory, then the device is treated as virtual. In this case, the files are located inside the hwmon* directory, otherwise, inside hwmon*/device directory.

The common scheme for naming the files that contain sensor readings inside any of the directories mentioned above is: <type><number>_<item>

,where
types for sensor chips are "in" (voltage), "temp" (temperature), "fan" (fan), etc.,
items are "input" (measured value), "max" (high threshold, "min" (low threshold), etc.,
number is always used for elements that can be present more than once (usually starts from 1, except for voltages which start from 0). If files do not refer to a specific element they have a simple name with no number. We also noticed that in some devices numbering doesn't start from 0 or 1, which means that the numbering could be device-specific.

A file, called name, located inside hwmon* or hwmon*/device directories contains the name of the chip. According to the specification this is the mandatory attribute.

sensors-detect program (lmsensors package) helps to determine which modules are necessary for available sensors. When modules are loaded the program, called sensors, can be used to show the readings of all sensor chips. The labeling of sensor readings, used by this program, can be different from the common naming scheme (<type><number>_<item> ):

  • if there is a file called <type><number>_label, then the label inside this file will be used instead of <type><number><item> name;
  • if there is no <type><number>_label file, then the program searches inside the /etc/sensors.conf (could be also /etc/sensors3.conf, or different) for the name substitution.

This labeling allows user to determine what kind of hardware is used. If there is neither <type><number>_label file nor label inside the configuration file the type of hardware can be determined by the name attribute (hwmon*/device/name).

In sensor program the available sensors are separated by the bus type (ISA adapter, PCI adapter, SPI adapter, Virtual device, ACPI interface, HID adapter).

Comment by Igors Homjakovs (Inactive) [ 2013 Jul 09 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-282

Comment by Andris Zeila [ 2013 Jul 12 ]

(1) proposed formatting and code style fixes in r36936. Also please check the TODO: comment

igorsh I accept your changes. RESOLVED

wiper CLOSED

Comment by Andris Zeila [ 2013 Jul 16 ]

(2) documentation update:

<richlv> this was also documented in https://www.zabbix.com/documentation/2.2/manual/appendix/items/sensor ; i added the part about obtaining sensor names

igorsh Updated https://www.zabbix.com/documentation/2.2/manual/config/items/itemtypes/zabbix_agent
and https://www.zabbix.com/documentation/2.2/manual/appendix/items/sensor

RESOLVED

wiper CLOSED

Comment by Andris Zeila [ 2013 Jul 17 ]

Successfully tested

Comment by Igors Homjakovs (Inactive) [ 2013 Jul 19 ]

Fixed in version 2.1.2 (trunk) r37160.

Comment by richlv [ 2013 Sep 04 ]

subissues (1) and (2) not closed

Comment by Andris Mednis [ 2013 Oct 04 ]

(3) Agent does not compile if Linux kernel version is 2.4:
sensors.c: In function `sysfs_read_attr':
sensors.c:98: `ATTR_MAX' undeclared (first use in this function)

andris CLOSED

(4) src/libs/zbxsysinfo/linux/sensors.c contains "#define EXTRA "device"". It looks like EXTRA is not used.

igorsh Both issues were RESOLVED in r39208

andris CLOSED

Comment by Trever L. Adams [ 2013 Oct 10 ]

While the patch supposedly fixes things and adds additionally fixes I didn't do, your code is a regression versus mine as mine did voltage levels, etc. I do monitor these as they can help show when a power supply or motherboard (voltage regulators) are starting to go. It can also help determine, to a lesser extent, if power in certain areas is too dirty. Please, put back in the voltage handling.

Comment by Igors Homjakovs (Inactive) [ 2013 Oct 11 ]

Dear Trever,

Thank you for your message and the patch previously submitted. Could you elaborate more on the problem? You cannot monitor voltages on the motherboard now, right?
I had tested the voltage monitoring on other hosts before and the agent returned correct readings.

Could you please post the output of sensors -u ?

Thanks.

Comment by Trever L. Adams [ 2013 Oct 11 ]

According to https://www.zabbix.com/documentation/2.2/manual/appendix/items/sensor you cannot look at +12V, +3.3V, etc.

My patch, I believe, allowed me to do this as I could view them. At the moment, I haven't updated to the versions of Zabbix that have this patch, so I cannot test and must trust the documentation. (I use Fedora and prefer to stay within their releases.)

atk0110-acpi-0
Adapter: ACPI interface
Vcore Voltage: +1.34 V (min = +0.85 V, max = +1.70 V)
+3.3 Voltage: +3.44 V (min = +2.97 V, max = +3.63 V)
+5 Voltage: +5.11 V (min = +4.50 V, max = +5.50 V)
+12 Voltage: +12.44 V (min = +10.20 V, max = +13.80 V)
CPU FAN Speed: 1991 RPM (min = 600 RPM, max = 7200 RPM)
CHASSIS FAN Speed: 1493 RPM (min = 600 RPM, max = 7200 RPM)
CHASSIS FAN 2 Speed: 1146 RPM (min = 600 RPM, max = 7200 RPM)
CPU Temperature: +102.2°F (high = +140.0°F, crit = +203.0°F)
MB Temperature: +107.6°F (high = +113.0°F, crit = +167.0°F)

k10temp-pci-00c3
Adapter: PCI adapter
temp1: +96.1°F (high = +158.0°F)
(crit = +194.0°F, hyst = +190.4°F)

Thank you,
Trever

Comment by Trever L. Adams [ 2013 Oct 11 ]

Oh, and I forgot. You are welcome for the patch.

Comment by Igors Homjakovs (Inactive) [ 2013 Oct 11 ]

Traver,

Thank you for prompt response.
I think i know what is the problem, which means that the documentation is confusing and has to be improved.

Let me try to clarify something in order to be sure.

As i can see from your previous post, you ran "sensors" command, but not "sensors -u", am I correct?

If yes, then the agent will not be able to return the sensors readings if you specify +3.3, or +5 as sensor names, e.g
zabbix_get -s 127.0.0.1 -k sensor[atk0110-acpi-0,+3.3]
ZBX_NOTSUPPORTED

However, if you run sensors -u then you would get something like this:

lm85-i2c-0-2e
Adapter: SMBus I801 adapter at 3000
+3.3V:
in2_input: 3.30
in2_min: 2.97
in2_max: 3.63
in2_alarm: 0.00
+5V:
in3_input: 1.51
in3_min: 4.50
in3_max: 5.50
in3_alarm: 1.00

Then the following will work just fine:
zabbix_get -s 127.0.0.1 -k sensor[lm85-i2c-0-2e,in2]
3.300000

I would appreciate your comment on that.

Comment by Trever L. Adams [ 2013 Oct 11 ]

Thank you, yes, the documentation is the problem. I am sorry that I didn't understand that right off. My day was just beginning.

Just to make sure that I do not misunderstand something else. All of the sensors in the documentation are temperature or voltage. Does the new code support fan speed? That is also very telling in my setup.

Comment by Igors Homjakovs (Inactive) [ 2013 Oct 14 ]

Trever,

Thank you for your feedback. It is very valuable for us. I will try to make the documentation more straightforward.

Yes, the new code supports voltage, current, temperature and fan speed readings. Actually, other sensor readings (e.g. pwm, humidity, energy, etc.) are also supported since the reading are acquired in the same way as temperature, voltage, current and fan speed readings. However, we were unable to test it thoroughly due to unavailability of sensors giving this type of readings.

Please let me know if you have any other questions or comments.

Comment by Igors Homjakovs (Inactive) [ 2013 Oct 16 ]

Fixed in version 2.1.8 (r39352).

Comment by Victor [ 2014 Feb 14 ]

See issue in 2.2.*

 
#  zabbix_agentd -p | grep sensor
sensor[w83781d-i2c-0-2d,temp1]                [m|ZBX_NOTSUPPORTED]

#  zabbix_agentd -V              
Zabbix Agent (daemon) v2.2.2rc2 (revision 42257) (04 February 2014)

# uname -r    
3.10.17-gentoo

# sensors
via_cputemp-isa-0000
Adapter: ISA adapter
Core 0:       +49.0 C  
cpu0_vid:    +0.684 V

vt1211-isa-6000
Adapter: ISA adapter
in0:          +2.10 V  (min =  +0.00 V, max =  +2.63 V)
in1:          +2.09 V  (min =  +0.00 V, max =  +2.63 V)
in2:          +1.99 V  (min =  +0.00 V, max =  +2.63 V)
in3:          +0.68 V  (min =  +0.00 V, max =  +2.63 V)
+3.3V:        +3.37 V  (min =  +0.00 V, max =  +4.18 V)
fan1:           0 RPM  (min =    0 RPM, div = 2)
fan2:           0 RPM  (min =    0 RPM, div = 2)
temp1:       +119.0 C  (high = +255.0 C, hyst =  +0.0 C)
SIO Temp:     +69.0 C  (high = +204.0 C, hyst =  +0.0 C)
temp7:         +0.6 C  (high =  -0.0 C, hyst =  +2.6 C)
cpu0_vid:    +1.708 V
Comment by richlv [ 2014 Feb 14 ]

your sensors are apparently not named "w83781d-i2c-0-2d" which is the default in the testing mode - please see the manual for details

Comment by Victor [ 2014 Feb 14 ]

Mmmm... yes, my fail. I think zabbix_agentd -p print my current sensor.

Problem seems in names (or it's my misunderstooding)

#zabbix_get -s HOST -k 'sensor[vt1211-isa-6000,temp1]'
119.000000

but

#zabbix_get -s HOST -k 'sensor[hwmon1,temp1]'
ZBX_NOTSUPPORTED

alias hwmon1 not work anymore?

Comment by richlv [ 2014 Feb 14 ]

you should use sensor name, i don't think zabbix has ever supported hwmonX

Comment by Victor [ 2014 Feb 14 ]

2.0.X versions work with hwmonX perfectly. But OK, I use device name already.
Mmmm... but if someone use hwmonX, he get broken items. I think it's not good.

Generated at Fri Mar 29 11:19:07 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.