We have an ARM board running Linaro Linux. The system is smart in that it activates CPUs depending on load. However, there seems to be a discrepancy between native tools and Zabbix with how online and offline status of CPUs is detected.
Let us consider three scenarios: low load, medium load, and high load.
Low load
Suppose we have a low load and, therefore, only 1 CPU is online:
# lscpu Architecture: armv7l Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0 Off-line CPU(s) list: 1-3 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1
# ./zabbix_get -s 127.0.0.1 -k system.cpu.num[max] 4 # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[online] 1 # ./zabbix_get -s 127.0.0.1 -k system.cpu.discovery { "data": [ { "{#CPU.NUMBER}": 0, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 1, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 2, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 3, "{#CPU.STATUS}": "online" } ] }
Here, system.cpu.num[max] and system.cpu.num[online] correctly report the number of online CPUs. However, system.cpu.discovery lists all CPUs as online.
Medium load
Now, let us create enough load to make the system activate two CPUs:
# lscpu Architecture: armv7l Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0,3 Off-line CPU(s) list: 1,2 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 1
# ./zabbix_get -s 127.0.0.1 -k system.cpu.num[max] 4 # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[online] 4 # ./zabbix_get -s 127.0.0.1 -k system.cpu.discovery { "data": [ { "{#CPU.NUMBER}": 0, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 1, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 2, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 3, "{#CPU.STATUS}": "online" } ] }
Above, system.cpu.num[max] is reported correctly. However, system.cpu.num[online] reports 4, while lscpu reports 2. The system.cpu.discovery item still reports all CPUs as online.
High load
Finally, let us increase the load to make the system activate all CPUs:
# lscpu Architecture: armv7l Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1
# ./zabbix_get -s 127.0.0.1 -k system.cpu.num[max] 4 # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[online] 4 # ./zabbix_get -s 127.0.0.1 -k system.cpu.discovery { "data": [ { "{#CPU.NUMBER}": 0, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 1, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 2, "{#CPU.STATUS}": "online" }, { "{#CPU.NUMBER}": 3, "{#CPU.STATUS}": "online" } ] }
In this case, everything seems to be correct. However, in the first two cases, it is expected that Zabbix output matches the output of lscpu. It may be a good idea to investigate how lscpu obtains its data.