Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-11857

discrepancy in online and offline status of CPUs

XMLWordPrintable

    • Icon: Incident report Incident report
    • Resolution: Unresolved
    • Icon: Trivial Trivial
    • None
    • 3.2.4
    • Agent (G)

      We have an ARM board running Linaro Linux. The system is smart in that it activates CPUs depending on load. However, there seems to be a discrepancy between native tools and Zabbix with how online and offline status of CPUs is detected.

      Let us consider three scenarios: low load, medium load, and high load.

      Low load

      Suppose we have a low load and, therefore, only 1 CPU is online:

      # lscpu
      Architecture:          armv7l
      Byte Order:            Little Endian
      CPU(s):                4
      On-line CPU(s) list:   0
      Off-line CPU(s) list:  1-3
      Thread(s) per core:    1
      Core(s) per socket:    1
      Socket(s):             1
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[max]
      4
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[online]
      1
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.discovery
      {
          "data": [
              {
                  "{#CPU.NUMBER}": 0,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 1,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 2,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 3,
                  "{#CPU.STATUS}": "online"
              }
          ]
      }
      

      Here, system.cpu.num[max] and system.cpu.num[online] correctly report the number of online CPUs. However, system.cpu.discovery lists all CPUs as online.

      Medium load

      Now, let us create enough load to make the system activate two CPUs:

      # lscpu
      Architecture:          armv7l
      Byte Order:            Little Endian
      CPU(s):                4
      On-line CPU(s) list:   0,3
      Off-line CPU(s) list:  1,2
      Thread(s) per core:    1
      Core(s) per socket:    2
      Socket(s):             1
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[max]
      4
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[online]
      4
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.discovery
      {
          "data": [
              {
                  "{#CPU.NUMBER}": 0,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 1,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 2,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 3,
                  "{#CPU.STATUS}": "online"
              }
          ]
      }
      

      Above, system.cpu.num[max] is reported correctly. However, system.cpu.num[online] reports 4, while lscpu reports 2. The system.cpu.discovery item still reports all CPUs as online.

      High load

      Finally, let us increase the load to make the system activate all CPUs:

      # lscpu
      Architecture:          armv7l
      Byte Order:            Little Endian
      CPU(s):                4
      On-line CPU(s) list:   0-3
      Thread(s) per core:    1
      Core(s) per socket:    4
      Socket(s):             1
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[max]
      4
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.num[online]
      4
      
      # ./zabbix_get -s 127.0.0.1 -k system.cpu.discovery
      {
          "data": [
              {
                  "{#CPU.NUMBER}": 0,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 1,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 2,
                  "{#CPU.STATUS}": "online"
              },
              {
                  "{#CPU.NUMBER}": 3,
                  "{#CPU.STATUS}": "online"
              }
          ]
      }
      

      In this case, everything seems to be correct. However, in the first two cases, it is expected that Zabbix output matches the output of lscpu. It may be a good idea to investigate how lscpu obtains its data.

            Unassigned Unassigned
            asaveljevs Aleksandrs Saveljevs
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: