Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-19517

CPU usage items seems to return invalid result on AIX "Dedicated CPU" servers

    XMLWordPrintable

Details

    • Team C
    • Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 79 (Aug 2021), Sprint 80 (Sep 2021), Sprint 81 (Oct 2021)
    • 2

    Description

      We have an Item in a template for AIX servers using key, "system.stat[cpu,pc]". This seems to work correctly on AIX "Shared CPU" servers, but incorrectly on AIX "Dedicated CPU" servers.

      First, a Shared CPU server - here is some CPU usage output from vmstat:

      $ vmstat 10 12
      System configuration: lcpu=24 mem=58880MB ent=3.00
      kthr    memory              page              faults              cpu          
      ----- ----------- ------------------------ ------------ -----------------------
       r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa    pc    ec
       1  0 2894246 6479413   0   0   0   0    0   0  59 49866 748  6  1 93  0  0.68  22.8
       1  0 2894266 6479393   0   0   0   0    0   0  67 52235 827  6  2 92  0  0.71  23.5
      ...
      

      In this case, the "pc" column is very close to the data that the Zabbix agent gathers, as would be expected.

      Here is some example output from a Dedicated CPU server:

      $ vmstat 10 30
      System configuration: lcpu=16 mem=29696MB
      kthr    memory              page              faults        cpu    
      ----- ----------- ------------------------ ------------ -----------
       r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
       1  0 1481398 3566625   0   0   0   0    0   0  27 6066 444  1  0 99  0
       1  0 1482258 3565764   0   0   0   0    0   0  40 6447 456  1  1 98  0
      ...
      

      In this case, there is no "pc" column, but the "id" column indicates that the CPU usage is almost 100% idle. However, the Zabbix agent item returns values that are mostly around 4.0, although some are as high as 10. There are 4 physical CPUs dedicated to this server, so it almost seems like it is capture the CPU idle value, although that wouldn't explain the values as high as 10.

      I'm attaching two files to this ticket, one from a Dedicated CPU server (aixXXXXX) and one from a Shared CPU server (urmXXXXX).
      Each file contains output from the "lparstat -i" command which gives resource allocation allocation information. Each file also contains CPU usage via the "vmstat" command.
      aixXXXXX.txt urmXXXXX.txt

      At the end, I've placed some data that the Zabbix agent is capturing during similar CPU usage, to compare to the output from vmstat.

      The Zabbix server is v5.0.7 and the AIX agent is v5.0.8

      Later we figured out that there is another key which is supporting additional options for AIX, so we changed key to

      system.cpu.util[all,system,avg1,physical]
      


      Even if the “system” parameter may not include everything (user,iowait,idle), the examples are showing values that are higher than the number of CPUs allocated to the LPAR, so that doesn’t seem correct.
      Below is an example of an LPAR with a much higher CPU allocation and utilization. In this case, the values don’t go higher than the number of allocated virtual CPUs (80), but they don’t match the other monitoring software either (although, it could be a difference between “system” usage and “full” cpu usage).

      The second monitoring tool is monitoring overall cpu usage in terms of the number of physical cpu’s consumed.

      Attachments

        1. aixXXXXX.txt
          5 kB
        2. erc-host-another-mon-system.png
          erc-host-another-mon-system.png
          24 kB
        3. erc-host-zabbix.png
          erc-host-zabbix.png
          33 kB
        4. urmXXXXX.txt
          4 kB

        Activity

          People

            andris Andris Mednis
            zalex_ua Oleksii Zagorskyi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: