Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-21703

Zabbix Agent2 is no longer retrieving Windows perfmon counters after a period of time

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Unresolved
    • Icon: Minor Minor
    • 6.2.3
    • Agent (G)
    • None
    • Windows Server 2019 Standard with Version 10.0.17663.
    • Sprint 103 (Aug 2023), Sprint 104 (Sep 2023), Sprint 105 (Oct 2023), Sprint 106 (Nov 2023), Sprint 107 (Dec 2023), S2401, S24-W6/7, S24-W8/9, S24-W10/11
    • 1

      Description

      We see on a significant number of servers that Zabbix agent2 is no longer retrieving performance counters on windows servers after a period of time. We see this behaviour with different Zabbix agent 2 (v6.0.4, v6.2.1 and v6.2.3). In the Zabbix UI items with performance counters become “not supported”. In Zabbix agent v6.0.4 and v6.2.1 the agent crashes when this happens, in agent v6.2.3 the Zabbix agent keeps running, but those items stays in unsupported state and never resolves.

      On all machines that are having these issues, we’ve concluded that (thus far) the OS Edition is exactly: “Windows Server 2019 Standard” with Version ’10.0.17663’. No other versions are found in our problem scope at this point. In case this changes, we will update this ticket accordingly.

       

      On the impacted assets we can see this error message “The system cannot find message text for message number 0x%1 in the message file for %2.” In the UI:

      On the impacted assets we can see this in de eventlog:


      When we restart the Zabbix agent the issue is resolved, but reoccurs after a period of time. As we have over 1000 assets we need a permanent fix for this behavior.
      Zabbix agent2 availability for one of the impacted assets:

      The issue results also in data gaps for items that are not using perf_counters:

      Steps to reproduce:
      We can't reproduce the issue. Troubleshooting so far:

      Via Zabbix proxy towards an impacted asset – Not working:

      root@proxy004:~# zabbix_get -s 10.10.10.10 -p 20050 -k 'perf_counter_en["\PhysicalDisk(0 C:)\Avg. Disk Write Queue Length",60]' --tls-connect psk --tls-psk-file /tmp/server-test01.psk --tls-psk-identity server-test01-agentZBX_NOTSUPPORTED: The system cannot find message text for message number 0x%1 in the message file for %2.

       Via Zabbix proxy towards an impacted asset: system_run[] to get the same Perf Counter instead of using Zabbix build in function – This works:

      root@proxy004:~# zabbix_get -s 10.10.10.10 -p 20050 -k 'system.run[powershell.exe "Get-Counter -Counter \"\PhysicalDisk(0 C:)\Avg. Disk Read Queue Length\""]' --tls-connect psk --tls-psk-file /tmp/server-test01.psk --tls-psk-identity server-test01-agent
      

      Via powershell fetching counters locally on an impacted asset – this works:

      PS C:\Program Files\Zabbix Agent 2> Get-Counter -Counter "\PhysicalDisk(0 C:)\Avg. Disk Read Queue Length"
       
      Timestamp                 CounterSamples
      ---------                 --------------
      28/09/2022 11:23:06       \\server-test01\physicaldisk(0 c:)\avg. disk read queue length :
                                0
       
      

       

      Via Windows cmd fetching counters locally on impacted asset – this works:

       

      C:\Program Files\Zabbix Agent 2>typeperf "\PhysicalDisk(0 C:)\Avg. Disk Read Queue Length"
       
      "(PDH-CSV 4.0)","\\SERVER-TEST01\PhysicalDisk(0 C:)\Avg. Disk Read Queue Length"
      "09/28/2022 11:33:18.382","0.000000"
      "09/28/2022 11:33:19.385","0.000000"
      "09/28/2022 11:33:20.386","0.000000"
      "09/28/2022 11:33:21.392","0.000000"
      "09/28/2022 11:33:22.395","0.000000"
      "09/28/2022 11:33:23.398","0.000000"
       
      The command completed successfully.
      

       

       

      This issue is seen on all perf_counter and perf_counter_en, we’re only using the out of the box perf_counters of the Zabbix templates:

      • perf_counter_en["\PhysicalDisk(0 C:)\Avg. Disk Write Queue Length",60]
      • perf_counter_en["\PhysicalDisk(0 C:)\Current Disk Queue Length",60]
      • perf_counter_en["\PhysicalDisk(0 C:)\Disk Reads/sec",60]
      • perf_counter_en["\PhysicalDisk(0 C:)\Avg. Disk sec/Read",60]
      • perf_counter_en["\PhysicalDisk(0 C:)\% Disk Time",60]
      • perf_counter_en["\PhysicalDisk(0 C:)\Disk Writes/sec",60]
      • perf_counter_en["\PhysicalDisk(0 C:)\Avg. Disk sec/Write",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Avg. Disk Read Queue Length",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Avg. Disk Write Queue Length",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Current Disk Queue Length",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Disk Reads/sec",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Avg. Disk sec/Read",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\% Disk Time",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Disk Writes/sec",60]
      • perf_counter_en["\PhysicalDisk(1 D:)\Avg. Disk sec/Write",60]
      • perf_counter_en["\Memory\Cache Bytes"]
      • perf_counter_en["\System\Context Switches/sec"]
      • perf_counter_en["\Processor Information(_total)\% DPC Time"]
      • perf_counter_en["\Processor Information(_total)\% Interrupt Time"]
      • perf_counter_en["\Processor Information(_total)\% Privileged Time"]
      • perf_counter_en["\System\Processor Queue Length"]
      • perf_counter_en["\Processor Information(_total)\% User Time"]
      • perf_counter_en["\Memory\Free System Page Table Entries"]
      • perf_counter_en["\Memory\Page Faults/sec"]
      • perf_counter_en["\Memory\Pages/sec"]
      • perf_counter_en["\Memory\Pool Nonpaged Bytes"]
      • perf_counter_en["\System\Threads"]
      • perf_counter_en["\Paging file(_Total)\% Usage"]

      All performance counter queries are failing.

      In attachment:

      • log with debug level 5 of one of the impacted assets

        1. 20231002_Screenshot-ZBX21703.png
          11 kB
          Stijn De Doncker
        2. agent2-generate-WindowsPerfMon-stats.xml
          5 kB
          Aigars Kadikis
        3. image-2022-09-28-14-54-32-500.png
          138 kB
          Stijn De Doncker
        4. image-2022-09-28-14-55-09-386.png
          33 kB
          Stijn De Doncker
        5. image-2022-09-28-14-55-26-079.png
          36 kB
          Stijn De Doncker
        6. image-2022-09-28-14-55-35-156.png
          36 kB
          Stijn De Doncker
        7. image-2022-09-28-14-55-48-108.png
          60 kB
          Stijn De Doncker
        8. image-2022-09-28-14-56-28-376.png
          60 kB
          Stijn De Doncker
        9. image-2024-11-18-09-37-15-542.png
          10 kB
          Cesar Inacio Martins
        10. image-2024-11-18-09-38-18-224.png
          4 kB
          Cesar Inacio Martins
        11. image-2024-11-18-09-38-49-685.png
          4 kB
          Cesar Inacio Martins
        12. image-2024-11-21-10-32-58-165.png
          15 kB
          Cesar Inacio Martins
        13. mta069-zabbix_agent2.log
          10.77 MB
          Jeffrey Descan
        14. new.log
          11 kB
          Nechaev Aleksey
        15. screenshot-1.png
          73 kB
          Tomass Janis Bross
        16. Screenshot 2023-09-13 at 16.04.47.png
          77 kB
          Jeffrey Descan
        17. web012-ctr.zabbix_agent2.log
          10.21 MB
          Jeffrey Descan
        18. zabbix_agent2.log
          8.49 MB
          Stijn De Doncker
        19. zabbix_agent2-x64-v64-dbg1-reopen-query.7z
          11.55 MB
          Michael Veksler
        20. zabbix_agent2-x64-v64-dbg2-reopen-query_timeout-impr.7z
          11.56 MB
          Michael Veksler
        21. zabbix_agent2-x64-v64-dbg3-mutex-split.7z
          11.62 MB
          Michael Veksler
        22. zabbix_agent2-x64-v64-dbg4-global_mutex_remove.7z
          11.96 MB
          Michael Veksler
        23. zabbix_agent2-x64-v64-dbg5-Errorlogs.7z
          11.95 MB
          Michael Veksler
        24. zabbix_agent2-x64-v64-dbg6-removePdhPath.7z
          11.97 MB
          Michael Veksler
        25. zabbix_server.log.5apr2023.gz
          10.00 MB
          Bartosz Mickiewicz
        26. zabbix-agent2.log
          51 kB
          Nechaev Aleksey
        27. zbx21703_all_hosts_problems.png
          180 kB
          Jeffrey Descan
        28. ZBX-21703_app140_agent2_status_detail.png
          279 kB
          Jeffrey Descan
        29. ZBX-21703_app140_agent2_status_global.png
          184 kB
          Jeffrey Descan
        30. ZBX-21703_app140_problems.png
          39 kB
          Jeffrey Descan
        31. ZBX-21703_app140_unsupported.png
          56 kB
          Jeffrey Descan
        32. zbx21703_db021_perflib_eventviewer.png
          7 kB
          Jeffrey Descan
        33. zbx21703_db021_problems.png
          39 kB
          Jeffrey Descan
        34. zbx21703_db021_troubleshooting_items.png
          19 kB
          Jeffrey Descan
        35. zbx21703_db021_unsupported_items.png
          42 kB
          Jeffrey Descan
        36. zbx21703_hosts_error_sept.png
          89 kB
          Jeffrey Descan
        37. zbx21703_mta069_unsupported.png
          42 kB
          Jeffrey Descan
        38. ZBX-21703_web131_agent2_status_detail.png
          211 kB
          Jeffrey Descan
        39. ZBX-21703_web131_agent2_status_globa.png
          207 kB
          Jeffrey Descan
        40. ZBX-21703_web131_problems.png
          41 kB
          Jeffrey Descan
        41. ZBX-21703_web367_agent2_status_detail.png
          43 kB
          Jeffrey Descan
        42. ZBX-21703_web367_agent2_status_global.png
          235 kB
          Jeffrey Descan
        43. ZBX-21703_web367_problems.png
          38 kB
          Jeffrey Descan
        44. ZBX-21703_web367_unsupported.png
          43 kB
          Jeffrey Descan

            MVekslers Michael Veksler
            stijndd Stijn De Doncker
            Team C
            Votes:
            48 Vote for this issue
            Watchers:
            64 Start watching this issue

              Created:
              Updated: