system.run on Agent2 on Windows does not terminate commands correctly -> CPU utilization

XMLWordPrintable

    • Sprint candidates
    • 3

      Some weird regression happened, supposedly after ZBX-26344

      Now long-time running commands for system.run get "locking" some resources and do not get terminated properly.

      If was fine in 6.0.40, agent side Timeout (3s default) does its job to terminate the command and reply back properly:

      $ zabbix_get -s 10.33.0.37 -k 'agent.version'
      6.0.40
      
      $ time zabbix_get -s 10.33.0.37 -k 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      ZBX_NOTSUPPORTED: Execute failed: command execution failed: context deadline exceeded.
      
      real    0m3,195s
      

      6.0.41 and later having the issue:

      $ zabbix_get -s 10.33.0.37 -k 'agent.version'
      6.0.41
      
      $ time zabbix_get -s 10.33.0.37 -k 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      ZBX_NOTSUPPORTED: Execute failed: command execution failed: context deadline exceeded.
      
      real    0m10,193s
      
      $ zabbix_get -s 10.33.0.37 -k 'agent.version'
      6.0.43
      
      $ time zabbix_get -s 10.33.0.37 -k 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      ZBX_NOTSUPPORTED: Execute failed: command execution failed: context deadline exceeded.
      
      real    0m10,209s
      

      Using Process Explorer (sysinternals.com) I saw that the "cmd -> conhost, powershell" forked processes get killed correctly after 3 seconds, but TCP connection for zabbix_get was holding for 10 seconds, until the Powershell supposed to be running.

      On real systems, where some more heavy Powershell commands may start to took much longer time, it may cause that there will be a lot of combusted "gnost" CPU utilization. See picture.

      Here are debug logs of both version, 2nd request for zabbix_get (it has 1 line less than 1st request):
      6.0.40:

      2026/01/23 10:19:40.239780 received passive check request: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]' from '10.33.0.4'
      2026/01/23 10:19:40.239780 [1] processing update request (1 requests)
      2026/01/23 10:19:40.239780 [1] adding new request for key: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:19:40.239780 [1] created direct exporter task for plugin 'SystemRun' itemid:0 key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:19:40.239780 executing direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:19:40.239780 [SystemRun] Executing command:'powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"'
      2026/01/23 10:19:43.250886 failed to execute direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]' error: 'Timeout while executing a shell script.'
      2026/01/23 10:19:43.250886 sending passive check response: ZBX_NOTSUPPORTED: 'Timeout while executing a shell script.' to '10.33.0.4'
      

      6.0.41

      2026/01/23 10:25:59.415862 received passive check request: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]' from '10.33.0.4'
      2026/01/23 10:25:59.417077 [1] processing update request (1 requests)
      2026/01/23 10:25:59.417077 [1] adding new request for key: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:25:59.417077 [1] created direct exporter task for plugin 'SystemRun' itemid:0 key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:25:59.417077 executing direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:25:59.417077 [SystemRun] Executing command:'powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"'
      2026/01/23 10:26:09.615274 failed to execute direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]' error: 'Execute failed: command execution failed: context deadline exceeded.'
      2026/01/23 10:26:09.615274 sending passive check response: ZBX_NOTSUPPORTED: 'Execute failed: command execution failed: context deadline exceeded.' to '10.33.0.4'
      

      If I ask the command to sleep longer than 30 seconds, the zabbix_get tool itself would timeout with own error. 40 seconds:

      $ time zabbix_get -s 10.33.0.37 -k 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"]'
      zabbix_get [9300]: Timeout while executing operation
      
      real    0m30,002s
      

      Agent logged that it "was sending" reply after the 40 seconds:

      2026/01/23 10:34:49.040027 received passive check request: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"]' from '10.33.0.4'
      2026/01/23 10:34:49.040027 [1] processing update request (1 requests)
      2026/01/23 10:34:49.040027 [1] adding new request for key: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"]'
      2026/01/23 10:34:49.040027 [1] created direct exporter task for plugin 'SystemRun' itemid:0 key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"]'
      2026/01/23 10:34:49.040027 executing direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"]'
      2026/01/23 10:34:49.040027 [SystemRun] Executing command:'powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"'
      2026/01/23 10:35:29.306525 failed to execute direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 40;echo 77"]' error: 'Execute failed: command execution failed: context deadline exceeded.'
      2026/01/23 10:35:29.306525 sending passive check response: ZBX_NOTSUPPORTED: 'Execute failed: command execution failed: context deadline exceeded.' to '10.33.0.4'
      

      Now I've tried recent 7.0.22 too. It has the same issue, but a bit different behavior - the forked chain of processes "cmd -> conhost, powershell" get running the whole sleep period, so they do not get killed after agent's timeout at all.
      debug of 7.0.22, 2nd get request:

      2026/01/23 10:55:04.159888 received passive check request from "system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command \"sleep 10;echo 77\"]": "10.33.0.4"
      2026/01/23 10:55:04.159888 [1] processing update request (1 requests)
      2026/01/23 10:55:04.159888 [1] adding new request for key: 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:55:04.159888 [1] created direct exporter task for plugin 'SystemRun' itemid:0 key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:55:04.159888 executing direct exporter task for key 'system.run[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]'
      2026/01/23 10:55:04.159888 [SystemRun] Executing command:'powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"'
      2026/01/23 10:55:14.601804 [SystemRun] command:'powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"' length:2 output:'77'
      2026/01/23 10:55:14.602619 executed direct exporter task for key 'system.run[[powershell.exe -NoProfile -ExecutionPolicy Bypass -Command "sleep 10;echo 77"]]'
      

            Assignee:
            Eriks Sneiders
            Reporter:
            Oleksii Zagorskyi
            Team INT
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - Not Specified
                Not Specified
                Logged:
                Time Spent - 1h
                1h