Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-26650

Zabbix Agent 2 Corrupts RPM Database via system.sw.packages.get

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Unresolved
    • Icon: Trivial Trivial
    • None
    • 7.0.15
    • Agent2 (G)
    • None
    • Support backlog

      Client is We are facing a critical issue on multiple RHEL 8 systems where the RPM database becomes corrupted after the zabbix_agent2 process executes the system.sw.packages.get item.

      After investigating the root cause, we followed this Red Hat knowledge base article:
      🔗 https://access.redhat.com/solutions/3330211

      The kernel audit logs confirmed that zabbix_agent2 (running as root) sends SIGKILL signals to the rpm processes executing rpm -qa --queryformat ..., which leads to corruption of the RPM database (/var/lib/rpm).
      Here’s a sample from the trace:

      {{sys.kill: zabbix_agent2(pid:1070317) called kill(2710675, SIGKILL)
      sig.send: SIGKILL was sent to rpm (pid:2710675) by uid:0
      kprocess.exit: rpm(pid:2710675) - Code 9 - "rpm -qa --queryformat ..."}}

      Once the agent terminated the rpm processes, client observed immediate database corruption:

      {{error: db5 error(30973) from dbenv>failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
      error: cannot open Packages database in /var/lib/rpm}}

      Simultaneously, Zabbix Agent 2 logs showed:

      {{check 'system.sw.packages.get' is not supported: Timeout occurred while gathering data.
      [Sw] Failed to execute command 'rpm -qa', err: Command execution failed: context deadline exceeded.}}

      Important context: Client has made the decision to run zabbix_agent2 as root in their environment to ensure access to all critical server components, including hardened or restricted filesystems. This was necessary for complete visibility. Agent command line:

      root 1070317 0.5 0.1 2229476 47392 ? Ssl Jun23 77:40 /usr/sbin/zabbix_agent2 -c /etc/zabbix/zabbix_agent2.conf

      Summary:

      • Agent version: Zabbix Agent 2 7.0.14
      • OS: RHEL 8
      • Affected item: system.sw.packages.get
      • Impact: RPM DB corruption due to SIGKILLs issued by Zabbix Agent
      • Reproducible: Yes, happens on multiple hosts
      • Workaround: We have temporarily disabled system.sw.packages.get

      Other possibly helpfull info:
       

      [root@sldjde1221 zabbix]# zabbix_agent2 -V
      zabbix_agent2 (Zabbix) 7.0.14
      Revision ae76e5efee9 18 June 2025, compilation time: Jun 18 2025 12:12:38, built with: go1.24.1
      Plugin communication protocol version is 6.4.0
      Copyright (C) 2025 Zabbix SIA
      License AGPLv3: GNU Affero General Public License version 3 <https://www.gnu.org/licenses/.>
      This is free software: you are free to change and redistribute it according to
      the license. There is NO WARRANTY, to the extent permitted by law.
      This product includes software developed by the OpenSSL Project
      for use in the OpenSSL Toolkit (http://www.openssl.org/).
      Compiled with OpenSSL 1.1.1k FIPS 25 Mar 2021
      Running with OpenSSL 1.1.1k FIPS 25 Mar 2021
      

       

      Client uses the library Eclipse Paho (eclipse/paho.mqtt.golang), which is
      distributed under the terms of the Eclipse Distribution License 1.0 (The 3-Clause BSD License)
      available at https://www.eclipse.org/org/documents/edl-v10.php

      Client uses the library go-modbus (goburrow/modbus), which is
      distributed under the terms of the 3-Clause BSD License
      available at https://github.com/goburrow/modbus/blob/master/LICENSE

       

      [root@sldjde1221 zabbix]# time rpm qa --queryformat '%{NAME},%{VERSION}%{RELEASE},%{ARCH},%{SIZE},%{BUILDTIME},%{INSTALLTIME}
      n' > /dev/null
      real 0m1.479s
      user 0m1.369s
      sys 0m0.105s
      

       

      # Ansible managed: agent.conf.j2 modified on 2025-04-30 12:04:13 by root on automation-job-146954-ggqbb
      #
      # This is a configuration file for Zabbix Agent 2
      # To get more information about Zabbix, visit http://www.zabbix.com
      
      # This configuration file is "minimalized", which means all the original comments
      # are removed. The full documentation for your Zabbix Agent 2 can be found here:
      # https://www.zabbix.com/documentation/7.0/en/manual/appendix/config/zabbix_agent2
      
      # Alias=
      # AllowKey=
      BufferSend=5
      BufferSize=100
      ControlSocket=/tmp/agent.sock
      DebugLevel=3
      # DenyKey=
      EnablePersistentBuffer=0
      ForceActiveChecksOnStart=0
      HeartbeatFrequency=60
      # HostInterface=
      # HostInterfaceItem=
      # HostMetadata=
      # HostMetadataItem=
      Hostname=sldjde1221
      # HostnameItem=
      Include=/etc/zabbix/zabbix_agent2.d/*.conf
      Include=/etc/zabbix/zabbix_agent2.d/plugins.d/*.conf
      ListenIP=172.21.14.57
      ListenPort=10050
      LogFile=/var/log/zabbix/zabbix_agent2.log
      LogFileSize=100
      LogType=file
      # PersistentBufferFile=
      PersistentBufferPeriod=1h
      PidFile=/var/run/zabbix/zabbix_agent2.pid
      PluginSocket=/tmp/agent.plugin.sock
      PluginTimeout=3
      RefreshActiveChecks=120
      Server=172.16.238.0/24,172.16.19.0/24
      ServerActive=zabbix-np.REDACTED,slqzbx0366.REDACTED,slqzbx0367.REDACTED,slqzbx0368.REDACTED
      # SourceIP=
      # StatusPort=
      Timeout=3
      TLSAccept=psk
      # TLSCAFile=
      # TLSCertFile=
      TLSConnect=psk
      # TLSCRLFile=
      # TLSKeyFile=
      TLSPSKFile=/etc/zabbix/zabbix_agent.psk
      TLSPSKIdentity=saqzabbixagent
      # TLSServerCertIssuer=
      # TLSServerCertSubject=
      UnsafeUserParameters=0
      # UserParameter=
      # UserParameterDir=0

      The zabbix-agent2 was installed using Ansible automation and the official Zabbix repository via dnf.
      The package in use is:

      zabbix-agent2-7.0.14-release1.el8.x86_64

      The host is running:

      Red Hat Enterprise Linux release 8.10 (Ootpa)
      Kernel: 4.18.0-553.56.1.el8_10.x86_64

      Client is not exactly sure when the issue began, as the agent was operating normally for some time. However, we recently started rolling out OS updates across several systems and noticed that the RPM database was corrupted on multiple hosts.

      After further investigation, including analysis with Red Hat Support, client identified that the root cause was linked to zabbix-agent2 executing the system.sw.packages.get item. The agent times out (context deadline exceeded), sends SIGKILL to the rpm process, and this results in RPM database corruption (DB_RUNRECOVERY).

            zabbix.support Zabbix Support Team
            jprusinowski Jan Prusinowski
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: