Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-16486

Housekeeper locking database on events cleanup

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Workaround proposed
    • Icon: Trivial Trivial
    • None
    • 4.2.5
    • Server (S)
    • Zabbix Server 4.2.5
      MyQL 8.0.17
      CentOS 7

      Notes on setup:

      *Due an issue on a trigger expression (percentile, "not enough data") the events table has grown over time with a higher amount of *internal events. When housekeeper retention has started to cleanup these events, approx. 385000 records per hour (= per run in setup) have to be deleted.

      **Steps to reproduce:

      Housekeeper is starting automatically. May be started using runtime command also. All data is collected using proxies - no direct data collection (except zabbix server itself) at all.

      Result:

      Zabbix is no longer able to insert data into the database because housekeeper is locking tables;

      Some runs are fast and do not cause any issues, but on some cleanups the housekeeper took very long. This causes zabbix to stop on data insertion and as a result zabbix sends alerts on missed values (= nodata-triggers firing). From a users perspective the whole system stalls and generates huge amounts alerts (which are again resolved when housekeeper completes).

      Here's an example from the log;

       

      70425:20190807:184948.012 housekeeper [deleted 0 hist/trends, 43 items/triggers, 385281 events, 371351 problems, 0 sessions, 0 alarms, 44 audit, 0 records in 276.169684 sec, idle for 1 hour(s)]
      70425:20190807:195114.227 housekeeper [deleted 0 hist/trends, 0 items/triggers, 384215 events, 55320 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 85.693726 sec, idle for 1 hour(s)]
      70425:20190807:205246.543 housekeeper [deleted 0 hist/trends, 0 items/triggers, 384691 events, 61691 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 91.810570 sec, idle for 1 hour(s)]
      70425:20190807:215417.068 housekeeper [deleted 0 hist/trends, 0 items/triggers, 385197 events, 61755 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 90.007166 sec, idle for 1 hour(s)]
      70425:20190808:035033.912 housekeeper [deleted 0 hist/trends, 0 items/triggers, 384004 events, 60042 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 17776.345255 sec, idle for 1 hour(s)]
      70425:20190808:045743.645 housekeeper [deleted 0 hist/trends, 0 items/triggers, 384523 events, 308902 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 428.900984 sec, idle for 1 hour(s)]
      70425:20190808:055915.332 housekeeper [deleted 0 hist/trends, 0 items/triggers, 383333 events, 56282 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 91.171846 sec, idle for 1 hour(s)]
      70425:20190808:070045.155 housekeeper [deleted 0 hist/trends, 0 items/triggers, 384757 events, 50566 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 89.316387 sec, idle for 1 hour(s)]
      70425:20190808:080226.627 housekeeper [deleted 0 hist/trends, 0 items/triggers, 386847 events, 50104 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 100.970520 sec, idle for 1 hour(s)]
      70425:20190808:082323.115 housekeeper [deleted 0 hist/trends, 0 items/triggers, 386676 events, 19420 problems, 0 sessions, 0 alarms, 0 audit, 0 records in 81.158017 sec, idle for 1 hour(s)]
      

      The run on 2019/08/08 03:50:33 took 17776 seconds which is by far higher than the others. While this issue occurs there is no other tasks running on the system or environment (like backups) - it's dedicated to zabbix server.

      Checking the mysql server i've been able to extract one of the queries that are blocking the server while this issue occurs.

       

      Expected:

      Housekeeper to be less intrusive; 

      Splitting the events query into smaller chunks may help to prevent running a single delete operation that locks the database for too long.

            rvaliahmetovs Renats Valiahmetovs (Inactive)
            dn@nuvotex.de Daniel
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: