Improved Zabbix housekeeper query (Postgres)

XMLWordPrintable

    • Type: Change Request
    • Resolution: Unresolved
    • Priority: Major
    • None
    • Affects Version/s: None
    • Component/s: Server (S)
    • None
    • Environment:
      Zabbix DB:
      CentOS 6.3
      32 GB memory
      PostgreSQL 9.2

      My zabbix DB (Postgres 9.2) had grown huge (270 GB) and after some investigation I found the history and history_uint tables had grown to 60 GB each (not including indexes). I cleaned it out manually but I think the problem may be that the housekeeper wasn't able to clean up such massive tables. The housekeeper deletion queries were extremely slow - from pgbadger (log parser for Postgres logs):

      24.169s | DELETE FROM history_uint WHERE itemid = 33909 AND clock < 1373867589;
      15.436s | DELETE FROM history_uint WHERE itemid = 33911 AND clock < 1373867546;
      14.093s | DELETE FROM history_uint WHERE itemid = 31239 AND clock < 1373896389;

      I ended up writing this query that does the job pretty quickly. Rather than running an individual query for each item, this cleans the entire table in a single query (though it has a limit to prevent running out of memory, so it needs to be run multiple times):

      zabbix=# delete from history where ctid = any ( array ( select h.ctid from history h join items i on h.itemid=i.itemid where h.clock < extract ( epoch from now() - i.history * interval '1 day')::int + 7200 limit 20000000));
      DELETE 20000000
      Time: 101450.095 ms

      This was an actual query I just ran on my zabbix DB which deleted 20,000,000 old rows from the history table in 101 seconds. This is several orders of magnitude faster than the current housekeeper.

            Assignee:
            Unassigned
            Reporter:
            Evan Hoffman
            Votes:
            5 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: