Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-26885

Problems page / widget load very slow if severity filters used (even for superadmin).

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Unresolved
    • Icon: Trivial Trivial
    • None
    • 6.0.40
    • Frontend (F)
    • None

      Steps to reproduce:
      After upgrade from 6.0.37 -> 6.0.40 we see very slow loading of Problems page if filter by several severities.

      • If no filter or only one severity selected - page works fast.
      • If 2 or more severities - query to DB can take 30–70 sec.

      Usually this happen for regular users with big permission sets. But now same problem also for superadmin, first time I see this behavior.
      For examle in mysql_slow.log
      Result:

      # Query_time: 49.048568  Lock_time: 0.000002  Rows_sent: 100  Rows_examined: 13979684  Rows_affected: 0  Bytes_sent: 1973
      SET timestamp=1755509396;
      SELECT COUNT(*) AS rowscount,a.eventid FROM alerts a WHERE EXISTS (SELECT NULL FROM actions aa WHERE a.actionid=aa.actionid AND aa.eventsource='0') AND a.eventid IN (816157075,816878766,816897200,818982561,821510431,821511008,822452883,822453140,823695312,823791562,823910460,831214976,831215083,831215386,832494121,832494756,832918602,833156936,833156939,833291474,833296305,834664261,836248526,839594245,840028183,840201570,840214074,840218783,840218786,840277589,840435136,840489623,841071662,841111704,841329026,841330738,841331034,841331455,841331461,841331583,841331697,841331847,841332355,841332983,841333134,841333256,841333547,841333877,841333949,841334239,841334346,841334656,841386072,841467171,841472083,841474159,841474167,841474170,841474175,841474250,841576556,841578869,841579195,841580162,841580316,841652367,841684736,841684955,841685333,841685678,841686200,841694601,841697275,841698296,841700379,841702281,841724463,841733067,841802147,841892123,841898561,841929824,841932690,841932991,841933500,841933642,841934551,841935403,841935408,841935409,841935901,841948774,841948775,841948777,841948779,841948780,842057823,842057824,842057826,842059489) GROUP BY a.eventid;
      

      equivalent as

      SELECT COUNT(*) AS rowscount,a.eventid
      FROM alerts a
      WHERE EXISTS (SELECT NULL FROM actions aa
                    WHERE a.actionid=aa.actionid AND aa.eventsource='0')
        AND a.eventid IN (.... long list of ids ....)
      GROUP BY a.eventid;
      

      This query scan ~14M rows in alerts.
      Even with filter “Age less than 14 days” -< still slow, because optimizer scan all alerts by actionid first and only then apply eventid filter.

      -> Table scan on <temporary>  (actual time=73594..73594 rows=100 loops=1)
          -> Aggregate using temporary table  (actual time=73594..73594 rows=100 loops=1)
              -> Nested loop inner join  (cost=40770 rows=1560) (actual time=8.85..73551 rows=139149 loops=1)
                  -> Covering index lookup on aa using actions_1 (eventsource=0)  (cost=1.56 rows=13) (actual time=0.0219..0.0323 rows=13 loops=1)
                  -> Filter: (a.eventid in (...long list...))  
                     (cost=2241 rows=120) (actual time=41.7..5657 rows=10704 loops=13)
                      -> Index lookup on a using alerts_1 (actionid=aa.actionid)  
                         (cost=2241 rows=8960) (actual time=0.265..5597 rows=1.07e+6 loops=13)
      

      Optimizer starts from actions table, then scan ~1.07M rows per actionid (13 actionids * 1.07M = ~14M rows). Very slow.

      workaround on our side:
      We created composite index and used ProxySQL to rewrite query with index hint.

      CREATE INDEX alerts_eventid_actionid_idx ON alerts (eventid, actionid);
      

      If we force to use this index - it works really fast.
      old:

      mysql> SELECT COUNT(*) AS rowscount,a.eventid FROM alerts a WHERE EXISTS (SELECT NULL FROM actions aa WHERE a.actionid=aa.actionid AND aa.eventsource='0') AND a.eventid IN (a lot of ids) GROUP BY a.eventid;
      ...
      100 rows in set (36.64 sec)
      
      | -> Table scan on <temporary>  (actual time=38547..38547 rows=100 loops=1)
          -> Aggregate using temporary table  (actual time=38547..38547 rows=100 loops=1)
              -> Nested loop inner join  (cost=29240 rows=957) (actual time=8.92..38513 rows=128321 loops=1)
                  -> Covering index lookup on aa using actions_1 (eventsource=0)  (cost=1.56 rows=13) (actual time=0.032..0.046 rows=13 loops=1)
                  -> Filter: (a.eventid in (a lot of IDs))  (cost=1607 rows=73.6) (actual time=56.1..2962 rows=9871 loops=13)
                      -> Index lookup on a using alerts_1 (actionid=aa.actionid)  (cost=1607 rows=6426) (actual time=0.292..2904 rows=1.08e+6 loops=13)
       1 row in set, 1 warning (38.55 sec)
      

      new:

      mysql> SELECT COUNT(*) AS rowscount,a.eventid FROM alerts a USE INDEX(alerts_eventid_actionid_idx) WHERE EXISTS (SELECT NULL FROM actions aa WHERE a.actionid=aa.actionid AND aa.eventsource='0') AND a.eventid IN (a lot of IDs) GROUP BY a.eventid;
      100 rows in set (0.06 sec)
      
      
      mysql> EXPLAIN ANALYZE SELECT COUNT(*) AS rowscount,a.eventid FROM alerts a USE INDEX(alerts_eventid_actionid_idx) WHERE EXISTS (SELECT NULL FROM actions aa WHERE a.actionid=aa.actionid AND aa.eventsource='0') AND a.eventid IN ( a lot of IDs) GROUP BY a.eventid;
      
      | -> Group aggregate: count(0)  (cost=81379 rows=104261) (actual time=0.356..102 rows=100 loops=1)
          -> Nested loop inner join  (cost=70953 rows=104261) (actual time=0.0452..96.6 rows=128321 loops=1)
              -> Filter: (a.eventid in (a lot of IDs))  (cost=26040 rows=128321) (actual time=0.0337..51.8 rows=128321 loops=1)
                  -> Covering index range scan on a using alerts_eventid_actionid_idx over (eventid = 676209621) OR (eventid = 692834226) OR (98 more)  (cost=26040 rows=128321) (actual time=0.0321..42 rows=128321 loops=1)
              -> Filter: (aa.eventsource = 0)  (cost=0.25 rows=0.812) (actual time=176e-6..237e-6 rows=1 loops=128321)
                  -> Single-row index lookup on aa using PRIMARY (actionid=a.actionid)  (cost=0.25 rows=1) (actual time=80e-6..99.4e-6 rows=1 loops=128321)
      
      1 row in set, 1 warning (0.11 sec)
      

      SHOW INDEX FROM alerts (after index creation):

      +--------+------------+-----------------------------+--------------+---------------+-----------+-------------+
      | Table  | Non_unique | Key_name                    | Seq_in_index | Column_name   | Collation | Cardinality |
      +--------+------------+-----------------------------+--------------+---------------+-----------+-------------+
      | alerts |          0 | PRIMARY                     |            1 | alertid       | A         |    10295047 |
      | alerts |          1 | alerts_1                    |            1 | actionid      | A         |        1159 |
      | alerts |          1 | alerts_2                    |            1 | clock         | A         |     4412831 |
      | alerts |          1 | alerts_3                    |            1 | eventid       | A         |     1151872 |
      | alerts |          1 | alerts_5                    |            1 | mediatypeid   | A         |        1106 |
      | alerts |          1 | alerts_6                    |            1 | userid        | A         |        1762 |
      | alerts |          1 | alerts_7                    |            1 | p_eventid     | A         |      222858 |
      | alerts |          1 | alerts_4                    |            1 | status        | A         |         630 |
      | alerts |          1 | c_alerts_6                  |            1 | acknowledgeid | A         |       18334 |
      | alerts |          1 | alerts_8                    |            1 | acknowledgeid | A         |       20950 |
      | alerts |          1 | alerts_eventid_actionid_idx |            1 | eventid       | A         |      654645 |
      | alerts |          1 | alerts_eventid_actionid_idx |            2 | actionid      | A         |      358999 |
      +--------+------------+-----------------------------+--------------+---------------+-----------+-------------+
      

      Also we have proxysql in front of mysql and we rewrote it for fixing our problem for now:

      SELECT COUNT(*) AS rowscount,a.eventid 
      FROM alerts a USE INDEX(alerts_eventid_actionid_idx)
      WHERE EXISTS (SELECT NULL
                    FROM events e,functions f,items i,hosts_groups hgg
                    JOIN rights r ON r.id=hgg.groupid AND r.groupid IN (?,?,?)
                    WHERE a.eventid=e.eventid AND e.objectid=f.triggerid
                      AND f.itemid=i.itemid AND i.hostid=hgg.hostid
                    GROUP BY f.triggerid
                    HAVING MIN(r.permission)>? AND MAX(r.permission)>=?)
        AND EXISTS (SELECT NULL FROM actions aa
                    WHERE a.actionid=aa.actionid AND aa.eventsource=?)
        AND (a.userid IS NULL OR EXISTS (
              SELECT NULL FROM users_groups ug
              WHERE ug.userid=a.userid AND ug.usrgrpid IN (?,?,?)))
        AND a.eventid=?
        AND a.status IN (?,?)
      GROUP BY a.eventid;
      

      Ideas:

      • Change frontend SQL to prefer (eventid, actionid) index, not (actionid) only.
      • Or add optimizer hint / force better index usage.
      • Or restructure SQL so filtering by eventid happens first, not after full actionid scan.

      Environment:

      • Zabbix 6.0.40 (was OK on 6.0.37) On Oracle Linux 9.5
      • Percona 8 8.0.39-30 with InnoDB, ~13M+ rows in alerts
      • ProxySQL in front (used for query rewrite workaround)
      • User role: superadmin (issue visible here, not only regular users)

      Probably related tickets:
      ZBX-19769, ZBX-23407, ZBX-26419

            pwegrzyn Piotr Wegrzyn
            anton.samets Anton Samets
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: