[ZBX-12758] Postgresql problem table missing index on r_eventid while MySQL InnoDB automatically adds it Created: 2017 Sep 21  Updated: 2024 Apr 10  Resolved: 2017 Dec 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.2.8rc1, 3.4.2rc1
Fix Version/s: 3.4.6rc1, 4.0.0alpha2, 4.0 (plan)

Type: Problem report Priority: Critical
Reporter: JB Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 0
Labels: events, housekeeper, index, oracle, performance, postgresql, problem
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS Linux release 7.3.1611 (Core)
postgresql-9.2.18-1.el7.x86_64


Issue Links:
Sub-task
part of ZBX-11426 Events removed by housekeeper can cau... Closed
Team: Team A
Sprint: Sprint 18, Sprint 19, Sprint 21, Sprint 22, Sprint 23, Sprint 24
Story Points: 2

 Description   

Housekeeper hangs when trying to delete old events. Manually trying to delete events take long time.

Trying to delete one minute of events:

zabbix=# explain analyze delete FROM events where age(to_timestamp(events.clock)) > interval '7 days 18:59:00';
                                                                    QUERY PLAN                                              
                      
----------------------------------------------------------------------------------------------------------------------------
----------------------
 Delete on events  (cost=0.00..212828.92 rows=1978096 width=6) (actual time=12120.870..12120.870 rows=0 loops=1)
   ->  Seq Scan on events  (cost=0.00..212828.92 rows=1978096 width=6) (actual time=5015.858..12119.974 rows=282 loops=1)
         Filter: (age((('now'::cstring)::date)::timestamp with time zone, to_timestamp((clock)::double precision)) > '7 days
 18:59:00'::interval)
         Rows Removed by Filter: 5919455
 Trigger for constraint c_acknowledges_2: time=3.971 calls=282
 Trigger for constraint c_alerts_2: time=3.933 calls=282
 Trigger for constraint c_event_tag_1: time=3.530 calls=282
 Trigger for constraint c_problem_1: time=4.266 calls=282
 Trigger for constraint c_event_recovery_1: time=5.458 calls=282
 Trigger for constraint c_event_recovery_2: time=4.280 calls=282
 Trigger for constraint c_problem_2: time=17571.926 calls=282
 Trigger for constraint c_event_recovery_3: time=12.291 calls=282
 Trigger for constraint c_alerts_5: time=4.937 calls=282
 Total runtime: 29736.092 ms

Added a new index, and was able to delete 1 hour of data faster than 1 minute without the index:

zabbix=# create index problem_3 on problem(r_eventid);
zabbix=# explain analyze delete FROM events where age(to_timestamp(events.clock)) > interval '7 days 18:00:00';
                                                                    QUERY PLAN                                              
                      
----------------------------------------------------------------------------------------------------------------------------
----------------------
 Delete on events  (cost=0.00..214479.72 rows=1993439 width=6) (actual time=12327.789..12327.789 rows=0 loops=1)
   ->  Seq Scan on events  (cost=0.00..214479.72 rows=1993439 width=6) (actual time=5173.513..12297.065 rows=26258 loops=1)
         Filter: (age((('now'::cstring)::date)::timestamp with time zone, to_timestamp((clock)::double precision)) > '7 days
 18:00:00'::interval)
         Rows Removed by Filter: 5930573
 Trigger for constraint c_acknowledges_2: time=166.392 calls=26258
 Trigger for constraint c_alerts_2: time=183.011 calls=26258
 Trigger for constraint c_event_tag_1: time=163.226 calls=26258
 Trigger for constraint c_problem_1: time=184.327 calls=26258
 Trigger for constraint c_event_recovery_1: time=210.511 calls=26258
 Trigger for constraint c_event_recovery_2: time=200.982 calls=26258
 Trigger for constraint c_problem_2: time=188.991 calls=26258
 Trigger for constraint c_event_recovery_3: time=183.742 calls=26258
 Trigger for constraint c_alerts_5: time=185.132 calls=26258
 Total runtime: 14012.007 ms
(14 rows)


 Comments   
Comment by JB [ 2017 Sep 21 ]

After adding the index the housekeeper finish in no time!

Comment by Vladislavs Sokurenko [ 2017 Sep 21 ]

could you please be so kind and do
show create table problem;

Comment by JB [ 2017 Sep 21 ]
zabbix=# \d+ problem
                                   Table "public.problem"
    Column     |  Type   |         Modifiers          | Storage | Stats target | Description 
---------------+---------+----------------------------+---------+--------------+-------------
 eventid       | bigint  | not null                   | plain   |              | 
 source        | integer | not null default 0         | plain   |              | 
 object        | integer | not null default 0         | plain   |              | 
 objectid      | bigint  | not null default 0::bigint | plain   |              | 
 clock         | integer | not null default 0         | plain   |              | 
 ns            | integer | not null default 0         | plain   |              | 
 r_eventid     | bigint  |                            | plain   |              | 
 r_clock       | integer | not null default 0         | plain   |              | 
 r_ns          | integer | not null default 0         | plain   |              | 
 correlationid | bigint  |                            | plain   |              | 
 userid        | bigint  |                            | plain   |              | 
Indexes:
    "problem_pkey" PRIMARY KEY, btree (eventid)
    "problem_1" btree (source, object, objectid)
    "problem_2" btree (r_clock)
    "problem_3" btree (r_eventid)
Foreign-key constraints:
    "c_problem_1" FOREIGN KEY (eventid) REFERENCES events(eventid) ON DELETE CASCADE
    "c_problem_2" FOREIGN KEY (r_eventid) REFERENCES events(eventid) ON DELETE CASCADE
Referenced by:
    TABLE "problem_tag" CONSTRAINT "c_problem_tag_1" FOREIGN KEY (eventid) REFERENCES problem(eventid) ON DELETE CASCADE
Has OIDs: no
Comment by Vladislavs Sokurenko [ 2017 Sep 21 ]

Thank you very much for your report, please note that issue does not occur in InnoDB because it will create indexes automatically, that's why it was missed!

Comment by JB [ 2017 Sep 21 ]

Thank you for quick response!

Comment by Valdis Kauķis (Inactive) [ 2017 Dec 05 ]

Successfully tested, including fixed conflicts in r75416, ZBX-12758-4.0 branch.

Comment by Vladislavs Sokurenko [ 2017 Dec 28 ]

Fixed in:

  • pre-3.4.6rc1 r76394
  • pre-4.0.0alpha2 (trunk) r76395
Comment by Brian Beaulieu [ 2017 Dec 28 ]

Had this index missing in 3.4.5 on MySQL.. installing/updating from RPM.
Added it manually.. is there a post-update SQL update that was needed to be run manually?

Comment by Vladislavs Sokurenko [ 2017 Dec 29 ]

Could you please be so kind and provide output of:

show create table problem;

No, there are no post-update SQL, on MySQL index is created automatically.
are you using partitionning ?

Comment by Brian Beaulieu [ 2017 Dec 29 ]

Hello,

Index didn't help
27780 seconds..

831 zabbix localhost zabbix Query 27780 Sending data select min(clock) from events where events.source=3 and events.object=4 and not exists (select null

No partitioning. Haven't had any issues with housekeeping until 3.4.5

CREATE TABLE `problem` (
  `eventid` bigint(20) unsigned NOT NULL,
  `source` int(11) NOT NULL DEFAULT '0',
  `object` int(11) NOT NULL DEFAULT '0',
  `objectid` bigint(20) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) NOT NULL DEFAULT '0',
  `ns` int(11) NOT NULL DEFAULT '0',
  `r_eventid` bigint(20) unsigned DEFAULT NULL,
  `r_clock` int(11) NOT NULL DEFAULT '0',
  `r_ns` int(11) NOT NULL DEFAULT '0',
  `correlationid` bigint(20) unsigned DEFAULT NULL,
  `userid` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`eventid`),
  KEY `problem_1` (`source`,`object`,`objectid`),
  KEY `problem_2` (`r_clock`),
  KEY `problem_3` (`r_eventid`),
  CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE,
  CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 
Comment by Brian Beaulieu [ 2017 Dec 29 ]
mysql> select count(*) from events;
+----------+
| count(*) |
+----------+
| 14245852 |
+----------+
1 row in set (16.20 sec)

mysql> show create table events;
CREATE TABLE `events` (
  `eventid` bigint(20) unsigned NOT NULL,
  `source` int(11) NOT NULL DEFAULT '0',
  `object` int(11) NOT NULL DEFAULT '0',
  `objectid` bigint(20) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) NOT NULL DEFAULT '0',
  `value` int(11) NOT NULL DEFAULT '0',
  `acknowledged` int(11) NOT NULL DEFAULT '0',
  `ns` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`eventid`),
  KEY `events_1` (`source`,`object`,`objectid`,`clock`),
  KEY `events_2` (`source`,`object`,`clock`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ]

It would be great to see whole query that is slow if possible please, you could also do explain on the query, but it looks like you simply have lots of events, how many problems do you have ? Please also provide MySQL version

Comment by Ronald Schaten [ 2017 Dec 29 ]

Same problem here, after update from 3.2.6 to 3.4.5. My tables look like Brian described, the full query is:

select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid)

The events table has 132463031 rows, problem table has 766423 rows. MySQL is "Ver 14.14 Distrib 5.7.20, for Linux (x86_64) using EditLine wrapper", running on Ubuntu 16.04.

mysql> explain select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid)\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: events
   partitions: NULL
         type: ref
possible_keys: events_1,events_2
          key: events_2
      key_len: 8
          ref: const,const
         rows: 15964577
     filtered: 100.00
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: problem
   partitions: NULL
         type: ALL
possible_keys: PRIMARY,c_problem_2
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 708846
     filtered: 19.00
        Extra: Range checked for each record (index map: 0x9)
2 rows in set, 3 warnings (0,00 sec)

The database is partitioned, but events and problem tables are not.

Upgrade notes for 3.4.0 recommend to decrease storage period for some events from 365d to 1d. Coming from Zabbix 3.2, I did that on my system. But only after doing the update, so of course the tables still contain quite many rows.

Comment by Vladislavs Sokurenko [ 2017 Dec 29 ]

Could you please also attach show create table problem; ?

Comment by Ronald Schaten [ 2017 Dec 29 ]

As mentioned, to me it looks like the one Brian attached this morning:

CREATE TABLE `problem` (
  `eventid` bigint(20) unsigned NOT NULL,
  `source` int(11) NOT NULL DEFAULT '0',
  `object` int(11) NOT NULL DEFAULT '0',
  `objectid` bigint(20) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) NOT NULL DEFAULT '0',
  `ns` int(11) NOT NULL DEFAULT '0',
  `r_eventid` bigint(20) unsigned DEFAULT NULL,
  `r_clock` int(11) NOT NULL DEFAULT '0',
  `r_ns` int(11) NOT NULL DEFAULT '0',
  `correlationid` bigint(20) unsigned DEFAULT NULL,
  `userid` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`eventid`),
  KEY `problem_1` (`source`,`object`,`objectid`),
  KEY `problem_2` (`r_clock`),
  KEY `c_problem_2` (`r_eventid`),
  CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE,
  CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ]

Can you try this please

explain select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid) and not exists (select null from problem where events.eventid=problem.r_eventid)\G
Comment by Ronald Schaten [ 2017 Dec 29 ]
mysql> explain select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid) and not exists (select null from problem where events.eventid=problem.r_eventid)\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: events
   partitions: NULL
         type: ref
possible_keys: events_1,events_2
          key: events_2
      key_len: 8
          ref: const,const
         rows: 15973658
     filtered: 100.00
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 3
  select_type: DEPENDENT SUBQUERY
        table: problem
   partitions: NULL
         type: ref
possible_keys: c_problem_2
          key: c_problem_2
      key_len: 9
          ref: zabbix.events.eventid
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: problem
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: zabbix.events.eventid
         rows: 1
     filtered: 100.00
        Extra: Using index
3 rows in set, 3 warnings (0,00 sec)
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ]

Thanks,

Can you confirm that second query is faster ?

select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid);
select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid) and not exists (select null from problem where events.eventid=problem.r_eventid);
Comment by Ronald Schaten [ 2017 Dec 29 ]

Yes. The second query took less than 18 minutes, the original one ran more than eight hours before I terminated it.

Comment by Vladislavs Sokurenko [ 2017 Dec 29 ]

Thanks allot created ZBX-13275

Comment by Oleksii Zagorskyi [ 2018 Feb 19 ]

I'm on mysql and started my zabbix from 3.4.
Upgraded to 3.4.6 from 3.4.5.
Is that ok that I have now 2 identical indexes for "problem" table?

  KEY `c_problem_2` (`r_eventid`),
  KEY `problem_3` (`r_eventid`),

Before upgrade I had only `c_problem_2` index.

Maybe the procedure should be more smart to not create duplicated indexes for mysql?

vso this procedure does not create duplicate indexes, you can run patch multiple times and it will no longer create index with the name 'problem_3'. Could you please attach outout of show create table problem; ?

zalex_ua those 2 lines are from the such output. I did already "drop index c_problem_2 on problem;" as it's a production database, which I had to dump and restore to maintain it, there was a reason.
You can see that the table had an index with name "c_problem_2", but zabbix code checks for index name

static int      DBpatch_3040006(void)
{
        if (FAIL == DBindex_exists("problem", "problem_3"))
                return DBcreate_index("problem", "problem_3", "r_eventid", 0);

Ahh, I had db backup (because I had to manage it), here is schema before I ran 3.4.6 after 3.4.5:

mysql> show create table problem \G
*************************** 1. row ***************************
       Table: problem
Create Table: CREATE TABLE `problem` (
  `eventid` bigint(20) unsigned NOT NULL,
  `source` int(11) NOT NULL DEFAULT '0',
  `object` int(11) NOT NULL DEFAULT '0',
  `objectid` bigint(20) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) NOT NULL DEFAULT '0',
  `ns` int(11) NOT NULL DEFAULT '0',
  `r_eventid` bigint(20) unsigned DEFAULT NULL,
  `r_clock` int(11) NOT NULL DEFAULT '0',
  `r_ns` int(11) NOT NULL DEFAULT '0',
  `correlationid` bigint(20) unsigned DEFAULT NULL,
  `userid` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`eventid`),
  KEY `problem_1` (`source`,`object`,`objectid`),
  KEY `problem_2` (`r_clock`),
  KEY `c_problem_2` (`r_eventid`),
  CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE,
  CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
1 row in set (0.00 sec)

Strange, I performed a clean test: created DB schema+images+data from 3.4.5 sources, upgrade it by 3.4.7 binary, what I had and received `c_problem_2` index magically replaced by `problem_3` index.
Hhh, then I recreated such DB again, then dumped it, then restored and then upgraded and viola - I got duplicated indexes:

mysql> show create table problem \G
*************************** 1. row ***************************
       Table: problem
Create Table: CREATE TABLE `problem` (
  `eventid` bigint(20) unsigned NOT NULL,
  `source` int(11) NOT NULL DEFAULT '0',
  `object` int(11) NOT NULL DEFAULT '0',
  `objectid` bigint(20) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) NOT NULL DEFAULT '0',
  `ns` int(11) NOT NULL DEFAULT '0',
  `r_eventid` bigint(20) unsigned DEFAULT NULL,
  `r_clock` int(11) NOT NULL DEFAULT '0',
  `r_ns` int(11) NOT NULL DEFAULT '0',
  `correlationid` bigint(20) unsigned DEFAULT NULL,
  `userid` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`eventid`),
  KEY `problem_1` (`source`,`object`,`objectid`),
  KEY `problem_2` (`r_clock`),
  KEY `c_problem_2` (`r_eventid`),
  KEY `problem_3` (`r_eventid`),
  CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE,
  CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.00 sec)

For both DBs (just created from schema+images+data SQLs and recovered from dump), upgrade debug log is identical:

z345:
 20173:20180219:125955.039 starting automatic database upgrade
 20173:20180219:125955.039 query [txnlev:1] [begin;]
 20173:20180219:125955.039 query [txnlev:1] [show index from problem where key_name='problem_3']
 20173:20180219:125955.040 query [txnlev:1] [create index problem_3 on problem (r_eventid)]
 20173:20180219:125955.058 query [txnlev:1] [update dbversion set optional=3040006]
 20173:20180219:125955.059 query [txnlev:1] [commit;]
 20173:20180219:125955.059 completed 100% of database upgrade
 20173:20180219:125955.059 database upgrade fully completed

z345recovered:
 20351:20180219:130053.868 starting automatic database upgrade
 20351:20180219:130053.868 query [txnlev:1] [begin;]
 20351:20180219:130053.868 query [txnlev:1] [show index from problem where key_name='problem_3']
 20351:20180219:130053.869 query [txnlev:1] [create index problem_3 on problem (r_eventid)]
 20351:20180219:130053.886 query [txnlev:1] [update dbversion set optional=3040006]
 20351:20180219:130053.887 query [txnlev:1] [commit;]
 20351:20180219:130053.887 completed 100% of database upgrade
 20351:20180219:130053.887 database upgrade fully completed

Of course mysqldump, taken from created database, contains a line

  KEY `c_problem_2` (`r_eventid`),

while schema.sql does not.

Fuhh, probably it should not be ignored, as it really happened in production

vso this is very unfortunate that MySQL acts differently when you restore backup, we could add another condition for 'c_problem_2' and would have to rename this index if it exists.

zalex_ua maybe. Please take care on it further. And probably 3.4.6 schema should be checked the same way, because you have added the index creation explicitly to schema.sql.

vso It's best if you create a separate bug report.

zalex_ua Reported as ZBX-13498
CLOSED

Generated at Fri Apr 26 10:45:07 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.