[ZBX-12758] Postgresql problem table missing index on r_eventid while MySQL InnoDB automatically adds it Created: 2017 Sep 21 Updated: 2024 Apr 10 Resolved: 2017 Dec 28 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 3.2.8rc1, 3.4.2rc1 |
Fix Version/s: | 3.4.6rc1, 4.0.0alpha2, 4.0 (plan) |
Type: | Problem report | Priority: | Critical |
Reporter: | JB | Assignee: | Vladislavs Sokurenko |
Resolution: | Fixed | Votes: | 0 |
Labels: | events, housekeeper, index, oracle, performance, postgresql, problem | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
CentOS Linux release 7.3.1611 (Core) |
Issue Links: |
|
||||||||
Team: | Team A | ||||||||
Sprint: | Sprint 18, Sprint 19, Sprint 21, Sprint 22, Sprint 23, Sprint 24 | ||||||||
Story Points: | 2 |
Description |
Housekeeper hangs when trying to delete old events. Manually trying to delete events take long time. Trying to delete one minute of events: zabbix=# explain analyze delete FROM events where age(to_timestamp(events.clock)) > interval '7 days 18:59:00'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- ---------------------- Delete on events (cost=0.00..212828.92 rows=1978096 width=6) (actual time=12120.870..12120.870 rows=0 loops=1) -> Seq Scan on events (cost=0.00..212828.92 rows=1978096 width=6) (actual time=5015.858..12119.974 rows=282 loops=1) Filter: (age((('now'::cstring)::date)::timestamp with time zone, to_timestamp((clock)::double precision)) > '7 days 18:59:00'::interval) Rows Removed by Filter: 5919455 Trigger for constraint c_acknowledges_2: time=3.971 calls=282 Trigger for constraint c_alerts_2: time=3.933 calls=282 Trigger for constraint c_event_tag_1: time=3.530 calls=282 Trigger for constraint c_problem_1: time=4.266 calls=282 Trigger for constraint c_event_recovery_1: time=5.458 calls=282 Trigger for constraint c_event_recovery_2: time=4.280 calls=282 Trigger for constraint c_problem_2: time=17571.926 calls=282 Trigger for constraint c_event_recovery_3: time=12.291 calls=282 Trigger for constraint c_alerts_5: time=4.937 calls=282 Total runtime: 29736.092 ms Added a new index, and was able to delete 1 hour of data faster than 1 minute without the index: zabbix=# create index problem_3 on problem(r_eventid); zabbix=# explain analyze delete FROM events where age(to_timestamp(events.clock)) > interval '7 days 18:00:00'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- ---------------------- Delete on events (cost=0.00..214479.72 rows=1993439 width=6) (actual time=12327.789..12327.789 rows=0 loops=1) -> Seq Scan on events (cost=0.00..214479.72 rows=1993439 width=6) (actual time=5173.513..12297.065 rows=26258 loops=1) Filter: (age((('now'::cstring)::date)::timestamp with time zone, to_timestamp((clock)::double precision)) > '7 days 18:00:00'::interval) Rows Removed by Filter: 5930573 Trigger for constraint c_acknowledges_2: time=166.392 calls=26258 Trigger for constraint c_alerts_2: time=183.011 calls=26258 Trigger for constraint c_event_tag_1: time=163.226 calls=26258 Trigger for constraint c_problem_1: time=184.327 calls=26258 Trigger for constraint c_event_recovery_1: time=210.511 calls=26258 Trigger for constraint c_event_recovery_2: time=200.982 calls=26258 Trigger for constraint c_problem_2: time=188.991 calls=26258 Trigger for constraint c_event_recovery_3: time=183.742 calls=26258 Trigger for constraint c_alerts_5: time=185.132 calls=26258 Total runtime: 14012.007 ms (14 rows) |
Comments |
Comment by JB [ 2017 Sep 21 ] | ||||||||
After adding the index the housekeeper finish in no time! | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Sep 21 ] | ||||||||
could you please be so kind and do | ||||||||
Comment by JB [ 2017 Sep 21 ] | ||||||||
zabbix=# \d+ problem Table "public.problem" Column | Type | Modifiers | Storage | Stats target | Description ---------------+---------+----------------------------+---------+--------------+------------- eventid | bigint | not null | plain | | source | integer | not null default 0 | plain | | object | integer | not null default 0 | plain | | objectid | bigint | not null default 0::bigint | plain | | clock | integer | not null default 0 | plain | | ns | integer | not null default 0 | plain | | r_eventid | bigint | | plain | | r_clock | integer | not null default 0 | plain | | r_ns | integer | not null default 0 | plain | | correlationid | bigint | | plain | | userid | bigint | | plain | | Indexes: "problem_pkey" PRIMARY KEY, btree (eventid) "problem_1" btree (source, object, objectid) "problem_2" btree (r_clock) "problem_3" btree (r_eventid) Foreign-key constraints: "c_problem_1" FOREIGN KEY (eventid) REFERENCES events(eventid) ON DELETE CASCADE "c_problem_2" FOREIGN KEY (r_eventid) REFERENCES events(eventid) ON DELETE CASCADE Referenced by: TABLE "problem_tag" CONSTRAINT "c_problem_tag_1" FOREIGN KEY (eventid) REFERENCES problem(eventid) ON DELETE CASCADE Has OIDs: no | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Sep 21 ] | ||||||||
Thank you very much for your report, please note that issue does not occur in InnoDB because it will create indexes automatically, that's why it was missed! | ||||||||
Comment by JB [ 2017 Sep 21 ] | ||||||||
Thank you for quick response! | ||||||||
Comment by Valdis Kauķis (Inactive) [ 2017 Dec 05 ] | ||||||||
Successfully tested, including fixed conflicts in r75416, | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 28 ] | ||||||||
Fixed in:
| ||||||||
Comment by Brian Beaulieu [ 2017 Dec 28 ] | ||||||||
Had this index missing in 3.4.5 on MySQL.. installing/updating from RPM. | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ] | ||||||||
Could you please be so kind and provide output of: show create table problem; No, there are no post-update SQL, on MySQL index is created automatically. | ||||||||
Comment by Brian Beaulieu [ 2017 Dec 29 ] | ||||||||
Hello, Index didn't help
No partitioning. Haven't had any issues with housekeeping until 3.4.5 CREATE TABLE `problem` ( `eventid` bigint(20) unsigned NOT NULL, `source` int(11) NOT NULL DEFAULT '0', `object` int(11) NOT NULL DEFAULT '0', `objectid` bigint(20) unsigned NOT NULL DEFAULT '0', `clock` int(11) NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', `r_eventid` bigint(20) unsigned DEFAULT NULL, `r_clock` int(11) NOT NULL DEFAULT '0', `r_ns` int(11) NOT NULL DEFAULT '0', `correlationid` bigint(20) unsigned DEFAULT NULL, `userid` bigint(20) unsigned DEFAULT NULL, PRIMARY KEY (`eventid`), KEY `problem_1` (`source`,`object`,`objectid`), KEY `problem_2` (`r_clock`), KEY `problem_3` (`r_eventid`), CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE, CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | ||||||||
Comment by Brian Beaulieu [ 2017 Dec 29 ] | ||||||||
mysql> select count(*) from events; +----------+ | count(*) | +----------+ | 14245852 | +----------+ 1 row in set (16.20 sec) mysql> show create table events; CREATE TABLE `events` ( `eventid` bigint(20) unsigned NOT NULL, `source` int(11) NOT NULL DEFAULT '0', `object` int(11) NOT NULL DEFAULT '0', `objectid` bigint(20) unsigned NOT NULL DEFAULT '0', `clock` int(11) NOT NULL DEFAULT '0', `value` int(11) NOT NULL DEFAULT '0', `acknowledged` int(11) NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`eventid`), KEY `events_1` (`source`,`object`,`objectid`,`clock`), KEY `events_2` (`source`,`object`,`clock`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ] | ||||||||
It would be great to see whole query that is slow if possible please, you could also do explain on the query, but it looks like you simply have lots of events, how many problems do you have ? Please also provide MySQL version | ||||||||
Comment by Ronald Schaten [ 2017 Dec 29 ] | ||||||||
Same problem here, after update from 3.2.6 to 3.4.5. My tables look like Brian described, the full query is: select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid) The events table has 132463031 rows, problem table has 766423 rows. MySQL is "Ver 14.14 Distrib 5.7.20, for Linux (x86_64) using EditLine wrapper", running on Ubuntu 16.04. mysql> explain select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid)\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: events partitions: NULL type: ref possible_keys: events_1,events_2 key: events_2 key_len: 8 ref: const,const rows: 15964577 filtered: 100.00 Extra: Using where; Using index *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: problem partitions: NULL type: ALL possible_keys: PRIMARY,c_problem_2 key: NULL key_len: NULL ref: NULL rows: 708846 filtered: 19.00 Extra: Range checked for each record (index map: 0x9) 2 rows in set, 3 warnings (0,00 sec) The database is partitioned, but events and problem tables are not. Upgrade notes for 3.4.0 recommend to decrease storage period for some events from 365d to 1d. Coming from Zabbix 3.2, I did that on my system. But only after doing the update, so of course the tables still contain quite many rows. | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ] | ||||||||
Could you please also attach show create table problem; ? | ||||||||
Comment by Ronald Schaten [ 2017 Dec 29 ] | ||||||||
As mentioned, to me it looks like the one Brian attached this morning: CREATE TABLE `problem` ( `eventid` bigint(20) unsigned NOT NULL, `source` int(11) NOT NULL DEFAULT '0', `object` int(11) NOT NULL DEFAULT '0', `objectid` bigint(20) unsigned NOT NULL DEFAULT '0', `clock` int(11) NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', `r_eventid` bigint(20) unsigned DEFAULT NULL, `r_clock` int(11) NOT NULL DEFAULT '0', `r_ns` int(11) NOT NULL DEFAULT '0', `correlationid` bigint(20) unsigned DEFAULT NULL, `userid` bigint(20) unsigned DEFAULT NULL, PRIMARY KEY (`eventid`), KEY `problem_1` (`source`,`object`,`objectid`), KEY `problem_2` (`r_clock`), KEY `c_problem_2` (`r_eventid`), CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE, CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ] | ||||||||
Can you try this please explain select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid) and not exists (select null from problem where events.eventid=problem.r_eventid)\G | ||||||||
Comment by Ronald Schaten [ 2017 Dec 29 ] | ||||||||
mysql> explain select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid) and not exists (select null from problem where events.eventid=problem.r_eventid)\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: events partitions: NULL type: ref possible_keys: events_1,events_2 key: events_2 key_len: 8 ref: const,const rows: 15973658 filtered: 100.00 Extra: Using where; Using index *************************** 2. row *************************** id: 3 select_type: DEPENDENT SUBQUERY table: problem partitions: NULL type: ref possible_keys: c_problem_2 key: c_problem_2 key_len: 9 ref: zabbix.events.eventid rows: 1 filtered: 100.00 Extra: Using index *************************** 3. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: problem partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: zabbix.events.eventid rows: 1 filtered: 100.00 Extra: Using index 3 rows in set, 3 warnings (0,00 sec) | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ] | ||||||||
Thanks, Can you confirm that second query is faster ? select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid); select min(clock) from events where events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid) and not exists (select null from problem where events.eventid=problem.r_eventid); | ||||||||
Comment by Ronald Schaten [ 2017 Dec 29 ] | ||||||||
Yes. The second query took less than 18 minutes, the original one ran more than eight hours before I terminated it. | ||||||||
Comment by Vladislavs Sokurenko [ 2017 Dec 29 ] | ||||||||
Thanks allot created | ||||||||
Comment by Oleksii Zagorskyi [ 2018 Feb 19 ] | ||||||||
I'm on mysql and started my zabbix from 3.4. KEY `c_problem_2` (`r_eventid`), KEY `problem_3` (`r_eventid`), Before upgrade I had only `c_problem_2` index. Maybe the procedure should be more smart to not create duplicated indexes for mysql? vso this procedure does not create duplicate indexes, you can run patch multiple times and it will no longer create index with the name 'problem_3'. Could you please attach outout of show create table problem; ? zalex_ua those 2 lines are from the such output. I did already "drop index c_problem_2 on problem;" as it's a production database, which I had to dump and restore to maintain it, there was a reason. static int DBpatch_3040006(void) { if (FAIL == DBindex_exists("problem", "problem_3")) return DBcreate_index("problem", "problem_3", "r_eventid", 0); Ahh, I had db backup (because I had to manage it), here is schema before I ran 3.4.6 after 3.4.5: mysql> show create table problem \G *************************** 1. row *************************** Table: problem Create Table: CREATE TABLE `problem` ( `eventid` bigint(20) unsigned NOT NULL, `source` int(11) NOT NULL DEFAULT '0', `object` int(11) NOT NULL DEFAULT '0', `objectid` bigint(20) unsigned NOT NULL DEFAULT '0', `clock` int(11) NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', `r_eventid` bigint(20) unsigned DEFAULT NULL, `r_clock` int(11) NOT NULL DEFAULT '0', `r_ns` int(11) NOT NULL DEFAULT '0', `correlationid` bigint(20) unsigned DEFAULT NULL, `userid` bigint(20) unsigned DEFAULT NULL, PRIMARY KEY (`eventid`), KEY `problem_1` (`source`,`object`,`objectid`), KEY `problem_2` (`r_clock`), KEY `c_problem_2` (`r_eventid`), CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE, CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin 1 row in set (0.00 sec) Strange, I performed a clean test: created DB schema+images+data from 3.4.5 sources, upgrade it by 3.4.7 binary, what I had and received `c_problem_2` index magically replaced by `problem_3` index. mysql> show create table problem \G *************************** 1. row *************************** Table: problem Create Table: CREATE TABLE `problem` ( `eventid` bigint(20) unsigned NOT NULL, `source` int(11) NOT NULL DEFAULT '0', `object` int(11) NOT NULL DEFAULT '0', `objectid` bigint(20) unsigned NOT NULL DEFAULT '0', `clock` int(11) NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', `r_eventid` bigint(20) unsigned DEFAULT NULL, `r_clock` int(11) NOT NULL DEFAULT '0', `r_ns` int(11) NOT NULL DEFAULT '0', `correlationid` bigint(20) unsigned DEFAULT NULL, `userid` bigint(20) unsigned DEFAULT NULL, PRIMARY KEY (`eventid`), KEY `problem_1` (`source`,`object`,`objectid`), KEY `problem_2` (`r_clock`), KEY `c_problem_2` (`r_eventid`), KEY `problem_3` (`r_eventid`), CONSTRAINT `c_problem_1` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE, CONSTRAINT `c_problem_2` FOREIGN KEY (`r_eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin 1 row in set (0.00 sec) For both DBs (just created from schema+images+data SQLs and recovered from dump), upgrade debug log is identical: z345: 20173:20180219:125955.039 starting automatic database upgrade 20173:20180219:125955.039 query [txnlev:1] [begin;] 20173:20180219:125955.039 query [txnlev:1] [show index from problem where key_name='problem_3'] 20173:20180219:125955.040 query [txnlev:1] [create index problem_3 on problem (r_eventid)] 20173:20180219:125955.058 query [txnlev:1] [update dbversion set optional=3040006] 20173:20180219:125955.059 query [txnlev:1] [commit;] 20173:20180219:125955.059 completed 100% of database upgrade 20173:20180219:125955.059 database upgrade fully completed z345recovered: 20351:20180219:130053.868 starting automatic database upgrade 20351:20180219:130053.868 query [txnlev:1] [begin;] 20351:20180219:130053.868 query [txnlev:1] [show index from problem where key_name='problem_3'] 20351:20180219:130053.869 query [txnlev:1] [create index problem_3 on problem (r_eventid)] 20351:20180219:130053.886 query [txnlev:1] [update dbversion set optional=3040006] 20351:20180219:130053.887 query [txnlev:1] [commit;] 20351:20180219:130053.887 completed 100% of database upgrade 20351:20180219:130053.887 database upgrade fully completed Of course mysqldump, taken from created database, contains a line KEY `c_problem_2` (`r_eventid`), while schema.sql does not. Fuhh, probably it should not be ignored, as it really happened in production vso this is very unfortunate that MySQL acts differently when you restore backup, we could add another condition for 'c_problem_2' and would have to rename this index if it exists. zalex_ua maybe. Please take care on it further. And probably 3.4.6 schema should be checked the same way, because you have added the index creation explicitly to schema.sql. vso It's best if you create a separate bug report. |