-
Incident report
-
Resolution: Fixed
-
Critical
-
2.0.7rc1, 2.1.1
I noticed this issue on PG, but it might exists with other DB backends as well: Loading a graph can take mighty long, like 30 seconds, unless the data is in the page cache. So, if you visit a graph you've never been to or haven't been to for a long time, you're likely going to experience this. Needless to say, it greatly hinders the use of screens too. This is mostly due to ...
SELECT MIN(ht.clock) AS min_clock FROM trends_uint ht WHERE ht.itemid IN ('54956','54959','55026','55035','60728')
... particularly for graphs with short periods selected. It gets worse the more itemids are inside IN().
The resulting query plan on PG is giving quite poor results:
zabbix20=# SELECT count(*) from trends; count ---------- 29349921 explain analyze SELECT MIN(ht.clock) AS min_clock FROM trends_uint ht WHERE ht.itemid IN ('54957','54960','55027','55036','60728'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=136689.71..136689.72 rows=1 width=4) (actual time=28713.804..28713.804 rows=1 loops=1) -> Bitmap Heap Scan on trends_uint ht (cost=973.83..136587.39 rows=40930 width=4) (actual time=14.356..28701.949 rows=12976 loops=1) Recheck Cond: (itemid = ANY ('{54957,54960,55027,55036,60728}'::bigint[])) -> Bitmap Index Scan on trends_uint_pkey (cost=0.00..963.60 rows=40930 width=0) (actual time=4.013..4.013 rows=12976 loops=1) Index Cond: (itemid = ANY ('{54957,54960,55027,55036,60728}'::bigint[])) Total runtime: 28713.849 ms (6 rows)
A different query yields results vastly quicker:
QUERY PLAN ------------------------------------------------------------------------------------------------------------ ------------------------------------------------- Aggregate (cost=33.23..33.33 rows=1 width=4) (actual time=0.140..0.140 rows=1 loops=1) -> Append (cost=3.95..33.21 rows=8 width=4) (actual time=0.030..0.137 rows=8 loops=1) -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.029..0.029 rows=1 loops=1) InitPlan 17 (returns $16) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.026..0.026 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 wi dth=4) (actual time=0.026..0.026 rows=1 loops=1) Index Cond: ((itemid = 34291) AND (clock IS NOT NULL)) Heap Fetches: 1 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.028..0.028 rows=1 l oops=1) InitPlan 18 (returns $17) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.027..0.027 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.026..0.026 rows=1 loops=1) Index Cond: ((itemid = 34295) AND (clock IS NOT NULL)) Heap Fetches: 1 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.016..0.017 rows=1 loops=1) InitPlan 19 (returns $18) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.014..0.014 rows=1 loops=1) Index Cond: ((itemid = 34296) AND (clock IS NOT NULL)) Heap Fetches: 1 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1) InitPlan 20 (returns $19) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.006..0.006 rows=0 loops=1) Index Cond: ((itemid = 34297) AND (clock IS NOT NULL)) Heap Fetches: 0 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.010..0.010 rows=1 loops=1) InitPlan 21 (returns $20) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.008..0.008 rows=1 loops=1) Index Cond: ((itemid = 34299) AND (clock IS NOT NULL)) Heap Fetches: 1 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.012..0.013 rows=1 loops=1) InitPlan 22 (returns $21) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.011..0.011 rows=1 loops=1) Index Cond: ((itemid = 34300) AND (clock IS NOT NULL)) Heap Fetches: 1 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=1) InitPlan 23 (returns $22) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.015..0.015 rows=1 loops=1) Index Cond: ((itemid = 34303) AND (clock IS NOT NULL)) Heap Fetches: 1 -> Result (cost=3.95..4.05 rows=1 width=0) (actual time=0.013..0.013 rows=1 loops=1) InitPlan 24 (returns $23) -> Limit (cost=0.00..3.95 rows=1 width=4) (actual time=0.013..0.013 rows=1 loops=1) -> Index Only Scan using trends_pkey on trends ht (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.013..0.013 rows=1 loops=1) Index Cond: ((itemid = 34304) AND (clock IS NOT NULL)) Heap Fetches: 1 Total runtime: 0.392 ms (51 rows)
To be fair, those were in the page cache already, but it hardly matters. The magnitude of improvement is still on a scale of more than factor 1000 for any given itemids.
The attached patch is a proof of concept. There are probably a lot more places that could profit from this kind of change.