[ZBX-6804] [Patch] Initial loading of graphs is slow -- at least with PostgreSQL and huge tables Created: 2013 Jul 18  Updated: 2017 May 30  Resolved: 2013 Oct 09

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Frontend (F)
Affects Version/s: 2.0.7rc1, 2.1.1
Fix Version/s: 2.0.9rc1, 2.1.5

Type: Incident report Priority: Critical
Reporter: Volker Fröhlich Assignee: Unassigned
Resolution: Fixed Votes: 3
Labels: performance, trivial, usability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File sadpuppy.jpg     File zabbix-2.0.6-speed_up_graphs.patch    

 Description   

I noticed this issue on PG, but it might exists with other DB backends as well: Loading a graph can take mighty long, like 30 seconds, unless the data is in the page cache. So, if you visit a graph you've never been to or haven't been to for a long time, you're likely going to experience this. Needless to say, it greatly hinders the use of screens too. This is mostly due to ...

include/graphs.inc.php:get_min_itemclock_by_itemid()
SELECT MIN(ht.clock) AS min_clock FROM trends_uint ht WHERE ht.itemid IN ('54956','54959','55026','55035','60728')

... particularly for graphs with short periods selected. It gets worse the more itemids are inside IN().

The resulting query plan on PG is giving quite poor results:

Query plan
zabbix20=# SELECT count(*) from trends;
  count  
----------
 29349921
 
 
explain analyze SELECT MIN(ht.clock) AS min_clock FROM trends_uint ht WHERE ht.itemid IN ('54957','54960','55027','55036','60728');
 
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=136689.71..136689.72 rows=1 width=4) (actual time=28713.804..28713.804 rows=1 loops=1)
   ->  Bitmap Heap Scan on trends_uint ht  (cost=973.83..136587.39 rows=40930 width=4) (actual time=14.356..28701.949 rows=12976 loops=1)
         Recheck Cond: (itemid = ANY ('{54957,54960,55027,55036,60728}'::bigint[]))
         ->  Bitmap Index Scan on trends_uint_pkey  (cost=0.00..963.60 rows=40930 width=0) (actual time=4.013..4.013 rows=12976 loops=1)
               Index Cond: (itemid = ANY ('{54957,54960,55027,55036,60728}'::bigint[]))
 Total runtime: 28713.849 ms
(6 rows)

A different query yields results vastly quicker:

Query plan for modified query
                                                                         QUERY PLAN                         
                                                 
------------------------------------------------------------------------------------------------------------
-------------------------------------------------
 Aggregate  (cost=33.23..33.33 rows=1 width=4) (actual time=0.140..0.140 rows=1 loops=1)
   ->  Append  (cost=3.95..33.21 rows=8 width=4) (actual time=0.030..0.137 rows=8 loops=1)
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.029..0.029 rows=1 loops=1)
               InitPlan 17 (returns $16)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.026..0.026 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 wi
dth=4) (actual time=0.026..0.026 rows=1 loops=1)
                             Index Cond: ((itemid = 34291) AND (clock IS NOT NULL))
                             Heap Fetches: 1
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.028..0.028 rows=1 l
oops=1)
               InitPlan 18 (returns $17)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.027..0.027 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.026..0.026 rows=1 loops=1)
                             Index Cond: ((itemid = 34295) AND (clock IS NOT NULL))
                             Heap Fetches: 1
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.016..0.017 rows=1 loops=1)
               InitPlan 19 (returns $18)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.014..0.014 rows=1 loops=1)
                             Index Cond: ((itemid = 34296) AND (clock IS NOT NULL))
                             Heap Fetches: 1
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1)
               InitPlan 20 (returns $19)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.006..0.006 rows=0 loops=1)
                             Index Cond: ((itemid = 34297) AND (clock IS NOT NULL))
                             Heap Fetches: 0
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.010..0.010 rows=1 loops=1)
               InitPlan 21 (returns $20)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.008..0.008 rows=1 loops=1)
                             Index Cond: ((itemid = 34299) AND (clock IS NOT NULL))
                             Heap Fetches: 1
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.012..0.013 rows=1 loops=1)
               InitPlan 22 (returns $21)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.011..0.011 rows=1 loops=1)
                             Index Cond: ((itemid = 34300) AND (clock IS NOT NULL))
                             Heap Fetches: 1
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=1)
               InitPlan 23 (returns $22)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.015..0.015 rows=1 loops=1)
                             Index Cond: ((itemid = 34303) AND (clock IS NOT NULL))
                             Heap Fetches: 1
         ->  Result  (cost=3.95..4.05 rows=1 width=0) (actual time=0.013..0.013 rows=1 loops=1)
               InitPlan 24 (returns $23)
                 ->  Limit  (cost=0.00..3.95 rows=1 width=4) (actual time=0.013..0.013 rows=1 loops=1)
                       ->  Index Only Scan using trends_pkey on trends ht  (cost=0.00..24305.42 rows=6152 width=4) (actual time=0.013..0.013 rows=1 loops=1)
                             Index Cond: ((itemid = 34304) AND (clock IS NOT NULL))
                             Heap Fetches: 1
 Total runtime: 0.392 ms
(51 rows)

To be fair, those were in the page cache already, but it hardly matters. The magnitude of improvement is still on a scale of more than factor 1000 for any given itemids.

The attached patch is a proof of concept. There are probably a lot more places that could profit from this kind of change.



 Comments   
Comment by Marc [ 2013 Jul 18 ]

Confirm improvement from 15 sec to <1 sec

Comment by Florian Koch [ 2013 Aug 05 ]

+1 confirm 5-10x improvment

Comment by Alexei Vladishev [ 2013 Aug 20 ]

We need to fix it for both 2.0x and 2.2.x.

Comment by Alexei Vladishev [ 2013 Aug 20 ]

According to PostgreSQL developers (#postgresql freenode IRC), the inefficiency was fixed in 9.2.4 however I failed to find any evidence in PostgreSQL release notes.

Comment by Florian Koch [ 2013 Aug 24 ]

i have updated to postgresql 9.2.4 and zabbix 2.0.8 today, without the patch, i can't see a speedup.

Comment by Igor Shipenkov [ 2013 Aug 28 ]

We use zabbix 1.8.16 with postgresql 9.2.4 and have this issue. You can extend list of affected versions.

I apply patch and it seems that all works fine now.

Comment by Eduards Samersovs (Inactive) [ 2013 Sep 11 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-6804

Comment by Volker Fröhlich [ 2013 Sep 11 ]

You're doing something different there. I found the order by/limit approach not to bring any benefit.

Comment by Alexander Vladishev [ 2013 Sep 13 ]

Hi Volker,

I tested SQL requests. Here results of testing.

Configuration:

  • Number of items: 2000
  • Update interval 120s
  • Keep trends: 2y
  • Keep history: 60d

Number of records

  • trends: 35040000
  • history: 86400000

Tests

  • Number of requests: 1990 (all unique)
  • Number of items in request: 11

Test results (average values in seconds):

Current (before changes) SQL statement:
select min(ht.clock) as min_clock from history ht where ht.itemid between ? and ?;
Table name MySQL 5.5.32 PostgreSQL 8.4.17 PostgreSQL 9.1.9
history 0.31 1.69 1.83
trends 0.084 0.471 0,497
Our SQL statement (r38414):
select h.clock as min_clock from history h where h.itemid between ? and ? order by h.clock limit 1 offset 0;
Table name MySQL 5.5.32 PostgreSQL 8.4.17 PostgreSQL 9.1.9
history 0.377 1.761 1.877
trends 0.127 0.495 0,515
Your SQL statement:
select min(ht.item_min_clock) as min_clock from (
 select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
 union all select min(ht.clock) as item_min_clock from history ht where ht.itemid = ?
) as ht;
Table name MySQL 5.5.32 PostgreSQL 8.4.17 PostgreSQL 9.1.9
history 0.0013 0.0027 0.0049
trends 0.0012 0.0025 0.0051

Summary

I confirm improvement with Volker's patch. It will be integrated into Zabbix.

Many thanks!

Comment by Alexander Vladishev [ 2013 Sep 13 ]

(1) Please apply the Volker's patch. It works quicker!

Eduards RESOLVED r.38465

sasha CLOSED

Comment by Alexander Vladishev [ 2013 Sep 13 ]

(2) The patch doesn't work on Oracle. It shall be a little changed:

SELECT min (...) AS min_clock FROM (...) AS ht;   =>   SELECT min (...) AS min_clock FROM (...) ht;

It works on MySQL, PostgreSQL and IBM DB2 too.

Eduards RESOLVED r.38472,38473

sasha CLOSED

Comment by Alexander Vladishev [ 2013 Sep 13 ]

Successfully tested!

Comment by Eduards Samersovs (Inactive) [ 2013 Sep 13 ]

Fixed in versions pre-2.1.5 (trunk) r.38477, 2.0.9rc1 r.38478

Comment by Volker Fröhlich [ 2013 Sep 25 ]

Will be in EPEL 2.0.8-3 packages.

Comment by richlv [ 2013 Oct 01 ]

(3) this issue was closed without adding the improvement to whatsnew. it makes puppies sad.

martins-v For review:

https://www.zabbix.com/documentation/2.0/manual/introduction/whatsnew209#faster_loading_graphs_with_postgresql
https://www.zabbix.com/documentation/2.2/manual/introduction/whatsnew220#faster_loading_graphs_with_postgresql

sasha It's not PostgreSQL only. It will be faster on all databases!

martins-v New version:

https://www.zabbix.com/documentation/2.0/manual/introduction/whatsnew209#faster_loading_graphs
https://www.zabbix.com/documentation/2.2/manual/introduction/whatsnew220#faster_loading_graphs

sasha Great! CLOSED

The puppy should be happy!

Generated at Tue Apr 23 10:27:00 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.