[ZBX-10652] Zabbix server generates slow "select clock,ns,value from history_uint..." queries in case of missing data for the items Created: 2016 Apr 12 Updated: 2024 Jun 03 Resolved: 2016 Jul 18 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 2.4.7 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Blocker |
Reporter: | Oleg Ivanivskyi | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 5 |
Labels: | partitioning, performance, valuecache | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Zabbix 2.4 |
Issue Links: |
|
Description |
It takes a lot of time (hours!) to get values with select clock,ns,value from history_uint where itemid=<itemid> and clock<=<timestamp> and clock><timestamp> order by clock desc limit 1 query from a big partitioned DB (e.g. ~40-50GB per each partition). These queries lead to 100% disk IO on the DB server. A possible solution is to replace the slow SQL query: select clock,ns,value from history_uint where itemid=123 and clock<=1460364250 and clock>1459759450 order by clock desc limit 1 with these two: select max(clock) from history_uint where itemid=123 and clock<=1460364250 and clock>1459759450; select clock,ns,value from history_uint where itemid=123 and clock=1460364197 order by ns desc limit 1; |
Comments |
Comment by Aleksandrs Saveljevs [ 2016 Apr 12 ] |
How about we close this as a duplicate of oleg.ivanivskyi asaveljevs, it is up to you. |
Comment by Sandis Neilands (Inactive) [ 2016 Apr 13 ] |
The proposed solution relies on the optimizer:
|
Comment by Kelly Shutt [ 2016 Apr 22 ] |
We are having similar issues on 2.4.7 after implementing partitioning a few weeks ago. I just had to remove the partitions as a temporary stopgap. After removing the partitions, everything returned to normal. I did also test those two queries as a proposed fix, and they worked perfect on MySql command line test. Initial query would take several seconds to return. However, when I split the same query into the suggested two queries, data was returned instantly. I also noticed that this is happens a LOT when adding new hosts for monitoring. Basically it's trying to cache values for the new host, even though there's obviously not any values collected yet. This causes it to go through a full scan of the history for every new item. Maybe a second part to this fix needs to be a stop measure that will just not look for existing values on "new" hosts? |
Comment by Sandis Neilands (Inactive) [ 2016 Apr 22 ] |
Thanks for your input on this issue! For completeness could you provide:
show create table history_uint\G; show table status from <your_zabbix_db> where name like 'history_uint'\G; |
Comment by Kelly Shutt [ 2016 Apr 25 ] |
As I said in my previous comment, I actually removed the partitioning last week because our server was not catching up. I added some new hosts, through a proxy instance, and it started taking forever to run these queries; which in turn caused the whole server to get behind. After an hour of waiting it was still not catching up; and was spamming notifications because everything was behind. I can tell you that I followed the procedures outlined here: https://www.zabbix.org/wiki/Docs/howto/mysql_partition show create table zabbix.history_uint\G; Table: history_uint Create Table: CREATE TABLE `history_uint` ( `itemid` bigint(20) unsigned NOT NULL, `clock` int(11) NOT NULL DEFAULT '0', `value` bigint(20) unsigned NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', KEY `history_uint_1` (`itemid`,`clock`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) show table status from zabbix where name like 'history_uint'\G; Name: history_uint Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 203926818 Avg_row_length: 51 Data_length: 10518265856 Max_data_length: 0 Index_length: 5671731200 Data_free: 4194304 Auto_increment: NULL Create_time: 2016-04-22 01:49:15 Update_time: 2016-04-25 08:42:18 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) |
Comment by Aleksandrs Saveljevs [ 2016 May 16 ] |
|
Comment by Kononenko Sergey [ 2016 Jun 23 ] |
Hi. mysql> show create table history_uint\G *************************** 1. row *************************** Table: history_uint Create Table: CREATE TABLE `history_uint` ( `itemid` bigint(20) unsigned NOT NULL, `clock` int(11) NOT NULL DEFAULT '0', `value` bigint(20) unsigned NOT NULL DEFAULT '0', `ns` int(11) NOT NULL DEFAULT '0', KEY `history_uint_1` (`itemid`,`clock`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin /*!50100 PARTITION BY RANGE ( clock) (PARTITION p2016_06_22 VALUES LESS THAN (1466629200) ENGINE = InnoDB, PARTITION p2016_06_23 VALUES LESS THAN (1466715600) ENGINE = InnoDB, PARTITION p2016_06_24 VALUES LESS THAN (1466802000) ENGINE = InnoDB) */ 1 row in set (0.00 sec) Very slow queries in Latest data, after add new host mysql> SELECT * FROM history_uint h WHERE h.itemid='185615' AND h.clock>1466607574 ORDER BY h.clock DESC LIMIT 2 OFFSET 0; +--------+------------+-------+----------+ | itemid | clock | value | ns | +--------+------------+-------+----------+ | 185615 | 1466697810 | 1 | 57512250 | | 185615 | 1466697510 | 1 | 26020090 | +--------+------------+-------+----------+ 2 rows in set (16.47 sec) UPD |
Comment by Alexey Pustovalov [ 2016 Jun 23 ] |
Kononenko Sergey, Try to disable "pushed_index_condition" optimizer without any custom patches. |
Comment by Kononenko Sergey [ 2016 Jun 24 ] |
Alexey Pustovalov, mysql> SET optimizer_switch='index_condition_pushdown=off'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT @@optimizer_switch\G *************************** 1. row *************************** @@optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=off,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on 1 row in set (0.00 sec) mysql> SELECT * FROM history_uint h WHERE h.itemid='185615' AND h.clock>1466607574 ORDER BY h.clock DESC LIMIT 2 OFFSET 0; +--------+------------+-------+-----------+ | itemid | clock | value | ns | +--------+------------+-------+-----------+ | 185615 | 1466749110 | 1 | 36607436 | | 185615 | 1466748809 | 1 | 929473480 | +--------+------------+-------+-----------+ 2 rows in set (0.01 sec) |
Comment by Alexey Pustovalov [ 2016 Jun 24 ] |
oxygen121, It is known bug of MySQL 5.7. So I think now you do not need to use custom patches |
Comment by Kelly Shutt [ 2016 Jun 24 ] |
This appears to have helped in our case also. I just made the same change and I'm not seeing the queries get stuck anymore. I even grabbed one that I know would return no results, and queried across about 20 partitions, and it finished pretty quickly. We are still running 2.4.x release at this point; so it looks like the same fix works for both 2.4.x and 3.x releases. This is good news because we plan on upgrading soon. Thanks for this insight, I tried looking for possible solutions on the MySQL side but was having a hard time coming up with anything. |
Comment by Kelly Shutt [ 2016 Jun 24 ] |
And, there's an added side benefit. After applying this MySQL change; our steady state IOPS are down 50% from what they were previously. This will make our SAN happy. |
Comment by Kenneth Palmertree [ 2016 Jul 07 ] |
I had similar issues like above but resolved them with the following patch -> https://support.zabbix.com/browse/ZBX-10229 |
Comment by Alexey Pustovalov [ 2016 Jul 07 ] |
palmertree It is not quite good solution. Try to disable index_condition_pushdown. |
Comment by Sandis Neilands (Inactive) [ 2016 Jul 08 ] |
The issue and the workaround will be documented in known issues. Nothing to do in Zabbix since this is MySQL issue. martins-v Added to known issues for 2.4. Please review before copying to 3.0. sandis.neilands Thanks! CLOSED. |
Comment by Dmitry Verkhoturov [ 2017 Mar 21 ] |
If you see huge performance drop after upgrading to mysql 5.7.x, see this: http://stackoverflow.com/questions/39023651/mysql-5-7-strange-perfomance-reduction-with-order-by-asc-desc-on-partitioned-tab If that's your case, adding following line to mysqld section of my.cnf should help: optimizer_switch = 'index_condition_pushdown=off'
From my experience, from-the-box installations of Zabbix are not affected, only ones with partitioning enabled. |