[ZBX-4190] Need to break apart the compounding of deletes followed by a logical or up in housekeeper.c Created: 2011 Sep 30  Updated: 2017 May 30  Resolved: 2013 Dec 31

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.8
Fix Version/s: None

Type: Incident report Priority: Minor
Reporter: Kam Lane Assignee: Unassigned
Resolution: Duplicate Votes: 1
Labels: housekeeper, mysql
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

MySQL Community Server 5.5.15-log
Scientific Linux release 6.1 (Carbon)


Issue Links:
Duplicate
duplicates ZBX-5862 DELETE query from housekeeper table i... Closed

 Description   

I'm not sure if this is a bug or an improvement, but I'm seeing a significant amount of this in my zabbix_server.log. The most interesting part is that my mysql server is on the same box with my data dir sitting on a SAN LUN and zabbix is connecting via a socket connection. The database isn't restarting; I have a feeling the code is losing it's connection due to the query below in addition to a timeout.

Is there any way to break this query up into multiple statements that operate on a smaller number of rows...like everytime an "OR" condition is thrown in the example below? That logical OR is actually a pain point in this query due to the number of items it has to lookup in the index.

26063:20110930:130423.330 [Z3005] query failed: [2006] MySQL server has gone away [delete from housekeeper where (housekeeperid in (100100000120079,100100000120080,100100000120081,100100000120082,100100000120083,100100000120084,100100000120085,100100000120086,100100000120087,100100000120088,100100000120089,100100000120090,100100000120091,100100000120092,100100000120093,100100000120094,100100000120095,100100000120096,100100000120097,100100000120098,100100000120099,100100000120100,100100000120101,100100000120102,100100000120103,100100000120104,100100000120105,100100000120106,100100000120107,100100000120108,100100000120109,100100000120110,100100000120111,100100000120112,100100000120113,100100000120114,100100000120115,100100000120116,100100000120117,100100000120118,100100000120119,100100000120120,100100000120121,100100000120122,100100000120123,100100000120124,100100000120125,100100000120126,100100000120127,100100000120128,100100000120129,100100000120130,100100000120131,100100000120132,100100000120133,100100000120134,100100000120135,100100000120136,100100000120137,100100000120138,100100000120139,100100000120140,100100000120141,100100000120142,100100000120143,100100000120144,100100000120145,100100000120146,100100000120147,100100000120148,100100000120149,100100000120150,100100000120151,100100000120152,100100000120153,100100000120154,100100000120155,100100000120156,100100000120157,100100000120158,100100000120159,100100000120160,100100000120161,100100000120162,100100000120163,100100000120164,100100000120165,100100000120166,100100000120167,100100000120168,100100000120169,100100000120170,100100000120171,100100000120172,100100000120173, ...
0000121932,100100000121933,100100000121934,100100000121935,100100000121936,100100000121937,100100000121938,100100000121939,100100000121940,100100000121941,100100000121942,100100000121943,100100000121944,100100000121945,100100000121946,100100000121947,100100000121948,100100000121949,100100000121950,100100000121951,100100000121952,100100000121953,100100000121954,100100000121955,100100000121956,100100000121957,100100000121958,100100000121959,100100000121960,100100000121961,100100000121962,100100000121963,100100000121964,100100000121965,100100000121966,100100000121967,100100000121968,100100000121969,100100000121970,100100000121971,100100000121972,100100000121973,100100000121974,100100000121975,100100000121976,100100000121977,100100000121978) or housekeeperid in (100100000121979,100100000121980,100100000121981,100100000121982,100100000121983,100100000121984,100100000121985,100100000121986,100100000121987,100100000121988,100100000121989,100100000121990,100100000121991,100100000121992,100100000121993,100100000121994,100100000121995,100100000121996,100100000121997,100100000121998,100100000121999,100100000122000,100100000122001,100100000122002,100100000122003,100100000122004,100100000122005,100100000122006,100100000122007,100100000122008,100100000122009,100100000122010,100100000122011,100100000122012,100100000122013,100100000122014,100100000122015,100100000122016,100100000122017,100100000122018,100100000122019,100100000122020,10010000
...
0000123844,100100000123845,100100000123846,100100000123847,100100000123848,100100000123849,100100000123850,100100000123851,100100000123852,100100000123853,100100000123854,100100000123855,100100000123856,100100000123857,100100000123858,100100000123859,100100000123860,100100000123861,100100000123862,100100000123863,100100000123864,100100000123865,100100000123866,100100000123867,100100000123868,100100000123869,100100000123870,100100000123871,100100000123872,100100000123873,100100000123874,100100000123875,100100000123876,100100000123877,100100000123878) or housekeeperid in (100100000123879,100100000123880,100100000123881,100100000123882,100100000123883,100100000123884,100100000123885,100100000123886,100100000123887,100100000123888,100100000123889,100100000123890,100100000123891,100100000123892,100100000123893,100100000123894,100100000123895,100100000123896,100100000123897,100100000123898,100100000123899,100100000123900,100100000123901,100100000123902,100100000123903,100100000123904,100100000123905,100100000123906,100100000123907,100100000123908,100100000123909,100100000123910,100100000123911,100100000123912,100100000123913,100100000123914,100100000123915,100100000123916,100100000123917,100100000123918,100100000123919,100100000

zabbix[;]> show table status where Name = 'housekeeper';
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

housekeeper InnoDB 10 Compact 220902 59 13123584 0 0 27262976 NULL 2011-09-23 21:10:55 NULL NULL utf8_general_ci NULL    

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

zabbix[;]> show index from housekeeper;
----------------------------------------------------------------------------------------------------------------------------

Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment

----------------------------------------------------------------------------------------------------------------------------

housekeeper 0 PRIMARY 1 housekeeperid A 222050 NULL NULL   BTREE    

----------------------------------------------------------------------------------------------------------------------------
1 row in set (0.00 sec)

zabbix[;]> select count from housekeeper;
----------

count

----------

221172

----------
1 row in set (0.00 sec)



 Comments   
Comment by Kam Lane [ 2011 Oct 10 ]

Further analysis executing the same queries by hand that end up in the log [as seen above], led me to discover that even a manual run of these queries [or subsets of these queries splitting at the 'OR'] were to big in terms of character/string size and the mysql server would kill my connection. I was able to cut down on this occurance significantly, but not completely, by modifying the mysql my.cnf file and adding:

max_allowed_packet = 16M

I tried many other values for packet size, but 16M was really the only value that I could change "max_allowed_packet" to in MySQL 5.5.15 and actually have that value hold through the server restart and be visible when calling the variable [show variables like 'max_allowed_%';]. The MySQL developer site talks about the max_allowed_packet size [the maximum size of one packet or any generated/intermediate string] by default being 1MB and obviously these queries can exceed that. I've also been wondering if zabbix is holding on to open database connections to long and round robin-ing them, but that's for another investigation.

To work around this, I propose that the code that generates the housekeeper delete queries be refactored to keep all queries under 1MB, as that's the default for the MySQL server. I also recommend that the documentation be updated in the interim stating that the 'max_allowed_packet' be increased from the default of 1MB.

The big issue here is if the code isn't re-factored breaking apart the compounding queries, it only takes a few failed queries before the housekeeper table starts queuing up and the process won't ever be able to cleanup as it's exceeded the max_allowed_packet size. The max max_allowed_packet size is 1GB on both the client and server. If it were to ever queue up to this, manual intervention would be required to cleanup the table.

Comment by Alexander Vladishev [ 2013 Dec 31 ]

Closed as duplicate.

Generated at Fri Apr 26 19:40:00 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.