[ZBX-4881] Slow select queries on large databases with partitions (at least on postgresql) Created: 2012 Apr 16  Updated: 2017 May 30  Resolved: 2012 Apr 19

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.11
Fix Version/s: 1.8.13rc1, 2.0.0rc4

Type: Incident report Priority: Major
Reporter: Sébastien Riccio Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: performance, sql
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux 2.6.32-5-amd64 #1 SMP Mon Oct 3 03:59:20 UTC 2011 x86_64 GNU/Linux
Postgresql 9.0 - Raid 10 HDDs - 8GB ram


Attachments: Text File ZBX-4881.patch    
Issue Links:
Duplicate

 Description   

Hi,

I dunno where to write about this so I'll try here.

I noticed that often our zabbix server loads being stuck on queries like this one:

select value from history where itemid=23719 and clock<=1334581521 order by itemid,clock desc limit 1

It seems, if i'm not wrong, that this query is used by the item.last trigger (to get the last value of an item) Could also be used by item.delta (with limit 2 instead of 1), etc...

The thing is that our history table is partitionned with constraints so we can disable housekeeper which is a resource eater on large dbs and like this we can keep a nice amount of history without too much perfs drop.

But sometimes such queries takes more than 5 minutes to complete, because clock<=xxxxxxxxx makes postgresql browse all the table partitions and render the partitionning quite useless.

if you replace that kind of query with (specifying a time range)
select value from history where itemid=23719 and clock<=1334581521 and clock >=1334580521 order by itemid,clock desc limit 1

it will be lightning fast at it will only read the patition table concerned by this time range.

Wouldn't be an idea to modify those queries so they use a time range to drastically improve the perfs for the people using partitons or
maybe not use less or equal but greater or equal for the clock ?? I must admit I don't get it why the query is built like this.
If you're getting the last value usually it should younger than "current time minus the configured item check interval"...

Help please

Here you'll find EXPLAINs for both queries :

select value from history where itemid=23719 and clock<=1334581521 and clock >=1334580521 (takes ages to complete)

QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Limit (cost=735231.66..735231.66 rows=1 width=25)
-> Sort (cost=735231.66..735931.35 rows=279878 width=25)
Sort Key: public.history.clock
-> Result (cost=0.00..733832.27 rows=279878 width=25)
-> Append (cost=0.00..733832.27 rows=279878 width=25)
-> Seq Scan on history (cost=0.00..0.00 rows=1 width=28)
Filter: ((clock <= 1334581521) AND (itemid = 23719))
-> Bitmap Heap Scan on history_20120321 history (cost=152.26..11222.49 rows=3759 width=20)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120321_1 (cost=0.00..151.32 rows=3759 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120320 history (cost=132.48..10173.12 rows=3407 width=20)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120320_1 (cost=0.00..131.63 rows=3407 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120301 history (cost=351.31..21969.12 rows=9125 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120301_1 (cost=0.00..349.03 rows=9125 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120302 history (cost=351.33..21974.20 rows=9127 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120302_1 (cost=0.00..349.05 rows=9127 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120303 history (cost=351.26..21958.50 rows=9120 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120303_1 (cost=0.00..348.98 rows=9120 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120304 history (cost=351.35..21977.17 rows=9128 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120304_1 (cost=0.00..349.06 rows=9128 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120305 history (cost=351.26..21958.50 rows=9120 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120305_1 (cost=0.00..348.98 rows=9120 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120306 history (cost=350.73..21836.59 rows=9069 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120306_1 (cost=0.00..348.46 rows=9069 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120307 history (cost=351.51..22014.24 rows=9144 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120307_1 (cost=0.00..349.23 rows=9144 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120308 history (cost=355.88..22103.22 rows=9179 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120308_1 (cost=0.00..353.58 rows=9179 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120309 history (cost=351.14..21929.37 rows=9108 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120309_1 (cost=0.00..348.86 rows=9108 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120310 history (cost=344.96..21428.49 rows=8899 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120310_1 (cost=0.00..342.74 rows=8899 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120311 history (cost=340.30..21273.45 rows=8836 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120311_1 (cost=0.00..338.09 rows=8836 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120312 history (cost=350.57..21801.49 rows=9054 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120312_1 (cost=0.00..348.31 rows=9054 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120313 history (cost=350.03..21676.76 rows=9002 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120313_1 (cost=0.00..347.78 rows=9002 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120314 history (cost=346.00..21665.42 rows=8999 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120314_1 (cost=0.00..343.75 rows=8999 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120315 history (cost=350.66..21820.31 rows=9062 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120315_1 (cost=0.00..348.39 rows=9062 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120316 history (cost=355.78..22079.92 rows=9170 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120316_1 (cost=0.00..353.49 rows=9170 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120317 history (cost=351.47..22006.01 rows=9140 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120317_1 (cost=0.00..349.19 rows=9140 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120318 history (cost=339.93..21188.30 rows=8800 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120318_1 (cost=0.00..337.73 rows=8800 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120319 history (cost=339.80..21159.03 rows=8788 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120319_1 (cost=0.00..337.61 rows=8788 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120322 history (cost=153.02..11443.90 rows=3830 width=20)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120322_1 (cost=0.00..152.06 rows=3830 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120323 history (cost=152.92..11431.14 rows=3819 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120323_1 (cost=0.00..151.96 rows=3819 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120324 history (cost=153.04..11460.62 rows=3831 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120324_1 (cost=0.00..152.08 rows=3831 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120325 history (cost=147.08..10926.57 rows=3646 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120325_1 (cost=0.00..146.17 rows=3646 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120326 history (cost=153.01..11453.16 rows=3828 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120326_1 (cost=0.00..152.05 rows=3828 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120327 history (cost=153.40..11547.90 rows=3866 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120327_1 (cost=0.00..152.44 rows=3866 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120328 history (cost=152.99..11444.28 rows=3826 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120328_1 (cost=0.00..152.03 rows=3826 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120329 history (cost=153.37..11532.73 rows=3863 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120329_1 (cost=0.00..152.40 rows=3863 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120330 history (cost=152.87..11420.21 rows=3814 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120330_1 (cost=0.00..151.92 rows=3814 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120331 history (cost=152.38..11295.92 rows=3767 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120331_1 (cost=0.00..151.44 rows=3767 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120401 history (cost=152.77..11401.47 rows=3804 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120401_1 (cost=0.00..151.82 rows=3804 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120402 history (cost=152.57..11350.62 rows=3785 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120402_1 (cost=0.00..151.63 rows=3785 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120403 history (cost=152.87..11409.86 rows=3815 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120403_1 (cost=0.00..151.92 rows=3815 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120404 history (cost=152.68..11377.87 rows=3796 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120404_1 (cost=0.00..151.74 rows=3796 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120405 history (cost=152.85..11418.48 rows=3812 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120405_1 (cost=0.00..151.90 rows=3812 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120406 history (cost=148.17..11255.88 rows=3746 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120406_1 (cost=0.00..147.24 rows=3746 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120407 history (cost=152.40..11313.51 rows=3768 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120407_1 (cost=0.00..151.46 rows=3768 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120408 history (cost=152.36..11303.00 rows=3764 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120408_1 (cost=0.00..151.42 rows=3764 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120409 history (cost=152.87..11420.30 rows=3814 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120409_1 (cost=0.00..151.92 rows=3814 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120410 history (cost=152.32..11292.28 rows=3760 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120410_1 (cost=0.00..151.38 rows=3760 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120411 history (cost=152.27..11282.03 rows=3755 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120411_1 (cost=0.00..151.33 rows=3755 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120412 history (cost=152.94..11441.59 rows=3821 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120412_1 (cost=0.00..151.99 rows=3821 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120413 history (cost=152.33..11295.74 rows=3761 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120413_1 (cost=0.00..151.39 rows=3761 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120414 history (cost=148.17..11254.87 rows=3746 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120414_1 (cost=0.00..147.24 rows=3746 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120415 history (cost=152.29..11284.67 rows=3757 width=19)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120415_1 (cost=0.00..151.35 rows=3757 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Heap Scan on history_20120416 history (cost=231.24..14557.94 rows=6047 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521))
-> Bitmap Index Scan on history_20120416_1 (cost=0.00..229.73 rows=6047 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521))
(195 rows)

select value from history where itemid=23719 and clock<=1334581521 and clock >=1334580521 order by itemid,clock desc limit 1 (ultra fast)

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Limit (cost=356.30..356.30 rows=1 width=28)
-> Sort (cost=356.30..356.53 rows=92 width=28)
Sort Key: public.history.clock
-> Result (cost=0.00..355.84 rows=92 width=28)
-> Append (cost=0.00..355.84 rows=92 width=28)
-> Seq Scan on history (cost=0.00..0.00 rows=1 width=28)
Filter: ((clock <= 1334581521) AND (clock >= 1334580521) AND (itemid = 23719))
-> Bitmap Heap Scan on history_20120416 history (cost=6.42..355.84 rows=91 width=28)
Recheck Cond: ((itemid = 23719) AND (clock <= 1334581521) AND (clock >= 1334580521))
-> Bitmap Index Scan on history_20120416_1 (cost=0.00..6.39 rows=91 width=0)
Index Cond: ((itemid = 23719) AND (clock <= 1334581521) AND (clock >= 1334580521))
(11 rows)

Kind regards,
Sébastien



 Comments   
Comment by Sébastien Riccio [ 2012 Apr 17 ]

Again if I'm not wrong the function that issue that query is:
**DBget_history(zbx_uint64_t itemid, unsigned char value_type, int function, int clock_from, int clock_to, const char *field_name, int last_n);

I've looked a bit into the source code to try to understand when such a query (with only clock_from arg passed, and no clock_to)

root@eye:~/install/zabbix-1.8.11/src/libs/zbxserver# grep DBget_history -R *
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, 0, "logeventid", 1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, 0, "source", 1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, 0, "severity", 1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_COUNT,
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE,
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE,
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_SUM, now - arg1, now, NULL, 0);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_AVG, now - arg1, now, NULL, 0);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_MIN, now - arg1, now, NULL, 0);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_MAX, now - arg1, now, NULL, 0);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_DELTA, now - arg1, now, NULL, 0);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg1);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, 2);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, 0, now, NULL, arg2);
evalfunc.c: h_value = DBget_history(item->itemid, item->value_type, ZBX_DB_GET_HIST_VALUE, now - arg2, now, NULL, 0);

Here there is nothing "suspicious", all are using a clock_to parameter.

But this one:

zabbix_server/poller/checks_aggregate.c: h_value = DBget_history(itemid, value_type, item_func, clock_from, 0, NULL, 0);

Could be the one generating this query as it seems to pass clock_from but 0 for clock_to.

What is this function used for ?

Thanks

Comment by Sébastien Riccio [ 2012 Apr 18 ]

Sorry,

Looking at it again it seems I've inverted clock_from and clock_to.

So that means a lot of DBget_history calls can be the cause.

Still trying to find why some are so slow and other aren't.

Comment by Alexander Vladishev [ 2012 Apr 18 ]

Hi,

I confirm a given problem. It will be fixed before release 2.0.0.

Thank you!

Comment by Sébastien Riccio [ 2012 Apr 18 ]

Hi Alexander,

Thanks a lot for the patch, I've applied it to the 1.8.11 source and recompiled.
I'll let you know how things goes

Thanks again,
Sébastien

Comment by Alexander Vladishev [ 2012 Apr 19 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4881

Comment by Sébastien Riccio [ 2012 Apr 19 ]

Hi Alexander,

Your patch is working like a charm, the average load of the box is now very low and we can't see any more queries slowing down all the processing.

But, it seems seems it brought a bug with the notifications.

The values for the item in the mail notifications are now UNKNOWN. This started to happen after I applied your patch.

Here is an example:

Alert:
Monitoring has detected the following problem:

Host: celsius-B1
What: Sensor 1 - Temperature too high: PROBLEM
When: 2012.04.19 at 09:17:46 (0m ago)
Severity: High

More informations:

Sensor 1 is now UNKNOWN

Recovery:
Monitoring reports that the following problem is now gone:

Host: celsius-B1
What: Sensor 1 - Temperature too high: OK
When: 2012.04.19 at 09:19:46 (0m ago)
Severity: High

More informations:

Sensor 1 is now UNKNOWN

(edit: here is the macros I'm using in the mail body)

SwissCenter Monitoring has detected the following problem:

Host:

{HOSTNAME}

What:

{TRIGGER.NAME}

:

{STATUS}

When:

{EVENT.DATE}

at

{EVENT.TIME}

(

{EVENT.AGE}

ago)
Severity:

{TRIGGER.SEVERITY}

More informations:

{ITEM.NAME1}

is now

{ITEM.VALUE1}

Maybe a side effect of changing the queries that are used to get this value ?

Thanks a lot for your work,
Sébastien

Comment by Sébastien Riccio [ 2012 Apr 19 ]

I changed my actions messages and replaced ITEM.VALUE1 with ITEM.LASTVALUE1 and it fixes the problem.

But still it was working with ITEM.VALUE1 before applying the patch. Could it be a side effect of changes made to :

in file /src/libs/zbxserver/expression.c

the function

int substitute_simple_macros(DB_EVENT *event, DB_ITEM *item, DC_HOST *dc_host,
DC_ITEM *dc_item, DB_ESCALATION *escalation,
char **data, int macro_type, char *error, int maxerrlen)

{ [...] else if (0 == strcmp(m, MVAR_ITEM_VALUE)) ret = DBget_item_value(&event->trigger, &replace_to, N_functionid, event->clock); [...] }

make uses of

static int DBget_item_value(DB_TRIGGER *trigger, char **value, int N_functionid, int clock)

{ [...] h_value = DBget_history(itemid, value_type, ZBX_DB_GET_HIST_VALUE, 0, clock, delay, NULL, 1); [...] }

It seems to return no value (maybe clock range too short?)

Thanks,
Sébastien

Comment by Alexey Pustovalov [ 2012 Apr 21 ]

Hi Sébastien,

Please try development branch svn://svn.zabbix.com/branches/dev/ZBX-4881
It should resolve the problem.

Regards, Alexey.

Comment by Oleksii Zagorskyi [ 2012 Apr 21 ]

Forum threads where this fix can be published as a message:
http://www.zabbix.com/forum/showthread.php?p=87758
http://www.zabbix.com/forum/showthread.php?t=14208

Comment by Sébastien Riccio [ 2012 Apr 21 ]

Hi Alexey,

I tried to build the branch but seems there is something wrong, or maybe I am doing something wrong:

  1. svn co svn://svn.zabbix.com/branches/dev/ZBX-4881
  2. cd ZBX-4881
  3. ./bootstrap.sh
  4. ./configure --enable-agent --enable-server --with-pgsql --with-net-snmp --with-jabber --with-libcurl --with-ssh2 --with-libcurl --with-ssh2
  5. make -j5
    [...output suppressed...]
    make[3]: Entering directory `/root/install/ZBX-4881/src/libs/zbxdbhigh'
    make[3]: *** No rule to make target `dbschema.c', needed by `libzbxdbhigh_a-dbschema.o'. Stop.
    make[3]: Leaving directory `/root/install/ZBX-4881/src/libs/zbxdbhigh'
    make[2]: *** [all-recursive] Error 1
    make[2]: Leaving directory `/root/install/ZBX-4881/src/libs'
    make[1]: *** [all-recursive] Error 1
    make[1]: Leaving directory `/root/install/ZBX-4881/src'
    make: *** [all-recursive] Error 1

Cheers,
Sébastien

Comment by richlv [ 2012 Apr 21 ]

see https://zabbix.org/wiki/Compilation_instructions - you must run "make dbschema" before make

Comment by Sébastien Riccio [ 2012 Apr 21 ]

Awww sorry I skipped this step. Thanks a lot richlv!

Comment by Sébastien Riccio [ 2012 Apr 21 ]

Compiled and installed, I can confirm it now indeed works with ITEM.VALUE1 macro in actions.

Cheers,
Sébastien

Comment by dimir [ 2012 Apr 27 ]

Successfully tested.

Comment by Alexander Vladishev [ 2012 Apr 28 ]

Available in versions pre-1.8.13rc1 r27169 and pre-2.0.0rc4 r27178.

Generated at Wed Apr 24 13:30:39 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.