[ZBX-12251] New values aren't counted towards trigger functions evaluation Created: 2017 Jun 01  Updated: 2024 Apr 10  Resolved: 2017 Dec 26

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.2.7rc1
Fix Version/s: 4.0.0alpha1, 4.0 (plan)

Type: Problem report Priority: Blocker
Reporter: Dmitry Verkhoturov Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 8
Labels: backup, deadlock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Zabbix 3.2.7rc1 (revision 68572)


Attachments: PNG File 2017-06-01_13-32-16.png     PNG File 2017-06-01_14-50-31.png     PNG File 2017-06-01_14-52-11.png     PNG File 2017-06-01_14-52-53.png     PNG File 2017-06-01_14-54-09.png     PNG File 2017-06-01_14-58-03.png     PNG File 2017-06-06_09-29-07.png     PNG File 2017-06-06_09-31-13.png     PNG File 2017-06-06_09-38-24.png     File ZBX-12251-3.2.6.diff     Text File ZBX-12251_3.2.5_r67445_history_syncer_debug_info.txt     Text File ZBX-12251_3.2.6_r67849_history_syncer_debug_info.txt     File query_fail_debug.patch     File reproduce.patch    
Issue Links:
Duplicate
is duplicated by ZBX-10462 Trigger status stuck Closed
is duplicated by ZBX-11768 "cannot find open problem events for ... Closed
is duplicated by ZBX-13776 Zabbix 3.4.4 doesn't resolve triggers Closed
is duplicated by ZBX-12859 False "More than 100 items having mis... Closed
Sub-task
depends on ZBXNEXT-3006 A cache to provide multiple metrics o... Closed
part of ZBX-12179 Database errors can go unnoticed with... Confirmed
Team: Team A
Sprint: Sprint 9, Sprint 10, Sprint 11, Sprint 12, Sprint 13, Sprint 14, Sprint 15, Sprint 16, Sprint 17, Sprint 18, Sprint 19, Sprint 20, Sprint 21, Sprint 23
Story Points: 5

 Description   

Hello, team.
I am observing very strange Zabbix behavior: for random amount of my 480000 items and 430000 triggers new values aren't triggering problem\ok trigger state change - if trigger got into Problem some time ago, new incoming value X = 0 for example with min(X)<1 won't switch trigger to Ok state.
Situation can be fixed by going into item on host and clearing history\trends for it.
One of the examples can be seen in attached screenshots. I'll be happy to give any kind of diagnostic information. Also I'll check generic 3.2.6 to see if bug repeats here.



 Comments   
Comment by Dmitry Verkhoturov [ 2017 Jun 01 ]

Downgrading to Zabbix 3.2.6 (revision 67849) with restart of server to update binaries helped this situation: trigger I've used as example jumped to OK state. Seems like restart affected the problem, not the downgrade.
UPD: Zabbix 3.2.6 (revision 67849) have same problem for me right now, I'm downgrading to Zabbix 3.2.5 (revision 67445) with ZBX-12121 patch.
My DB is percona-server-common-5.7 version 5.7.17-13-1.trusty without replication:
$ mysql --version
mysql Ver 14.14 Distrib 5.7.17-13, for debian-linux-gnu (x86_64) using readline 6.3

Comment by patrik uytterhoeven [ 2017 Jun 05 ]

i had this similar problem a while a go but in chat no one could confirm similar behaviour.

We had a network crash and in zabbix many of the items in problem state would not revert back to ok state even in latest data we would receive new data. I tried to restart server do a cache reload nothing helped.
the only solution was to unlink and clear or probably clear trends ...

Dont know if it helps but we are on:
we are running atm on ubuntu 16.04 LTS with zabbix 3.2.1 and postgresql 9.6.0

Comment by Dmitry Verkhoturov [ 2017 Jun 06 ]

I found exactly same behavior in 3.2.5.
UPD: but after 6 hours of false positive "PROBLEM" state alert gone to OK after single "1" value and then more "0" came, see 2017-06-06_09-38-24.png.

Comment by Kevin Daudt [ 2017 Jun 06 ]

Noticing something similar in 3.2.3. Manually closing the problem fixes it. Looking at this issue, I do notice that same SNMP items look switched (values for stable OIDs suddenly changing). Note sure if due to monitored host or Zabbix, but the triggers that got into a problem state are not solving anymore even if the data says it should).

Comment by Glebs Ivanovskis (Inactive) [ 2017 Jun 06 ]

There was an issue with triggers stuck in PROBLEM in the very infancy of 3.2: ZBX-11439. It is supposed to be fixed by ZBX-11454.

Comment by Kevin Daudt [ 2017 Jun 06 ]

The issue is not a lack of events (I see the events, and can manually close them once enabled for this trigger)

Comment by Dmitry Verkhoturov [ 2017 Jun 07 ]

I've uploaded txt file with debug info from history syncer on new value which came to problem item. I hope it will shed some light on the problem.
I can't reproduce it but see tens of affected triggers in my system.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Jun 08 ]

Few configuration tips, if I may

  • you can use 2h instead of 7200 in trigger expression;
  • you can use positional macro $1 instead of 2h in trigger name to pull this constant out of expression.

This way you'll save some time needed to keep everything in sync.

Now back to the serious business. Paskal, which timezone is your Zabbix server situated in? Is it UTC? If it is, then trigger lastchange from this line

 82112:20170605:172013.809 cannot find open problem events for triggerid:3414987, lastchange:1496683209

is just 4 seconds before the log message was printed:

$ TZ='' date -d @1496683209
Mon Jun  5 17:20:09 UTC 2017

In my opinion, this makes it highly unlikely that housekeeper is involved in this story.

This is active check. Is proxy involved in collecting this item or is this host monitored directly by server?

But then this line...

 28333:20170607:055648.877 In zbx_process_trigger() triggerid:3414987 value:0(0) new_value:3

value:0(0) means TRIGGER_VALUE_OK and TRIGGER_STATE_NORMAL, new_value:3 means TRIGGER_VALUE_NONE.

Looks like you are suffering from ZBX-12121 again... Which version was the log from?

Comment by Dmitry Verkhoturov [ 2017 Jun 08 ]

Zabbix 3.2.5 (revision 67445) with ZBX-12121 patch. I'll upgrade back to Zabbix 3.2.6 (revision 67849) because version change doesn't make the difference for this particular problem.

Re "Which timezone is your Zabbix server situated in?" - it is UTC.
Re "In my opinion, this makes it highly unlikely that housekeeper is involved in this story." - housekeeper disabled for history* tables, we have MySQL partitioning set up.
Re "Is proxy involved in collecting this item" - no, it is collected directly.
I'll get debug log of the problem from 3.2.6 to make things clear.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Jun 08 ]

Thank you for the answers! Just in case, only Zabbix server version matters, no need to experiment with frontend/agents/proxies.

What is very suspicious is that server logs current trigger value as TRIGGER_VALUE_OK, that's why, in server's mind, this trigger does not need resolving. But, as you say, frontend shows this trigger in PROBLEM. And the value of trigger in DB is 1 which is TRIGGER_VALUE_PROBLEM.

Comment by Dmitry Verkhoturov [ 2017 Jun 10 ]

New debug info from 3.2.6 attached, old one renamed.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Jun 11 ]

Do you still have those logs (unfiltered)?

Comment by Vladislavs Sokurenko [ 2017 Jun 12 ]

Could you please be so kind and check if you had similar configuration ? I couldn't reproduce the issue right away, it might need more investigation.

What I have tried is zabbix agent active item with key

vfs.file.time["{$FILE}"]

Refresh rate of 5 seconds

And following trigger expression, while everything else in trigger is as by default.

{test server:vfs.file.time["{$FILE}"].now(0)} - {test server:vfs.file.time["{$FILE}"].last(0)} > 30

Problem occurs when file is not changed for 30 seconds and resolves when I change the file.

Comment by Dmitry Verkhoturov [ 2017 Jun 12 ]

vso, I'm terribly sorry - please, ignore 3.2.5 debug log. As glebs.ivanovskis wrote, it's likely affected by ZBX-12121 while 3.2.6 is not. Please, see ZBX-12251_3.2.6_r67849_history_syncer_debug_info.txt
Problem with 3.2.6 is not reproducible for newly created item and trigger by me, I can tickle existing bugged trigger but can't tell you how to create new one under same condition.
vso, about what you're asked to check: seems I can reproduce that, alert didn't came before "touch filename".
UPD
glebs.ivanovskis, makes great sense to me. Please be aware that I can't turn on debug log level for long time — I can debug anything you want when problem already happened, but can't wait for problem to show up with debug for unknown time. This new logging either have to be high level or show things you want to see every time they are queried by history syncer.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Jun 12 ]

Now I don't think that ZBX-12121 has any affect here. After all, it was debug log of history syncer and ZBX-12121 is a timer bug. It could be that all 3.2.x are affected. I also have a suspicion that this bug can hardly be reproduced on demand. And I have an impression that the mistake occurs long before it can be detected. As I said above, the problem is that in-memory trigger value differs from in-database trigger value before recalculation and update.

I think our best shot is to provide a patch with additional sanity checks and very specific logging and roll it out on a live installation where issue already takes place.

Comment by Dmitry Verkhoturov [ 2017 Jun 14 ]

Around the time problem happened first I've enabled SQL backup on Zabbix MySQL master every 3h, heaviest part of which looked like that:

TABLES_TO_IMPORT=($(mysql -e 'show tables from zabbix where Tables_in_zabbix not like "history%" and Tables_in_zabbix not like "trends%" and Tables_in_zabbix != "sessions";' | tail -n +2))

mysqldump --skip-add-drop-table --no-create-info --no-autocommit --skip-disable-keys --skip-add-locks --lock-all-tables --databases zabbix --tables "${TABLES_TO_IMPORT[@]}"

Effectively it meant following options for mysqldump:

--opt
--lock-all-tables
--no-autocommit
--no-create-info
--skip-add-drop-table
--skip-add-locks
--skip-disable-keys

Because of lock-all-tables option in case of long query being in progress database stopped all writes to affected tables and Zabbix writes stopped at all — we restarted MySQL after that couple of times before finding the problem.

Now I changed it to run only once a day without a lock, by removing lock-all-tables and adding single-transaction. If that was a real root cause, after my changes and zabbix-server restart problem won't appear at all.
Disregarding the root cause, zabbix-server behavior afterwards is inadequate and problem should be found and fixed in my opinion.

UPD: I've didn't got any reports of the problem after fixing backup, maybe that's really was the problem. Previous update seems false positive, trigger fixed itself after a little period.

Comment by richlv [ 2017 Jun 22 ]

this might be unrelated (and 2.4 is not supported), but i might be seeing the same symptoms with 2.4.6. data is collected and stored in the database, but triggers on a host are stuck in a problem state.

Comment by richlv [ 2017 Jun 22 ]

could this also be the same as ZBX-10462 ?

Comment by Dmitry Verkhoturov [ 2017 Jun 23 ]

I'm not using proxy, but it seems similar judjing by little amount of data ZBX-10462 provided.
As far as I can tell this could be really triggered by MySQL tables lock, after I've changed my backup problem didn't repeat, or did less often so I couldn't notice it.

Comment by Vladislavs Sokurenko [ 2017 Jun 27 ]

If you are able to reproduce issue then please try enabling history sincer log only.

zabbix_server -R log_level_increase="history syncer"

Or if you did not delete the log then please search for function zbx_process_trigger() I don't see the end result of it in the log attached unfortunately.

Could you please also check if you had messages like this

 20762:20170627:185502.383 ignoring query [txnlev:1] [insert into events (eventid,source,object,objectid,clock,ns,value) values (67,0,0,13572,1498578886,458560744,0);
] within failed transaction
Comment by Vladislavs Sokurenko [ 2017 Jun 27 ]

Reproduced on trunk, database failure during saving of events is not handled correctly.

Steps:
1. Apply patch attached
2. Create trapper item with trap1 name and key, create trigger on it

 test server:trap1.last()}=204 

3. Make trigger go in problem state

 ./bin/zabbix_sender -z 127.0.0.1 -p 10051 -s "test server" -k trap1 -o 204 

4. Try to resolve by executing command below, but when log says stop database now do restart of database for example sudo systemctl restart mariadb

 ./bin/zabbix_sender -z 127.0.0.1 -p 10051 -s "test server" -k trap1 -o 0 

5. Observe that trigger is in problem state and message show is

 20762:20170627:185502.383 ignoring query [txnlev:1] [insert into events (eventid,source,object,objectid,clock,ns,value) values (67,0,0,13572,1498578886,458560744,0);
] within failed transaction

6. Try to resolve it again

 ./bin/zabbix_sender -z 127.0.0.1 -p 10051 -s "test server" -k trap1 -o 0 

Expected:
It should be possible to recover from problem state even if there was database failure during saving of events or at least when new values come.

Actual:
It's not possible to recover until reboot, even on new values.

Comment by Vladislavs Sokurenko [ 2017 Jun 27 ]

for anyone experiencing issues similar to this, please apply attached patch query_fail_debug.patch for 3.2 to see if any queries are failed and ignored, it will only increase logging if there are database errors, so this should not slow down server.

Comment by Dmitry Verkhoturov [ 2017 Jun 27 ]

Regarding

ignoring query

- I didn't found any such entries in my log, sorry. Problem happened long before I found it and checked debug log.
Regarding raw logs - I still have them, and I shared them with glebs.ivanovskis, you can ask him.
Regarding

query_fail_debug.patch

- I will apply it and reproduce problem state of DB if you really need me to check it, if you are sure what root cause is found - I'll choose not break my production monitoring once more.

vso I see that there was a deadlock in the same place that I can reproduce and then ignoring query and failed transaction, also this confirms your words that by removing lock-all-tables the issue is gone.

 28285:20170607:060059.964 [Z3005] query failed: [1213] Deadlock found when trying to get lock; try restarting transaction

Paskal I'm very glad you reproduced the problem, please tell me if any need will be needed for creating a fix on that.

vso yes please do:
To see the last deadlock in an InnoDB user transaction, use the SHOW ENGINE INNODB STATUS command. If frequent deadlocks highlight a problem with transaction structure or application error handling, run with the innodb_print_all_deadlocks setting enabled to print information about all deadlocks to the mysqld error log.

Paskal

------------------------
LATEST DETECTED DEADLOCK
------------------------
2017-06-28 15:18:55 0x7f8c33030700
*** (1) TRANSACTION:
TRANSACTION 98361978762, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 8 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 2
MySQL thread id 94485929, OS thread handle 140238525007616, query id 7749298672 mon-aws-fr-1.iponweb.iponweb.net 10.32.1.158 zabbix updating
UPDATE profiles SET value_str='screens.php', type=3 WHERE userid=248 AND idx='web.paging.lastpage'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1113 page no 244 n bits 256 index PRIMARY of table `zabbix`.`profiles` trx id 98361978762 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 98361978729, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
561 lock struct(s), heap size 73936, 87323 row lock(s), undo log entries 6
MySQL thread id 94486037, OS thread handle 140240127985408, query id 7749298324 hostname 10.32.4.4 zabbix updating
DELETE FROM profiles WHERE idx='web.favorite.graphids' AND source='itemid' AND value_id='6923163'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1113 page no 244 n bits 256 index PRIMARY of table `zabbix`.`profiles` trx id 98361978729 lock_mode X
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1113 page no 1294 n bits 168 index PRIMARY of table `zabbix`.`profiles` trx id 98361978729 lock_mode X waiting
*** WE ROLL BACK TRANSACTION (1)

vso thanks, unfortunately this is some other deadlock related to frontend, not the one that could have caused problems in triggers.

Paskal They didn't happen to me for a long time, I'll return base to bad condition after building server with your patch and enabling deadlocks logging for MySQL.

Paskal To make sure we're on one wave: I'm patching my zabbix-server, enabling innodb_print_all_deadlocks, and waiting for trouble to come. When it comes, I give you data on problem trigger from Zabbix and MySQL log, is it correct?

vso do you experience issue after changing your mysqldump query ? if yes, then please provide it so we can find a reason for deadlock, but it's currently clear that zabbix does not handle deadlock correctly and it should be fixed.

Paskal I am not experiencing deadlocks which trigger that problem after changing mysqldump query and I'll better leave my system that way. If you know the problem and know how to fix it, I'm happy and don't need anything else.
I thought you want me to return system in problem state to discover something more about the problem root cause.

Paskal Vladislav, do you need any additional information from me? Please write explicitly, do I need to try to reproduce the bug and give you logs or it's not necessary and ball on your side.

vso Thank you, no additional information is needed. Changing mysqldump not to lock tables should have resolved deadlocks for you. We will see what can be done to handle deadlocks in Zabbix server.

Comment by Kevin Daudt [ 2017 Jun 28 ]

We are facing similar issues (seeing "query failed" messages in our zabbix_server.log during backups), but we do not have a solution yet (we are using xtrabackup, not mysqldump).

vso do you see deadlock message ?

kdaudt Yes:

2106:20170625:230340.113 [Z3005] query failed: [1213] Deadlock found when trying to get lock; try restarting transaction

Please next time call SHOW ENGINE INNODB STATUS command so exact cause of deadlock can be known.

kdaudt

This is the output:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2017-06-13 01:00:02 7fbf570c3700
*** (1) TRANSACTION:
TRANSACTION 66811503829, ACTIVE 0 sec starting index read
mysql tables in use 362, locked 362
LOCK WAIT 780 lock struct(s), heap size 79400, 418 row lock(s)
MySQL thread id 100730068, OS thread handle 0x7fc043e38700, query id 21017990923 nlmgmt016 10.60.17.233 zabbix updating
delete from trends_uint where itemid=200200000570970 and clock<1497314251
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6951 page no 8 n bits 328 index `PRIMARY` of table `zabbix`.`trends_uint` /* Partition `p201706070000` */ trx id 66811503829 lock_mode X waiting
Record lock, heap no 246 PHYSICAL RECORD: n_fields 8; compact format; info bits 0
 0: len 8; hex 0000b614b1eb7b50; asc       {P;;
 1: len 4; hex d93f2b00; asc  ?+ ;;
 2: len 6; hex 000f8e46a3f3; asc    F  ;;
 3: len 7; hex a9000040233c87; asc    @#< ;;
 4: len 4; hex 8000000c; asc     ;;
 5: len 8; hex 0000000000000000; asc         ;;
 6: len 8; hex 0000000000000000; asc         ;;
 7: len 8; hex 0000000000000001; asc         ;;

Note that the date says 2017-06-13, while the latest log entry in the server log for deadlock is 2017-06-28

Comment by Rostislav Palivoda [ 2017 Jul 24 ]

Discussed with vso - dues to high risk change, will merge to trunk in release 4.0. Fix version changed to 4.0 plan.

Comment by Vladislavs Sokurenko [ 2017 Aug 09 ]

Fixed in development branch:
svn://svn.zabbix.com/branches/dev/ZBX-12251

Comment by Andris Zeila [ 2017 Aug 30 ]

Successfully tested.

Please review & test r71871 - history table upadates were moved to separate transaction because of ZBXNEXT-4002. Also refactored DCmass_update_history() (DCmass_prepare_history_items) function.

vso reviewed/tested.

Comment by Elvar [ 2017 Oct 23 ]

I am not certain but I think I may be experiencing this same issue in 3.4.1, 3.4.2, and 3.4.3. I am randomly seeing situations with items not recovering despite their recovery conditions being met. Can anyone tell me how I can verify whether or not it is this bug? Or maybe it is related to https://support.zabbix.com/browse/ZBX-12923 which I just opened today. Suggestions?

vso out of bonds error makes whole transaction to fail so trigger is updated in cache only, not in database, though fixing ZBX-12923 should be enough in your case

wiper I was thinking about it - we probably should not update cache if the transaction faills with unrecoverable error.

Comment by Andris Zeila [ 2017 Nov 23 ]

Successfully tested

Comment by Vladislavs Sokurenko [ 2017 Nov 23 ]

Fixed in

  • pre-4.0.0alpha1 (trunk) r74914
Comment by Andrey Melnikov [ 2017 Dec 15 ]

r74914 have off-by-one error which allow server store items with history=0 in database.

Index: src/libs/zbxdbcache/dbcache.c
===================================================================
--- src/libs/zbxdbcache/dbcache.c	(revision 75695)
+++ src/libs/zbxdbcache/dbcache.c	(working copy)
@@ -2007,7 +2007,7 @@
 			continue;
 		}
 
-		if (0 == items->history)
+		if (0 == item->history)
 			h->flags |= ZBX_DC_FLAG_NOHISTORY;
 
 		if ((ITEM_VALUE_TYPE_FLOAT != item->value_type && ITEM_VALUE_TYPE_UINT64 != item->value_type) ||

vso yes, thanks allot for reporting this.

Comment by Naruto [ 2017 Dec 16 ]

Any plans for a 3.4 fix? It's affected too, I experienced this behavior after a brief database unavailability.

vso I am sorry but it's not planned for 3.4

Comment by Vladislavs Sokurenko [ 2017 Dec 18 ]

Fixed in:

  • pre-4.0.0alpha1 (trunk) r76065

Fixed possibility of trigger status getting stuck in PROBLEM or OK state due to transaction error in history synchronization, added deadlock to a list of recoverable errors.
Fixed possibility that new history values are lost and not processed by Zabbix server or Zabbix proxy due to recoverable transaction error.

Comment by Vladislavs Sokurenko [ 2017 Dec 19 ]

Attached patch for 3.2.6 which is subset of fixes for 4.0
fixed possibility that trigger will get stuck in PROBLEM or OK state due to transaction failures during history synchronization

Comment by Dmitry Verkhoturov [ 2017 Dec 19 ]

Paskal will fix be included into 3.2 branch? Otherwise I could publish it in https://github.com/zabbix/zabbix-patches for broader visibility and easier applying.

vso I am sorry but 3.2 is no longer supported, feel free to publish.

Paskal Thank you very much! Patch included and fixed for latest available version (3.2.10).

Paskal Vladislavs, is there any chance you could give us same patch for v3.4?

vso unfortunately I don't have the patch, I should make one if needed, but alpha 4.0.0alpha1 is out perhaps you can upgrade to it.

Comment by Leandro [ 2018 Aug 21 ]

Hi everyone,

I'm facing a similar problem like the one reported here.

My installation is:
zabbix_server (Zabbix) 3.4.5
Revision 76340 27 December 2017, compilation time: Dec 27 2017 14:09:14
OS: CentOS 7, mysql 15.1 Distrib 5.5.56-MariaDB
 
Just looked for deadlocks in INNODB STATUS output. Attached is what I found:

------------------------
LATEST DETECTED DEADLOCK
------------------------
180802 18:18:10
*** (1) TRANSACTION:
TRANSACTION 1074DEB8, ACTIVE 31 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 2
MySQL thread id 14, OS thread handle 0x7fa52814e700, query id 131804 localhost zabbix update
insert into problem (eventid,source,object,objectid,clock,ns) values (17819200,0,0,13473,1533251858,881253939)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 407298 n bits 1096 index `c_problem_2` of table `zabbix`.`problem` trx id 1074DEB8 lock_mode X insert intention waiting
*** (2) TRANSACTION:
TRANSACTION 1074D1CF, ACTIVE 276 sec fetching rows
mysql tables in use 2, locked 2
775 lock struct(s), heap size 113080, 210408 row lock(s), undo log entries 24189
MySQL thread id 838, OS thread handle 0x7fa407110700, query id 124045 localhost zabbix Sending data
delete from events where clock<1533137125 and events.source=3 and events.object=0 and not exists (select null from problem where events.eventid=problem.eventid or events.eventid=problem.r_eventid)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 407298 n bits 1096 index `c_problem_2` of table `zabbix`.`problem` trx id 1074D1CF lock mode S
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 223560 n bits 312 index `PRIMARY` of table `zabbix`.`events` trx id 1074D1CF lock_mode X waiting
*** WE ROLL BACK TRANSACTION (1)

 

Could this deadlock be a cause for alarm condition stuck in trigger status, along with the errors I have in zabbix log?

 

  1688:20180819:053122.134 [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [insert into problem (eventid,source,object,objectid,clock,ns) values (19331940,3,0,16474,1534674631,176287537),(19331941,3,0,16475,1534674631,176287537),(19331942,3,0,16476,1534674631,176287537),(19331943,3,0,16477,1534674631,176287537),(19331944,3,0,16478,1534674631,176287537),(19331945,3,0,16479,1534674631,176287537),(19331946,3,0,16480,1534674631,176287537),(19331947,3,0,16481,1534674631,176287537),(19331948,3,0,16482,1534674631,176287537),(19331949,3,0,16483,1534674631,176287537),(19331950,3,0,16484,1534674631,176287537),(19331951,3,0,16485,1534674631,176287537),(19331952,3,0,16486,1534674631,176287537),(19331953,3,0,16487,1534674631,176287537),(19331954,3,0,16488,1534674631,176287537);
]
  1688:20180819:053122.134 slow query: 50.756534 sec, "insert into problem (eventid,source,object,objectid,clock,ns) values (19331940,3,0,16474,1534674631,176287537),(19331941,3,0,16475,1534674631,176287537),(19331942,3,0,16476,1534674631,176287537),(19331943,3,0,16477,1534674631,176287537),(19331944,3,0,16478,1534674631,176287537),(19331945,3,0,16479,1534674631,176287537),(19331946,3,0,16480,1534674631,176287537),(19331947,3,0,16481,1534674631,176287537),(19331948,3,0,16482,1534674631,176287537),(19331949,3,0,16483,1534674631,176287537),(19331950,3,0,16484,1534674631,176287537),(19331951,3,0,16485,1534674631,176287537),(19331952,3,0,16486,1534674631,176287537),(19331953,3,0,16487,1534674631,176287537),(19331954,3,0,16488,1534674631,176287537);
"

 

Is it currently another way to prevent this issue to appear, without doing a full reinstallation to pre-4.0.0alpha1?

I have configured many of mysql settings suggested to avoid slow queries. They are still there, though.

Thanks for your support,

L

Comment by Vladislavs Sokurenko [ 2018 Aug 21 ]

Hello, it should be solved by ZBX-14312, please upgrade and let me know the result

Comment by Leandro [ 2018 Aug 21 ]

ok,

Is patch applied to pre-3.4.10rc1 r80725 also ported forward to 3.4.12? Will it be in the last stable 3.4 release?

Thank you

Comment by Vladislavs Sokurenko [ 2018 Aug 21 ]

Yes it is also available forward.

Generated at Thu Apr 25 00:56:11 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.