[#ZBXNEXT-3193] Linkage between problem and ok events

[ZBXNEXT-3193] Linkage between problem and ok events Created: 2016 Mar 16 Updated: 2018 Aug 13 Resolved: 2016 Dec 15
Status:	Closed
Project:	ZABBIX FEATURE REQUESTS
Component/s:	API (A), Frontend (F), Installation (I), Server (S)
Affects Version/s:	None
Fix Version/s:	3.2.0alpha1

Type:

New Feature Request

Priority:

Major

Reporter:

Alexander Vladishev

Assignee:

Unassigned

Resolution:

Fixed

Votes:

2

Labels:

events, multiple, triggers

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Issue Links:

Causes
causes	~~ZBX-14720~~	Missing index causing full table scan...	Closed

Description

It's needed for faster displaying of events and finding correlation between PROBLEM and OK events.

Comments

Comment by richlv [ 2016 Mar 16 ]

any more detail on what the feature request is about ?

Comment by Andris Zeila [ 2016 May 12 ]

(1) [S] Database patch ready for testing in development branch svn://svn.zabbix.com/branches/dev/ZBXNEXT-3193
CLOSED

Comment by Andris Zeila [ 2016 May 12 ]

Server side ready for testing.

Comment by Ivo Kurzemnieks [ 2016 May 12 ]

(2) No translation string changes.

sasha CLOSED

Comment by Ivo Kurzemnieks [ 2016 May 12 ]

Apparently there is nothing to do for event API, since there is only event.get and the field is already read-only.

Comment by Andris Zeila [ 2016 May 20 ]

Updated server to conform the specification changes

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(3) [D] Tables 'problem' and 'event recovery' have 'eventid' from 'events' as foreign key. However table 'events' can be partitioned (https://www.zabbix.org/wiki/Docs/howto/mysql_partitioning). At least MySQL does not support foreign keys on partitioned tables (http://dev.mysql.com/doc/refman/5.7/en/partitioning-limitations.html).

This is a showstopper for upgrading to 3.2 for those that have the 'events' table partitioned.

wiper 'events' table partitioning is not recommended anymore and we will stop supporting it. It should be clearly stated in upgrade notes and we should also update relevant documentation.

sandis.neilands Should we also provide a guide on how to unify partitioned 'events' table?

gunarspujats Discussed with sasha, guide won't be provided. CLOSED

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(4) design Why 10000 events per batch during DB upgrade? How was this number determined? Considering that we'll hold the IDs of all events in memory during the upgrade - what is the maximum number of events that we have seen?

wiper Just and arbitrary number, not too small and not too large.
The following data is stored in memory during patch processing:

all event sources (triggers/items) - this is less what configuration cache has to keep in shared memory
active (not recovered) event identifiers - in theory this can be large if there are no OK events for triggers with multiple problem generation

wiper One solution would be to use temporary database table to store open events. However that would slow things down. For now we will just log warning when an event source reaches 10m open events.

sandis.neilands Another possible solution - add log that states percentage of events processed after each 5%. This would allow to forecast how much time is still needed for the upgrade.

wiper Would be nice, but there are upgrade steps we can't show step progress - for example changes to history table structure. So user will not be able to forecast to total time needed for upgrade.
RESOLVED in r60339.

sandis.neilands CLOSED with a comment. After upgrade upon receiving an OK event the server will be stuck while processing all those PROBLEM events.

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(5) design API: why not expose problems via API? The workflow of automating common tasks fails if we do not have 1-to-1 correspondance between API and UI.

wiper This will done later in ~~ZBXNEXT-3201~~
CLOSED

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(6) [S] trigger_events_hash_func() - hash overwritten, hash of source computed twice.

wiper the hash is used as seed for the next hash step calculation. Fixed duplicate source addition to hash.
RESOLVED in r60337

sandis.neilands CLOSED.

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(7) [S] assign_recovery_events() or update_event_recovery() (comment vs. reality)?

wiper RESOLVED in r60338.

sandis.neilands CLOSED.

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(8) design If item and/or trigger removed while problem still unresolved then it is never removed from 'problem' table. Is this intended?

wiper It will be removed together with events by housekeeper
CLOSED

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(9) design If trigger changed to generate single problem event, PROBLEM event arrives, then OK event arrives. All the open PROBLEM events for the trigger are closed. Is this intended?

wiper Yes, currently OK event closes all open PROBLEM events.
CLOSED

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(10) design When is 'event_recovery' table supposed to be cleaned? How will it be used (which ZBXNEXT)?

wiper 'event_recovery' table records will be automatically removed with events (e.g. housekeeper).
CLOSED

Comment by Marc [ 2016 May 26 ]

richlv,
I asked myself the same. Judging the comment on (1), I assume it's about mapping things on database level to allow more efficient database queries... just a guess though.
A brief look at the specification would be helpful. But that's obviously supposed to be a surprising gift to us - we can get ready to be excited!

Comment by Alexei Vladishev [ 2016 May 26 ]

I assume it's about mapping things on database level to allow more efficient database queries...

Your assumption is absolutely correct. It's a quite technical feature request, which does not bring any end-user benefits if implemented alone.

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

(11) [S] Memory leak during upgrade in update_event_recovery().

==27269== 53,252,160 bytes in 1,633 blocks are possibly lost in loss record 1,065 of 1,065
==27269==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27269==    by 0x4E898B3: my_malloc (in /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18.0.0)
==27269==    by 0x4E853CD: alloc_root (in /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18.0.0)
==27269==    by 0x4E6A33A: cli_read_rows (in /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18.0.0)
==27269==    by 0x4E6CB97: mysql_store_result (in /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18.0.0)
==27269==    by 0x53F2E7: zbx_db_vselect (db.c:1392)
==27269==    by 0x53F5D4: __zbx_zbx_db_select (db.c:241)
==27269==    by 0x53F477: zbx_db_select_n (db.c:1624)
==27269==    by 0x5095FA: DBselectN (db.c:405)
==27269==    by 0x4FA4D8: update_event_recovery (dbupgrade_3010.c:276)
==27269==    by 0x4FA133: DBpatch_3010021 (dbupgrade_3010.c:350)
==27269==    by 0x4F1ECB: DBcheck_version (dbupgrade.c:784)
==27269== 
==27269== LEAK SUMMARY:
==27269==    definitely lost: 680 bytes in 85 blocks
==27269==    indirectly lost: 6,568,800 bytes in 426 blocks
==27269==      possibly lost: 76,010,400 bytes in 2,258 blocks
==27269==    still reachable: 814,257 bytes in 13,152 blocks
==27269==         suppressed: 0 bytes in 0 blocks
==27269== Reachable blocks (those to which a pointer was found) are not shown.
==27269== To see them, rerun with: --leak-check=full --show-leak-kinds=all

wiper RESOLVED in r60351.

sandis.neilands CLOSED.

Comment by Sandis Neilands (Inactive) [ 2016 May 26 ]

Successfully tested the functionality.

We should check the performance once those features that depend on this feature are implemented.

Comment by Andris Zeila [ 2016 May 27 ]

Released in:

pre-3.1.0 r60365

Generated at Sun Jul 06 01:35:20 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.