[#ZBXNEXT-4271] Delay escalator by a huge escalations table with Recovery operations

[ZBXNEXT-4271] Delay escalator by a huge escalations table with Recovery operations Created: 2017 Dec 11 Updated: 2024 Apr 10 Resolved: 2018 Oct 10
Status:	Closed
Project:	ZABBIX FEATURE REQUESTS
Component/s:	Server (S)
Affects Version/s:	3.2.10, 3.4.4, 4.0 (plan)
Fix Version/s:	4.0.1rc1, 4.2.0alpha1, 4.2 (plan)

Type:

Change Request

Priority:

Trivial

Reporter:

Kim Jongkwon

Assignee:

Vladislavs Sokurenko

Resolution:

Fixed

Votes:

Labels:

None

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Attachments:

4.0.1rc1_test_result.png

million escalations new.png

million escalations old.png

Issue Links:

Causes
Duplicate

Team:

Team A

Sprint:

Sprint 42, Sprint 43, Sprint 44

Story Points:

Description

from ~~ZBX-13137~~

There are cases in which you store them in the escalations table without the OK event.

Examples:

zabbix=> select count(*) from escalations where status=2;
  count  
---------
 1180124

The real problem is the "escalator busy 100%" in this situation.
Escalator's busy rate has increased and It's hard to notice the problem until the problem occurs.

Solution 1 (no need for development)
Delete a "recovery operation" actions or "all event manual close" can solves this problem. (escalation table datas is also deleted.)

And I also thought it's better to remove it from the escalations if really don't need to deal with "RECOVERY" by trigger settings. Problem of "escalator busy" is need improvement. Therefore, the additional improvements that require development are below.

Solution 2
With "OK event generation : None" and "Don't allow manual close" datas -> to be removed

Solution 3
With "OK event generation : None" and "Allow manual close" datas -> If possible, We need a solution that doesn't increase the escalator busy rate. (The best solution is not to store data in escalations.)

Solution 4 (New Features)
An additional feature that allows to automatically close (remove the old escalation data)

Comments

Comment by Vladislavs Sokurenko [ 2018 Oct 03 ]

Fixed in :

pre-4.0.1rc1 r85403
pre-4.2.0alpha1 (trunk) r85404

Comment by Kim Jongkwon [ 2018 Oct 24 ]

Just FYI. To clarify :
The "Solutions" ideas that is written is to remove the escalator datas. But actual fix is improved escalator performance, for now.

improved escalator performance by using nextcheck index instead of reading whole table (vso)

I've checked this fix with Zabbix 4.0.1rc1. (I tested with 360,000 escalation datas)

I think it's a pretty good results. Thanks for fix.

vso thanks for feedback, was it test data or real Zabbix server ? Under which conditions would you like escalation data to be removed ?

JKKim That was 'Test data'. Please double check - I noticed it is different from the situation in 3.2. In 4.0.1rc1, escalation data does not disappear even if the any option is changed. (like Solution 1 - Delete a "recovery operation" actions)
If this is true, There is no simple way to erase the stored huge datas.

And go back to the story of ~~ZBX-13137~~. Users will think this type of data is not created with this trigger options below.

OK event generation : None
Allow manual close : No (unchecked)

In my opinion, If this option is selected - It is best condition to remove datas.

vso does decreasing delay help ?

JKKim As the performance improved so much 4.0.1, delay resolved with this performance fix. I understood that this is the best way at now. so we can CLOSE.

Current design, If the escalator data remove...

Impossible to recovery events the manual closed setting to be changed later.
If Escalator process doing remove data, Delays may increase due to remove or check. (maybe)

But unfortunately remaining "unused data" on escalations table. Just I think that it is necessary to discuss conditions and processing that data can be removed. (It could be another ZBXNEXT in the future.)

Generated at Thu Apr 10 20:11:12 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.

[ZBXNEXT-4271] Delay escalator by a huge escalations table with Recovery operations Created: 2017 Dec 11 Updated: 2024 Apr 10 Resolved: 2018 Oct 10