[#ZBXNEXT-4900] Improve preprocessing performance

[ZBXNEXT-4900] Improve preprocessing performance Created: 2018 Dec 04 Updated: 2020 Nov 09
Status:	Open
Project:	ZABBIX FEATURE REQUESTS
Component/s:	Server (S)
Affects Version/s:	4.2.0alpha1
Fix Version/s:	None

Type:

Change Request

Priority:

Major

Reporter:

Andris Zeila

Assignee:

Zabbix Development Team

Resolution:

Unresolved

Votes:

Labels:

None

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Issue Links:

Duplicate
is duplicated by	~~ZBXNEXT-4914~~	Multiple preprocessing managers	Closed

Description

With new features (LLD preprocessing, custom scripts) preprocessing might become a bottleneck. Currently it can push ~160k small values/sec on i7-7700 cpu. With preprocessing values becoming larger (especially in LLD case) and steps more complex (scripts) the performance drops and might not be enough.

There are two options - rework preprocessing to use worker threads instead of processes. The data exchange load will be significantly reduced, especially with large data. Another option is brute force approach - use multiple preprocessing managers each having own worker process set, split items between managers by itemid.
Some tests results by converting processes to threads (a 'trim' preprocessing step was applied to all values) :

Value Size (bytes)	Values/sec (trunk)	Values/sec (threaded workers using sockets)	Values/sec (threaded workers using queues)
4	167K	173k	590k
128	158K	170k	530k
1024	136K	148k	362k
2048	124K	141k	268k
4096	84K	127k	183k
8192	68K	99k	115k

Threaded workers using sockets

The worker processes were replaced with threads. The old communication protocol (sockets) was used, but instead of sending data only references to the data objects were sent. It could be optimized further, but still gives a rough estimate.

Threaded workers using queues

In this test the manager-worker communication was changed to simple mutex protected queues.

Or in worst case we can merge both options.

Comments

Comment by Glebs Ivanovskis [ 2018 Dec 06 ]

Network traffic compression was advertised to provide 5 times improvement in required bandwidth with "no impact on CPU or memory usage". Why can't it be used for sending LLD data to preprocessing workers?

Complexity of preprocessing steps shouldn't be the issue since it only affects preprocessing workers and you can have as many of them as you like.

wiper: I would say - no noticeable impact on CPU usage. But that's with large data. With smaller data packets it might have negative result. Currently we are limited by preprocessing manager performance, so we are looking either how to reduce or share it.

I did test pure preprocessing performance, without pushing data into history cache. With contested history cache the results would be even worse.

On the other hand if the compression is done in data gathering processes (is that what you were thinking?), then preprocessing manager has only to cache/forward the compressed data - so that should help with performance when large data (LLD) is processed.

cyclone Honestly speaking, I wasn't thinking that far. But it sounds like a sensible approach to offload the preprocessing manager. And of course you are not obliged to compress all, small data can go uncompressed. It is a matter of protocol design to allow both types of messages. Since the protocol isn't for public use you are free to design it the way you like it.

As far as I recall, one and the only preprocessing manager was needed to keep the guarantee that the data which came first will be processed by triggers and actions first regardless of preprocessing steps it has to go through.

wiper: yes, but that guarantee is lost in history cache.

cyclone Well, I meant for items in one group of interdependent triggers... Isn't this guarantee still true if we neglect the fact that internal item processing is prioritized?

Comment by Evren Yurtesen [ 2020 Nov 09 ]

Is there a reason why preprocessing can't be moved to zabbix-agent side? this way the load can be distributed. Also with options like threshold/discard, the final transmitted data amount can be considerably smaller.

Generated at Tue Oct 21 06:40:39 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.

[ZBXNEXT-4900] Improve preprocessing performance Created: 2018 Dec 04 Updated: 2020 Nov 09