Loading...

XML

Word

Printable

Type: Problem report
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Environment:

**Zabbix Server 5.0.10, CentOS linux, 2 cores, 16Gb RAM

Dedicated Db: Posgresql 12, timescaledb 2, 1 month shards as per recommendation of Zabbix (see ~~ZBX-16347~~), 8 cores, 125Gb RAM. Database is approximately 1.7T on disk storage.

Dedicated Front End server, Zabbix 5.0.10, nginx.

Server monitors minimal items (primarily itself and the database), all primary monitoring performed by a set of 9 proxies.

Steps to reproduce:

Deploy Zabbix to a relatively large environment. Our Zabbix implementation is 1126208 items, with a NPS of just over 2600. We see approximately 23142 "trend" data elements per hour, and approximately 750000 "trend_uint" data elements per hour (with an overall average of approximately 45 history data points per item).
Wait for trends to be flushed (half-past the hour is a good time)
Shut down Zabbix. This will be slow because it will write out the current-hour trends.
Verify that you have trend data associated with the partial hour
1. select 'trends',count(itemid),sum(num),TO_TIMESTAMP(clock),clock
  from trends
  where clock >= extract(epoch from now() - INTERVAL '4 HOUR')::INTEGER
  group by clock
  UNION
  select 'trends_uint',count(itemid),sum(num),TO_TIMESTAMP(clock),clock
  from trends_uint
  where clock >= extract(epoch from now() - INTERVAL '4 HOUR')::INTEGER
  group by clock
  order by 1,4 desc;
Start Zabbix.
Wait for the top of the next hour
Watch your Zabbix implementation stop inserting history data.
re run the query above, note how the sum value slowly increases as trends are updated

Result:
**

**Zabbix will use all history syncs to store trend data until the trend data is flushed .

For particularly large databases, using the recommended timescale settings of one shard per month (see ~~ZBX-16347~~), the select() query to determine if an item is already in the database takes 6 seconds to return, for each call. The individual update queries can also be significant (exceeding 60 seconds). The end result is that the synced can take several minutes (last night, I gave up at 35 minutes and forcibly killed the Zabbix server, judging that it is better to lose trends than more history). The 6PM trend write had not yet completed, and my estimate was that it was only one-third complete.

Expected:
Trend writing does not impact history writing

Possible solutions (there are likely others, but these are possibilities):

Use a dedicated trend writing daemon, rather than the history syncer, or limit trend writes to a subset of the syncer processes available.
Use prepare/exec nomenclature so the SQL Server only needs to prepare the query once (Note: this would require you to stop stop sending multiple statements in a single query). Use UPSERT nomenclature to eliminate select() query to check for existing trend data.
Make trend inserts smaller and use a queue model so that history data is not blocked by large select/insert pain during the trend transaction

duplicates

ZBX-23064 Problem with restart server - syncing trend data

Closed

part of

ZBX-25651 Add an opportunity to disable trend synchronization when Zabbix Server is shutdown

Closed

Assignee:: Vladislavs Sokurenko

Reporter:: Aaron Whiteman

Votes:: 20 Vote for this issue

Watchers:: 25 Start watching this issue

Created:: 2021 Apr 02 18:44

Updated:: 2025 Mar 04 09:35

Resolved:: 2024 Feb 01 14:06

Details

Description

Attachments

Issue Links

Activity

People

Dates