-
Problem report
-
Resolution: Duplicate
-
Major
-
None
-
None
-
None
-
None
Environment:
**Zabbix Server 5.0.10, CentOS linux, 2 cores, 16Gb RAM
Dedicated Db: Posgresql 12, timescaledb 2, 1 month shards as per recommendation of Zabbix (see ZBX-16347), 8 cores, 125Gb RAM. Database is approximately 1.7T on disk storage.
Dedicated Front End server, Zabbix 5.0.10, nginx.
Server monitors minimal items (primarily itself and the database), all primary monitoring performed by a set of 9 proxies.
Steps to reproduce:
- Deploy Zabbix to a relatively large environment. Our Zabbix implementation is 1126208 items, with a NPS of just over 2600. We see approximately 23142 "trend" data elements per hour, and approximately 750000 "trend_uint" data elements per hour (with an overall average of approximately 45 history data points per item).
- Wait for trends to be flushed (half-past the hour is a good time)
- Shut down Zabbix. This will be slow because it will write out the current-hour trends.
- Verify that you have trend data associated with the partial hour
- select 'trends',count(itemid),sum(num),TO_TIMESTAMP(clock),clock
from trends
where clock >= extract(epoch from now() - INTERVAL '4 HOUR')::INTEGER
group by clock
UNION
select 'trends_uint',count(itemid),sum(num),TO_TIMESTAMP(clock),clock
from trends_uint
where clock >= extract(epoch from now() - INTERVAL '4 HOUR')::INTEGER
group by clock
order by 1,4 desc;
- select 'trends',count(itemid),sum(num),TO_TIMESTAMP(clock),clock
- Start Zabbix.
- Wait for the top of the next hour
- Watch your Zabbix implementation stop inserting history data.
- re run the query above, note how the sum value slowly increases as trends are updated
Result:
**
**Zabbix will use all history syncs to store trend data until the trend data is flushed .
For particularly large databases, using the recommended timescale settings of one shard per month (see ZBX-16347), the select() query to determine if an item is already in the database takes 6 seconds to return, for each call. The individual update queries can also be significant (exceeding 60 seconds). The end result is that the synced can take several minutes (last night, I gave up at 35 minutes and forcibly killed the Zabbix server, judging that it is better to lose trends than more history). The 6PM trend write had not yet completed, and my estimate was that it was only one-third complete.
Expected:
Trend writing does not impact history writing
Possible solutions (there are likely others, but these are possibilities):
- Use a dedicated trend writing daemon, rather than the history syncer, or limit trend writes to a subset of the syncer processes available.
- Use prepare/exec nomenclature so the SQL Server only needs to prepare the query once (Note: this would require you to stop stop sending multiple statements in a single query). Use UPSERT nomenclature to eliminate select() query to check for existing trend data.
- Make trend inserts smaller and use a queue model so that history data is not blocked by large select/insert pain during the trend transaction
- duplicates
-
ZBX-23064 Problem with restart server - syncing trend data
- Closed