I currently have around 171k triggers in my zabbix environment. Whenever I restart Zabbix, I can see many queries being run to pull data out of the history tables. Queries like this one:
select clock,ns,value from history_uint where itemid=45409 and clock<=1394207920 and clock>1394204320 order by clock desc,ns desc
These appear to be queries that are used to populate the Value Cache. Since I have so many triggers (some that look at 3 days worth of data), this process of reading data from the database can take several minutes. While ordinarily this wouldn't be much of a problem, the process of having these queries run appears to prevent any history data from being inserted into the Zabbix database. I have 24 DB Syncers. After the initial startup of the server, I can see that all 24 are being used to get data for the value cache. After about 2-4 minutes, only 6 of the queries exist at any given time, however, none of the other 18 history syncers will insert data during this time. Once the value cache is populated (usually about 6-7 minutes), history will then be inserted into the DB as expected.
Because of this issue, I have a "blackout" period for data for about 6-7 minutes whenever I restart the zabbix server process. This will only get worse as more and more triggers are added into Zabbix. As a result, this problem is preventing our ability to scale Zabbix. Is there perhaps some internal cache locking or something that is going on that prevents the history syncers from inserting data into the DB? If I had to venture a guess, I would assume it has something to do with evaluating triggers from the new data.