-
Problem report
-
Resolution: Unresolved
-
Trivial
-
None
-
7.0.10
-
None
-
OS: Redhat 8 (vcpu 6 / mem 25gb)
DB: Mariadb 6.10.21 (OS: el8 / vcpu 8 / mem 96gb / ssd 8T)
NVPS: ~9k
Proxies Count: 80
Number of hosts (enabled/disabled) 10864 10337 / 527
Number of templates 1352
Number of items (enabled/disabled/not supported) 2715946 1943367 / 601862 / 170717
Number of triggers (enabled/disabled [problem/ok]) 854963 791361 / 63602 [6645 / 784716]
Number of users (online) 104 5
Required server performance, new values per second 9026.65
High availability cluster Enabled Fail-over delay: 1 minuteOS: Redhat 8 (vcpu 6 / mem 25gb) DB: Mariadb 6.10.21 (OS: el8 / vcpu 8 / mem 96gb / ssd 8T) NVPS: ~9k Proxies Count: 80 Number of hosts (enabled/disabled) 10864 10337 / 527 Number of templates 1352 Number of items (enabled/disabled/not supported) 2715946 1943367 / 601862 / 170717 Number of triggers (enabled/disabled [problem/ok]) 854963 791361 / 63602 [6645 / 784716] Number of users (online) 104 5 Required server performance, new values per second 9026.65 High availability cluster Enabled Fail-over delay: 1 minute
Steps to reproduce:
- Update from 7.0.9 to 7.0.10
Result:
After the version update, our server, which has around 9k NVPS, experienced a significant delay in normalizing data synchronization with the proxies and sending it to the database. Even after 48 hours, the issues persisted. It was observed that the history syncer processes were barely "working"—it was very difficult to see them actually sending data to the database.
Another point noted was that the History Cache usage never dropped below 80% throughout this entire period. It's important to highlight that before the update to version 7.0.9, these issues were not present.
No changes were made apart from the minor version update to 7.0.10 on 03/12/2025.
Additionally, another factor observed was that the proxy queue, which was typically stabilized between 10k and 15k, spiked to over 1.5 million. However, after rolling back to version 7.0.9, everything returned to normal in less than an hour.
Expected:
The expectation was that normalization would occur within a maximum of 3 hours after the update and that the queue and history cache levels would return to normal.