-
Problem report
-
Resolution: Cannot Reproduce
-
Trivial
-
None
-
6.4.12
-
None
-
DB: (PostgreSQL) 14.8 + timescaledb 2.7.1
40 CPU, 160 GB RAM
Server: HA Zabbix two nodes cluster - 6.4.8 1 CPU, 94 GB RAM
frontend: 6.4.8
proxy: 14x 6.4.4 with postgresql 14.5, 4vcpu 16 GB RAM, mode: active
Number of hosts (enabled) - 32676
Number of items (enabled/disabled/not supported) - 5665333 / 98704 / 177889
Number of triggers (enabled/disabled) - 2782781 / 49553
Required server performance, new values per second - 8032.17
90% of items have logrt.count key
DB: (PostgreSQL) 14.8 + timescaledb 2.7.1 40 CPU, 160 GB RAM Server: HA Zabbix two nodes cluster - 6.4.8 1 CPU, 94 GB RAM frontend: 6.4.8 proxy: 14x 6.4.4 with postgresql 14.5, 4vcpu 16 GB RAM, mode: active Number of hosts (enabled) - 32676 Number of items (enabled/disabled/not supported) - 5665333 / 98704 / 177889 Number of triggers (enabled/disabled) - 2782781 / 49553 Required server performance, new values per second - 8032.17 90% of items have logrt.count key
We have 14 proxies in the environment. In the "administration" -> "Proxies" tab we have an incorrect number of per proxy items. For readability - we will focus on a single proxy (the wrong number is on all proxies).
In the proxy tab we have:
Name | Mode | Encryption | Version | Last seen (age) | Host count | Item count | Required vps | Hosts |
---|---|---|---|---|---|---|---|---|
ProxyXXX | Active | None | 6.4.4 | 2s | 2004 | 1898 | 539.26 | xxx,xxx,xxx,xxx...... |
The number of hosts is correct, but the number of items (1898) is definitely too small. Additionally - "Required vps" seems to be ok.
We started the debug by checking the query "GUI -> server" by tcpdump, as proxy status data is taken directly form servers trapper process. The answer has been shortened to include the most important information:
(...) "item stats": [ (...) { "attributes": { "proxyid": 30913, "status": 1 }, "count": 17 }, { "attributes": { "proxyid": 30913, "status": 0, "state": 1 }, "count": 57 }, { "attributes": { "proxyid": 30913, "status": 0, "state": 0 }, "count": 1841 }, (...) ] (...)
Of course, 1841 + 57 = 1898, which agrees with the above number of items (from proxies tab). So we have ruled out a problem on the GUI side.
We then checked the configuration on the proxy side. in short - We noticed the correct value in the logs when increasing the debug level:
(...) 411987:20240314:111800.365 DCsync_configuration() items : 396984 (1760203 slots) 411987:20240314:111800.365 DCsync_configuration() items_hk : 396984 (1760203 slots) 411987:20240314:111800.365 DCsync_configuration() numitems : 338924 (1760203 slots) (...)
So we assumed correct "server <-> proxy" communication and we have ruled out a problem on the PROXY side.
Returning to the "GUI -> server" communication, the response returns wrong values, so we suspect an error on the zabbix-server side, and in particular - communication with the cache.
After a quick review of the code, we suspect the dc_status_update() function in the dbconfig.c file, where, for example, you can see the counting of the number of items:
(...) if (NULL != dc_proxy_host) dc_proxy_host->items_active_normal++; (...) if (NULL != dc_proxy_host) dc_proxy_host->items_active_notsupported++; (...)
Unfortunately, these are just suspicions and we are not sure where the error is.