[ZBX-22813] Zabbix server |LLD | lost of data because of OOM (?) Created: 2023 May 17  Updated: 2023 Nov 23  Resolved: 2023 Nov 23

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Problem report Priority: Trivial
Reporter: db100 Assignee: Tomass Janis Bross
Resolution: Incomplete Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

This is a bug report. I will be providing logs and other pieces of information next.

I am running Zabbix 6.4.0 on kubernetes and i have imposed a limit to its memory.

The server handles about 2000 hosts and >20000 items, all discovered via means of LLDs. All LLD discovery rules are set to have a retention of undiscovered items of 30 days

For some reason, at some point the server simply eliminates all discovered Hosts and Items in a housekeeping job that (according to the log) takes about 400 seconds to execute (regular housekeeping ususally take up just few seconds).

I am still investigating this issue but i fear it might be linked to the CPU or RAM limits imposed on the Pod. That's why the title of this post.

Has anyone had similar symptoms ? This is quite a bad bug i must say

UPDATE

Here the log i see in the server container: "invalid discovery rule ID [51085]" ---> this goes for most of the rules

and before that:

```
244:20230517:065155.162 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: deadlock detected
DETAIL: Process 1680 waits for ShareLock on transaction 2741830; blocked by process 1683.
Process 1683 waits for ShareLock on transaction 2741132; blocked by process 1680.
HINT: See server log for query details.
CONTEXT: while deleting tuple (36,3) in relation "item_rtdata"
SQL statement "DELETE FROM ONLY "public"."item_rtdata" WHERE $1 OPERATOR(pg_catalog.=) "itemid""
[delete from functions where (itemid in​
```

if i try to create more LLDs it seems the they are not executed, i see no useful logs in the server process. only these two things:

  • postgres DB log: `LOG: could not receive data from client: Connection reset by peer`
  • zabbix server:

 
Code:
...
271:20230517:073822.065 server #35 started
7:20230517:073822.068 "zabbix-server-..." node started in "active" mode
272:20230517:073822.068 server #36 started
273:20230517:073822.072 server #37 started
Bad operator (INTEGER): At line 73 in /var/lib/mibs/ietf/SNMPv2-PDU
243:20230517:073823.368 thread started
243:20230517:073823.368 thread started
243:20230517:073823.368 thread started
this is the onllog i see ...



 Comments   
Comment by db100 [ 2023 May 17 ]

ok, i have tried to create a new instance of zabbix 6.4.2 and apply all hosts and templates needed for the LLD to run and it DOES not create any host ... i am not sure if the Host protoype specification has changed, maybe there was a breaking change with the upgrade? i see no error in the console log, but from the message posted above it seems that something does not work anymore in the host generation process of the LLD (which uses a JAvaScript preprocessor ... so maybe something was broken there with the update ??)

Comment by db100 [ 2023 May 17 ]

 
ok, i have figured out what is wrong not:

"Zabbix does not support nested host prototypes, i.e. host prototypes are not supported on hosts that are discovered by low-level discovery rule."

https://www.zabbix.com/documentation/current/en/manual/discovery/low_level_discovery/host_prototypes

i have no idea how did i manage to create all hosts using these nested host prototypes rules ... but it seems to have worked ???

but now all hosts are gone ... which is strange, because i was expecting them to simply become "disabled" or unsupported ...
 

Comment by db100 [ 2023 May 19 ]

please notice that this issue scope has changed: basically the real problem here is that some LLDs became invalid all of a sudden, which lead to data losses:

 

> Here the log i see in the server container: "invalid discovery rule ID [51085]

 

so the point to look after would be to prevent this to happen in the first place. 

Comment by Tomass Janis Bross [ 2023 May 19 ]

Hello db!

1. In your first comment you mention that you tried to create a new instance, did you create a whole new instance or simply upgrade the existing one ?
2. On the new/upgraded instance, you have applied discovery rules with host prototypes just like you had before, or have you made any changes ?

Also, and this is very important, please describe in detail how you achieved the creation of nested host prototypes, since our team had tested this in 6.0 and it was physically impossible, Zabbix simply wouldn't allow to assign a template with host discovery to a host that was created by host discovery.
Please describe what were the hosts, what technologies were used (docker or lxc containers, vmware machines, etc.), please provide links to the templates you used or export and provide the ones you used, if they were created by you.

Cheers,
Tom

Comment by Tomass Janis Bross [ 2023 Nov 23 ]

No additional information provided. Feel free to re-open if there is more information.

Generated at Mon Jun 09 07:06:57 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.