Loading...

XML

Word

Printable

Type: Problem report
Resolution: Cannot Reproduce
Priority: Trivial
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- api
- items
- lld
- triggers
Environment:
CentOS 7.7, Zabbix 4.4.3, MySQL 5.7

Steps to reproduce:

**I have a pretty large deployment of 3500 hosts, with about 250 more pseudo hosts that we dynamically create for aggregate items within a host group (i.e. "group_<hostgroup>"). We wrote our own custom code that parses our software's metrics and then sends the metrics to trapper items. The custom code creates all of the templates, applications, items, triggers, LLD rules, hosts, prototypes, etc.

The custom code is written in python with heavy use of multiprocessing to break up metric parsing, metric construction and formatting, and sending of the LLDs and metrics to the trappers. One of the processes reads our "template" configuration file and if a change is detected it applies that change. In this specific case, I added a new triggerprototype. I believe the number of items in our database it was dependent on was about 6000 - 8000. So when my custom code ran the jsonrpc api post to create that triggerprototype it took about 1 minute or so. Simultaneously in my code's log file I could see that new metrics were arriving that triggered the creation of new aggregate items and triggers for those aggregate items. At this point any new triggers, hosts, etc that I tried to create failed with an error code of 32500 Application Error with a cause of duplicate entry for primary key and then the id number of a functionid. The function and trigger it referenced was one of the triggers that my custom code dynamically created for the aggregate items. *Note that the aggregate items are just to get the max and min values of certain metrics within our hostgroups. I don't think the cause is the aggregate items.

There isn't much to go on in the zabbix_server.log. I'm curious if the cause is trying to create a triggerprototype that references a large number of items (6-8 thousand in my case) and then simultaneously creating a bunch of items and triggers for those items. I see occassionally in the log that there is some incrementer that increments ID numbers for triggers. Is it possible that by performing concurrent requests to the database to make new triggers and triggerprototypes that duplicate primary keys (ids) are being created? Also of note is that the LLD workers became 100% utilized and several deadlocks were noted in my mysql log when I tried to perform the actions above. Perhaps the deadlock in the database is the culprit?

Result:

Expected:

Assignee:: Aigars Kadikis
Reporter:: Ryan Eberly
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: 2020 Feb 21 23:57
Updated:: 2020 Mar 25 17:39
Resolved:: 2020 Mar 25 17:39

Details

Description

Attachments

Activity

People

Dates