[ZBXNEXT-2519] Move API to Zabbix Server Created: 2014 Oct 16  Updated: 2021 Sep 06

Status: Reopened
Project: ZABBIX FEATURE REQUESTS
Component/s: API (A), Server (S)
Affects Version/s: None
Fix Version/s: None

Type: New Feature Request Priority: Major
Reporter: Alexei Vladishev Assignee: Unassigned
Resolution: Unresolved Votes: 34
Labels: performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by ZBX-10119 Method to go over all triggers given ... Closed

 Description   

There are number of drawbacks of current implementation of Zabbix API:

  • code duplication (C, PHP)
  • poor performance especially for template related operations
  • lack of bulk operations and in-memory cache
  • API code is too tied to front-end code (no clear separation)
  • API is not available without Zabbix front-end

It's proposed to move API to Zabbix Server side.



 Comments   
Comment by Marc [ 2014 Nov 15 ]

There should be a reliable mechanism (timeout?) to cancel or prevent expensive API requests.
Currently one can just restart the HTTP server. Restarting the Zabbix server is certainly no option.

Otherwise issues like ZBX-6763 might become a serious problem.

Comment by Marc [ 2014 Nov 15 ]

Btw, how about implementing this as a separate process accessing the shared memory of the server?

Comment by Andris Zeila [ 2014 Nov 21 ]

Yes, the current plan is to implement it as separate process(es). This will allow to partly re-use existing functionaly for some requests (for example template linking) and also will give access to configuration/value caches.

Comment by Oleksii Zagorskyi [ 2015 Aug 28 ]

Performance issue for template related operations discussed in ZBX-6118

Comment by OFN Team [ 2015 Nov 10 ]

We've had a number of comments from our developers that are creating micro services (which we'll run in application containers) that are looking for a restful end point to send metrics to. Will this be available on the proxy as well and will the API be expanded to allow updating item values?

Comment by Ross Peoples [ 2016 Feb 01 ]

Can we make this a separate component from the Zabbix server entirely so that it can be installed on a separate machine from the server? The reason I ask is this: the server already has a lot to do, ingesting data from the proxies, writing it to the database, monitoring anything not monitored by the proxies, calculating triggers and items, and running actions based on those triggers in addition to the other various tasks it has to do.

In a large enterprise environment, being able to horizontally scale is a requirement and thus far, we've been able to do that by adding proxies and additional web servers, but there can only be one Zabbix server. We are planning to start using the Zabbix API much more heavily in the next year, so being able to set up multiple API servers behind a load balancer would be extremely beneficial and take some load off the Zabbix server.

Comment by Ryan Armstrong [ 2016 Apr 03 ]

I agree with Ross. We definitely need to start separating the server into discrete, horizontally scalable micro-services. May need to look at memcached/redis for distributing in-memory caches between components.

I did consider prototyping a RESTful API written in C but it lead to another million new cool ideas and I couldn't decide where to start. For example, it really should sit in front of a shared Data Access Layer (or two: one for config, one for metrics) which could also be a discrete component with a RESTful/other API and then be used by the front-end, server, public API, etc.

So in summary, please don't move the API to the server binary; let's make it discrete and scalable.

Comment by Alexei Vladishev [ 2016 Apr 04 ]

Ryan & Ross, I totally agree with you. The API component must be as independent as possible and ideally shouldn't affect runtime processing of Zabbix Server. We may also give users a choice: run it as part of Zabbix Server or as a standalone process for those who prefer scalability. In both cases the API component shouldn't tightly coupled with Zabbix Server, there will be no shared resources at all.

From the other hand, having API on Zabbix Server side would allow interesting things like building distributed or HA Zabbix Server on top of the API or making all communications (server-agent, proxy-server) to be based entirely on API calls. Just imagine Zabbix Agent doing history.put or Proxy doing config.get. I like it, do you?

Comment by richlv [ 2016 Apr 04 ]

when decoupling the api, what about the server caches (value, configuration) ?
those would be very valuable to reuse for the api.
separate caching engines like redis could be considered, but the complexity of a zabbix deployment would increase massively

Comment by Ryan Armstrong [ 2016 Apr 05 ]

Yeah I think the caches are valuable to the API but do agree that redis/memcached/other would add install complexity.
Alexei, I definitely like it.

I understand this idea is turning into a complete redesign (Zabbix v4?), but if you will indulge me, I'd like to regurgitate the "million new cool ideas" that I feel may actually have value to you. Take it or tweak it or leave it I guess, but I feel some isolation and abstraction would make Zabbix stand out from the crowd.

In brief:

  • All shared memory caches are reimplemented as discrete services with private APIs (i.e. Zabbix implementation of redis/memcached/etcd)
  • Each worker process (currently discrete PIDs via fork()) migrated to a discrete service with private API (e.g. trapper, pinger, poller, trigger eval, action queue, etc.)
  • Two new Data-Access Layer (DAL) private APIs to abstract away the database (one for time series, one for config/other) to enable multiple backends (OpenTSB, MongoDB, etc.) without affecting other components. The config service could apply templates at runtime rather than storing duplicate data to disk (which is then expensive to update)
  • Use a message queue to improve API interconnects (pub/sub), atomicity of changes, producer/consumer load balancing, agent comms, delivery and queuing of metrics, actions, etc.
  • a watchdog service to make sure each component is started/listening
  • Public API is then a discrete service which leverages private APIs and DAL
  • Web front-end should exclusively use the public API ("eat your own dog food")

Advantages:

  • Improved (limitless?) scalability of each component (deploy as monolithic server or distributed micro-services (containerised?))
  • HA per component
  • Significantly improved extensibility (new DAL backends or event brokers would be so much easier!)
  • Improved topologies across network boundaries
  • Eliminate proxies (replace with ZabbixMQ or agent proxies like SCCM)
  • faster changes to templates via config API with updates published on the MQ (no bulk updates or syncers to DB)
  • Leverage service discovery protocols (e.g. consul)

Should I put together some sort of proposal on the wiki?

Generated at Fri Mar 29 07:53:19 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.