[ZBXNEXT-413] Maintenance Period on Trigger/service level Created: 2010 Jun 17  Updated: 2024 Apr 10  Resolved: 2018 Sep 03

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: API (A), Frontend (F), Server (S)
Affects Version/s: 1.8.2
Fix Version/s: 4.0.0beta1, 4.0 (plan)

Type: New Feature Request Priority: Major
Reporter: Florian Koch Assignee: Andris Zeila
Resolution: Fixed Votes: 140
Labels: flexibility, maintenance
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Issue Links:
Causes
causes ZBX-14891 Undefined index: show_timeline [jsrpc... Closed
causes ZBX-14954 Invalid psql query crashes Zabbix SQL... Closed
causes ZBX-15124 Trigger overview behavior when using ... Closed
Duplicate
is duplicated by ZBXNEXT-1203 Per trigger maintenance Closed
is duplicated by ZBXNEXT-3603 More nuanced maintenance windows Closed
is duplicated by ZBXNEXT-3304 Maintenance mode and IT Services / SL... Closed
Sub-task
depends on ZBXNEXT-4684 Use shared table locks when processin... Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
ZBXNEXT-4572 Frontend: Maintenance Period on Trigg... Specification change (Sub-task) Closed Andrejs Griščenko  
Team: Team A
Team: Team A
Sprint: Sprint 34, Sprint 35, Sprint 36, Sprint 37, Sprint 38, Sprint 39, Sprint 40, Sprint 41
Story Points: 7

 Description   

is it possible to add Maintenance Periods for triggers only ?

We monitoring hosts with multiple services on each, and if i would shutdown only one, i need to add the complete host to the maintenance. This is bad, because if another service fail, Zabbix won't report this .

So i like to be able to set Maintenance Periods for hosts and/or triggers.

rgds flo



 Comments   
Comment by Benjamin Coles [ 2010 Nov 19 ]

I'd second this, we have some servers who run into constant memory triggers due to JVM doing a garbage dump. It'd be nice to setup a maint period until the issue gets resolved.

Cheers,
Benjamin Coles
Senior Systems Administrator
Information Resources & Technology
Stanford School of Medicine

Comment by Florian Koch [ 2011 Feb 25 ]

Any news on this?

Comment by Jessie Bryan [ 2011 May 24 ]

We are a service provider with a large ratio of edge routers to interfaces (virtual ATM interfaces for instance)

So for one data center we may have 2 routers that host 20,000 interfaces.
We use nagios presently to schedule downtime on individual circuits (tied to the router host).
When a customer circuit goes down it may be them doing maintenance after hours, or perhaps power outage. An ACK doesn't do us any good since we cannot wrap a time frame around it, and it must first page us before we can silence notifications.

Another example is to schedule downtime on CPU Triggers

Comment by matthias zeilinger [ 2011 Jun 09 ]

it´s also a good idea to add a maintenance option for complete templates too (additional to a single trigger).

because we are monitoring many jboss-applications on one host. for each application we have a template and if we maintain one application we want to set the complete template on this host to maintenance.

Comment by Paul [ 2011 Jun 14 ]

This would be awesome. We have network switches with 400+ ports, creating a window for the whole switch doesn't work for us(obviously).

Comment by Gavin Balaam [ 2011 Aug 10 ]

I think it best to resolve the maintenance issue in general first, but I have a system whereby I must stop monitoring on some components, but need to retain other monitors.

I'd therefore say that maintenance some be granular to an item, template, Application, or server.

It would be good to have exclusion so a server can be in maintenance, but specific items excluded. (But very much nice to have)

Any chance of someone supplying an eta, or maybe a separate build that includes?

Comment by David [ 2011 Oct 03 ]

I'd also really like the maintenance to be granular to an item, template, application, trigger or server.

A workaround I found was to clone the host and split the checks between the hosts (ones in maintenance on one and those that won't be in maintenance on the other).

Comment by Chandra [ 2012 Mar 26 ]

we are eagerly waiting for this ability to place maintenance for trigger/item instead of hole host or group.

Please make sure this issue is resolved in future release.

Thanks

Comment by Volker Fröhlich [ 2012 Dec 03 ]

Maybe this approach can help some in the meantime:

http://blog.zabbix.com/a-workaround-for-trigger-based-maintenance/1527/

Comment by ca.unix [ 2012 Dec 19 ]

Hi,

To wait the feature, i found a solution to disable trigger mails during a specific period :

=> exclude the trigger from the problems report and recreate an action specific for the trigger, specifying the period when we want to bring a mail.

If it can be usefull to someone...

Comment by quentel [ 2013 May 21 ]

I Agree that would be a great feature tha actually corresponds to a need when monitoring applications.

Let's say your application running on several servers is under maintenance, what you expect to achieve with "maintenance tool" in zabbix is not to assess triggers corresponding to the application for a certain period of time. However, you should not turn off all triggers for the host.

An example is a database server on which one database is under maintenance. You would need to deactivate the trigger corresponding to that DB while the onesfor other databases are still running.

To sum up, what would be great is to be able to select wether tou want to apply maintenance globally for a host or just trigger by trigger!

Comment by Little Martian [ 2014 May 30 ]

If i may formulate a more generic approach: make maintenance period linkable to an arbitrary list of objects (currently you can only do this for ibjects of type host or host group, but would be nice to link a maintenance to a list of itmes and triggers based on those items would no longe fire while item is in maintenance, or link maintenance to a list of applications that may be spread over several hosts etc.)

Comment by Matthias Baur [ 2014 Sep 15 ]

This would be great! Please implement a more granular option for maintenance periods. It would also be great to set this on application basis.

Comment by Ionut Cadariu [ 2015 May 02 ]

Any news regarding Maintenance Period on Trigger?

Comment by Michelle Taggart [ 2015 Jun 05 ]

Voted on this and would love to see it in 3.0 if possible.

Comment by Gustavo Moura de Sousa [ 2015 Aug 25 ]

This feature request is important to put a web scenario in maintenance.

Comment by Ben Schuler [ 2016 Apr 13 ]

Hi all.

We are evaluating this tool to replace Nagios at a large shop and while Zabbix has some great features that we want to use, the lack of a per-service maintenance period is a deal-breaker.

We have a number of deployment automations in place that quiet a set of services for the duration of the maintenance. I would add a request to attach a timer to the maintenance period so that it expires after a pre-set duration.

Comment by Darcy Ning [ 2016 Jun 14 ]

For zabbix version 1.8 please reference URL:https://www.zabbix.com/forum/showthread.php?t=17341

Comment by Darcy Ning [ 2016 Jun 14 ]

If we add condition option "not" for "Type of calculation" this problem can also be solved.

Comment by Darcy Ning [ 2016 Jun 15 ]

Dear all, can anyone help with this case?

Comment by itcbsops [ 2016 Sep 13 ]

I see that this feature isn't present even in upcoming 3.2
Any news please ?
We need this feature because we have multiple services on single AIX machines. We don't want to disable all monitoring on the host but only some of the services.

Comment by richlv [ 2016 Sep 13 ]

as it is not on any roadmap, you might want to contact zabbix regarding financed development of this feature

Comment by Strahinja Kustudic [ 2016 Sep 13 ]

I was hoping that it would be possible to do something with Tags to fine grain maintenance in 3.2, but for now it doesn't look like that.

Comment by Frank [ 2016 Dec 13 ]

Would also be nice if maintenance could also be limited to specific "Applications".

Comment by itcbsops [ 2016 Dec 13 ]

I agree with @frank

Comment by Max DiOrio [ 2017 Feb 24 ]

I'll throw my hat into this. YES! This is crucial. We have load balancers, a single device with dozens of individual services that have monitors and triggers.

Just now, we have a single server causing us a headache, I put that server into maintenance, but I can't put the load balancer into maintenance since it would prevent us from receiving alerts on ALL our Load Balanced sites. NOT GOOD!

It doesn't make sense for us to make multiple devices for each site, that's a lot of overhead to maintain, especially since services are brought into Zabbix via LLD.

Comment by Pascal Uhlmann [ 2017 Apr 21 ]

Additionally to maintenances on trigger level it would be helpfull if maintenances could be specified on item level. So any trigger using this item could implicitly be put into maintenance.

Comment by Florent [ 2017 Jul 31 ]

Hello,
I add a comment to say that this feature would be great if implemented...
I would say that it would be even better if we could put a service/host in maintenance mode from the "Monitoring --> Problems" page as we do to acknowledge a triggered trigger.

I think more actions can be done within this page, like Centreon does.

Thanks

Comment by Glebs Ivanovskis (Inactive) [ 2017 Jul 31 ]

With ZBXNEXT-1675 it is possible in 3.4 to add a sort of item based daily or weekly maintenance without data collection. You should add another flexible interval with 0 as interval (no data collection) and {$MAINTENANCE} as period. Then you define {$MAINTENANCE} to be something like "2-4,5:00-5:30" (from Tuesday to Thursday from 5 to 5:30 a.m.) And you can use same macro in multiple item configurations, macro contexts ({$MAINTENANCE:service1}, {$MAINTENANCE:service2}, etc.) are supported as well. It is still not a complete solution for per item/trigger maintenance (no single time maintenances, no windows), but macros make things a lot easier to manage.

Comment by Braden Borg [ 2017 Oct 11 ]

This would be an awesome feature to have. We have many instances where we don't want all the items on a host to go maintenance mode. Currently we would have to disable every thing because of this lack of functionality.

Comment by Marcos Buzo [ 2017 Oct 19 ]

Totally agree this would be an awesome feature, specially when you are monitoring network devices with multiple connections. For example, if you have 2 carrier links connected to a single network device and carrier A announces a maintenance, ideally you would put only the trigger for carrier A on maintenance mode, not the entire host + all the links.

Comment by Max DiOrio [ 2017 Nov 20 ]

How has this been around for 7 years and still not even assigned to someone? This is such an important feature. We recently have implemented data builds in production and during the build CPU spikes over threshold. We're forced to either put the entire host in maintenance mode, which sucks since disk and memory are critical, or just ignore the alerts, which also sucks.

How do we get traction on such a simple thing?

Comment by Marc [ 2017 Nov 20 ]

mdiorio,
the fast lane is by (co-)sponsoring its development. Who knows, possibly there's not much left to co-sponsor in order to get it implemented.

Comment by Szymon Baranek [ 2018 Mar 12 ]

alexei - could you please let us know why this topic in not important for you (I mean Zabbix team) or what difficulties are to process this feature request successfully? I've revied the activity log and found no important activity from your side. It seems that really nothing has happened since 8 years. Also there is no explanations of that. I'd love to find out your approach into this matter.

This is very important topic in many use cases.

Comment by dimir [ 2018 Mar 15 ]

You are right, I'll try to get some attention to this.

Comment by dimir [ 2018 Mar 22 ]

I couldn't get more information other than this task seems to be pretty complicated and huge and thus requires sponsorship. As far as I know currently there are no requests about becoming a sponsor to our sales team.

Comment by Dior Gardner [ 2018 Mar 22 ]

Thanks for checking Dimir. Did they mention the cost associated with the sponsorship?

Comment by Marc [ 2018 Mar 22 ]

dior.gardner,

sales at zabbix.com may tell you, if the estimate was done already.

Comment by Jessie Bryan [ 2018 Mar 22 ]

Back in June 2011 I received an estimate from Sergey Sorokin (Zabbix Director of Business Dev), and the estimate was not an unrealistic number. For us, we ended up using a different solution/NMS that we already had in service.

Comment by John Tammaro [ 2018 May 15 ]

I have been forwarded to this link from a request for enhancement I logged for the exact same requirement.

Just adding my thoughts that this would be very useful to have for repeat known triggers that alert.

Example, our backup solution runs on Saturdays and Sundays for all customer devices. We expect the CPU to run high so dont want the trigger alert. I do still want up time, disks, interfaces etc so a full maintenance window is not appropriate for weekend coverage.

I have also been advised of the dayofweek() function. This could work small scale, but large scale its a bit messy.

Thanks All

 

JT

Comment by Andris Zeila [ 2018 Jun 29 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBXNEXT-413

Comment by Vladislavs Sokurenko [ 2018 Aug 03 ]

(36) [SD] StartTimers documentation must be updated to indicate that it does not process time-based trigger functions and that not only first timer process handles maintenances.

### Option: StartTimers
#	Number of pre-forked instances of timers.
#	Timers process time-based trigger functions and maintenance periods.
#	Only the first timer process handles the maintenance periods.

wiper: RESOLVED in r84057

vso CLOSED

Comment by Vladislavs Sokurenko [ 2018 Aug 03 ]

(39) [D] Even if server does not have any maintenances created or any events suppressed, it will still read whole problem table, functions and triggers on each startup.

vso this should be documented or possible

This should be changed

				/* force maintenance updates at server startup */
				if (0 == maintenance_time)
					update = SUCCEED;

To this:
check if event_suppression count is 0 and running maintenances is 0 then don't force. (nothing to remove by removed maintenances and cannot add new because of no maintenances)

wiper: That would not be enough. We still need to check and update hosts.

vso yes, host updates can stay. Let's document for now so it is at least clear what is happening if someone has that problem.

martins-v Added to the upgrade notes. RESOLVED

wiper: CLOSED

Comment by Andris Zeila [ 2018 Aug 24 ]

Released in:

  • pre-4.0.0beta1 r84067
Comment by Martins Valkovskis [ 2018 Aug 29 ]

(48) [D] Updated documentation:

RESOLVED

vmurzins CLOSED

Comment by Valdis Murzins [ 2018 Aug 29 ]

(49) [D] My comments on documentation update in (48):

Errors:
https://www.zabbix.com/documentation/4.0/manual/web_interface/frontend_sections/monitoring/dashboard/widgets

  • Trigger overview => missing new checkbox

https://www.zabbix.com/documentation/4.0/manual/introduction/whatsnew400#host_maintenance_on_trigger_level

  • Configuration option in dashboard widgets: => Should also include newly added option for Trigger overview.

To consider:
https://www.zabbix.com/documentation/4.0/manual/maintenance:

  • The Hosts and groups tab allows you to select the host groups, hosts and host trigger tags for maintenance. => [..] hosts and problem tags for maintenance.
  • If tags are specified, maintenance for the selected hosts will be limited to the triggers with the corresponding tags. => [..] to the problems with the corresponding tags.
  • Display => Also add a screenshot for the "eye" icon for suppressed problems.

https://www.zabbix.com/documentation/4.0/manual/web_interface/frontend_sections/monitoring/overview

  • Overview of triggers => screenshot is updated, but not explained, what the "Show suppressed problems" checkbox does.

https://www.zabbix.com/documentation/4.0/manual/config/visualisation/maps/map#creating_a_map

  • Shouldn't we have a screenshots for different highlighting on map elements? Currenlty I see it only in text for "Icon highlighting" and "Mark elements on trigger status change" parameters. Also it is not mentioned, that maintenance background has bigger priority over problem severity background.

martins-v These screenshots are in the map viewing section, where I've added the detail about maintenance background having higher priority. I added a link to it from the map configuration section.

https://www.zabbix.com/documentation/4.0/manual/web_interface/user_profile/global_notifications#configuration

  • Global messages displayed => Don't we want to have screenshot of notification with suppressed problem? vmurzins This icon was not implemented, so no changes here.

martins-v Thanks for the comments and corrections, I've tried to fix the issues. RESOLVED

agriscenko These pages should also be updated:

  • https://www.zabbix.com/documentation/4.0/manual/config/visualisation/maps/links
    • If multiple triggers go into a problem state, the one with the highest severity will determine the link style and color. If multiple triggers with the same severity are assigned to the same map link, the one with the lowest ID takes precedence.

      This is not true anymore. Link style and color are determined by trigger having problem with the highest severity, "Show suppressed problems" and "Minimum severity" settings in map configuration.
      Please also consider that there could be triggers with multiple problems (multiple problem generation). Each problem may have severity that differs from trigger severity (changed manually), may have different tags (due to macros) and may be suppressed.

  • https://www.zabbix.com/documentation/4.0/manual/xml_export_import/maps

martins-v Thanks for the detailed information, RESOLVED

vmurzins Great. Thank you. CLOSED

Comment by Andris Zeila [ 2018 Aug 29 ]

(50) [D] Maintenance tags description:

Like - similar string match
Equal - exact case-sensitive string match

I'm not sure how similar features are documented, but more precise would be to say that Like is case-sensitive substring match. As it stands now there is question if abcde should match abcdf - they are similar enough. And explicitly mentioning case sensitivity for 'Equal' might imply that 'Like' performs case insensitive match.

However if the like operator in other places is defined in the same way, then users are already familiar with it and I have no objections.

martins-v I agree that precision is better. I've updated the descriptions. RESOLVED

vmurzins

wiper: CLOSED

Comment by Garry Shtern [ 2018 Sep 03 ]

This is not quite the same thing as what is being asked.  Granted, if one wants to suspend a specific trigger, he can reference a tag associated with that trigger.  However, this would require assigning specific tags to differentiate those, and would require multiple steps just to suspend notifications.  That is, assume I have a switch that has 60 interfaces, and I need to suspend just one of those. I would have to assign a tag to that particular one, and add it to maintenance schedule.  Then, once the maintenance is over, I would need to remove that tag or come up with some other way of dealing with them.

How difficult would it be to actually implement selection of triggers in the maintenance criteria, in addition to tags? Perhaps, as another option under host/host-group section?

Comment by Vitaly Zhuravlev [ 2018 Sep 04 ]

[email protected], you may add tags to all interfaces in advance in the template for network interfaces: for example using tag: IFNAME=

{#IFNAME}

or IFALIAS=

{#IFALIAS}

macros that will identify particular network interface, Then you may reference specific interface by that tag in maintenance rule.

Tags have been chosen instead of referencing a specific trigger, because each monitored entity(not trigger) you are about to maintain (network interface, Database, disk array...) probably may have more than one trigger defined. and you want to suppress all of then not just one. And we also wanted to avoid the need to synchronize maintenance rules each time new triggers added/deleted that may be related to interface,database,disk. We believe this is a more flexible approach.

Comment by Nicolas Bataille [ 2019 Jul 12 ]

Sorry [Vitaly|https://support.zabbix.com/secure/ViewProfile.jspa?name=vzhuravlev] but i'm agree with [Garry|https://support.zabbix.com/secure/ViewProfile.jspa?name=garry.shtern%40xrtrading.com]

 

I totaly understand the need for the tag, BUT the original question (and my need) is to flag maintenance on a trigger for a host.

Generated at Fri Apr 19 09:19:38 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.