[ZBXNEXT-1368] Create Windows service discovery Created: 2012 Aug 20 Updated: 2016 Mar 02 Resolved: 2015 Aug 28 |
|
Status: | Closed |
Project: | ZABBIX FEATURE REQUESTS |
Component/s: | Agent (G), Frontend (F), Server (S) |
Affects Version/s: | None |
Fix Version/s: | 2.5.0, 3.0.0alpha2 |
Type: | New Feature Request | Priority: | Trivial |
Reporter: | Raymond Kuiper | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 48 |
Labels: | lld, patch, services, windows | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Windows hosts |
Attachments: | ZBXNEXT-1368.patch | ||||
Issue Links: |
|
Description |
It would be handy if the LLD functionality could be extended to perform a discovery of all the services that are known on a Windows host to allow for automatic monitoring of these services. It would be even better if we could choose to enable only monitoring of services that are started at boot time. Specification: https://www.zabbix.org/wiki/Docs/specs/ZBXNEXT-1368 |
Comments |
Comment by Grzegorz Grabowski [ 2012 Aug 22 ] |
Win-agent "services[<type>,<state>,<exclude>]" not enough? Examples: http://www.zabbix.com/documentation/2.0/manual/config/items/itemtypes/zabbix_agent/win_keys Bests, |
Comment by richlv [ 2012 Aug 23 ] |
those can't be directly fed into lld rule |
Comment by Raymond Kuiper [ 2012 Aug 23 ] |
The problem with the solution that Grzegorz mentioned is that you can only compare a list of "running services" with a list of "automatically started services" and you can trigger on a difference between the two lists. There is no way to mention the exact service that is not running properly (Please correct me if I'm wrong). What I would like LLD to do is get the list of (automatically started) services and use prototypes to define items and triggers for each service's state so we can have triggers descriptions like "Service XYZ is not running on {HOSTNAME}". |
Comment by Alessandro De Maria [ 2012 Sep 06 ] |
+1 from me. Has anybody come up with a clever Windows script for this in the interim? |
Comment by Raymond Kuiper [ 2012 Dec 14 ] |
I have now https://github.com/q1x/zabbix-templates/tree/master/service-discovery @Devs, please have a look here for the way it should function IMHO. |
Comment by Raymond Kuiper [ 2012 Dec 17 ] |
@Alessandro: I notice that you have not voted on the issue yet. |
Comment by Raymond Kuiper [ 2013 Jan 02 ] |
See this Zabbix blogpost for more info on my solution: |
Comment by Elvar [ 2013 Jan 03 ] |
|
Comment by adriano [ 2013 Jun 22 ] |
I'm in, services discovery would be really good stuff. The need to login on server to know what services it runs is something that takes a lot of time when you have costumers with about 50 servers. Monitoring servers is something that need planning, and then we get a checklist of main services, but knowing the exact name of the services for the monitoring is not always what we have, and this always take a lot time. |
Comment by Marco Janse [ 2013 Jul 23 ] |
I'm in too. Just voted for this feature request. In the meantime, I will give Raymond's script a try. Thanks for posting! |
Comment by Keith Braithwaite [ 2013 Oct 02 ] |
Another use for this feature is a case where I want to monitor all exchange server services on a given server. Different exchange servers will run different services based on the implementation of that server. With this type of discovery rule I could simply setup a template to find all servers that start with "MSExchange" and monitor them, rather than setting up all the services in the template and "disabling" the ones I don't need on a particular server. |
Comment by Andreas Jud [ 2013 Oct 31 ] |
+1 We have running a Database-Server (MS SQL) with multiple Instances. Each Instance has it's own Service. So we must create for Each Service an Item and a Trigger. Not very sexy.. Andy |
Comment by Ryan Armstrong [ 2015 Feb 16 ] |
Here is a patch for native service discovery in Windows on agent v2.2.8: It's been working in production for us for a few months on several hundred servers. In hind site, this key shouldn't be used to loosely. We found a lot of Windows servers were configured to start 'Automatically' but routinely stop running without issue. I think this would be more useful with applications specific needs such as SQL or Exchange monitoring (using Regex discovery filters). |
Comment by Marc [ 2015 Feb 16 ] |
Just attached ryan.armstrong's patch to open the way for assigning the 'patch' label |
Comment by Ryan Armstrong [ 2015 Feb 16 ] |
Thanks Marc |
Comment by Raymond Kuiper [ 2015 May 18 ] |
I've also noticed Windows has auto start services that exit after a while. Using the regex filters one could filter out those services. The multiple LLD filter feature as developed in Currently I'm not using Windows in the field anymore, but perhaps someone else can attach a list of default Windows services that exit automatically? |
Comment by Raymond Kuiper [ 2015 May 18 ] |
Not sure if the patch works around this, but I've received a bug report on my Powershell script that it fails on the following "Path to execuable": "C:/ProgramFiles/OmniBack/bin/rds.exe" ob2_40 Apparently, some options can be supplied after the executable name... |
Comment by xanadu dm [ 2015 May 19 ] |
A list of services that exit automatically: Service Portable Device Enumerator Service Thanks Raymond for your efforts! |
Comment by xanadu dm [ 2015 May 21 ] |
I modified the script from Raymond to also include the services which are listed as "Manual". This is sufficient for me to use. |
Comment by Igors Homjakovs (Inactive) [ 2015 May 22 ] |
Available in svn://svn.zabbix.com/branches/dev/ZBXNEXT-1368 |
Comment by Aleksandrs Saveljevs [ 2015 May 28 ] |
(1) Item "service.info[...,startup]" is currently made to return a number, which is one of the "dwStartType" constants mentioned at https://msdn.microsoft.com/en-us/library/windows/desktop/ms684950(v=vs.85).aspx . I wonder whether this is good, because "service.discovery" transforms these numbers into human-readable strings like "manual", "disabled", and "automatic" for "{#SERVICE.STARTUP}" macro. asaveljevs Similarly, in "service.info[...,state]" we map "dwCurrentState" to our own constants (0, ..., 7, 255), but in "service.discovery" we just return "dwCurrentState" as is in "{#SERVICE.STATE}". igorsh RESOLVED in r53859:r53906 asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 May 28 ] |
(2) In SERVICE_DISCOVERY() function, we call EnumServicesStatusEx() with SERVICE_WIN32. According to https://msdn.microsoft.com/en-us/library/windows/desktop/ms682640%28v=vs.85%29.aspx , this means that services of type SERVICE_FILE_SYSTEM_DRIVER and SERVICE_KERNEL_DRIVER will never be returned. Therefore, "{#SERVICE.TYPE}" macro does not make a lot of sense currently: it will always be "service". asaveljevs Similarly, according to https://msdn.microsoft.com/en-us/library/windows/desktop/ms684950(v=vs.85).aspx , services of types SERVICE_WIN32 will never have "dwStartType" of SERVICE_BOOT_START and SERVICE_SYSTEM_START, because these constants are only valid for driver services. igorsh RESOLVED in r53859:r53906 asaveljevs I have just discussed this with a seasoned Windows admin (zalex_ua) and he expressed a doubt that driver services are worth including. Here is an example: { "{#SERVICE.DISPLAYNAME}": "adpu160m", "{#SERVICE.NAME}": "adpu160m", "{#SERVICE.PATH}": "\\SystemRoot\\system32\\drivers\\adpu160m.sys", "{#SERVICE.STARTUPNAME}": "disabled", "{#SERVICE.STARTUP}": 5, "{#SERVICE.STATENAME}": "stopped", "{#SERVICE.STATE}": 6, "{#SERVICE.TYPENAME}": "kernel driver", "{#SERVICE.TYPE}": 0, "{#SERVICE.USER}": "" }, asaveljevs Note that "services[]" does not return driver services. Let's discuss it a bit more. REOPENED. igorsh After discussion it was decided not to include drivers. Only very few users might be interested in that functionality. Additionally, service type field will be also taken out. Information about what type of service- own, shared or interactive is also hardly useful. RESOLVED in r53959.
asaveljevs We might also wish to change "auto" to "automatic", because "automatic" seems to be what people usually see in MMC. REOPENED igorsh RESOLVED in r53993. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 May 28 ] |
(3) In SERVICE_DISCOVERY() function, if either of the calls to QueryServiceConfig() fails, we will return an incomplete discovery record. igorsh RESOLVED in r53859:r53906 asaveljevs CLOSED asaveljevs After adding QueryServiceConfig2() call, the same problem reappeared. REOPENED. igorsh RESOLVED in r53994. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 May 28 ] |
(4) Please review changes in r53821 through r53838. They fix a couple of bugs like unnecessary calls to zbx_free(). igorsh Thank you. CLOSED. |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
(5) Here is the documentation for QUERY_SERVICE_CONFIG structure: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684950(v=vs.85).aspx . It says that "dwServiceType" can be one of the following values: SERVICE_FILE_SYSTEM_DRIVER (0x00000002), SERVICE_KERNEL_DRIVER (0x00000001), SERVICE_WIN32_OWN_PROCESS (0x00000010), SERVICE_WIN32_SHARE_PROCESS (0x00000020). Additionally, if it is SERVICE_WIN32_OWN_PROCESS or SERVICE_WIN32_SHARE_PROCESS, SERVICE_INTERACTIVE_PROCESS (0x00000100) can be specified, too. Based on the above, my conjecture is that SERVICE_INTERACTIVE_PROCESS is an additional bit in the bitmask, rather than an independent value. Therefore, the current implementation of get_type_string() is wrong: static const char *get_type_string(DWORD type) { switch (type) { case SERVICE_KERNEL_DRIVER: return "kernel driver"; case SERVICE_FILE_SYSTEM_DRIVER: return "file system driver"; case SERVICE_WIN32_SHARE_PROCESS: return "win32 share process"; case SERVICE_WIN32_OWN_PROCESS: return "win32 own process"; case SERVICE_INTERACTIVE_PROCESS: return "interactive process"; default: return "unknown"; } } It might also be nice to arrange the constants in some order (currently, they are arranged neither alphabetically nor numerically). Perhaps, numerical order would be more consistent with the rest. asaveljevs Note that both "service.discovery" and "service.info[]" are affected by SERVICE_INTERACTIVE_PROCESS being an additional bit in a bitmask. asaveljevs Added value mapping is affected, too. igorsh RESOLVED in r53959. asaveljevs The resolution seems to be that we no longer distinguish between service types - everything is just a service. Do we still need value mappings then? REOPENED. igorsh RESOLVED in r53995 and r53996. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
(6) QueryServiceConfig2() supports querying service description using SERVICE_CONFIG_DESCRIPTION (see https://msdn.microsoft.com/en-us/library/windows/desktop/ms684935(v=vs.85).aspx). Do we wish to include it, too? igorsh RESOLVED in r53962 and r53978. asaveljevs In SERVICE_INFO(), if we fail to get service description, we return string "empty". If we fail to get some other parameter (e.g., display name), we make the item not supported. Why do we make this distinction? If we fail to get service description, the item should be made not supported, too. asaveljevs Also, note that according to https://msdn.microsoft.com/en-us/library/windows/desktop/ms685156(v=vs.85).aspx , the "lpDescription" field of NULL indicates no description. We should probably return an empty string then. asaveljevs Also, note that SET_TEXT_RESULT() should be used for multiline values. asaveljevs Finally, note that SET_STR_RESULT(result, "empty") will crash at runtime, because the result string has to be dynamically allocated. REOPENED. igorsh RESOLVED in r53999. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
(7) Flag SERVICE_QUERY_STATUS was removed from the call to OpenService(). Now item "services[]" does not return anything, because SERVICE_QUERY_STATUS permission is required for QueryServiceStatus() function. asaveljevs Alternatively, it seems we can keep the reduced permissions, but use "ssp[i].ServiceStatusProcess.dwCurrentState", as returned by EnumServicesStatusEx(), same as in SERVICE_DISCOVERY(). igorsh RESOLVED in r53960. asaveljevs The fix was to bring SERVICE_QUERY_STATUS permission back. CLOSED. |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
(8) In loops that map between system values and Zabbix values, consider using ARRSIZE() macro for confidence: - for (k = 0; k < 5 && service_type != service_types[k]; k++) + for (k = 0; k < ARRSIZE(service_types) && service_type != service_types[k]; k++) asaveljevs It might also be useful to consider reducing indentation in SERVICE_DISCOVERY(). igorsh RESOLVED in r53961. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
(9) In SERVICE_DISCOVERY(), consider handling QueryServiceConfig() errors in the same way as in SERVICE_INFO() by setting an error message and making the item not supported. igorsh RESOLVED in r53978. asaveljevs The resolution is that the error message is logged at DebugLevel=4, but discovery continues with other services. CLOSED. |
Comment by Aleksandrs Saveljevs [ 2015 Jun 04 ] |
(10) Please review r53941. Changes are mostly stylistic. igorsh Thank you. Looks good. asaveljevs Please review r54032. Mostly style as well. RESOLVED. igorsh CLOSED. |
Comment by Igors Homjakovs (Inactive) [ 2015 Jun 09 ] |
(11) Item service.info[] should be added to item key selection window in the frontend. igorsh RESOLVED in r54092. asaveljevs Three issues:
asaveljevs First two issues RESOLVED in r54094. Third issue added as (15) below. igorsh Thank you. Looks good. CLOSED. |
Comment by Aleksandrs Saveljevs [ 2015 Jun 11 ] |
(12) Currently, the following pages state that an EXE file can be specified as the first argument to service_state[]:
Testing shows that this statement is dubious - items just become not supported. Further, documentation for OpenService() function states that forward and backward slashes are invalid characters in the service name. Hence, a path cannot be specified. It is therefore proposed to remove the EXE part from service_state[] documentation. igorsh RESOLVED. asaveljevs The note about service_state[] being deprecated since 3.0 should be removed from pre-3.0 documentation. (Already done by martins-v.) asaveljevs "Windows Services Control Manager" should probably be renamed to something like "MMC Services snap-in". asaveljevs There are also two problems specific to the 3.0 page:
asaveljevs REOPENED. igorsh RESOLVED. asaveljevs It still does not describe the return values for service.info[...,state] and service.info[...,startup]. REOPENED. igorsh RESOLVED. asaveljevs CLOSED |
Comment by Igors Homjakovs (Inactive) [ 2015 Jun 12 ] |
(13) Updated documentation:
asaveljevs It might be useful to move https://www.zabbix.com/documentation/3.0/manual/introduction/whatsnew300#windows_service_discovery to "5.9 Item changes/improvements" section, similar to db.odbc.discovery. asaveljevs That page also does not mention anything about the new service.info[] item. asaveljevs https://www.zabbix.com/documentation/3.0/manual/discovery/low_level_discovery still counts the number of discoveries available out of the box as 5. asaveljevs https://www.zabbix.com/documentation/3.0/manual/discovery/low_level_discovery#discovery_of_windows_services mentions macro names without { and }. asaveljevs I am not sure that https://www.zabbix.com/documentation/3.0/manual/appendix/macros/supported_by_location#macros_used_in_low-level_discovery benefits from a list of Windows discovery macros. That section is probably meant to specify where LLD macros can be used, rather than be a definitive list of LLD macros (for instance, it does not have CPU or ODBC discovery macros). REOPENED. igorsh RESOLVED asaveljevs The following suggestions were ignored:
asaveljevs Meanwhile, "What's new in Zabbix 3.0.0" page does not seem to be the proper place to document macros returned by Windows service LLD. I have linked to the LLD page instead, similar to ODBC LLD. Please review. asaveljevs REOPENED igorsh RESOLVED asaveljevs CLOSED richlv looks like we have completely removed service_state description - that's no good as users might see some older templates and think that item won't work. we should return the description + mention that it's deprecated and suggest the new solution. then we can remove the deprecation note from the new item. sasha the description of the deprecated items we can see in older versions of the documentation. CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 Jun 17 ] |
(14) There is an inconsistency between the old "service_state[]" item and "service.info[...,state]". If the specified service does not exist, the old item returns 255. The new item, however, becomes not supported. It should be discussed whether this is a problem. asaveljevs After discussion, 255 should either be removed from the service state value mapping or not. asaveljevs Discussion showed that people use 255 with "service_state[]" to check that the service does not exist. Therefore, "service.info[...,state]" should return 255 for non-existing services, too, as an exception. For other values of the second parameter it should still become not supported. igorsh RESOLVED in r54104. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 Jun 17 ] |
(15) The following translation string was removed: State of a service. Returns 0 - running; 1 - paused; 2 - start pending; 3 - pause pending; 4 - continue pending; 5 - stop pending; 6 - stopped; 7 - unknown; 255 - no such service The following translation string was added: Information about a service. Returns integer with param as state, startup; string with param as displayname, path, user; text with param as description igorsh CLOSED. |
Comment by Igors Homjakovs (Inactive) [ 2015 Jun 18 ] |
Available in 2.5.0 (trunk) r54111. |
Comment by richlv [ 2015 Aug 27 ] |
(16) item helper text does not include whole "return value" column string from the manual; asaveljevs Fixed in pre-2.5.1 (trunk) directly in r55197. Translation string removed: Information about a service. Returns integer with param as state, startup; string with param as displayname, path, user; text with param as description Translation string added: Information about a service. Returns integer with param as state, startup; string - with param as displayname, path, user; text - with param as description; Specifically for state: 0 - running, 1 - paused, 2 - start pending, 3 - pause pending, 4 - continue pending, 5 - stop pending, 6 - stopped, 7 - unknown, 255 - no such service; Specifically for startup: 0 - automatic, 1 - automatic delayed, 2 - manual, 3 - disabled, 4 - unknown RESOLVED. <richlv> looks just perfect to me, thank you |
Comment by Oleksii Zagorskyi [ 2016 Mar 01 ] |
|
Comment by Ryan Armstrong [ 2016 Mar 01 ] |
Did my patch get used for this? Using Github and pull requests would be a great way for people to contribute and get recognition. |
Comment by Aleksandrs Saveljevs [ 2016 Mar 01 ] |
Yes, there is the following ChangeLog entry: ..FGI..... [ZBXNEXT-1368] added Windows service discovery and service.info[] item; thanks to Ryan Armstrong for patch |
Comment by Ryan Armstrong [ 2016 Mar 02 ] |
Oh cool! Thanks very much. |