[#ZBXNEXT-101] Debugging failed web monitoring steps

[ZBXNEXT-101] Debugging failed web monitoring steps Created: 2009 Oct 09 Updated: 2016 May 29 Resolved: 2014 Aug 26
Status:	Closed
Project:	ZABBIX FEATURE REQUESTS
Component/s:	Frontend (F), Proxy (P), Server (S)
Affects Version/s:	None
Fix Version/s:	2.3.4

Type:

Change Request

Priority:

Major

Reporter:

Christoph Haas

Assignee:

Unassigned

Resolution:

Fixed

Votes:

Labels:

debugging, webmonitoring

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Environment:

Debian. Zabbix 1.6.6

Attachments:

web_scenario_log_headers_request.patch

web_scenario_log_page.patch

Issue Links:

Duplicate
is duplicated by	~~ZBXNEXT-777~~	Web monitoring should have some debug...	Closed

Description

Currently when a Web Monitoring step fails it doesn't tell why. It would be great if there were some way of showing what happened instead. Like showing the first kilobyte of the HTTP body.

I increased the debug level on the server but even then had a hard time finding out why a certain step failed.

Comments

Comment by Hannu Ylitalo [ 2010 Jan 21 ]

Also would be nice to have item for each step and have zabbix go through all steps even if one step fails and have some macro for actions/triggers that tells which step failed.

Comment by greg long [ 2010 Jul 01 ]

I have created a patch that dumps web output to the zabbix log on a failed web step. Here is the code, feel free to integrate it. It works with 1.4 thru 1.8

modify /src/zabbix_server/httppoller/httptest.c

around line 460

replace

zabbix_log(LOG_LEVEL_DEBUG, "Page didn't match [%s]", httpstep.required);
with

zabbix_log(LOG_LEVEL_WARNING, "For URL [%s], page didn't match [%s]. Output follows:", httpstep.url, httpstep.required);
zabbix_log(LOG_LEVEL_WARNING, "%s", page.data);

Comment by Arli [ 2012 May 03 ]

And in more recent versions - in the same file around line 407 modify block to look like this:

if ('\0' != *httpstep.required &&
NULL == zbx_regexp_match(page.data, httpstep.required, NULL))

{ zabbix_log(LOG_LEVEL_DEBUG, "%s() required pattern \"%s\" not found on %s. Output follows:", __function_name, httpstep.required, httpstep.url); zabbix_log(LOG_LEVEL_DEBUG, "%s", page.data); err_str = zbx_strdup(err_str, "Required pattern not found"); }

This workaround helped me a lot while debugging web monitoring scenarios protected by different SSO technologies.

Comment by Bart Verwilst [ 2012 Oct 25 ]

Had a step fail for a few minutes, then it recovered. I have no idea why..

Comment by richlv [ 2013 Oct 07 ]

two debugging patches against trunk r38935, thanks to wiper.

web_scenario_log_page.patch - logs received page data at debuglevel 4
web_scenario_log_headers_request.patch - additionally logs received headers and requests (made by server or proxy) at debuglevel 4

Comment by Igors Homjakovs (Inactive) [ 2014 May 20 ]

Fixed in svn://svn.zabbix.com/branches/dev/ZBXNEXT-101

Comment by Bart Verwilst [ 2014 May 20 ]

Could you enlighten us about how this was fix and what kind of goodness we can expect in the next release?

Thanks!

Comment by Bart Verwilst [ 2014 May 20 ]

If the patches above were used, then never mind..

Comment by Igors Homjakovs (Inactive) [ 2014 May 21 ]

Bart,

This fix is still under testing and other modification may still be added. You will be enlightened in a short while

Comment by Alexander Vladishev [ 2014 May 21 ]

Related issues: ~~ZBXNEXT-136~~, ~~ZBXNEXT-416~~, ~~ZBXNEXT-2280~~

Comment by richlv [ 2014 May 22 ]

hmm, based on the attached patches and earlier discussions, once debug is enabled, sent and received data should be logged no matter whether it failed or not... maybe it's worth discussing and describing implementation details first

Comment by Igors Homjakovs (Inactive) [ 2014 Jul 02 ]

Additional modification in r46970 and r46998.

Up to now the following things have been implemented:

Ability to change log level at runtime using runtime control options (-Rlog_level_increase and -Rlog_level_decrease)

Now it is possible to change the log level for all processes, one process(e.g. trapper #1), all processes of the same kind (e.g. all trapper processes) or a process with a certain PID.

Additional log level (5) has been added for extended debugging of failed web monitoring steps

Due to the fact that now it is possible to send signals to any process additional care should be taken with signal handling. The main problem is that in some cases undefined behaviour is possible after signal arrival. One way to fix it is to set a flag when signal arrives and then check this flag in the beginning of each process loop. Another option is to block a signal while process is in the middle of the loop and then unblock and execute signal handler at the end of the loop. Second important thing is that logging has to be avoided (or improved) in the signal handler because non-signal-safe functions are currently used there.

Comment by richlv [ 2014 Jul 03 ]

thanks for the update. detailed spec will still be needed
for now, just some questions/remarks.

"Ability to change log level at runtime using runtime control options (-Rlog_level_increase and -Rlog_level_decrease)" - that is bloody cool
"...log level for all processes, one process(e.g. trapper #1), all processes of the same kind (e.g. all trapper processes) or a process with a certain PID" - that, too
a way to set specific loglevel (like 3) would be needed, too
a way to print current loglevel for all processes would be great
what are the messages we log when loglevel is changed ?
what do we do if incorrect loglevel is supplied ?
if trapper 1 is at loglevel 5, trapper 2 at 4 and we send log_level_increase for all trappers - what happens ?
greg and others noted a need to only dump contents upon step failure - do we plan to implement that ?

overall i like this development a lot

Comment by Igors Homjakovs (Inactive) [ 2014 Jul 04 ]

Some comments regarding the above questions:

a way to set specific loglevel (like 3) would be needed, too
a way to print current loglevel for all processes would be great
At the moment we don’t plan to implement this.
<richlv> for the record, this might be slightly problematic when doing a longer debugging session and switching debuglevels up/down - being sure that it's all set to a specific level would be nearly impossible
wiper there are limitations of the data we can easily send. We already have packed a lot of data in single integer (4 bytes). Adding there exact log level would be problematic.

what are the messages we log when loglevel is changed ?
The short message will say which log level is set for a process. If loglevel changes for multiple processes then the message will be shown for each process.
<richlv> exact messages must be listed in the spec
wiper when changing log level of process type the main process writes the following debug messages depending if the log level was changed for process type or for a specific pid:
```
the signal was redirected to "listener" process pid:11042
the signal was redirected to process pid:11042
```
The receiving process logs the changes made to log level, for example:
```
log level has been increased to "5 - trace"
cannot increase log level: maximum level has been already set
log level has been decreased to "1 - critical"
cannot decrease log level: minimum level has been already set
```

what do we do if incorrect loglevel is supplied ?
Incorrect loglevel cannot be supplied because we only can increase or decrease the loglevel. The message will be printed if we exceed the max/min loglevel, and in this case the loglevel will not be changed.
<richlv> exact messages must be listed in the spec

if trapper 1 is at loglevel 5, trapper 2 at 4 and we send log_level_increase for all trappers - what happens ?
The loglevel will be increased by 1 for trapper 2 and 4. The trapper 1 will say with loglevel 5 and the message that the max loglevel has been previously set will be outputted to the log file.
<richlv> i guess trapper #2 will get loglevel increased but we'll get a message about trapper #1

greg and others noted a need to only dump contents upon step failure - do we plan to implement that ?
Yes, this has been already implemented and waits for the testing.
<richlv> this is one of those strange cases where development happens without communicating intent... i even mentioned earlier that multiple patches print out received data always.
i'm still not 100% sure what has been implemented. i'd suggest to print out contents after failed steps only at dl4 and after all steps at dl5

Comment by Andris Zeila [ 2014 Jul 14 ]

(1) In current design adding new process type will change the protocol - the ZBX_PROCESS_TYPE_ALL and ZBX_PROCESS_TYPE_PID values will be different. While old server should not send the signals to new server (and vice versa), it still would be nice to keep the same protocol for the next Zabbix versions too.

Also it would be better to separate signal contents from internal constants used for command line parsing (ZBX_TASK_* defines).

See the specification draft at https://www.zabbix.org/wiki/Docs/specs/ZBXNEXT-101

igorsh RESOLVED in r47333.

wiper CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(2) The ZBX_TASK_START_HTTP_DEBUG, ZBX_TASK_STOP_HTTP_DEBUG defines are not used anywhere.

igorsh RESOLVED in r47295.

wiper CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(3) The set_log_level_task() function directly uses the zbx_optarg external variable. It would be better to pass the
reference to the optional arguments ('=') in the zbx_optarg - the argument parsing would be less complex. Also instead of using sscanf() functions use is_ushort() defines to extract pids and process numbers from strings.

igorsh RESOLVED in r47333.

wiper CLOSED. Please check slight changes in log level control option parsing error messages in r48017

Comment by Andris Zeila [ 2014 Jul 14 ]

(4) Instead of having the process type->name mapping defined in two places (compare_process_type_string() and get_process_type_string() functions) the compare_process_type_string() function should iterate through process types and use get_process_type_string() to test if the specified string matches process type name. Also it should be renamed to get_process_type_by_name() or similar.

igorsh RESOLVED in r47333.

wiper CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(5) Use zbx_error() instead of printf() to print command argument parsing error messages.

igorsh RESOLVED in r47334.

wiper CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(6) The inline help states that the log level options can have optional arguments '=process, N'. It should be '=process,N' without spaces (however process name itself can contain spaces).

igorsh RESOLVED in r47333.

wiper
Currently the help (-h) message is as follows:

Runtime control options:
  config_cache_reload             Reload configuration cache
  log_level_increase <option>     Increase log level
  log_level_decrease <option>     Decrease log level

Log level control options:
  No option                       All processes
  =pid                            Pid number of a process
  =process                        All specified processes (e.g. all poller processes)
  =process,N                      Process name and number (e.g. poller #1, http poller #3)

It could be improved, for example to something like this:

Runtime control options:
  config_cache_reload             Reload configuration cache
  log_level_increase[=<target>]   Increase log level, affect all processes if target is not specified
  log_level_decrease[=<target>]   Decrease log level, affect all processes if target is not specified

Log level control targets:
  <pid>                           Process identifier
  <process type>                  All processes of specified type (e.g. poller)
  <process type>,N                Process type and number (e.g. poller,3)

Still need to reach consensus about the format.

REOPENED

asaveljevs andris changed --help format recently under ~~ZBX-8391~~ and you might wish to do similarly.

wiper yes, we are waiting on his input

andris ~~ZBX-8391~~ is dealing a lot with man-pages and help-messages, but currently is waiting for review. After it is reviewed and merged, you can start describing the new additional runtime control options in man-pages and help messages. Log level control targets proposed by wiper look better to me. Valid process names to choose from should be documented, examples should be added in man-pages, rendering of updated man-pages should be tested on AIX, HP-UX, GNU/Linux, OpenBSD, Solaris etc., finally the updated man-pages should be copied into online documentation.

igorsh RESOLVED in r48351 and r48354.

asaveljevs Help for zabbix_agentd currently says:

Log level control targets:
  <pid>                           Process identifier
  <process type>                  All processes of specified type (e.g. poller)
  <process type,N>                Process type and number (e.g. poller,3)

Zabbix agent does not have a poller process. In the example, it might be better to include a process type that really exists on the agent.

Also, according to http://en.wiktionary.org/wiki/e.g.#Usage_notes , in American English "e.g." should be followed by a comma. However, in that case, there may be too many commas.

REOPENED.

igorsh RESOLVED in r48475.

asaveljevs CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(7) There is no need for separate set_log_level_task() implementations for proxy and server. The both do the same thing and can be moved somewhere.

igorsh RESOLVED in r47333 and r47336.

wiper CLOSED, as agent,server and proxy shares the same get_process_type_by_name() function I removed the callback parameter from set_log_level_task() function. Please review r48021

igorsh Thank you. CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(8) Use THIS_SHOULD_NEVER_HAPPEN instead of assert(0)

igorsh RESOLVED in r47487.

wiper CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(9) Might be better to merge get_process_type_by_server_num() and get_process_num_by_server_num() functions into one function get_process_info_by_thread(int *process_type, int *process_num).

igorsh RESOLVED in r47362.

wiper CLOSED

Comment by Andris Zeila [ 2014 Jul 14 ]

(10) Log level changes should work for agent (nix only) too.

igorsh RESOLVED in r47443 and r47474.

wiper CLOSED

Comment by Igors Homjakovs (Inactive) [ 2014 Jul 21 ]

(11) Signal blocking has been added for server and proxy processes in order to prevent undefined behaviour on user signal(SIGUSR1) arrival. Signal blocking has not been added for dbconfig and proxyconfig processes in order to preserve instant configuration reloading.

igorsh RESOLVED in r47486.

wiper the signal block was removed, so this is not relevant anymore. CLOSED

Comment by Andris Zeila [ 2014 Aug 14 ]

(12) Continuing from the discussion above we should always log the received contents in web monitoring with trace log level, even if the web monitoring step succeeds.

igorsh RESOLVED in r48206.

wiper CLOSED

Comment by Andris Zeila [ 2014 Aug 14 ]

(13) Vmware collector also should log the received data when log level set to trace.

igorsh RESOLVED in r48217.

wiper CLOSED. Please review r48382 (slightly changed log messages)

igorsh Thank you. Looks good. Please review my modifications in r48428.

wiper CLOSED

Comment by Andris Zeila [ 2014 Aug 14 ]

(14) It would be better to use bit masks/shifts to assemble the signal data. It would make the data independent of system endianess and the code would be easier to read.

And instead of reusing ZBX_TASK_* defines for message code, create separate messages, with config cache reload message having the same value 9, but the log level messages having values 1,2 (see spec).

igorsh RESOLVED in r48327.

wiper CLOSED

Comment by Andris Zeila [ 2014 Aug 14 ]

(15) The config cache reload message should be assembled with the same tools used to create log level messages to have the same byte order (as defined in spec).

Currently config cache reload message is sent to the first process, which should be configuration syncer. But now we can find the right process so we should use this technique also when redirecting config cache reload message.

igorsh RESOLVED in r48327.

wiper CLOSED, please review changes in r48401

igorsh Thank you. CLOSED.

Comment by Alexander Vladishev [ 2014 Aug 22 ]

(16) man pages should be updated

igorsh RESOLVED in r48415.

asaveljevs Review will be done later, when we resolve conflicts with ~~ZBX-5166~~. POSTPONED.

sasha Resolved with ~~ZBX-5166~~. CLOSED

Comment by Andris Zeila [ 2014 Aug 26 ]

Successfully tested

Comment by Alexander Vladishev [ 2014 Aug 26 ]

(17) Documentation:

asaveljevs When documenting, mention that sleeping processes will be woken up and might unexpectedly start executing. For instance, if we send "log_level_increase" command to a sleeping housekeeper, it will do housekeeping. Similarly, a listening trapper will print that it has processed data in 0 seconds.

<richlv> did we just implement ~~ZBXNEXT-207~~ ?

asaveljevs My comment above superseded by ~~ZBX-8699~~.

igorsh I have updated the documentation, except the manpages. Please have a look.

<dimir> agent man page should mention support for -R

https://www.zabbix.com/documentation/2.4/manpages/zabbix_agentd

igorsh RESOLVED.

andris It should be documented that changing log level for a single Zabbix process by specifing process PID is supported only if PID <= 65535. On systems with large PIDs <process type,N> can be used to change log level for a single process.

igorsh RESOLVED in

https://www.zabbix.com/documentation/2.4/manual/concepts/server
https://www.zabbix.com/documentation/2.4/manual/concepts/proxy
https://www.zabbix.com/documentation/2.4/manual/concepts/agent

andris CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 27 ]

(19) [minor] In log.h, we have the following set of functions:

void		zabbix_set_log_level(int level);
int		set_debug_level_up();
int		set_debug_level_down();
const char	*get_debug_level_string();
int		zabbix_check_log_level(int level);

In some of them, we say "log level", while in the others we say "debug level". I would suggest to (a) have them all refer to "log level", (b) prefix them with "zabbix_" to avoid issues like ~~ZBX-8629~~ in the future, (c) name 2nd and 3rd functions "zabbix_increase_log_level" and "zabbix_decrease_log_level", for consistency with runtime controls.

igorsh RESOLVED in r48489.

asaveljevs Functions that do not take any parameters should have "void" in their parameter list. Otherwise, it is possible to call these functions with any number of arguments. REOPENED.

igorsh RESOLVED in r48524.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 27 ]

(20) This looks a bit wrong, because if proxy and agent are built together without server, then "zbxself" will be processed two times:

Index: src/libs/Makefile.am
===================================================================
--- src/libs/Makefile.am	(revision 48471)
+++ src/libs/Makefile.am	(working copy)
@@ -50,7 +50,11 @@
 	zbxicmpping \
 	zbxself
 endif
+if AGENT
+AGENT_SUBDIRS = \
+	zbxself
 endif
+endif

This is exactly what we wanted to avoid in ~~ZBX-6235~~.

igorsh RESOLVED in r48492.

asaveljevs Good, except formatting got broken. REOPENED.

igorsh RESOLVED in r48527.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 27 ]

(21) zabbix_set_log_level() was such a useful debugging function! Very sad to see it being removed, I'd rather keep it. If you would still like to remove it, however, please remove the prototype from include/log.h.

igorsh RESOLVED in r48494.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 27 ]

(22) As noted by sasha, the following logging code (for instance, in src/zabbix_proxy/proxyconfig/proxyconfig.c) is incorrect:

zabbix_log(LOG_LEVEL_INFORMATION, "server #%d started [%s #%d]",
		server_num, get_process_type_string(process_type), process_num);

It says hardcoded "server #%d started", but it should be "server" for zabbix_server and "proxy" for zabbix_proxy.

asaveljevs Similarly, in src/zabbix_agent/stats.c, there is:

zabbix_log(LOG_LEVEL_INFORMATION, "agent #%d started [collector]", server_num);

The "collector" part should come from get_process_type_string() function, instead of being hardcoded.

asaveljevs Similarly, too, the following code distinguished between server and proxy before:

#ifdef HAVE_SNMP
	init_snmp("zabbix_server");
#endif

wiper Working on it, solving libsysperf dependency on collector does not look simple
wiper So we decided to leave the collector message hardcoded for now, and fix the other listed issues. RESOLVED in r48526

asaveljevs Looks good, but please see r48534. It changes get_daemon_type_string() function to accept unsigned char, same as get_process_type_string(). RESOLVED.

wiper Thanks, CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 27 ]

(23) We now have the following structure:

typedef struct
{
	int		server_num;
	int		process_num;
	unsigned char	process_type;
	void		*args;
}
zbx_thread_args_t;

This has "server", "process" and "thread", words that almost contradict each other. There is probably little we can do regarding "process" and "thread" - those are already rooted in our code.

However, the "server_num" field is new and makes no sense on the agent. Perhaps, "global_num" would be better?

Also, perhaps we could replace the following repetitive code with a single call to, say, zbx_thread_unpack_args(args, &global_num, &process_num, &process_type)?

process_type = ((zbx_thread_args_t *)args)->process_type;
server_num = ((zbx_thread_args_t *)args)->server_num;
process_num = ((zbx_thread_args_t *)args)->process_num;

wiper I was thinking about creating a single entry function for all process (worker) threads, which would handle args, log the '<binary> started> message and then call the specific entry function (also passed with args). I even started to implement it, but then ran into dependency problems and decided to back off.

wiper It was decided to leave zbx_thread_agrs_t possible refactoring in ZBXNEXT-2427
CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(24) In function get_log_level_message(), variable "proc_name" is not always freed.

igorsh RESOLVED in r48532.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(25) Not related to this development, but compiling zabbix_sender.dll gives a warning:

src\zabbix_sender\win32\zabbix_sender.c(66) : warning C4002: too many actual parameters for macro 'zbx_tcp_recv'

wiper Yes, the windows sender dll was not updated when the socket API changed slightly
RESOLVED in r48528

asaveljevs Variable "answer" became unused. Removed it in r48536. RESOLVED.

wiper Right, CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(26) The following was added for use with siginfo_t.si_uid field:

#define SIG_CHECKED_FIELD_UINT(siginfo, field)		(NULL == siginfo ? 0 : siginfo->field)

Unfortunately, the value of 0 does not allow to distinguish between root user and NULL siginfo. Perhaps we should live with just SIG_CHECKED_FIELD, as before, despite si_uid being unsigned - it might not be common to have user IDs over 31 bits.

wiper RESOLVED in ~~r48529~~ r48538

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(27) A notion of "Zabbix worker process" was introduced in daemon.c: in common_worker_sigusr_handler() function and message "process pid:%d is not a Zabbix worker process". We might wish to avoid introducing a new term and just stick with "Zabbix process".

wiper RESOLVED in r48531

asaveljevs Looks better, but not quite:

zabbix_log(LOG_LEVEL_ERR, "failed to redirect signal: process pid:%d is not a Zabbix"
		" process", ZBX_RTC_GET_DATA(flags));

If we try to send SIGUSR1 to the main process, it will say that "pid:12345 is not a Zabbix process", which is not entirely correct. Perhaps, "Zabbix child process" would be better. REOPENED.

wiper That's why I wanted to find some definition for Zabbix 'worker' processes. For example if the server invokes a script - technically it's also a Zabbix child process.
RESOLVED in r48540

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(28) In zabbix_agentd.c, the following change was made:

 static char	shortopts[] =
-	"c:hVpt:"
+	"c:hVR:pt:"
 #ifdef _WINDOWS
 	"idsxm"
 #endif

However, "R" should only be present on Unix, in the same way as "usage_message" above contains "R" only for Windows.

wiper I was thinking it would be better to give explicit error message about runtime control not being supported on Windows, rather than user thinking that he made an error and trying to find how to change the log level on Windows

wiper RESOLVED in r48533 (removed runtime option control from Windows agent options)

asaveljevs Please see r48548. Among other things, it removes double break that r48533 introduced. RESOLVED.
wiper thanks, CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(29) Daemons do not compile on platforms without sigqueue() function (e.g., FreeBSD 4.2):

../../src/libs/zbxnix/libzbxnix.a(daemon.o): In function `common_worker_sigusr_handler':
src/libs/zbxnix/daemon.c:78: undefined reference to `zbx_sigusr_handler'

igorsh RESOLVED in r48541.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(30) Previously, main process used to say that it started:

 14488:20140826:152618.787 server #0 started [main process]

Unfortunately, it no longer does.

igorsh RESOLVED in r48539.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(31) When we receive signal, we write:

  4544:20140828:145453.425 Got signal [signal:10(SIGUSR1),sender_pid:4543,sender_uid:1000,value_int:33025].

Since we are using packed integers for signal messages, it might be better to dump value_int as hexadecimal.

wiper RESOLVED in r48529

asaveljevs Changed the hexadecimal dump to be 8 digits long in r48546. RESOLVED.
wiper CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(32) When we change log level, we write:

  4544:20140828:145449.938 log level has been increased to "4 - debug"

As discussed, let's change that to the following instead:

  4544:20140828:145449.938 log level has been increased to 4 (debug)

igorsh RESOLVED in r48543.

asaveljevs There should not quotes around the printed value. REOPENED.
wiper RESOLVED in r48554

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(33) Old bug, but anyway: if a PID file is empty, we quit without an error message:

$ touch /tmp/zabbix_server.pid
$ sbin/zabbix_server -R log_level_increase
zabbix_server [4907]:

This is because in function read_pid_file() we do not set an error message if fscanf() fails.

igorsh RESOLVED in r48545 and r48547.

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(34) In zabbix_agentd.c, we currently have:

case ZBX_TASK_RUNTIME_CONTROL:
	zbx_load_config(ZBX_CFG_FILE_OPTIONAL);
	exit(SUCCEED == zbx_sigusr_send(t.flags) ? EXIT_SUCCESS : EXIT_FAILURE);
	break;

Config file loading should not be optional, because if we fail at loading the config, whatever PID file we are thinking of as the default is probably wrong.

wiper RESOLVED in r48555

asaveljevs CLOSED

Comment by Aleksandrs Saveljevs [ 2014 Aug 28 ]

(35) Currently, we allow the following, which sends signal to all our processes. We should not.

$ sbin/zabbix_server -R log_level_increase=0

wiper RESOLVED in ~~r48557~~ ~~r48563~~ ~~r48566~~ r48567

asaveljevs CLOSED

Comment by Igors Homjakovs (Inactive) [ 2014 Aug 29 ]

Available in pre-2.3.4 (trunk) r48604.

Comment by richlv [ 2014 Aug 30 ]

some review asked in subissues 7 & 15
subissues still open : 16, 17

Comment by richlv [ 2014 Sep 07 ]

ZBXNEXT-2443 asks for a way to query current loglevel
ZBXNEXT-2444 asks for a way to set a specific loglevel
~~ZBX-8725~~ - apparently, although it's not mentioned anywhere, loglevel5 can not be set in the config file

Comment by Andris Mednis [ 2014 Sep 08 ]

(36) get_process_info_by_thread() has no prototype:

zabbix_agentd.c:215:5: warning: no previous prototype for 'get_process_info_by_thread' [-Wmissing-prototypes]
int get_process_info_by_thread(int server_num, unsigned char *process_type, int *process_num)

igorsh RESOLVED in r48897.
andris CLOSED.

Comment by Andris Mednis [ 2014 Sep 08 ]

(37) Confusing messages from validation of runtime control option:

$ sbin/zabbix_proxy -c ~/ZBX-8391/zabbix_proxy.conf -R log_level_increaseAAA	
zabbix_proxy [21649]: unknown log level control option: AAA

$ sbin/zabbix_proxy -c ~/ZBX-8391/zabbix_proxy.conf -R log_level_increase=poller,
zabbix_proxy [21808]: invalid log level control option: process number must be unsigned short non-zero value

igorsh RESOLVED ir r48933.

andris Please review proposed changes in r49134.

igorsh Additional changes in r49137.

andris Please review proposed change in r49140.

igorsh Thank you. Moved to ~~ZBX-8794~~. CLOSED.

igorsh Agent crashes if only process name is specified (e.g -Rlog_level_increase=poller)
due to uninitialized proc_num in src/libs/zbxnix/control.c. REOPENED

andris RESOLVED in r49317.

igorsh Thank you. CLOSED

Comment by Andris Mednis [ 2014 Sep 08 ]

(38) get_process_info_by_thread() arguments have the same names as global variables in src/zabbix_agent/zabbix_agentd.c:

zabbix_agentd.c: In function 'get_process_info_by_thread':
zabbix_agentd.c:215:36: warning: declaration of 'server_num' shadows a global declaration [-Wshadow]
 int get_process_info_by_thread(int server_num, unsigned char *process_type, int *process_num)
                                    ^
zabbix_agentd.c:180:22: warning: shadowed declaration is here [-Wshadow]
 ZBX_THREAD_LOCAL int server_num   = 0;
                      ^
zabbix_agentd.c:215:63: warning: declaration of 'process_type' shadows a global declaration [-Wshadow]
 int get_process_info_by_thread(int server_num, unsigned char *process_type, int *process_num)
                                                               ^
zabbix_agentd.c:178:32: warning: shadowed declaration is here [-Wshadow]
 ZBX_THREAD_LOCAL unsigned char process_type = 255; /* ZBX_PROCESS_TYPE_UNKNOWN */
                                ^
zabbix_agentd.c:215:82: warning: declaration of 'process_num' shadows a global declaration [-Wshadow]
 int get_process_info_by_thread(int server_num, unsigned char *process_type, int *process_num)
                                                                                  ^
zabbix_agentd.c:179:22: warning: shadowed declaration is here [-Wshadow]
 ZBX_THREAD_LOCAL int process_num;

igorsh RESOLVED in r48897 and moved to ~~ZBX-8796~~.
andris CLOSED.

Comment by Aleksandrs Saveljevs [ 2014 Sep 10 ]

This caused IPMI pollers not working on the server: ~~ZBX-8734~~.

Comment by Andris Zeila [ 2014 Sep 11 ]

(39) There is a potential lockup if signal is received during message logging - as it will try to lock the mutex already locked by log function. Possible solution would be to block signals during log writes, but need to check performance impact and how it would work from signal handler itself.

igorsh Moved to ~~ZBX-8761~~ and resolved in r49011, r49013, r49043. CLOSED.

Comment by Igors Homjakovs (Inactive) [ 2014 Sep 15 ]

(40) Debug level 5 messages are not printed if syslog is used.

igorsh Moved to ~~ZBX-8763~~ and RESOLVED in r49006. CLOSED.

Comment by Oleksii Zagorskyi [ 2016 May 19 ]

Take headers into consideration (including for logging) has been requested in ~~ZBXNEXT-2315~~

Generated at Wed Jul 16 09:30:25 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.

[ZBXNEXT-101] Debugging failed web monitoring steps Created: 2009 Oct 09 Updated: 2016 May 29 Resolved: 2014 Aug 26