[ZBX-13351] Strange log messages when OpenSSL connection times out Created: 2018 Jan 17  Updated: 2024 Apr 10  Resolved: 2018 Mar 05

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G), Proxy (P), Server (S)
Affects Version/s: 3.0.14, 3.4.6
Fix Version/s: 3.0.16rc1, 3.4.8rc1, 4.0.0alpha5, 4.0 (plan)

Type: Patch request Priority: Trivial
Reporter: Glebs Ivanovskis (Inactive) Assignee: Viktors Tjarve
Resolution: Fixed Votes: 0
Labels: encryption, logging
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OpenSSL


Team: Team A
Sprint: Sprint 27, Sprint 28, Sprint 29
Story Points: 0.5

 Description   

Seeing surprising messages in the log file when Zabbix server cannot communicate with passive proxy using TLS (PSK) due to timeouts:

 10012:20180117:055518.812 cannot send data to proxy "<proxy name>": SSL_write() timed out
 10012:20180117:055518.812 SSL_shutdown() with <proxy IP> returned error code 5: 

According to OpenSSL documentation SSL_shutdown() can only return 1, 0 or negative values. Looking into this piece of code

if (0 > (res = SSL_shutdown(s->tls_ctx->ctx)))
{
	int	error_code;
	char	*error = NULL;
	size_t	error_alloc = 0, error_offset = 0;

	error_code = SSL_get_error(s->tls_ctx->ctx, res);
	zbx_tls_error_msg(&error, &error_alloc, &error_offset);
	zabbix_log(LOG_LEVEL_WARNING, "SSL_shutdown() with %s returned error code %d: %s",
			s->peer, error_code, info_buf);
	zbx_free(error);
}

revealed that "5" is not coming from SSL_shutdown() at all. Moreover, for some reason error is composed just to be thrown away.



 Comments   
Comment by Viktors Tjarve [ 2018 Feb 20 ]

SSL_get_error() returns a result code for various preceding calls to SSL_*() functions. It is also recommended to be used with SSL_shutdown() if it has returned -1.

Comment by Viktors Tjarve [ 2018 Feb 20 ]

Additional error information prepared in zbx_tls_error_msg() now added to the SSL_shutdown() error log entry.

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-13351

Comment by Andris Mednis [ 2018 Feb 23 ]

(1) While looking how info_buf[] is cleared in other functions, it was discovered that only zbx_tls_connect() and zbx_tls_accept() do it correctly - before calling SSL_connect() and SSL_accept(), respectively.
In zbx_tls_write() and zbx_tls_read() the info_buf[] is cleared after ZBX_TLS_WRITE() and ZBX_TLS_READ(), which may cause losing of some error details.
In zbx_tls_close() the info_buf[] is not cleared at all before calling SSL_shutdown() which may cause reporting misleading error details.

andris RESOLVED in r77986 and r77991. Also see minor change in r77988.

<viktors.tjarve> CLOSED

Comment by Andris Mednis [ 2018 Feb 23 ]

For testing changes described in (1) it took me some time to find out, how to cause a TLS alert during SSL_write() and SSL_read() after a successful TLS handshake. One way is to insert SSL_clear() for testing:

Index: src/libs/zbxcrypto/tls.c
===================================================================
--- src/libs/zbxcrypto/tls.c	(revision 77991)
+++ src/libs/zbxcrypto/tls.c	(working copy)
@@ -5079,6 +5079,10 @@
 #if defined(HAVE_OPENSSL)
 	info_buf[0] = '\0';	/* empty buffer for zbx_openssl_info_cb() messages */
 #endif
+{
+if (0 != (program_type & ZBX_PROGRAM_TYPE_AGENTD))
+	SSL_clear(s->tls_ctx->ctx);
+}
 	do
 	{
 		res = ZBX_TLS_WRITE(s->tls_ctx->ctx, buf, len);
@@ -5162,6 +5166,10 @@
 #if defined(HAVE_OPENSSL)
 	info_buf[0] = '\0';	/* empty buffer for zbx_openssl_info_cb() messages */
 #endif
+{
+if (0 != (program_type & ZBX_PROGRAM_TYPE_AGENTD))
+	SSL_clear(s->tls_ctx->ctx);
+}
 	do
 	{
 		res = ZBX_TLS_READ(s->tls_ctx->ctx, buf, len);
Comment by Andris Mednis [ 2018 Feb 23 ]

(2) Concerning
SSL_shutdown() can only return 1, 0 or negative values......revealed that "5" is not coming from SSL_shutdown() at all.
glebs.ivanovskis, would you prefer it like

Index: src/libs/zbxcrypto/tls.c
===================================================================
--- src/libs/zbxcrypto/tls.c	(revision 77991)
+++ src/libs/zbxcrypto/tls.c	(working copy)
@@ -5321,8 +5321,8 @@
 
 			error_code = SSL_get_error(s->tls_ctx->ctx, res);
 			zbx_tls_error_msg(&error, &error_alloc, &error_offset);
-			zabbix_log(LOG_LEVEL_WARNING, "SSL_shutdown() with %s returned error code %d:%s: %s",
-					s->peer, error_code, ZBX_NULL2EMPTY_STR(error), info_buf);
+			zabbix_log(LOG_LEVEL_WARNING, "SSL_shutdown() with %s returned %d (error code %d):%s: %s",
+					s->peer, res, error_code, ZBX_NULL2EMPTY_STR(error), info_buf);
 			zbx_free(error);
 		}

to have

 13236:20180223:195057.161 SSL_shutdown() with 127.0.0.1 returned -1 (error code 1): file ../ssl/ssl_lib.c line 1735: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init: 

? I thought internal 'error_code' would be more informative than plain 'res'.

andris Decided to modify error messages to look like:

17366:20180226:121249.426 SSL_shutdown() with 127.0.0.1 set result code to 1: ....

RESOLVED in r78011.

<viktors.tjarve> I found a case when neither zbx_tls_error_msg() has returned any error information nor there is any info in info_buf. In such case log entry looks like this:

8863:20180302:150016.273 SSL_shutdown() with 127.0.0.1 set result code to 5::

RESOLVED in r78236.

andris Reviewed, accepted.

Comment by Andris Mednis [ 2018 Mar 05 ]

Successfully tested.

Comment by Viktors Tjarve [ 2018 Mar 05 ]

Released in:

  • 3.0.16rc1 r78301
  • 3.4.8rc1 r78304
  • 4.0.0alpha5 r78308
Comment by philthynz [ 2018 Apr 05 ]

It seems this issue is not fixed

Running agent version 3.4.8

zabbix-agent:

Installed: 1:3.4.8-1+xenial
Candidate: 1:3.4.8-1+xenial
Version table:
*** 1:3.4.8-1+xenial 500
500 http://repo.zabbix.com/zabbix/3.4/ubuntu xenial/main amd64 Packages
100 /var/lib/dpkg/status
1:2.4.7+dfsg-2ubuntu2.1 500
500 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages
1:2.4.7+dfsg-2ubuntu2 500
500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages
root@zabbix-proxy:~# tail -10 /var/log/zabbix/zabbix_agentd.log
14625:20180405:142502.364 using configuration file: /etc/zabbix/zabbix_agentd.conf
14625:20180405:142502.364 agent #0 started [main process]
14629:20180405:142502.365 agent #4 started[listener #3]
14628:20180405:142502.366 agent #3 started[listener #2]
14627:20180405:142502.368 agent #2 started[listener #1]
14626:20180405:142502.369 agent #1 started [collector]
14629:20180405:144853.149 SSL_shutdown() with 127.0.0.1 set result code to 5:
14628:20180405:144909.727 SSL_shutdown() with 127.0.0.1 set result code to 5:
14627:20180405:144910.032 SSL_shutdown() with 127.0.0.1 set result code to 5:
14629:20180405:144911.996 SSL_shutdown() with 127.0.0.1 set result code to 5:

The lxc container has Zabbix Proxy and Zabbix Agent on the same host. The agent is configured with Server=127.0.0.1 because the proxy is on the same host. Encryption is configured with psk on the server, proxy and agent. The data reports fine to the Zabbix Server, so i presume the connection is fine. There are just some ssl errors. I also restarted the server, agent and proxy on the needed hosts.

Comment by Andris Mednis [ 2018 Apr 05 ]

Hmmm.. But what is wrong with error messages here ? They look as expected.
Another thing is why SSL_shutdown() ends with error.

Comment by philthynz [ 2018 Apr 05 ]

I am not sure if there is anything wrong with these errors, zabbix seems to be functioning as normal. There are just these errors in the agent log:

 
root@zabbix-proxy:~# tail -f /var/log/zabbix/zabbix_agentd.log
 14625:20180405:142502.364 using configuration file: /etc/zabbix/zabbix_agentd.conf
 14625:20180405:142502.364 agent #0 started [main process]
 14629:20180405:142502.365 agent #4 started [listener #3]
 14628:20180405:142502.366 agent #3 started [listener #2]
 14627:20180405:142502.368 agent #2 started [listener #1]
 14626:20180405:142502.369 agent #1 started [collector]
 14629:20180405:144853.149 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14628:20180405:144909.727 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14627:20180405:144910.032 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:144911.996 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14628:20180405:145856.092 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:145857.295 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14627:20180405:145857.686 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:151903.095 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14627:20180405:151903.160 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:153846.073 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14628:20180405:153846.454 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:154846.114 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14627:20180405:154853.965 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:155845.766 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14628:20180405:155848.753 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:160848.734 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14627:20180405:160910.378 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14628:20180405:160920.923 SSL_shutdown() with 127.0.0.1 set result code to 5:
 14629:20180405:160922.783 SSL_shutdown() with 127.0.0.1 set result code to 5:

Seems like an error or just information?

Comment by Andris Mednis [ 2018 Apr 05 ]

Probably tcpdump or Wireshark is necessary ti find out why these errors happen.

Generated at Wed Apr 24 05:28:11 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.