[ZBX-13351] Strange log messages when OpenSSL connection times out Created: 2018 Jan 17 Updated: 2024 Apr 10 Resolved: 2018 Mar 05 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G), Proxy (P), Server (S) |
Affects Version/s: | 3.0.14, 3.4.6 |
Fix Version/s: | 3.0.16rc1, 3.4.8rc1, 4.0.0alpha5, 4.0 (plan) |
Type: | Patch request | Priority: | Trivial |
Reporter: | Glebs Ivanovskis (Inactive) | Assignee: | Viktors Tjarve |
Resolution: | Fixed | Votes: | 0 |
Labels: | encryption, logging | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
OpenSSL |
Team: | Team A |
Sprint: | Sprint 27, Sprint 28, Sprint 29 |
Story Points: | 0.5 |
Description |
Seeing surprising messages in the log file when Zabbix server cannot communicate with passive proxy using TLS (PSK) due to timeouts: 10012:20180117:055518.812 cannot send data to proxy "<proxy name>": SSL_write() timed out 10012:20180117:055518.812 SSL_shutdown() with <proxy IP> returned error code 5: According to OpenSSL documentation SSL_shutdown() can only return 1, 0 or negative values. Looking into this piece of code if (0 > (res = SSL_shutdown(s->tls_ctx->ctx))) { int error_code; char *error = NULL; size_t error_alloc = 0, error_offset = 0; error_code = SSL_get_error(s->tls_ctx->ctx, res); zbx_tls_error_msg(&error, &error_alloc, &error_offset); zabbix_log(LOG_LEVEL_WARNING, "SSL_shutdown() with %s returned error code %d: %s", s->peer, error_code, info_buf); zbx_free(error); } revealed that "5" is not coming from SSL_shutdown() at all. Moreover, for some reason error is composed just to be thrown away. |
Comments |
Comment by Viktors Tjarve [ 2018 Feb 20 ] |
SSL_get_error() returns a result code for various preceding calls to SSL_*() functions. It is also recommended to be used with SSL_shutdown() if it has returned -1. |
Comment by Viktors Tjarve [ 2018 Feb 20 ] |
Additional error information prepared in zbx_tls_error_msg() now added to the SSL_shutdown() error log entry. Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-13351 |
Comment by Andris Mednis [ 2018 Feb 23 ] |
(1) While looking how info_buf[] is cleared in other functions, it was discovered that only zbx_tls_connect() and zbx_tls_accept() do it correctly - before calling SSL_connect() and SSL_accept(), respectively. andris RESOLVED in r77986 and r77991. Also see minor change in r77988. <viktors.tjarve> CLOSED |
Comment by Andris Mednis [ 2018 Feb 23 ] |
For testing changes described in (1) it took me some time to find out, how to cause a TLS alert during SSL_write() and SSL_read() after a successful TLS handshake. One way is to insert SSL_clear() for testing: Index: src/libs/zbxcrypto/tls.c =================================================================== --- src/libs/zbxcrypto/tls.c (revision 77991) +++ src/libs/zbxcrypto/tls.c (working copy) @@ -5079,6 +5079,10 @@ #if defined(HAVE_OPENSSL) info_buf[0] = '\0'; /* empty buffer for zbx_openssl_info_cb() messages */ #endif +{ +if (0 != (program_type & ZBX_PROGRAM_TYPE_AGENTD)) + SSL_clear(s->tls_ctx->ctx); +} do { res = ZBX_TLS_WRITE(s->tls_ctx->ctx, buf, len); @@ -5162,6 +5166,10 @@ #if defined(HAVE_OPENSSL) info_buf[0] = '\0'; /* empty buffer for zbx_openssl_info_cb() messages */ #endif +{ +if (0 != (program_type & ZBX_PROGRAM_TYPE_AGENTD)) + SSL_clear(s->tls_ctx->ctx); +} do { res = ZBX_TLS_READ(s->tls_ctx->ctx, buf, len); |
Comment by Andris Mednis [ 2018 Feb 23 ] |
(2) Concerning Index: src/libs/zbxcrypto/tls.c =================================================================== --- src/libs/zbxcrypto/tls.c (revision 77991) +++ src/libs/zbxcrypto/tls.c (working copy) @@ -5321,8 +5321,8 @@ error_code = SSL_get_error(s->tls_ctx->ctx, res); zbx_tls_error_msg(&error, &error_alloc, &error_offset); - zabbix_log(LOG_LEVEL_WARNING, "SSL_shutdown() with %s returned error code %d:%s: %s", - s->peer, error_code, ZBX_NULL2EMPTY_STR(error), info_buf); + zabbix_log(LOG_LEVEL_WARNING, "SSL_shutdown() with %s returned %d (error code %d):%s: %s", + s->peer, res, error_code, ZBX_NULL2EMPTY_STR(error), info_buf); zbx_free(error); } to have 13236:20180223:195057.161 SSL_shutdown() with 127.0.0.1 returned -1 (error code 1): file ../ssl/ssl_lib.c line 1735: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init: ? I thought internal 'error_code' would be more informative than plain 'res'. andris Decided to modify error messages to look like: 17366:20180226:121249.426 SSL_shutdown() with 127.0.0.1 set result code to 1: .... RESOLVED in r78011. <viktors.tjarve> I found a case when neither zbx_tls_error_msg() has returned any error information nor there is any info in info_buf. In such case log entry looks like this: 8863:20180302:150016.273 SSL_shutdown() with 127.0.0.1 set result code to 5:: RESOLVED in r78236. andris Reviewed, accepted. |
Comment by Andris Mednis [ 2018 Mar 05 ] |
Successfully tested. |
Comment by Viktors Tjarve [ 2018 Mar 05 ] |
Released in:
|
Comment by philthynz [ 2018 Apr 05 ] |
It seems this issue is not fixed Running agent version 3.4.8 zabbix-agent: Installed: 1:3.4.8-1+xenial Candidate: 1:3.4.8-1+xenial Version table: *** 1:3.4.8-1+xenial 500 500 http://repo.zabbix.com/zabbix/3.4/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status 1:2.4.7+dfsg-2ubuntu2.1 500 500 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages 1:2.4.7+dfsg-2ubuntu2 500 500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages root@zabbix-proxy:~# tail -10 /var/log/zabbix/zabbix_agentd.log 14625:20180405:142502.364 using configuration file: /etc/zabbix/zabbix_agentd.conf 14625:20180405:142502.364 agent #0 started [main process] 14629:20180405:142502.365 agent #4 started[listener #3] 14628:20180405:142502.366 agent #3 started[listener #2] 14627:20180405:142502.368 agent #2 started[listener #1] 14626:20180405:142502.369 agent #1 started [collector] 14629:20180405:144853.149 SSL_shutdown() with 127.0.0.1 set result code to 5: 14628:20180405:144909.727 SSL_shutdown() with 127.0.0.1 set result code to 5: 14627:20180405:144910.032 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:144911.996 SSL_shutdown() with 127.0.0.1 set result code to 5: The lxc container has Zabbix Proxy and Zabbix Agent on the same host. The agent is configured with Server=127.0.0.1 because the proxy is on the same host. Encryption is configured with psk on the server, proxy and agent. The data reports fine to the Zabbix Server, so i presume the connection is fine. There are just some ssl errors. I also restarted the server, agent and proxy on the needed hosts. |
Comment by Andris Mednis [ 2018 Apr 05 ] |
Hmmm.. But what is wrong with error messages here ? They look as expected. |
Comment by philthynz [ 2018 Apr 05 ] |
I am not sure if there is anything wrong with these errors, zabbix seems to be functioning as normal. There are just these errors in the agent log: root@zabbix-proxy:~# tail -f /var/log/zabbix/zabbix_agentd.log 14625:20180405:142502.364 using configuration file: /etc/zabbix/zabbix_agentd.conf 14625:20180405:142502.364 agent #0 started [main process] 14629:20180405:142502.365 agent #4 started [listener #3] 14628:20180405:142502.366 agent #3 started [listener #2] 14627:20180405:142502.368 agent #2 started [listener #1] 14626:20180405:142502.369 agent #1 started [collector] 14629:20180405:144853.149 SSL_shutdown() with 127.0.0.1 set result code to 5: 14628:20180405:144909.727 SSL_shutdown() with 127.0.0.1 set result code to 5: 14627:20180405:144910.032 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:144911.996 SSL_shutdown() with 127.0.0.1 set result code to 5: 14628:20180405:145856.092 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:145857.295 SSL_shutdown() with 127.0.0.1 set result code to 5: 14627:20180405:145857.686 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:151903.095 SSL_shutdown() with 127.0.0.1 set result code to 5: 14627:20180405:151903.160 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:153846.073 SSL_shutdown() with 127.0.0.1 set result code to 5: 14628:20180405:153846.454 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:154846.114 SSL_shutdown() with 127.0.0.1 set result code to 5: 14627:20180405:154853.965 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:155845.766 SSL_shutdown() with 127.0.0.1 set result code to 5: 14628:20180405:155848.753 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:160848.734 SSL_shutdown() with 127.0.0.1 set result code to 5: 14627:20180405:160910.378 SSL_shutdown() with 127.0.0.1 set result code to 5: 14628:20180405:160920.923 SSL_shutdown() with 127.0.0.1 set result code to 5: 14629:20180405:160922.783 SSL_shutdown() with 127.0.0.1 set result code to 5: Seems like an error or just information? |
Comment by Andris Mednis [ 2018 Apr 05 ] |
Probably tcpdump or Wireshark is necessary ti find out why these errors happen. |