[ZBX-14856] Can't connect to agents after upgrading OpenSSL to 1.1.1 Created: 2018 Sep 16 Updated: 2024 Apr 10 Resolved: 2018 Oct 21 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 3.0.22, 3.4.13, 4.0.0beta2 |
Fix Version/s: | 3.0.23rc1, 3.4.15rc1, 4.0.1rc1, 4.2.0alpha1, 4.2 (plan) |
Type: | Incident report | Priority: | Critical |
Reporter: | Vladislav | Assignee: | Andris Mednis |
Resolution: | Fixed | Votes: | 0 |
Labels: | encryption, ssl, zabbix_get | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Archlinux, zabbix server 3.4.13 |
Attachments: | patch_set_max_version_tls12_for_30.patch | ||||||||
Issue Links: |
|
||||||||
Team: | Team A | ||||||||
Sprint: | Sprint 43, Sprint 44, Sprint 45 | ||||||||
Story Points: | 11 |
Description |
Steps to reproduce:
Result: # zabbix_get -s agent.example.com --tls-connect psk --tls-psk-file "/etc/zabbix/agent.psk" --tls-psk-identity "PSK Agent" -k agent.ping zabbix_get [2014]: Get value error: TCP successful, cannot establish TLS to [[agent.example.com]:10050]: SSL_connect() set result code to SSL_ERROR_SSL: file ssl/statem/extensions_clnt.c line 801: error:14212044:SSL routines:tls_construct_ctos_early_data:internal error: TLS wri te fatal alert "internal error" |
Comments |
Comment by Andris Mednis [ 2018 Sep 17 ] | ||||||||||||||||||||||||||||||||||||
Another problem with OpenSSL 1.1.1 is with certificate right from the start: 20630:20180917:180612.103 In zbx_tls_init_child() 20630:20180917:180612.109 OpenSSL library (version OpenSSL 1.1.1 11 Sep 2018) initialized 20630:20180917:180612.110 zbx_tls_init_child() loaded CA certificate(s) from file "/home/zabbix30/zabbix_ca_file" 20630:20180917:180612.111 cannot load certificate(s) from file "/home/zabbix30/zabbix_agentd.crt": file ../ssl/ssl_rsa.c line 310: error:140AB18E:SSL routines:SSL_CTX_use_certificate:ca md too weak So, a certificate also stopped working. | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
PSK also does not work with a strange error about certificates: $ bin/zabbix_get -s 127.0.0.1 --tls-connect psk --tls-psk-file "/home/zabbix30/zabbix_agentd.psk" --tls-psk-identity "PSK Zabbix server" -k agent.ping zabbix_get [10956]: Get value error: connection closed during read zabbix_get [10956]: Check access restrictions in Zabbix agent configuration In agent log: 10705:20180918:111052.851 In zbx_tls_accept() 10705:20180918:111052.852 zbx_psk_server_cb() requested PSK identity "PSK Zabbix server" 10705:20180918:111052.852 zbx_tls_accept() cannot obtain peer certificate 10705:20180918:111052.852 End of zbx_tls_accept():FAIL error:'unspecified certificate verification error' 10705:20180918:111052.852 failed to accept an incoming connection: from 127.0.0.1: unspecified certificate verification error So, yes, it does not work with OpenSSL 1.1.1. | ||||||||||||||||||||||||||||||||||||
Comment by Florian Pritz [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
OpenSSL suggested that this could be worked around by limiting the used TLS protocol to 1.2 maximum. I don't see and option for this, but maybe adding one would be a good idea for future bugs like this? | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
Added note in | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
Thanks, Florian, for a hint!
| ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
Link to OpenSSL bugtracker, opened by Florian: 1.1.1 breaks TLS-PSK in zabbix #7241. | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
Let's try to fix it in a way that users can enjoy benefits of TLS 1.3 (faster connection, lower latency to name a few). We start with Zabbix 3.0 as src/libs/zbxcrypto/tls.c changes are small between 3.0 and 4.0. OpenSSL 1.1.1 is the new LTS with 5 years support. It was released 11 Sept (a week ago ) and is based on TLS 1.3. standard RFC8446 which was approved last month (August 2018). | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
Attached patch patch_set_max_version_tls12_for_30.patch adds setting max protocol to TLS 1.2 and PSK works again (but certificate still does not load as described in a comment above). | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
Thanks, cyclone, for helping with Jira | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 18 ] | ||||||||||||||||||||||||||||||||||||
https://wiki.openssl.org/index.php/TLS1.3 describes TLS 1.3 specific details for OpenSSL 1.1.1. Working with PSK is significantly changed under 1.3. | ||||||||||||||||||||||||||||||||||||
Comment by Florian Pritz [ 2018 Sep 19 ] | ||||||||||||||||||||||||||||||||||||
Thanks for the patch! | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 20 ] | ||||||||||||||||||||||||||||||||||||
Tested Zabbix 3.0.23rc1 with OpenSSL 1.1.1 on both endpoints. It works without modification with certificates and uses TLS 1.3 out-of-the-box: 5546:20180920:163536.228 In zbx_tls_accept() 5546:20180920:163536.236 zbx_tls_accept() peer certificate issuer:"CN=ZBX-14856 Signing CA,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" subject:"CN=Zabbix proxy,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" 5546:20180920:163536.236 End of zbx_tls_accept():SUCCEED (established TLSv1.3 TLS_AES_256_GCM_SHA384) Old CA certificate used "Signature Algorithm: sha1WithRSAEncryption" and was not accepted (error message "ca md too weak"). I regenerated CA and all test certificates with "Signature Algorithm: sha256WithRSAEncryption" and it works. So, only PSK needs to be fixed. | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 20 ] | ||||||||||||||||||||||||||||||||||||
As proposed in https://github.com/openssl/openssl/issues/7241#issuecomment-422483571 the PSK issue is in zbx_tls_accept() in src/libs/zbxcrypto/tls.c: #elif defined(HAVE_OPENSSL) int zbx_tls_accept(zbx_socket_t *s, unsigned int tls_accept, char **error) { ... /* Is this TLS conection using certificate or PSK? */ cipher_name = SSL_get_cipher(s->tls_ctx->ctx); #if OPENSSL_VERSION_NUMBER >= 0x1010000fL /* OpenSSL 1.1.0 or newer */ if (0 == strncmp("ECDHE-PSK-", cipher_name, ZBX_CONST_STRLEN("ECDHE-PSK-")) || 0 == strncmp("PSK-", cipher_name, ZBX_CONST_STRLEN("PSK-"))) #else if (0 == strncmp("PSK-", cipher_name, ZBX_CONST_STRLEN("PSK-"))) #endif { s->connection_type = ZBX_TCP_SEC_TLS_PSK; } else if (0 != strncmp("(NONE)", cipher_name, ZBX_CONST_STRLEN("(NONE)"))) { s->connection_type = ZBX_TCP_SEC_TLS_CERT; ... Ciphersuite names significantly changed in OpenSSL 1.1.1 and TLS 1.3. The old method of determining is it a certificate-based or PSK-based incoming connection in trapper (or listener process) fails - it always results in a certificate-based. If agent is modified to always assume PSK (just for test), it works and uses TLS 1.3: 17048:20180920:182712.322 In zbx_tls_accept() 17048:20180920:182712.322 zbx_psk_server_cb() requested PSK identity "PSK Zabbix server" 17048:20180920:182712.324 End of zbx_tls_accept():SUCCEED (established TLSv1.3 TLS_CHACHA20_POLY1305_SHA256) 17048:20180920:182712.324 __zbx_zbx_setproctitle() title:'listener #4 [processing request]' 17048:20180920:182712.324 Requested [proc.num[]] 17048:20180920:182712.330 Sending back [362] | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 21 ] | ||||||||||||||||||||||||||||||||||||
It is not that simple as finding out what - certificate or PSK - was used with incoming connection. Server and proxy trapper processes typically accept certificates and PSKs and examine later does a connection meet security restrictions. If max protocol version is not restricted to TLS 1.2, then SSL_VERIFY_FAIL_IF_NO_PEER_CERT in if (NULL != ctx_all) SSL_CTX_set_verify(ctx_all, SSL_VERIFY_PEER | SSL_VERIFY_FAIL_IF_NO_PEER_CERT, NULL); results in certificate always being required - even for PSK connections (if TLS 1.3 is used) - and SSL_accept() fails. | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Sep 27 ] | ||||||||||||||||||||||||||||||||||||
Inspired by asaveljevs comment https://support.zabbix.com/browse/ZBXNEXT-1263?focusedCommentId=157843&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-157843 I tried to do a similar test for GnuTLS and OpenSSL 1.1.1. Performance and ciphersuites of zabbix_get and zabbix_agentd (r85206)
No encryption:
Using certificates:
Using PSK:
| ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Oct 02 ] | ||||||||||||||||||||||||||||||||||||
Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-14856-30. | ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Oct 16 ] | ||||||||||||||||||||||||||||||||||||
Fixed in versions:
| ||||||||||||||||||||||||||||||||||||
Comment by Andris Mednis [ 2018 Oct 16 ] | ||||||||||||||||||||||||||||||||||||
Documented in martins-v Thanks, reworded slightly the what's new entries. CLOSED. |