Details

      Description

      daemon communication encryption: ssl

      encryption should be supported by all components :

      server (also node-node);
      proxy
      agent
      zabbix_get
      zabbix_sender
      java proxy (out of scope in the initial implementation)

      ZBXNEXT-17 - psk
      ZBXNEXT-1264 - kerberos

      specification : https://www.zabbix.org/wiki/Docs/specs/ZBXNEXT-1263

      1. subissue-67-r56413.patch
        5 kB
        Aleksandrs Saveljevs
      2. subissue-86-r56746.patch
        5 kB
        Aleksandrs Saveljevs
      1. agent-cpu-usage.png
        159 kB
      2. agent-cpu-usage-psk.png
        146 kB
      3. cannot-update.png
        22 kB
      4. encryption_column.png
        39 kB
      5. gnutls-to-openssl-leak.png
        19 kB
      6. proxy-config.png
        9 kB
      7. script-from-frontend.png
        14 kB
      8. server-cpu-usage.png
        158 kB
      9. server-cpu-usage-psk.png
        153 kB
      10. undefined-index.png
        26 kB

        Issue Links

          Activity

          Hide
          Sébastien added a comment -

          I aggree !

          It would allow monitoring of standalone vm for example without the hassle to setup vpn or ssh tunneling.

          Keep up the good work.

          Regards

          Show
          Sébastien added a comment - I aggree ! It would allow monitoring of standalone vm for example without the hassle to setup vpn or ssh tunneling. Keep up the good work. Regards
          Hide
          Raymond Kuiper added a comment -

          Please have a look at ZBXNEXT-2308. Implementing MQTT as a transport protocol will solve this problem and bring some other interesting functionality to Zabbix as well.

          Show
          Raymond Kuiper added a comment - Please have a look at ZBXNEXT-2308 . Implementing MQTT as a transport protocol will solve this problem and bring some other interesting functionality to Zabbix as well.
          Hide
          Andris Mednis added a comment -

          Raymond - you mean - rearchitect Zabbix for using message-queues in server/proxy/agent communications and find an MQTT library with built-in TLS support ? Well, it could be a way to achieve higher efficiency and throughput for Zabbix in future, but it seems to me a quite large change, larger than adding TLS support to existing architecture and protocol.

          Show
          Andris Mednis added a comment - Raymond - you mean - rearchitect Zabbix for using message-queues in server/proxy/agent communications and find an MQTT library with built-in TLS support ? Well, it could be a way to achieve higher efficiency and throughput for Zabbix in future, but it seems to me a quite large change, larger than adding TLS support to existing architecture and protocol.
          Hide
          Andris Mednis added a comment -

          Started to write specifications in https://www.zabbix.org/wiki/Docs/specs/ZBXNEXT-1263 .

          Show
          Andris Mednis added a comment - Started to write specifications in https://www.zabbix.org/wiki/Docs/specs/ZBXNEXT-1263 .
          Hide
          Marc added a comment -

          How about supporting LibreSSL's libtls API as well?
          Haven't worked with it yet, so can't really judge it atm. I just want to raise attention for it in case it wasn't considered yet but could be of interest.

          Show
          Marc added a comment - How about supporting LibreSSL's libtls API as well? Haven't worked with it yet, so can't really judge it atm. I just want to raise attention for it in case it wasn't considered yet but could be of interest.
          Hide
          Andris Mednis added a comment -

          Maybe with time but not now. Seems like LibreSSL is not yet packaged even for Debian, let alone other distributions. We have to take into account what is available in Red Hat Enterprise Linux when choosing libraries for Zabbix.

          Show
          Andris Mednis added a comment - Maybe with time but not now. Seems like LibreSSL is not yet packaged even for Debian, let alone other distributions. We have to take into account what is available in Red Hat Enterprise Linux when choosing libraries for Zabbix.
          Hide
          Andris Mednis added a comment -

          ZBXNEXT-2497 is also being solved as part of ZBXNEXT-1263.

          Show
          Andris Mednis added a comment - ZBXNEXT-2497 is also being solved as part of ZBXNEXT-1263 .
          Hide
          Andris Mednis added a comment - - edited

          (1) [F] In PHP frontend Configuration->Hosts-><some host>->Encryption the input field PSK allows to enter up to 255 characters. The limit should be increased to 512 characters and a validation rule should be added to allow only hex-digits 0-9A-Fa-f.
          The same change should be implemented for Administration->Proxies-><some proxy>->Encryption input field PSK.
          (Development branch is svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-1)

          Oleg Egorov RESOLVED IN r54045

          Ivo Kurzemnieks CLOSED

          Show
          Andris Mednis added a comment - - edited (1) [F] In PHP frontend Configuration->Hosts-><some host>->Encryption the input field PSK allows to enter up to 255 characters. The limit should be increased to 512 characters and a validation rule should be added to allow only hex-digits 0-9A-Fa-f. The same change should be implemented for Administration->Proxies-><some proxy>->Encryption input field PSK. (Development branch is svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-1) Oleg Egorov RESOLVED IN r54045 Ivo Kurzemnieks CLOSED
          Hide
          Andris Mednis added a comment - - edited

          (2) [F] It would be a good idea to add "Encryption" column in PHP frontend Configuration->Hosts to display configuration overview of outgoing ("To") and incoming ("From") connections for each host:

          Name   	            Interface	           Status    Availability       Encryption
          -------------  ...  ----------------  ...  -------   -----------------  ----------
          Zabbix server       127.0.0.1: 10070       Enabled   zbx snmp jmx ipmi  To  From
          

          "To" could be small icons displaying 6 states: "unencrypted", "PSK", "certificate", "certificate with issuer specified", "certificate with subject specified", "certificate with issuer and subject specified".
          "From" could be a combination of states: "unencrypted", "PSK", "certificate", "certificate with issuer specified", "certificate with subject specified".

          Oleg Egorov FIXED IN r54069

          Ivo Kurzemnieks For column details see (53).
          CLOSED.

          Show
          Andris Mednis added a comment - - edited (2) [F] It would be a good idea to add "Encryption" column in PHP frontend Configuration->Hosts to display configuration overview of outgoing ("To") and incoming ("From") connections for each host: Name Interface Status Availability Encryption ------------- ... ---------------- ... ------- ----------------- ---------- Zabbix server 127.0.0.1: 10070 Enabled zbx snmp jmx ipmi To From "To" could be small icons displaying 6 states: "unencrypted", "PSK", "certificate", "certificate with issuer specified", "certificate with subject specified", "certificate with issuer and subject specified". "From" could be a combination of states: "unencrypted", "PSK", "certificate", "certificate with issuer specified", "certificate with subject specified". Oleg Egorov FIXED IN r54069 Ivo Kurzemnieks For column details see (53) . CLOSED.
          Hide
          Andris Mednis added a comment - - edited

          (3) In PHP frontend when a new host is created and nothing is specified about encryption, a new record in "hosts" table is created with tls_connect=1 and tls_accept=1. It is correct. But when a new proxy is created without specifying encryption details, a new record in "hosts" table is created with tls_connect=1 and tls_accept=0 (it should be 1).

          Oleg Egorov FIXED IN r54081

          Ivo Kurzemnieks CLOSED.

          Show
          Andris Mednis added a comment - - edited (3) In PHP frontend when a new host is created and nothing is specified about encryption, a new record in "hosts" table is created with tls_connect=1 and tls_accept=1. It is correct. But when a new proxy is created without specifying encryption details, a new record in "hosts" table is created with tls_connect=1 and tls_accept=0 (it should be 1). Oleg Egorov FIXED IN r54081 Ivo Kurzemnieks CLOSED.
          Show
          Andris Mednis added a comment - - edited (3.1) Documented at: https://www.zabbix.com/documentation/3.0/manual/config/hosts/host?&#configuration https://www.zabbix.com/documentation/3.0/manual/distributed_monitoring/proxies?&#adding_proxies https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_server https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_proxy https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_agentd https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_agentd_win https://www.zabbix.com/documentation/3.0/manual/installation/install?&#configure_the_sources https://www.zabbix.com/documentation/3.0/manual/introduction/whatsnew300#performance_improvements1 https://www.zabbix.com/documentation/3.0/manual/encryption https://www.zabbix.com/documentation/3.0/manpages/zabbix_get https://www.zabbix.com/documentation/3.0/manpages/zabbix_sender "zabbix_get" exit code improvement documented in https://www.zabbix.com/documentation/3.0/manual/introduction/whatsnew300?&#commandline_utilities_improvements Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (4) Trying to compile the branch without any encryption libraries give the following errors:

          $ make
          active.c: In function ‘refresh_active_checks’:
          active.c:571:14: error: ‘CONFIG_TLS_SERVER_CERT_ISSUER’ undeclared (first use in this function)
             tls_arg1 = CONFIG_TLS_SERVER_CERT_ISSUER;
                        ^
          active.c:571:14: note: each undeclared identifier is reported only once for each function it appears in
          active.c:572:14: error: ‘CONFIG_TLS_SERVER_CERT_SUBJECT’ undeclared (first use in this function)
             tls_arg2 = CONFIG_TLS_SERVER_CERT_SUBJECT;
                        ^
          active.c:576:14: error: ‘CONFIG_TLS_PSK_IDENTITY’ undeclared (first use in this function)
             tls_arg1 = CONFIG_TLS_PSK_IDENTITY;
                        ^
          active.c: In function ‘send_buffer’:
          active.c:751:14: error: ‘CONFIG_TLS_SERVER_CERT_ISSUER’ undeclared (first use in this function)
             tls_arg1 = CONFIG_TLS_SERVER_CERT_ISSUER;
                        ^
          active.c:752:14: error: ‘CONFIG_TLS_SERVER_CERT_SUBJECT’ undeclared (first use in this function)
             tls_arg2 = CONFIG_TLS_SERVER_CERT_SUBJECT;
                        ^
          active.c:756:14: error: ‘CONFIG_TLS_PSK_IDENTITY’ undeclared (first use in this function)
             tls_arg1 = CONFIG_TLS_PSK_IDENTITY;
                        ^
          

          Andris Mednis RESOLVED in r54041.

          Aleksandrs Saveljevs It now compiles, but issues the following warnings:

          dbconfig.c:3813:19: warning: ‘__config_psk_hash’ defined but not used [-Wunused-function]
           static zbx_hash_t __config_psk_hash(const void *data)
                             ^
          dbconfig.c:3821:12: warning: ‘__config_psk_compare’ defined but not used [-Wunused-function]
           static int __config_psk_compare(const void *d1, const void *d2)
                      ^
          proxy.c: In function ‘get_active_proxy_id’:
          proxy.c:114:23: warning: unused variable ‘tls_accept’ [-Wunused-variable]
            unsigned int  status, tls_accept;
                                 ^
          proxy.c:114:15: warning: unused variable ‘status’ [-Wunused-variable]
            unsigned int  status, tls_accept;
                         ^
          proxy.c: In function ‘process_mass_data’:
          proxy.c:2207:22: warning: unused variable ‘attr’ [-Wunused-variable]
            zbx_tls_conn_attr_t attr;
                                ^
          active.c: In function ‘get_hostid_by_host’:
          active.c:94:24: warning: unused variable ‘attr’ [-Wunused-variable]
              zbx_tls_conn_attr_t attr;
                                  ^
          

          Aleksandrs Saveljevs REOPENED

          Andris Mednis RESOLVED in r54056.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (4) Trying to compile the branch without any encryption libraries give the following errors: $ make active.c: In function ‘refresh_active_checks’: active.c:571:14: error: ‘CONFIG_TLS_SERVER_CERT_ISSUER’ undeclared (first use in this function) tls_arg1 = CONFIG_TLS_SERVER_CERT_ISSUER; ^ active.c:571:14: note: each undeclared identifier is reported only once for each function it appears in active.c:572:14: error: ‘CONFIG_TLS_SERVER_CERT_SUBJECT’ undeclared (first use in this function) tls_arg2 = CONFIG_TLS_SERVER_CERT_SUBJECT; ^ active.c:576:14: error: ‘CONFIG_TLS_PSK_IDENTITY’ undeclared (first use in this function) tls_arg1 = CONFIG_TLS_PSK_IDENTITY; ^ active.c: In function ‘send_buffer’: active.c:751:14: error: ‘CONFIG_TLS_SERVER_CERT_ISSUER’ undeclared (first use in this function) tls_arg1 = CONFIG_TLS_SERVER_CERT_ISSUER; ^ active.c:752:14: error: ‘CONFIG_TLS_SERVER_CERT_SUBJECT’ undeclared (first use in this function) tls_arg2 = CONFIG_TLS_SERVER_CERT_SUBJECT; ^ active.c:756:14: error: ‘CONFIG_TLS_PSK_IDENTITY’ undeclared (first use in this function) tls_arg1 = CONFIG_TLS_PSK_IDENTITY; ^ Andris Mednis RESOLVED in r54041. Aleksandrs Saveljevs It now compiles, but issues the following warnings: dbconfig.c:3813:19: warning: ‘__config_psk_hash’ defined but not used [-Wunused-function] static zbx_hash_t __config_psk_hash(const void *data) ^ dbconfig.c:3821:12: warning: ‘__config_psk_compare’ defined but not used [-Wunused-function] static int __config_psk_compare(const void *d1, const void *d2) ^ proxy.c: In function ‘get_active_proxy_id’: proxy.c:114:23: warning: unused variable ‘tls_accept’ [-Wunused-variable] unsigned int status, tls_accept; ^ proxy.c:114:15: warning: unused variable ‘status’ [-Wunused-variable] unsigned int status, tls_accept; ^ proxy.c: In function ‘process_mass_data’: proxy.c:2207:22: warning: unused variable ‘attr’ [-Wunused-variable] zbx_tls_conn_attr_t attr; ^ active.c: In function ‘get_hostid_by_host’: active.c:94:24: warning: unused variable ‘attr’ [-Wunused-variable] zbx_tls_conn_attr_t attr; ^ Aleksandrs Saveljevs REOPENED Andris Mednis RESOLVED in r54056. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (5) The following diff is proposed to fix a typo in r54041:

          Index: src/zabbix_server/poller/checks_agent.c
          ===================================================================
          --- src/zabbix_server/poller/checks_agent.c     (revision 54052)
          +++ src/zabbix_server/poller/checks_agent.c     (working copy)
          @@ -76,8 +76,8 @@
                                  tls_arg2 = item->host.tls_psk;
                          }
           #else
          -               zbx_snprintf(buffer, sizeof(buffer), "A TLS connection is configured to be used with agent but support "
          -                               "for TLS was not not compiled into server or proxy.");
          +               zbx_snprintf(buffer, sizeof(buffer), "A TLS connection is configured to be used with agent but support"
          +                               " for TLS was not compiled into server or proxy.");
                          SET_MSG_RESULT(result, strdup(buffer));
                          ret = NETWORK_ERROR;
                          goto out;
          

          Note the space convention (it should be in the beginning of the second line) and the double "not".

          Aleksandrs Saveljevs While at it, it might be nice to create a function that returns which type of binary we are and use it here, similar to how it is done in dbupgrade.c:

          zabbix_log(LOG_LEVEL_CRIT, "The %s does not match Zabbix database."
          		" Current database version (mandatory/optional): UNKNOWN."
          		" Required mandatory version: %08d.",
          		ZBX_PROGRAM_TYPE_SERVER == program_type ? "server" : "proxy", required);
          

          Andris Mednis Thanks! RESOLVED in r54063.

          Aleksandrs Saveljevs Commit above ignored the function suggestion, but it turns out we already have such a function: get_program_type_string(). Made use of it in 54143. RESOLVED.

          Aleksandrs Saveljevs Please also see r54176, which adds missing program types to get_program_type_string() function. Still RESOLVED.

          Aleksandrs Saveljevs Compilation warnings and errors fixed in r54178, r54181, and r54185. Still RESOLVED.

          Andris Mednis Thanks! CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (5) The following diff is proposed to fix a typo in r54041: Index: src/zabbix_server/poller/checks_agent.c =================================================================== --- src/zabbix_server/poller/checks_agent.c (revision 54052) +++ src/zabbix_server/poller/checks_agent.c (working copy) @@ -76,8 +76,8 @@ tls_arg2 = item->host.tls_psk; } #else - zbx_snprintf(buffer, sizeof(buffer), "A TLS connection is configured to be used with agent but support " - "for TLS was not not compiled into server or proxy."); + zbx_snprintf(buffer, sizeof(buffer), "A TLS connection is configured to be used with agent but support" + " for TLS was not compiled into server or proxy."); SET_MSG_RESULT(result, strdup(buffer)); ret = NETWORK_ERROR; goto out; Note the space convention (it should be in the beginning of the second line) and the double "not". Aleksandrs Saveljevs While at it, it might be nice to create a function that returns which type of binary we are and use it here, similar to how it is done in dbupgrade.c: zabbix_log(LOG_LEVEL_CRIT, "The %s does not match Zabbix database." " Current database version (mandatory/optional): UNKNOWN." " Required mandatory version: %08d.", ZBX_PROGRAM_TYPE_SERVER == program_type ? "server" : "proxy", required); Andris Mednis Thanks! RESOLVED in r54063. Aleksandrs Saveljevs Commit above ignored the function suggestion, but it turns out we already have such a function: get_program_type_string(). Made use of it in 54143. RESOLVED. Aleksandrs Saveljevs Please also see r54176, which adds missing program types to get_program_type_string() function. Still RESOLVED. Aleksandrs Saveljevs Compilation warnings and errors fixed in r54178, r54181, and r54185. Still RESOLVED. Andris Mednis Thanks! CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment -

          (6) We currently use HAVE_POLARSSL in the source code. However, PolarSSL was recently rebranded as mbed TLS, so in the future we might wish to use HAVE_MBEDTLS instead. The recent blog post on mbed TLS 2.0 release at https://tls.mbed.org/tech-updates/blog/preparing-2.0-upgrade announces that all symbols will be renamed.

          This comment is mostly to document that this has been thought of, but we will probably not take any renaming action at this point, considering that all repositories currently have PolarSSL and that ./configure option (the only publicly visible thing) already refers to mbed TLS.

          Aleksandrs Saveljevs WON'T FIX for now.

          Show
          Aleksandrs Saveljevs added a comment - (6) We currently use HAVE_POLARSSL in the source code. However, PolarSSL was recently rebranded as mbed TLS, so in the future we might wish to use HAVE_MBEDTLS instead. The recent blog post on mbed TLS 2.0 release at https://tls.mbed.org/tech-updates/blog/preparing-2.0-upgrade announces that all symbols will be renamed. This comment is mostly to document that this has been thought of, but we will probably not take any renaming action at this point, considering that all repositories currently have PolarSSL and that ./configure option (the only publicly visible thing) already refers to mbed TLS. Aleksandrs Saveljevs WON'T FIX for now.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          Useful reading:

          Show
          Aleksandrs Saveljevs added a comment - - edited Useful reading: Workings of TLS http://www.moserware.com/2009/06/first-few-milliseconds-of-https.html (The First Few Milliseconds of an HTTPS Connection) https://technet.microsoft.com/en-us/library/cc781476(v=ws.10).aspx (Overview of SSL/TLS Encryption, and further pages) https://blog.cloudflare.com/keyless-ssl-the-nitty-gritty-technical-details/ (Keyless SSL: The Nitty Gritty Technical Details, and the rest of the blog) http://security.stackexchange.com/questions/20803/how-does-ssl-tls-work http://chimera.labs.oreilly.com/books/1230000000545/ch04.html (Chapter 4. Transport Layer Security (TLS)) Standards https://tools.ietf.org/html/rfc5246 (The Transport Layer Security (TLS) Protocol Version 1.2) https://tools.ietf.org/html/rfc4279 (Pre-Shared Key Ciphersuites for Transport Layer Security (TLS)) http://tools.ietf.org/rfc/rfc6460.txt (Suite B Profile for Transport Layer Security (TLS). January 2012) https://tools.ietf.org/html/rfc7525 (Recommendations for Secure Use of Transport Layer Security (TLS)) Recommendations, best practice http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-90Ar1.pdf (NIST Special Publication 800-90A Revision 1. Recommendation for Random Number Generation Using Deterministic Random Bit Generators) - for developers, how to make personalization string. https://www.feistyduck.com/library/openssl-cookbook/ (Ivan Ristić. OpenSSL Cookbook) https://www.ssllabs.com/downloads/SSL_TLS_Deployment_Best_Practices.pdf (Ivan Ristić. SSL/TLS Deployment Best Practices) Security method comparison https://community.qualys.com/blogs/securitylabs/2013/06/25/ssl-labs-deploying-forward-secrecy (a bit on forward secrecy and performance of DHE and ECDHE) http://vincent.bernat.im/en/blog/2011-ssl-perfect-forward-secrecy.html (on forward secrecy and Diffie-Hellman key exchange, with and without elliptic curves) Programming with OpenSSL https://mail-archive.com/openssl-dev@openssl.org/msg26634.html (man page for X509_load_crl_file() and few other functions) https://www.mail-archive.com/openssl-dev@openssl.org/msg18431.html (short example of calling X509_load_crl_file()) http://openssl.6102.n7.nabble.com/missing-steps-with-CRLs-td18932.html (example of loading and checking CRL) http://fixunix.com/openssl/153719-re-loading-crl-client.html (example of loading CRL)
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (7) There seem to be many places where we zbx_tcp_connect() without encryption (e.g., web.page.get[] or Telnet checks). Maybe we should introduce zbx_tcp_connect_ext() for cases where encryption is needed?

          Andris Mednis Currently zbx_tcp_connect() is called in 17 places, approximately half of them (9) are without encryption and it is clearly visible in the code.

          Aleksandrs Saveljevs It was decided that there is currently no need for zbx_tcp_connect_ext(), because zbx_tcp_connect() is only used in just 17 places and passing explicit ZBX_TCP_SEC_UNENCRYPTED is a nice sign of warning. WON'T FIX.

          Show
          Aleksandrs Saveljevs added a comment - - edited (7) There seem to be many places where we zbx_tcp_connect() without encryption (e.g., web.page.get[] or Telnet checks). Maybe we should introduce zbx_tcp_connect_ext() for cases where encryption is needed? Andris Mednis Currently zbx_tcp_connect() is called in 17 places, approximately half of them (9) are without encryption and it is clearly visible in the code. Aleksandrs Saveljevs It was decided that there is currently no need for zbx_tcp_connect_ext(), because zbx_tcp_connect() is only used in just 17 places and passing explicit ZBX_TCP_SEC_UNENCRYPTED is a nice sign of warning. WON'T FIX.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (8) This subissue is to document which library versions we support and why, and will be gradually updated. The results should then be copied to the official documentation.

          mbed TLS (version 1.3.9 and above)

          Specification says the following:

          PolarSSL (recently renamed to mbed TLS)(from versions 1.3.x (earlier versions do not support pre-shared keys))

          However, trying to build with PolarSSL 1.2.14 (the last 1.2.x) fails with the following error:

          tls.c:34:27: fatal error: polarssl/oid.h: No such file or directory
           # include <polarssl/oid.h>
                                     ^
          compilation terminated.
          

          Therefore, it is not just about pre-shared keys.

          Minimum supported version is yet to be found.

          Andris Mednis Zabbix does not compile with PolarSSL 1.3.7 and earlier. Compilation with 1.3.8 succeeds but some built-in tests fail. Let's set minimum version to PolarSSL 1.3.9 (two built-in tests segfault, but Zabbix works and this version is available on Debian stable).

          Aleksandrs Saveljevs mbed TLS CLOSED.

          GnuTLS (version 3.1.18 and above)

          Aleksandrs Saveljevs Zabbix does not compile with GnuTLS 3.0.32:

          tls.c: In function ‘zbx_tls_connect’:
          tls.c:3359:52: error: ‘GNUTLS_NO_EXTENSIONS’ undeclared (first use in this function)
            if (GNUTLS_E_SUCCESS != (res = gnutls_init(&s->tls_ctx, GNUTLS_CLIENT | GNUTLS_NO_EXTENSIONS)))
          

          Aleksandrs Saveljevs Same error with GnuTLS 3.1.2, but compiling with GnuTLS 3.1.3 fails with the following:

          ../../src/libs/zbxcrypto/libzbxcrypto.a(tls.o): In function `zbx_verify_peer_cert':
          /home/zabbix/ZBXNEXT-1263-1/src/libs/zbxcrypto/tls.c:1979: undefined reference to `gnutls_certificate_verification_status_print'
          collect2: error: ld returned 1 exit status
          

          Aleksandrs Saveljevs GnuTLS 3.1.4 seems to be the first version that current development branch (r54190) compiles with.

          Andris Mednis RHEL 7 uses v.3.1.18, Debian stable uses 3.3.8. Documented in specifications that minimum GnuTLS version is 3.1.18.

          Aleksandrs Saveljevs Version 3.1.18 seems to be the lowest 3.x version used in mainstream Linux distributions. FreeBSD 10.1 uses 3.2.16 (see http://svnweb.freebsd.org/ports/tags/RELEASE_10_1_0/security/gnutls/distinfo?view=markup), FreeBSD 8 and FreeBSD 9 use 2.x.

          Aleksandrs Saveljevs GnuTLS CLOSED.

          OpenSSL (version 1.0.1 and above)

          Aleksandrs Saveljevs Apart from the problems discussed in (12), we compile successfully with OpenSSL 1.0.1, but fail with OpenSSL 1.0.0r:

          ../../src/libs/zbxcrypto/libzbxcrypto.a(tls.o): In function `zbx_tls_init_child':
          /home/zabbix/ZBXNEXT-1263-1/src/libs/zbxcrypto/tls.c:2731: undefined reference to `TLSv1_2_client_method'
          /home/zabbixZBXNEXT-1263-1/src/libs/zbxcrypto/tls.c:2733: undefined reference to `TLSv1_2_method'
          

          Apparently, OpenSSL 1.0.0r does not support TLS 1.2. So OpenSSL 1.0.1 is the likely candidate for the minimum version.

          Andris Mednis Documented in specifications that minimum OpenSSL version is 1.0.1 (available on RHEL 7, Centos 6,7, Ubuntu LTS 12.04).

          Aleksandrs Saveljevs OpenSSL CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (8) This subissue is to document which library versions we support and why, and will be gradually updated. The results should then be copied to the official documentation. mbed TLS (version 1.3.9 and above) Specification says the following: PolarSSL (recently renamed to mbed TLS)(from versions 1.3.x (earlier versions do not support pre-shared keys)) However, trying to build with PolarSSL 1.2.14 (the last 1.2.x) fails with the following error: tls.c:34:27: fatal error: polarssl/oid.h: No such file or directory # include <polarssl/oid.h> ^ compilation terminated. Therefore, it is not just about pre-shared keys. Minimum supported version is yet to be found. Andris Mednis Zabbix does not compile with PolarSSL 1.3.7 and earlier. Compilation with 1.3.8 succeeds but some built-in tests fail. Let's set minimum version to PolarSSL 1.3.9 (two built-in tests segfault, but Zabbix works and this version is available on Debian stable). Aleksandrs Saveljevs mbed TLS CLOSED. GnuTLS (version 3.1.18 and above) Aleksandrs Saveljevs Zabbix does not compile with GnuTLS 3.0.32: tls.c: In function ‘zbx_tls_connect’: tls.c:3359:52: error: ‘GNUTLS_NO_EXTENSIONS’ undeclared (first use in this function) if (GNUTLS_E_SUCCESS != (res = gnutls_init(&s->tls_ctx, GNUTLS_CLIENT | GNUTLS_NO_EXTENSIONS))) Aleksandrs Saveljevs Same error with GnuTLS 3.1.2, but compiling with GnuTLS 3.1.3 fails with the following: ../../src/libs/zbxcrypto/libzbxcrypto.a(tls.o): In function `zbx_verify_peer_cert': /home/zabbix/ZBXNEXT-1263-1/src/libs/zbxcrypto/tls.c:1979: undefined reference to `gnutls_certificate_verification_status_print' collect2: error: ld returned 1 exit status Aleksandrs Saveljevs GnuTLS 3.1.4 seems to be the first version that current development branch (r54190) compiles with. Andris Mednis RHEL 7 uses v.3.1.18, Debian stable uses 3.3.8. Documented in specifications that minimum GnuTLS version is 3.1.18. Aleksandrs Saveljevs Version 3.1.18 seems to be the lowest 3.x version used in mainstream Linux distributions. FreeBSD 10.1 uses 3.2.16 (see http://svnweb.freebsd.org/ports/tags/RELEASE_10_1_0/security/gnutls/distinfo?view=markup ), FreeBSD 8 and FreeBSD 9 use 2.x. Aleksandrs Saveljevs GnuTLS CLOSED. OpenSSL (version 1.0.1 and above) Aleksandrs Saveljevs Apart from the problems discussed in (12), we compile successfully with OpenSSL 1.0.1, but fail with OpenSSL 1.0.0r: ../../src/libs/zbxcrypto/libzbxcrypto.a(tls.o): In function `zbx_tls_init_child': /home/zabbix/ZBXNEXT-1263-1/src/libs/zbxcrypto/tls.c:2731: undefined reference to `TLSv1_2_client_method' /home/zabbixZBXNEXT-1263-1/src/libs/zbxcrypto/tls.c:2733: undefined reference to `TLSv1_2_method' Apparently, OpenSSL 1.0.0r does not support TLS 1.2. So OpenSSL 1.0.1 is the likely candidate for the minimum version. Andris Mednis Documented in specifications that minimum OpenSSL version is 1.0.1 (available on RHEL 7, Centos 6,7, Ubuntu LTS 12.04). Aleksandrs Saveljevs OpenSSL CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (9) It should be considered whether we wish to check for PolarSSL version during ./configure (like we do for libssh, for instance) or just fail during compilation. If we decide on the former, then include/polarssl/version.h is available at least since PolarSSL 1.0.0, but it is not present in PolarSSL 0.10.0.

          Aleksandrs Saveljevs Same question for GnuTLS and OpenSSL.

          Andris Mednis Let's check for mbed TLS (PolarSSL) min. version 1.3.9, GnuTLS min. version 3.1.18, OpenSSL min. version 1.0.1.

          Andris Mednis Min.version check for mbed TLS RESOLVED in r54685. The latest mbed TLS is 2.0.0 but it won't be initially supported in Zabbix and detected by configuration scripts.

          Andris Mednis Min.version check for GnuTLS RESOLVED in r54692.

          Andris Mednis Min.version check for OpenSSL RESOLVED in r54736.

          Aleksandrs Saveljevs Looks good, but scripts seem to use double parentheses around variables:

          if test $((found_gnutls_version_major)) -gt $((minimal_gnutls_version_major)); then ...
          

          Could you please describe the reason for that? Does it work in all shells (e.g., csh)?

          Andris Mednis Double parentheses are used to convert version numbers to integers used in arithmetic evaluation - to compare version numbers as numbers, not as strings. It should work in any POSIX-compatible shell.

          Aleksandrs Saveljevs It does not work in the real /bin/sh (in Debian, /bin/sh is a symlink to /bin/dash, but a real /bin/sh is available, for instance, on our Solaris 10).

          Also, what is the use of converting a variable to a number before passing it to "test"? Suppose that $found_gnutls_version_major could not be obtained properly and contains a value "abc". The result of $((found_gnutls_version_major)) is then 0. If, however, we give $found_gnutls_version_major to "test", then "test" will complain, because -gt operator expects integers:

          $ found_gnutls_version_major='abc'
          $ test $((found_gnutls_version_major)) -eq 0 && echo YES
          YES
          $ test $found_gnutls_version_major -eq 0 && echo YES                            
          bash: test: abc: integer expression expected
          

          REOPENED

          Aleksandrs Saveljevs To answer my question above: $((variable)) is meant to convert a hexadecimal number to decimal.

          Trying it out on our Solaris 10, it seems to magically work, but it does not work on our NetBSD 5.0, even though bash seems to be used:

          $ ./configure --enable-agent --with-openssl=...
          ...
          checking for OpenSSL support... ./configure[10667]: 0x1000204: bad number `0x1000204'
          ...
          $ echo $SHELL
          /usr/pkg/bin/bash
          

          In m4/libopenssl.m4 script, we rely on awk. Maybe we could use awk to convert a hexadecimal number to decimal, like so:

          $ echo 0x1000204 | awk '{ printf "%d\n", $1 }'                                              
          16777732
          

          Aleksandrs Saveljevs The above suggestion works on NetBSD 5.0 with the following awk version:

          $ awk --version
          awk version 20070501
          

          However, it does not work on Linux with GNU awk:

          $ echo 0x1000204 | awk '{ printf "%d\n", $1 }'  
          0
          $ awk --version
          GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)
          

          This is because $1 is a variable that contains a string that contains a hexadecimal number, not a number literal. So we can do it differently:

          $ version=0x1000204
          $ awk "BEGIN { printf \"%d\n\", $version }"
          16777732
          

          Aleksandrs Saveljevs Unfortunately, the suggestion above does not work on NetBSD - exactly opposite of the previous suggestion. The previous suggestion is made workable on Linux by using --non-decimal-data, but this is not portable either.

          Page https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Shell-Substitutions.html describes $((...)) as not portable.

          Aleksandrs Saveljevs It was decided to rewrite the check using good old "expr" in a way that is also used elsewhere. RESOLVED in r55503 and r55511.

          Andris Mednis Thanks for solution! CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (9) It should be considered whether we wish to check for PolarSSL version during ./configure (like we do for libssh, for instance) or just fail during compilation. If we decide on the former, then include/polarssl/version.h is available at least since PolarSSL 1.0.0, but it is not present in PolarSSL 0.10.0. Aleksandrs Saveljevs Same question for GnuTLS and OpenSSL. Andris Mednis Let's check for mbed TLS (PolarSSL) min. version 1.3.9, GnuTLS min. version 3.1.18, OpenSSL min. version 1.0.1. Andris Mednis Min.version check for mbed TLS RESOLVED in r54685. The latest mbed TLS is 2.0.0 but it won't be initially supported in Zabbix and detected by configuration scripts. Andris Mednis Min.version check for GnuTLS RESOLVED in r54692. Andris Mednis Min.version check for OpenSSL RESOLVED in r54736. Aleksandrs Saveljevs Looks good, but scripts seem to use double parentheses around variables: if test $((found_gnutls_version_major)) -gt $((minimal_gnutls_version_major)); then ... Could you please describe the reason for that? Does it work in all shells (e.g., csh)? Andris Mednis Double parentheses are used to convert version numbers to integers used in arithmetic evaluation - to compare version numbers as numbers, not as strings. It should work in any POSIX-compatible shell. Aleksandrs Saveljevs It does not work in the real /bin/sh (in Debian, /bin/sh is a symlink to /bin/dash, but a real /bin/sh is available, for instance, on our Solaris 10). Also, what is the use of converting a variable to a number before passing it to "test"? Suppose that $found_gnutls_version_major could not be obtained properly and contains a value "abc". The result of $((found_gnutls_version_major)) is then 0. If, however, we give $found_gnutls_version_major to "test", then "test" will complain, because -gt operator expects integers: $ found_gnutls_version_major='abc' $ test $((found_gnutls_version_major)) -eq 0 && echo YES YES $ test $found_gnutls_version_major -eq 0 && echo YES bash: test: abc: integer expression expected REOPENED Aleksandrs Saveljevs To answer my question above: $((variable)) is meant to convert a hexadecimal number to decimal. Trying it out on our Solaris 10, it seems to magically work, but it does not work on our NetBSD 5.0, even though bash seems to be used: $ ./configure --enable-agent --with-openssl=... ... checking for OpenSSL support... ./configure[10667]: 0x1000204: bad number `0x1000204' ... $ echo $SHELL /usr/pkg/bin/bash In m4/libopenssl.m4 script, we rely on awk. Maybe we could use awk to convert a hexadecimal number to decimal, like so: $ echo 0x1000204 | awk '{ printf "%d\n", $1 }' 16777732 Aleksandrs Saveljevs The above suggestion works on NetBSD 5.0 with the following awk version: $ awk --version awk version 20070501 However, it does not work on Linux with GNU awk: $ echo 0x1000204 | awk '{ printf "%d\n", $1 }' 0 $ awk --version GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0) This is because $1 is a variable that contains a string that contains a hexadecimal number, not a number literal. So we can do it differently: $ version=0x1000204 $ awk "BEGIN { printf \"%d\n\", $version }" 16777732 Aleksandrs Saveljevs Unfortunately, the suggestion above does not work on NetBSD - exactly opposite of the previous suggestion. The previous suggestion is made workable on Linux by using --non-decimal-data, but this is not portable either. Page https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Shell-Substitutions.html describes $((...)) as not portable. Aleksandrs Saveljevs It was decided to rewrite the check using good old "expr" in a way that is also used elsewhere. RESOLVED in r55503 and r55511. Andris Mednis Thanks for solution! CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (10) Compiling with PolarSSL 1.3.0 (the minimum supported version as per current specification) fails with the following:

          tls.c: In function ‘zbx_tls_cert_error_msg’:
          tls.c:322:5: error: ‘BADCERT_FUTURE’ undeclared (first use in this function)
               BADCERT_FUTURE, BADCRL_FUTURE, 0 };
               ^
          tls.c:322:5: note: each undeclared identifier is reported only once for each function it appears in
          tls.c:322:21: error: ‘BADCRL_FUTURE’ undeclared (first use in this function)
               BADCERT_FUTURE, BADCRL_FUTURE, 0 };
                               ^
          tls.c: In function ‘zbx_is_ciphersuite_cert’:
          tls.c:764:4: error: ‘POLARSSL_KEY_EXCHANGE_ECDHE_PSK’ undeclared (first use in this function)
              POLARSSL_KEY_EXCHANGE_ECDHE_PSK != info->key_exchange &&
              ^
          tls.c:764:36: warning: comparison between pointer and integer
              POLARSSL_KEY_EXCHANGE_ECDHE_PSK != info->key_exchange &&
                                              ^
          tls.c: In function ‘zbx_is_ciphersuite_psk’:
          tls.c:798:4: error: ‘POLARSSL_KEY_EXCHANGE_ECDHE_PSK’ undeclared (first use in this function)
              POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange ||
              ^
          tls.c:798:36: warning: comparison between pointer and integer
              POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange ||
                                              ^
          tls.c: In function ‘zbx_tls_accept’:
          tls.c:3971:4: error: ‘POLARSSL_KEY_EXCHANGE_ECDHE_PSK’ undeclared (first use in this function)
              POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange ||
              ^
          tls.c:3971:36: warning: comparison between pointer and integer
              POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange ||
                                              ^
          make[3]: *** [tls.o] Error 1
          make[2]: *** [all-recursive] Error 1
          make[1]: *** [all-recursive] Error 1
          make: *** [all-recursive] Error 1
          

          Andris Mednis As proposed in (8) let's set minimum version to PolarSSL 1.3.9.

          Aleksandrs Saveljevs WON'T FIX

          Show
          Aleksandrs Saveljevs added a comment - - edited (10) Compiling with PolarSSL 1.3.0 (the minimum supported version as per current specification) fails with the following: tls.c: In function ‘zbx_tls_cert_error_msg’: tls.c:322:5: error: ‘BADCERT_FUTURE’ undeclared (first use in this function) BADCERT_FUTURE, BADCRL_FUTURE, 0 }; ^ tls.c:322:5: note: each undeclared identifier is reported only once for each function it appears in tls.c:322:21: error: ‘BADCRL_FUTURE’ undeclared (first use in this function) BADCERT_FUTURE, BADCRL_FUTURE, 0 }; ^ tls.c: In function ‘zbx_is_ciphersuite_cert’: tls.c:764:4: error: ‘POLARSSL_KEY_EXCHANGE_ECDHE_PSK’ undeclared (first use in this function) POLARSSL_KEY_EXCHANGE_ECDHE_PSK != info->key_exchange && ^ tls.c:764:36: warning: comparison between pointer and integer POLARSSL_KEY_EXCHANGE_ECDHE_PSK != info->key_exchange && ^ tls.c: In function ‘zbx_is_ciphersuite_psk’: tls.c:798:4: error: ‘POLARSSL_KEY_EXCHANGE_ECDHE_PSK’ undeclared (first use in this function) POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange || ^ tls.c:798:36: warning: comparison between pointer and integer POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange || ^ tls.c: In function ‘zbx_tls_accept’: tls.c:3971:4: error: ‘POLARSSL_KEY_EXCHANGE_ECDHE_PSK’ undeclared (first use in this function) POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange || ^ tls.c:3971:36: warning: comparison between pointer and integer POLARSSL_KEY_EXCHANGE_ECDHE_PSK == info->key_exchange || ^ make[3]: *** [tls.o] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 Andris Mednis As proposed in (8) let's set minimum version to PolarSSL 1.3.9. Aleksandrs Saveljevs WON'T FIX
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (11) Server did not configure with GnuTLS, because -I directive had "gnutls" at the end:

          checking for GnuTLS support... no
          configure: error: GnuTLS library libgnutls not found
          ...
          configure:10395: checking for GnuTLS support
          configure:10453: gcc -o conftest -g -O2  -I/usr/include/mysql -DBIG_JOINS=1  -fno-strict-aliasing    -g -DNDEBUG          -I/home/zabbix/software/gnutls-3.3.15-install/include/gnutls  -rdynamic  -L/home/zabbix/software/gnutls-3.3.15-install/lib conftest.c -lm -ldl  -lresolv -lgnutls >&5
          conftest.c:121:27: fatal error: gnutls/gnutls.h: No such file or directory
           #include <gnutls/gnutls.h>
                                     ^
          compilation terminated.
          

          Aleksandrs Saveljevs Fixed it in r54190. RESOLVED.

          Aleksandrs Saveljevs A similar problem was with OpenSSL configuration. Fixed in r54195. RESOLVED.

          Aleksandrs Saveljevs While at it, fixed formatting in ./configure --help in r54197 and renamed libpolarssl.m4 to libmbedtls.m4 in r54198. Still RESOLVED.

          Aleksandrs Saveljevs Renamed PolarSSL to mbed TLS in Windows Makefiles, too, in r54205 and r54211. Still RESOLVED.

          Andris Mednis Thanks! Reviewed and accepted r54190, r54195, r54197, r54198, r54205, r54211.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (11) Server did not configure with GnuTLS, because -I directive had "gnutls" at the end: checking for GnuTLS support... no configure: error: GnuTLS library libgnutls not found ... configure:10395: checking for GnuTLS support configure:10453: gcc -o conftest -g -O2 -I/usr/include/mysql -DBIG_JOINS=1 -fno-strict-aliasing -g -DNDEBUG -I/home/zabbix/software/gnutls-3.3.15-install/include/gnutls -rdynamic -L/home/zabbix/software/gnutls-3.3.15-install/lib conftest.c -lm -ldl -lresolv -lgnutls >&5 conftest.c:121:27: fatal error: gnutls/gnutls.h: No such file or directory #include <gnutls/gnutls.h> ^ compilation terminated. Aleksandrs Saveljevs Fixed it in r54190. RESOLVED. Aleksandrs Saveljevs A similar problem was with OpenSSL configuration. Fixed in r54195. RESOLVED. Aleksandrs Saveljevs While at it, fixed formatting in ./configure --help in r54197 and renamed libpolarssl.m4 to libmbedtls.m4 in r54198. Still RESOLVED. Aleksandrs Saveljevs Renamed PolarSSL to mbed TLS in Windows Makefiles, too, in r54205 and r54211. Still RESOLVED. Andris Mednis Thanks! Reviewed and accepted r54190, r54195, r54197, r54198, r54205, r54211. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (12) The check for libcrypto in m4/libopenssl.m4 checks for function CRYPTO_memcmp(). However, according to http://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=9c00a950604aca819cee977f1dcb4b45f2af3aa6 , this function only appeared in 2013, just 2 years ago. Maybe we should use an older function instead, considering that we do not use CRYPTO_memcmp() ourselves?

          This makes it work with OpenSSL 1.0.1e (released on February 11, 2013), but fail with OpenSSL 1.0.1c (released on May 10, 2012). Apart from that, we compile successfully with version 1.0.1c. Considering the time between these releases (see https://www.openssl.org/source/old/1.0.1/), version 1.0.1c was stable.

          According to http://pkgs.org/download/openssl version 1.0.1c is only present in Slackware 14.0, but Ubuntu 12.04 LTS has version 1.0.1.

          In any case, there is a bug. In libopenssl.m4 we do LIBOPENSSL_TRY_LINK() and LIBCRYPTO_TRY_LINK() one after another. If at least one succeeds, we consider OpenSSL to be found. In case of OpenSSL 1.0.1c, the first succeeds, but the second fails. In total, we consider it a success, which is wrong.

          Andris Mednis RESOLVED in r54770.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (12) The check for libcrypto in m4/libopenssl.m4 checks for function CRYPTO_memcmp(). However, according to http://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=9c00a950604aca819cee977f1dcb4b45f2af3aa6 , this function only appeared in 2013, just 2 years ago. Maybe we should use an older function instead, considering that we do not use CRYPTO_memcmp() ourselves? This makes it work with OpenSSL 1.0.1e (released on February 11, 2013), but fail with OpenSSL 1.0.1c (released on May 10, 2012). Apart from that, we compile successfully with version 1.0.1c. Considering the time between these releases (see https://www.openssl.org/source/old/1.0.1/ ), version 1.0.1c was stable. According to http://pkgs.org/download/openssl version 1.0.1c is only present in Slackware 14.0, but Ubuntu 12.04 LTS has version 1.0.1. In any case, there is a bug. In libopenssl.m4 we do LIBOPENSSL_TRY_LINK() and LIBCRYPTO_TRY_LINK() one after another. If at least one succeeds, we consider OpenSSL to be found. In case of OpenSSL 1.0.1c, the first succeeds, but the second fails. In total, we consider it a success, which is wrong. Andris Mednis RESOLVED in r54770. Aleksandrs Saveljevs CLOSED
          Hide
          Oleg Egorov added a comment -

          FINISHED frontend side in
          svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-1

          Show
          Oleg Egorov added a comment - FINISHED frontend side in svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-1
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (13) Formatting for default values in configuration files was a bit wrong:

          -# Default: unencrypted
          -# TLSConnect=
          +# Default:
          +# TLSConnect=unencrypted
          

          Aleksandrs Saveljevs Fixed that in r54200. Please take a look. RESOLVED.

          Andris Mednis Thanks! Reviewed and accepted.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (13) Formatting for default values in configuration files was a bit wrong: -# Default: unencrypted -# TLSConnect= +# Default: +# TLSConnect=unencrypted Aleksandrs Saveljevs Fixed that in r54200. Please take a look. RESOLVED. Andris Mednis Thanks! Reviewed and accepted. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (14) Regarding the Windows Makefiles, running "nmake /f Makefile_sender_dll" used to copy three files: *.dll, *h, *.lib. Now it only copies *.dll and *.h files.

          Aleksandrs Saveljevs Similarly, changes to src\zabbix_sender\win32\zabbix_sender.c are no longer detected - zabbix_sender.dll and zabbix_sender.lib are no longer rebuilt if that file changes.

          Aleksandrs Saveljevs Running that same command always performs "del ..\..\..\bin\win32\dev\zabbix_sender.exp 2>nul". It might be that previously the behavior was the same, but at least it was hidden.

          Aleksandrs Saveljevs Also, it was noticed that the following options are now gone when building the agent:

          /DEBUG
          /DELAYLOAD:wevtapi.dll
          /OPT:REF
          

          Aleksandrs Saveljevs If it was intentional, could you please clarify the reason?

          dimir Nope, wasn't planning anything like that, it must be an error when copying/merging changes. Feel free to revert those.

          Andris Zeila Also minor detail that could be fixed with encryption changes - we are defining WIN32 define for agent (win32, win64), get (win32) and sender (win32). This define is used to assemble application title. But there is already built-in _WIN32 define that could be used instead.

          Andris Zeila RESOLVED in r55395, r55398
          Inspired by what dimir had done with makefiles I took a step further and removed makefiles for 64 bit targets. Now the makefiles will detect the build environment (32bit ir 64bit) and use appropriate settings.

          Aleksandrs Saveljevs Looks better, thank you. However, we now have the following in Makefile_targets.inc:

          clean: $(POSTCLEAN)
          	...
          	echo VERSION $(_NMAKE_VER)
          	...
          

          Is that echo necessary? It looks like this in the output:

                  ...
                  echo VERSION 9.00.30729.01
          VERSION 9.00.30729.01�
                  ...
          

          Aleksandrs Saveljevs Also, after we build the agent, there is build\win32\project\zabbix_agentd.pdb and bin\win32\zabbix_agentd.pdb? Maybe we should have only one of them or add to svn:ignore?

          "nmake get_clean" tries to delete zabbix_get.pdb in the same two locations, but the *.pdb files are not generated for zabbix_get.

          Aleksandrs Saveljevs When we detect architecture at the beginning of each Makefile (can we have it one place only?), we define either ARCH=32 or ARCH=64. What are they used for?

          Aleksandrs Saveljevs Please also verify my whitespace and typo fixes in r55587. REOPENED.

          Andris Zeila The version echo was left from debugging. The ARCH variables are not used anymore. So it all can be safely removed.

          The pdb file is automatically generated during compilation process in work directory and then the final version is created during link process in the target directory. Maybe the best option to ignore the one created during compilation process.

          RESOLVED in r55703, r55706, r55707

          Aleksandrs Saveljevs Makefiles for zabbix_sender.exe and zabbix_sender.dll seem to use the same /Fd parameter, which is not perfectly correct. REOPENED.

          Andris Zeila Added extension to intermediate idb/pdb files so we have separate files for sender and sender_dll. RESOLVED r55947

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (14) Regarding the Windows Makefiles, running "nmake /f Makefile_sender_dll" used to copy three files: *.dll, *h, *.lib. Now it only copies *.dll and *.h files. Aleksandrs Saveljevs Similarly, changes to src\zabbix_sender\win32\zabbix_sender.c are no longer detected - zabbix_sender.dll and zabbix_sender.lib are no longer rebuilt if that file changes. Aleksandrs Saveljevs Running that same command always performs "del ..\..\..\bin\win32\dev\zabbix_sender.exp 2>nul". It might be that previously the behavior was the same, but at least it was hidden. Aleksandrs Saveljevs Also, it was noticed that the following options are now gone when building the agent: /DEBUG /DELAYLOAD:wevtapi.dll /OPT:REF Aleksandrs Saveljevs If it was intentional, could you please clarify the reason? dimir Nope, wasn't planning anything like that, it must be an error when copying/merging changes. Feel free to revert those. Andris Zeila Also minor detail that could be fixed with encryption changes - we are defining WIN32 define for agent (win32, win64), get (win32) and sender (win32). This define is used to assemble application title. But there is already built-in _WIN32 define that could be used instead. Andris Zeila RESOLVED in r55395, r55398 Inspired by what dimir had done with makefiles I took a step further and removed makefiles for 64 bit targets. Now the makefiles will detect the build environment (32bit ir 64bit) and use appropriate settings. Aleksandrs Saveljevs Looks better, thank you. However, we now have the following in Makefile_targets.inc: clean: $(POSTCLEAN) ... echo VERSION $(_NMAKE_VER) ... Is that echo necessary? It looks like this in the output: ... echo VERSION 9.00.30729.01 VERSION 9.00.30729.01� ... Aleksandrs Saveljevs Also, after we build the agent, there is build\win32\project\zabbix_agentd.pdb and bin\win32\zabbix_agentd.pdb? Maybe we should have only one of them or add to svn:ignore? "nmake get_clean" tries to delete zabbix_get.pdb in the same two locations, but the *.pdb files are not generated for zabbix_get. Aleksandrs Saveljevs When we detect architecture at the beginning of each Makefile (can we have it one place only?), we define either ARCH=32 or ARCH=64. What are they used for? Aleksandrs Saveljevs Please also verify my whitespace and typo fixes in r55587. REOPENED. Andris Zeila The version echo was left from debugging. The ARCH variables are not used anymore. So it all can be safely removed. The pdb file is automatically generated during compilation process in work directory and then the final version is created during link process in the target directory. Maybe the best option to ignore the one created during compilation process. RESOLVED in r55703, r55706, r55707 Aleksandrs Saveljevs Makefiles for zabbix_sender.exe and zabbix_sender.dll seem to use the same /Fd parameter, which is not perfectly correct. REOPENED. Andris Zeila Added extension to intermediate idb/pdb files so we have separate files for sender and sender_dll. RESOLVED r55947 Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (15) In function zbx_socket_create(), we do zbx_socket_connect() and later, if we are asked to make a secure connection in "tls_connect" variable without being compiled with an SSL library, we close the socket and fail. Why do we need to connect and then fail? Maybe we should fail immediately, upon entering the function?

          Andris Mednis RESOLVED in r54804.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (15) In function zbx_socket_create(), we do zbx_socket_connect() and later, if we are asked to make a secure connection in "tls_connect" variable without being compiled with an SSL library, we close the socket and fail. Why do we need to connect and then fail? Maybe we should fail immediately, upon entering the function? Andris Mednis RESOLVED in r54804. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (16) This subissue shall hold minor fixes that are made during code review.

          Aleksandrs Saveljevs Please take a look at r54224 (mostly comms.c).

          Andris Mednis Thanks! Reviewed and accepted.

          Aleksandrs Saveljevs Please also take a look at r54253 (mostly dbconfig.c).

          Andris Mednis Thanks for improvement! Reviewed and accepted.

          Aleksandrs Saveljevs r54307 (minor style)

          Andris Mednis Thanks! Reviewed and accepted.

          Aleksandrs Saveljevs Please check r54462, too (mostly tls.c).

          Andris Mednis Thanks for improvements and corrections! Reviewed and accepted.

          Andris Mednis Please check r54486 (__zbx_zabbix_log() optimization).

          Aleksandrs Saveljevs Looks good.

          Aleksandrs Saveljevs Please take a look at r55198 (formatting and wording).

          Andris Mednis Thanks! Reviewed and accepted.

          Aleksandrs Saveljevs Please see r55441 (formatting in the latest trunk update).

          Andris Mednis Reviewed, accepted.

          Aleksandrs Saveljevs Please take a look at r56202 (a line in comms.c and recent unrelated development in cpustat.c).

          Andris Mednis Reviewed.

          Aleksandrs Saveljevs Andris Zeila, could you please review stylistic fixes to ZBX-1916's merge in r56674?

          Andris Zeila Thanks, reviewed.

          Andris Mednis CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (16) This subissue shall hold minor fixes that are made during code review. Aleksandrs Saveljevs Please take a look at r54224 (mostly comms.c). Andris Mednis Thanks! Reviewed and accepted. Aleksandrs Saveljevs Please also take a look at r54253 (mostly dbconfig.c). Andris Mednis Thanks for improvement! Reviewed and accepted. Aleksandrs Saveljevs r54307 (minor style) Andris Mednis Thanks! Reviewed and accepted. Aleksandrs Saveljevs Please check r54462, too (mostly tls.c). Andris Mednis Thanks for improvements and corrections! Reviewed and accepted. Andris Mednis Please check r54486 (__zbx_zabbix_log() optimization). Aleksandrs Saveljevs Looks good. Aleksandrs Saveljevs Please take a look at r55198 (formatting and wording). Andris Mednis Thanks! Reviewed and accepted. Aleksandrs Saveljevs Please see r55441 (formatting in the latest trunk update). Andris Mednis Reviewed, accepted. Aleksandrs Saveljevs Please take a look at r56202 (a line in comms.c and recent unrelated development in cpustat.c). Andris Mednis Reviewed. Aleksandrs Saveljevs Andris Zeila , could you please review stylistic fixes to ZBX-1916 's merge in r56674? Andris Zeila Thanks, reviewed. Andris Mednis CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (17) There is the following code in DCsync_hosts():

          if (NULL != (psk_i = zbx_hashset_search(&config->psks, &psk_i_local)) &&
          		0 == --(psk_i->refcount))
          {
          	zbx_strpool_release(psk_i->tls_psk_identity);
          	zbx_strpool_release(psk_i->tls_psk);
          	zbx_hashset_remove(&config->psks, &psk_i_local);
          }
          

          It does not look entirely correct to do zbx_hashset_remove() after zbx_strpool_release().

          The reason it works in cases like the following in DCsync_items(), is because we key the hashset by "itemid", which is a number and is always present in the structure:

          else if (NULL != (ipmiitem = zbx_hashset_search(&config->ipmiitems, &itemid)))
          {
          	/* remove IPMI parameters for non-IPMI item */
          	zbx_strpool_release(ipmiitem->ipmi_sensor);
          	zbx_hashset_remove(&config->ipmiitems, &itemid);
          }
          

          However, in DCsync_hosts() case above, zbx_strpool_release() releases strings pointed to by "psk_i->tls_psk_identity" and "psk_i->tls_psk", and the hashset is keyed by "psk_i->tls_psk_identity". Therefore, after zbx_strpool_release() is called, the location pointed to by "psk_i->tls_psk_identity" may contain unexpected data. Currently, when we zbx_strpool_release() in the configuration cache, nothing gets overwritten and it currently works, but this behavior should not be relied upon.

          There may be other cases like this, including the old code.

          Aleksandrs Saveljevs There were other cases like that indeed (e.g., config->hosts_h hashset), but since the bug does not currently have any practical impact, decided to fix it in trunk only under this issue, without creating a separate ZBX.

          Fixed in r55306 by introducing zbx_hashset_remove_direct() function, which allows to remove hashset entries using a pointer. This is also a bit more efficient than using zbx_hashset_remove() function, because a hash code does not need to be calculated. RESOLVED.

          Andris Zeila CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (17) There is the following code in DCsync_hosts(): if (NULL != (psk_i = zbx_hashset_search(&config->psks, &psk_i_local)) && 0 == --(psk_i->refcount)) { zbx_strpool_release(psk_i->tls_psk_identity); zbx_strpool_release(psk_i->tls_psk); zbx_hashset_remove(&config->psks, &psk_i_local); } It does not look entirely correct to do zbx_hashset_remove() after zbx_strpool_release(). The reason it works in cases like the following in DCsync_items(), is because we key the hashset by "itemid", which is a number and is always present in the structure: else if (NULL != (ipmiitem = zbx_hashset_search(&config->ipmiitems, &itemid))) { /* remove IPMI parameters for non-IPMI item */ zbx_strpool_release(ipmiitem->ipmi_sensor); zbx_hashset_remove(&config->ipmiitems, &itemid); } However, in DCsync_hosts() case above, zbx_strpool_release() releases strings pointed to by "psk_i->tls_psk_identity" and "psk_i->tls_psk", and the hashset is keyed by "psk_i->tls_psk_identity". Therefore, after zbx_strpool_release() is called, the location pointed to by "psk_i->tls_psk_identity" may contain unexpected data. Currently, when we zbx_strpool_release() in the configuration cache, nothing gets overwritten and it currently works, but this behavior should not be relied upon. There may be other cases like this, including the old code. Aleksandrs Saveljevs There were other cases like that indeed (e.g., config->hosts_h hashset), but since the bug does not currently have any practical impact, decided to fix it in trunk only under this issue, without creating a separate ZBX. Fixed in r55306 by introducing zbx_hashset_remove_direct() function, which allows to remove hashset entries using a pointer. This is also a bit more efficient than using zbx_hashset_remove() function, because a hash code does not need to be calculated. RESOLVED. Andris Zeila CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (18) The first "if" in DCsync_configuration() violates the convention about splitting "if"s at https://www.zabbix.org/wiki/C_coding_guidelines#Compiler-specific_issues :

          	if (NULL == (host_result =
          #if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL)
          			DBselect(
          			"select hostid,proxy_hostid,host,ipmi_authtype,ipmi_privilege,ipmi_username,"
          			...
          #else
          			DBselect(
          			"select hostid,proxy_hostid,host,ipmi_authtype,ipmi_privilege,ipmi_username,"
          			...
          #endif
          

          Aleksandrs Saveljevs The comment above it also says the following:

          /* DBselect is a macro and embedding a directive within macro arguments is not portable. Therefore a */
          /* repeated SQL query. */
          

          We actually have code that uses directives in DBselect(), see execute_commands() function in escalator.c. If what you say in the comment above is true (ideally, please give a reference), then that place should be fixed as well.

          Andris Mednis execute_commands() function in escalator.c compiles with warnings with GCC 5.1.1:

          escalator.c: In function ‘execute_commands’:
          escalator.c:469:1: warning: embedding a directive within macro arguments is not portable
           #ifdef HAVE_OPENIPMI
           ^
          escalator.c:471:1: warning: embedding a directive within macro arguments is not portable
           #endif
           ^
          escalator.c:481:1: warning: embedding a directive within macro arguments is not portable
           #ifdef HAVE_OPENIPMI
           ^
          escalator.c:483:1: warning: embedding a directive within macro arguments is not portable
           #endif
           ^
          escalator.c:492:1: warning: embedding a directive within macro arguments is not portable
           #ifdef HAVE_OPENIPMI
           ^
          escalator.c:494:1: warning: embedding a directive within macro arguments is not portable
           #endif
           ^
          

          Aleksandrs Saveljevs http://stackoverflow.com/questions/28343501/is-it-legal-to-use-the-line-directive-in-a-macro-argument with a reference to C99 standard seems to explain it:

          From the C standard, 6.10.3 Macro replacement:

          11 [...] If there are sequences of preprocessing tokens within the list of arguments that would otherwise act as preprocessing directives, the behavior is undefined.

          Aleksandrs Saveljevs Issue about escalator.c moved out to ZBX-9677 to fix in all versions. This issue is then only about fixing the split "if" in dbconfig.c.

          Andris Mednis RESOLVED in r54251.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (18) The first "if" in DCsync_configuration() violates the convention about splitting "if"s at https://www.zabbix.org/wiki/C_coding_guidelines#Compiler-specific_issues : if (NULL == (host_result = # if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL) DBselect( "select hostid,proxy_hostid,host,ipmi_authtype,ipmi_privilege,ipmi_username," ... # else DBselect( "select hostid,proxy_hostid,host,ipmi_authtype,ipmi_privilege,ipmi_username," ... #endif Aleksandrs Saveljevs The comment above it also says the following: /* DBselect is a macro and embedding a directive within macro arguments is not portable. Therefore a */ /* repeated SQL query. */ We actually have code that uses directives in DBselect(), see execute_commands() function in escalator.c. If what you say in the comment above is true (ideally, please give a reference), then that place should be fixed as well. Andris Mednis execute_commands() function in escalator.c compiles with warnings with GCC 5.1.1: escalator.c: In function ‘execute_commands’: escalator.c:469:1: warning: embedding a directive within macro arguments is not portable #ifdef HAVE_OPENIPMI ^ escalator.c:471:1: warning: embedding a directive within macro arguments is not portable #endif ^ escalator.c:481:1: warning: embedding a directive within macro arguments is not portable #ifdef HAVE_OPENIPMI ^ escalator.c:483:1: warning: embedding a directive within macro arguments is not portable #endif ^ escalator.c:492:1: warning: embedding a directive within macro arguments is not portable #ifdef HAVE_OPENIPMI ^ escalator.c:494:1: warning: embedding a directive within macro arguments is not portable #endif ^ Aleksandrs Saveljevs http://stackoverflow.com/questions/28343501/is-it-legal-to-use-the-line-directive-in-a-macro-argument with a reference to C99 standard seems to explain it: From the C standard, 6.10.3 Macro replacement: 11 [...] If there are sequences of preprocessing tokens within the list of arguments that would otherwise act as preprocessing directives, the behavior is undefined. Aleksandrs Saveljevs Issue about escalator.c moved out to ZBX-9677 to fix in all versions. This issue is then only about fixing the split "if" in dbconfig.c. Andris Mednis RESOLVED in r54251. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (19) In DCsync_hosts() function, there are two cases of THIS_SHOULD_NEVER_HAPPEN, accompanied by two log messages. However, practice shows that sometimes impossible things do happen and it is useful to have information in the log that helps to identify the problem. In this case, the warnings are as follows:

          zabbix_log(LOG_LEVEL_WARNING, "empty PSK value for identity \"%s\"", row[28]);
          zabbix_log(LOG_LEVEL_WARNING, "empty PSK identity with non-empty value");
          

          These warnings could probably benefit from some identifying information, like host name or host ID.

          Andris Mednis RESOLVED in r54424.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (19) In DCsync_hosts() function, there are two cases of THIS_SHOULD_NEVER_HAPPEN, accompanied by two log messages. However, practice shows that sometimes impossible things do happen and it is useful to have information in the log that helps to identify the problem. In this case, the warnings are as follows: zabbix_log(LOG_LEVEL_WARNING, "empty PSK value for identity \" %s\"", row[28]); zabbix_log(LOG_LEVEL_WARNING, "empty PSK identity with non-empty value" ); These warnings could probably benefit from some identifying information, like host name or host ID. Andris Mednis RESOLVED in r54424. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (20) The following code repeats in all binaries:

          struct rlimit	limit;
          
          /* disable core dumps */
          
          limit.rlim_cur = 0;
          limit.rlim_max = 0;
          
          if (0 != setrlimit(RLIMIT_CORE, &limit))
          {
          	zabbix_log(LOG_LEVEL_CRIT, "cannot set resource limits, exiting...");
          	exit(EXIT_FAILURE);
          }
          

          Can we put it inside a dedicated function?

          Andris Mednis RESOLVED in r54829.

          Aleksandrs Saveljevs The code now looks like this:

          if (SUCCEED != zbx_coredump_disable())
          {
          	zabbix_log(LOG_LEVEL_CRIT, "cannot set resource limits, exiting...");
          	exit(EXIT_FAILURE);
          }
          

          The error message refers to the internal details of zbx_coredump_disable(). Maybe a better error message would be "cannot disable core dump, exiting..." and, in case setrlimit() fails, we should log a debug or warning message in zbx_coredump_disable() itself, together with errno and strerror()? REOPENED.

          Andris Mednis Thanks for noticing! RESOLVED in r55019.

          Aleksandrs Saveljevs r55019 introduced unnecessary "res" variable to zbx_coredump_disable(). I shall fix it under a bigger commit for review. CLOSED.

          Andris Zeila Moved zbx_coredump_disable() function to zbxnix library.
          REOPENED and RESOLVED in r55362

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (20) The following code repeats in all binaries: struct rlimit limit; /* disable core dumps */ limit.rlim_cur = 0; limit.rlim_max = 0; if (0 != setrlimit(RLIMIT_CORE, &limit)) { zabbix_log(LOG_LEVEL_CRIT, "cannot set resource limits, exiting..." ); exit(EXIT_FAILURE); } Can we put it inside a dedicated function? Andris Mednis RESOLVED in r54829. Aleksandrs Saveljevs The code now looks like this: if (SUCCEED != zbx_coredump_disable()) { zabbix_log(LOG_LEVEL_CRIT, "cannot set resource limits, exiting..." ); exit(EXIT_FAILURE); } The error message refers to the internal details of zbx_coredump_disable(). Maybe a better error message would be "cannot disable core dump, exiting..." and, in case setrlimit() fails, we should log a debug or warning message in zbx_coredump_disable() itself, together with errno and strerror()? REOPENED. Andris Mednis Thanks for noticing! RESOLVED in r55019. Aleksandrs Saveljevs r55019 introduced unnecessary "res" variable to zbx_coredump_disable(). I shall fix it under a bigger commit for review. CLOSED. Andris Zeila Moved zbx_coredump_disable() function to zbxnix library. REOPENED and RESOLVED in r55362 Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (21) The old options in "zabbix_get" and "zabbix_sender" are validated not to repeat. However, the new TLS options may repeat and the last value will be taken. Should be consistent.

          Andris Mednis RESOLVED in 55589.

          Aleksandrs Saveljevs The less copy-paste, the better. Commit in r55708 tries to make it shorter. Still RESOLVED.

          Andris Zeila Changed opt_count array initializations to opt_count[256] = {0}; to be compatible with microsoft compiler. Now RESOLVED in r55841

          Also please review sender_dll compilation fix in r55840 .

          Andris Mednis Thanks for improvements r55708, r55841 and r55840, reviewed. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (21) The old options in "zabbix_get" and "zabbix_sender" are validated not to repeat. However, the new TLS options may repeat and the last value will be taken. Should be consistent. Andris Mednis RESOLVED in 55589. Aleksandrs Saveljevs The less copy-paste, the better. Commit in r55708 tries to make it shorter. Still RESOLVED. Andris Zeila Changed opt_count array initializations to opt_count[256] = {0}; to be compatible with microsoft compiler. Now RESOLVED in r55841 Also please review sender_dll compilation fix in r55840 . Andris Mednis Thanks for improvements r55708, r55841 and r55840, reviewed. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (22) In "zabbix_get" and "zabbix_sender", if we were compiled without TLS support, we still recognize --tls-connect and other TLS options, but quit with an error message that TLS options cannot be used. However, we do not seem to do like that elsewhere: see zabbix_agentd.c, for instance. There, if we do not have compile-time support for some options, then we just do not recognize them at run-time.

          Aleksandrs Saveljevs Also, if we were compiled without TLS support, we quit if there are TLS options specified on the command-line. Then there is the following non-TLS code below:

          if (NULL != CONFIG_TLS_CONNECT || NULL != CONFIG_TLS_CA_FILE || NULL != CONFIG_TLS_CRL_FILE ||
          		NULL != CONFIG_TLS_SERVER_CERT_ISSUER || NULL != CONFIG_TLS_SERVER_CERT_SUBJECT ||
          		NULL != CONFIG_TLS_CERT_FILE || NULL != CONFIG_TLS_KEY_FILE ||
          		NULL != CONFIG_TLS_PSK_IDENTITY || NULL != CONFIG_TLS_PSK_FILE)
          {
          	zbx_error("TLS parameters cannot be used: 'zabbix_get' was compiled without TLS support.");
          	ret = FAIL;
          	goto out;
          }
          

          Since the only way to set these parameters in "zabbix_get" is through the command-line, the fragment above does not seem to be necessary. In "zabbix_sender", it seems to be OK.

          Andris Mednis It was decided to always recognize parameters, even if their support was not compiled in (in this case a program stops with error message). RESOLVED in r55617 and r55644.

          Aleksandrs Saveljevs Similar to (21), the less copy-paste, the better. Please look at an attempt to shorten it in r55716. Function get_program_type_string() had to be moved from zbxself, because that module is not linked to zabbix_sender and zabbix_get. Alternative approach would be to put check_cfg_feature_str() and check_cfg_feature_int() functions somewhere, where only daemons have access, but get_program_type_string() returns values for ZBX_PROGRAM_TYPE_SENDER and ZBX_PROGRAM_TYPE_GET, so it should have been moved somewhere where it would work for zabbix_sender and zabbix_get, too. Still RESOLVED.

          Aleksandrs Saveljevs Actually, the solution does not work for most VMware parameters: their default values are non-zero:

          int	CONFIG_VMWARE_FREQUENCY		= 60;
          int	CONFIG_VMWARE_PERF_FREQUENCY	= 60;
          int	CONFIG_VMWARE_TIMEOUT		= 10;
          
          zbx_uint64_t	CONFIG_VMWARE_CACHE_SIZE	= 8 * ZBX_MEBIBYTE;
          

          This was probably the reason why they were not checked before. REOPENED.

          Andris Mednis RESOLVED in r55929.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (22) In "zabbix_get" and "zabbix_sender", if we were compiled without TLS support, we still recognize --tls-connect and other TLS options, but quit with an error message that TLS options cannot be used. However, we do not seem to do like that elsewhere: see zabbix_agentd.c, for instance. There, if we do not have compile-time support for some options, then we just do not recognize them at run-time. Aleksandrs Saveljevs Also, if we were compiled without TLS support, we quit if there are TLS options specified on the command-line. Then there is the following non-TLS code below: if (NULL != CONFIG_TLS_CONNECT || NULL != CONFIG_TLS_CA_FILE || NULL != CONFIG_TLS_CRL_FILE || NULL != CONFIG_TLS_SERVER_CERT_ISSUER || NULL != CONFIG_TLS_SERVER_CERT_SUBJECT || NULL != CONFIG_TLS_CERT_FILE || NULL != CONFIG_TLS_KEY_FILE || NULL != CONFIG_TLS_PSK_IDENTITY || NULL != CONFIG_TLS_PSK_FILE) { zbx_error( "TLS parameters cannot be used: 'zabbix_get' was compiled without TLS support." ); ret = FAIL; goto out; } Since the only way to set these parameters in "zabbix_get" is through the command-line, the fragment above does not seem to be necessary. In "zabbix_sender", it seems to be OK. Andris Mednis It was decided to always recognize parameters, even if their support was not compiled in (in this case a program stops with error message). RESOLVED in r55617 and r55644. Aleksandrs Saveljevs Similar to (21), the less copy-paste, the better. Please look at an attempt to shorten it in r55716. Function get_program_type_string() had to be moved from zbxself, because that module is not linked to zabbix_sender and zabbix_get. Alternative approach would be to put check_cfg_feature_str() and check_cfg_feature_int() functions somewhere, where only daemons have access, but get_program_type_string() returns values for ZBX_PROGRAM_TYPE_SENDER and ZBX_PROGRAM_TYPE_GET, so it should have been moved somewhere where it would work for zabbix_sender and zabbix_get, too. Still RESOLVED. Aleksandrs Saveljevs Actually, the solution does not work for most VMware parameters: their default values are non-zero: int CONFIG_VMWARE_FREQUENCY = 60; int CONFIG_VMWARE_PERF_FREQUENCY = 60; int CONFIG_VMWARE_TIMEOUT = 10; zbx_uint64_t CONFIG_VMWARE_CACHE_SIZE = 8 * ZBX_MEBIBYTE; This was probably the reason why they were not checked before. REOPENED. Andris Mednis RESOLVED in r55929. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (23) In some cases, we offer PSK option first and then certificate option. In other places, it is the other way round. For instance, "zabbix_get" and "zabbix_sender" show "cert" before "psk" in usage message, but "psk" before "cert" in --tls-connect values and examples. The order should be chosen and made consistent.

          Aleksandrs Saveljevs We seem to have fixed it in other issues. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (23) In some cases, we offer PSK option first and then certificate option. In other places, it is the other way round. For instance, "zabbix_get" and "zabbix_sender" show "cert" before "psk" in usage message, but "psk" before "cert" in --tls-connect values and examples. The order should be chosen and made consistent. Aleksandrs Saveljevs We seem to have fixed it in other issues. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (24) Not exactly related to this development, but there seems to be a typo in the variable name (the extra "t"):

          ZBX_THREAD_SENDVAL_ARGS	*sentdval_args;
          

          Aleksandrs Saveljevs RESOLVED in r55242.

          Andris Mednis Reviewed and accepted. CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (24) Not exactly related to this development, but there seems to be a typo in the variable name (the extra "t"): ZBX_THREAD_SENDVAL_ARGS *sentdval_args; Aleksandrs Saveljevs RESOLVED in r55242. Andris Mednis Reviewed and accepted. CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (25) Some lines in --help output are indented 2 spaces, some 4, some 7.

          Aleksandrs Saveljevs Made it consistent in r54311. RESOLVED.

          Andris Mednis Thanks! Reviewed and accepted.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (25) Some lines in --help output are indented 2 spaces, some 4, some 7. Aleksandrs Saveljevs Made it consistent in r54311. RESOLVED. Andris Mednis Thanks! Reviewed and accepted. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (26) In some places, TLSPSKIdentity comes before TLSPSKFile, in some places after.

          Aleksandrs Saveljevs Made it consistent in r54314 by making TLSPSKIdentity always come before TLSPSKFile. RESOLVED.

          Andris Mednis Thanks! Reviewed and accepted.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (26) In some places, TLSPSKIdentity comes before TLSPSKFile, in some places after. Aleksandrs Saveljevs Made it consistent in r54314 by making TLSPSKIdentity always come before TLSPSKFile. RESOLVED. Andris Mednis Thanks! Reviewed and accepted. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (27) File tls.c has conditionals like the following:

          #if GNUTLS_VERSION_NUMBER >= 0x030000
          

          If the lowest version of GnuTLS we decide to support is GnuTLS 3.1.18, these conditionals can be removed.

          Andris Mednis RESOLVED in r54831

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (27) File tls.c has conditionals like the following: # if GNUTLS_VERSION_NUMBER >= 0x030000 If the lowest version of GnuTLS we decide to support is GnuTLS 3.1.18, these conditionals can be removed. Andris Mednis RESOLVED in r54831 Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (28) Function polarssl_debug_cb() is set to be a debug function for PolarSSL if zabbix_check_log_level(LOG_LEVEL_TRACE) is SUCCEED, but the function itself uses LOG_LEVEL_DEBUG. Same for GnuTLS.

          Andris Mednis RESOLVED in r54342.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (28) Function polarssl_debug_cb() is set to be a debug function for PolarSSL if zabbix_check_log_level(LOG_LEVEL_TRACE) is SUCCEED, but the function itself uses LOG_LEVEL_DEBUG. Same for GnuTLS. Andris Mednis RESOLVED in r54342. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (29) mbed TLS 1.3.11 introduces the following defines in include/polarssl/x509.h:

          #define BADCERT_KEY_USAGE         0x0800  /**< Usage does not match the keyUsage extension. */
          #define BADCERT_EXT_KEY_USAGE     0x1000  /**< Usage does not match the extendedKeyUsage extension. */
          #define BADCERT_NS_CERT_TYPE      0x2000  /**< Usage does not match the nsCertType extension. */
          

          Do we wish to include them in zbx_tls_cert_error_msg()?

          Andris Mednis RESOLVED in r54874.

          Aleksandrs Saveljevs Looks good. However, it should be noted that mbed TLS 1.3.11 introduces not only these constants, but also x509_crt_verify_info() function, which turns these codes into strings. If we are compiled with mbed TLS 1.3.11 and above, maybe we could use this function, so that we do not have to update our code whenever another verify code is added? For PolarSSL 1.3.9 and mbed TLS 1.3.10, we could use strings from x509_crt_verify_strings array, instead of inventing our own.

          Aleksandrs Saveljevs If the above is not inspiring, this subissue can be closed.

          Andris Mednis https://tls.mbed.org/tech-updates/releases/mbedtls-2.0.0-released says that "The 1.3 branch now moves into Maintenance Mode and will become End of Life at the end of next year (December 31st 2016).". Seems it is not worth to split error logging in our code into "pre-1.3.11" and "post-1.3.11" branches. If mbed TLS will be popular choice for Zabbix then migration to mbed TLS 2.0 is more relevant.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (29) mbed TLS 1.3.11 introduces the following defines in include/polarssl/x509.h: #define BADCERT_KEY_USAGE 0x0800 /**< Usage does not match the keyUsage extension. */ #define BADCERT_EXT_KEY_USAGE 0x1000 /**< Usage does not match the extendedKeyUsage extension. */ #define BADCERT_NS_CERT_TYPE 0x2000 /**< Usage does not match the nsCertType extension. */ Do we wish to include them in zbx_tls_cert_error_msg()? Andris Mednis RESOLVED in r54874. Aleksandrs Saveljevs Looks good. However, it should be noted that mbed TLS 1.3.11 introduces not only these constants, but also x509_crt_verify_info() function, which turns these codes into strings. If we are compiled with mbed TLS 1.3.11 and above, maybe we could use this function, so that we do not have to update our code whenever another verify code is added? For PolarSSL 1.3.9 and mbed TLS 1.3.10, we could use strings from x509_crt_verify_strings array, instead of inventing our own. Aleksandrs Saveljevs If the above is not inspiring, this subissue can be closed. Andris Mednis https://tls.mbed.org/tech-updates/releases/mbedtls-2.0.0-released says that "The 1.3 branch now moves into Maintenance Mode and will become End of Life at the end of next year (December 31st 2016).". Seems it is not worth to split error logging in our code into "pre-1.3.11" and "post-1.3.11" branches. If mbed TLS will be popular choice for Zabbix then migration to mbed TLS 2.0 is more relevant. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (30) In zbx_tls_validate_config(), we have the following code:

          if (0 != (program_type & (ZBX_PROGRAM_TYPE_GET | ZBX_PROGRAM_TYPE_SENDER)))
          	zbx_parameter_not_empty(CONFIG_TLS_CONNECT, "--tls-connect");
          else
          	zbx_parameter_not_empty(CONFIG_TLS_CONNECT, "TLSConnect");
          

          The problem is that Zabbix sender can obtain configuration (TLSConnect in this case) from Zabbix agent configuration file. Therefore, it is wrong to complain about --tls-connect, where TLSConnect is the actual problem:

          $ bin/zabbix_sender -c etc/zabbix_agentd.conf -z 127.0.0.1 -s 'Zabbix server' -k trapper -o 1
          zabbix_sender [12563]: ERROR: configuration parameter "--tls-connect" is defined but empty
          

          Andris Mednis RESOLVED in r55691.

          Aleksandrs Saveljevs In a way similar to (21) and (22), r55756 attempts to make it shorter by introducing zbx_tls_parameter_name() function, which intelligently returns the required name of a configuration file or command line parameter. I have also renamed zbx_log_error() function, because the name was too generic and easily confused with zbx_error(), for example.

          Aleksandrs Saveljevs We may also wish to use zbx_tls_parameter_name() with check_cfg_feature_int() and check_cfg_feature_str() functions by making zbx_tls_parameter_name() always available, not only when we are compiled with TLS. RESOLVED.

          Andris Mednis Thanks for great improvement r55756 ! Reviewed. See a minor change in r56159.

          Aleksandrs Saveljevs Not sure about r56159: it changes "configuration parameter" to "option" for zabbix_get in zbx_tls_parameter_not_empty(), which is reasonable, but we still refer to "parameter" for zabbix_get in other places (e.g., zbx_tls_validation_error() function). REOPENED.

          Andris Mednis Reverted r56159.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (30) In zbx_tls_validate_config(), we have the following code: if (0 != (program_type & (ZBX_PROGRAM_TYPE_GET | ZBX_PROGRAM_TYPE_SENDER))) zbx_parameter_not_empty(CONFIG_TLS_CONNECT, "--tls-connect" ); else zbx_parameter_not_empty(CONFIG_TLS_CONNECT, "TLSConnect" ); The problem is that Zabbix sender can obtain configuration (TLSConnect in this case) from Zabbix agent configuration file. Therefore, it is wrong to complain about --tls-connect, where TLSConnect is the actual problem: $ bin/zabbix_sender -c etc/zabbix_agentd.conf -z 127.0.0.1 -s 'Zabbix server' -k trapper -o 1 zabbix_sender [12563]: ERROR: configuration parameter "--tls-connect" is defined but empty Andris Mednis RESOLVED in r55691. Aleksandrs Saveljevs In a way similar to (21) and (22), r55756 attempts to make it shorter by introducing zbx_tls_parameter_name() function, which intelligently returns the required name of a configuration file or command line parameter. I have also renamed zbx_log_error() function, because the name was too generic and easily confused with zbx_error(), for example. Aleksandrs Saveljevs We may also wish to use zbx_tls_parameter_name() with check_cfg_feature_int() and check_cfg_feature_str() functions by making zbx_tls_parameter_name() always available, not only when we are compiled with TLS. RESOLVED. Andris Mednis Thanks for great improvement r55756 ! Reviewed. See a minor change in r56159. Aleksandrs Saveljevs Not sure about r56159: it changes "configuration parameter" to "option" for zabbix_get in zbx_tls_parameter_not_empty(), which is reasonable, but we still refer to "parameter" for zabbix_get in other places (e.g., zbx_tls_validation_error() function). REOPENED. Andris Mednis Reverted r56159. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (31) In zbx_tls_validate_config(), when we parse CONFIG_TLS_ACCEPT parameter, we check for "p" being NULL, but it can never be NULL.

          Aleksandrs Saveljevs RESOLVED in r55241.

          Andris Mednis Thanks! Another simplification in r55342.

          Aleksandrs Saveljevs Wonderful! CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (31) In zbx_tls_validate_config(), when we parse CONFIG_TLS_ACCEPT parameter, we check for "p" being NULL, but it can never be NULL. Aleksandrs Saveljevs RESOLVED in r55241. Andris Mednis Thanks! Another simplification in r55342. Aleksandrs Saveljevs Wonderful! CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (32) Function zbx_tls_validate_config() is called in zbx_tls_init_child(), which is called after forking. This means that (a) children do double validation work and (b) the user does not see the error messages on the console, unlike zbx_validate_config(). Would it be possible to call zbx_tls_validate_config() before forking?

          Andris Mednis Good idea. RESOLVED in r55553.

          Aleksandrs Saveljevs Very good. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (32) Function zbx_tls_validate_config() is called in zbx_tls_init_child(), which is called after forking. This means that (a) children do double validation work and (b) the user does not see the error messages on the console, unlike zbx_validate_config(). Would it be possible to call zbx_tls_validate_config() before forking? Andris Mednis Good idea. RESOLVED in r55553. Aleksandrs Saveljevs Very good. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (33) At the bottom of zbx_tls_validate_config() we require that TLSConnect and TLSAccept be defined if a certificate or a PSK are configured. However, if TLSConnect or TLSAccept are "unencrypted", we are satisfied.

          Aleksandrs Saveljevs Should we require a certificate or a PSK if at least one of TLSConnect and TLSAccept needs it?

          Aleksandrs Saveljevs A related scenario: suppose we omit TLSConnect from agent configuration file, but specify TLSAccept and PSK parameters. The agent then complains:

           27506:20150810:164724.499 certificate or PSK is configured but parameter "TLSConnect" is not defined
          

          However, the configuration seems valid, especially if active checks are not used.

          Aleksandrs Saveljevs Related issue: (86).

          Aleksandrs Saveljevs Based on internal discussion decided to leave the current state as is. This subissue therefore becomes a documetation task: it should be documented that if a user has certificate or PSK parameters specified, but he wishes to use unencrypted connection, then he should specify "TLSConnect=unencrypted" explicitly, rather than rely on the default value. This is a new behavior in Zabbix.

          <richlv> the error message has changed a bit :

          ERROR: parameter "TLSCertFile" is defined, but "TLSConnect" is not defined
          

          the default for tlsconnect claims to be 'unencrypted'. the reasoning is that user might expect certificates to be used as soon as they are specified, but omitting tlsconnect would ignore them for active items. maybe worth adding something like this to the comment for TLSConnect : "required if one of ... is defined"

          Andris Mednis As proposed by richlv comment improved in configuration template files in r58020.
          Documented in
          https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_proxy
          https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_agentd
          https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_agentd_win

          Aleksandrs Saveljevs Looks good. Made "unencrypted" in the note italic for consistency, please review.

          Andris Mednis Thanks, reviewed, accepted. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (33) At the bottom of zbx_tls_validate_config() we require that TLSConnect and TLSAccept be defined if a certificate or a PSK are configured. However, if TLSConnect or TLSAccept are "unencrypted", we are satisfied. Aleksandrs Saveljevs Should we require a certificate or a PSK if at least one of TLSConnect and TLSAccept needs it? Aleksandrs Saveljevs A related scenario: suppose we omit TLSConnect from agent configuration file, but specify TLSAccept and PSK parameters. The agent then complains: 27506:20150810:164724.499 certificate or PSK is configured but parameter "TLSConnect" is not defined However, the configuration seems valid, especially if active checks are not used. Aleksandrs Saveljevs Related issue: (86). Aleksandrs Saveljevs Based on internal discussion decided to leave the current state as is. This subissue therefore becomes a documetation task: it should be documented that if a user has certificate or PSK parameters specified, but he wishes to use unencrypted connection, then he should specify "TLSConnect=unencrypted" explicitly, rather than rely on the default value. This is a new behavior in Zabbix. < richlv > the error message has changed a bit : ERROR: parameter "TLSCertFile" is defined, but "TLSConnect" is not defined the default for tlsconnect claims to be 'unencrypted'. the reasoning is that user might expect certificates to be used as soon as they are specified, but omitting tlsconnect would ignore them for active items. maybe worth adding something like this to the comment for TLSConnect : "required if one of ... is defined" Andris Mednis As proposed by richlv comment improved in configuration template files in r58020. Documented in https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_proxy https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_agentd https://www.zabbix.com/documentation/3.0/manual/appendix/config/zabbix_agentd_win Aleksandrs Saveljevs Looks good. Made "unencrypted" in the note italic for consistency, please review. Andris Mednis Thanks, reviewed, accepted. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (34) In function zbx_psk_cb(), we have the following code:

          /* special print: psk_identity is not '\0'-terminated */
          zabbix_log(LOG_LEVEL_DEBUG, "%s(): requested PSK-identity: \"%.*s\"", __function_name, psk_identity_len,
          		psk_identity);
          

          This does not seem to be good, because "psk_identity_len" is "size_t", but * seems to expect an "int" (see http://stackoverflow.com/questions/19145951/printf-variable-string-length-specifier).

          Aleksandrs Saveljevs This is confirmed by "man 3 printf":

          The precision
          An optional precision, in the form of a period ('.') followed by an optional decimal digit string. Instead of a decimal digit string one may write "*" or "*m$" (for some decimal integer m) to specify that the precision is given in the next argument, or in the m-th argument, respectively, which must be of type int.

          Aleksandrs Saveljevs RESOLVED in r55229.

          Andris Mednis Thanks! Reviewed, accepted. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (34) In function zbx_psk_cb(), we have the following code: /* special print: psk_identity is not '\0'-terminated */ zabbix_log(LOG_LEVEL_DEBUG, "%s(): requested PSK-identity: \" %.*s\"", __function_name, psk_identity_len, psk_identity); This does not seem to be good, because "psk_identity_len" is "size_t", but * seems to expect an "int" (see http://stackoverflow.com/questions/19145951/printf-variable-string-length-specifier ). Aleksandrs Saveljevs This is confirmed by "man 3 printf": The precision An optional precision, in the form of a period ('.') followed by an optional decimal digit string. Instead of a decimal digit string one may write "*" or "*m$" (for some decimal integer m) to specify that the precision is given in the next argument, or in the m-th argument, respectively, which must be of type int. Aleksandrs Saveljevs RESOLVED in r55229. Andris Mednis Thanks! Reviewed, accepted. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (35) Some log messages in tls.c use "PSK identity", some "PSK-identity". Let's make it consistent.

          Andris Mednis RESOLVED in r54405.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (35) Some log messages in tls.c use "PSK identity", some "PSK-identity". Let's make it consistent. Andris Mednis RESOLVED in r54405. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (36) There is the following comment in zbx_read_psk_file():

          /* fgets() produces a confusing error message "[2] No such file or directory" if reading from an empty file. */
          /* To avoid the confusing message get the file size and deal with this situation separately. */
          

          Could you please provide any evidence that this is the case? "man fgets" has no indication that fgets() sets "errno". Maybe "errno" of 2 remained from another, previous error?

          Aleksandrs Saveljevs You can use the following test program as a basis:

          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <errno.h>
          
          int	main()
          {
          	FILE	*f;
          	char	buffer[256];
          
          	f = fopen("empty.txt", "r");
          
          	errno = 0;
          
          	if (NULL == fgets(buffer, sizeof(buffer), f))
          		printf("errno = %d\nstrerror() = %s\n", errno, strerror(errno));
          
          	fclose(f);
          
          	return 0;
          }
          

          Aleksandrs Saveljevs While at it, in that function the result of fgets() is checked for '\n'. It should also be checked for '\r'.

          Andris Mednis Thanks for finding - it is an error. RESOLVED in r54376.

          Aleksandrs Saveljevs The first part looks good, but it still checks for '\n' character only:

          if (NULL != (p = strchr(buf, '\n')))
          	*p = '\0';
          

          On Windows, the read buffer will have '\r' at the end. REOPENED.

          Andris Mednis RESOLVED in r54484.

          Aleksandrs Saveljevs Cute little solution. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (36) There is the following comment in zbx_read_psk_file(): /* fgets() produces a confusing error message "[2] No such file or directory" if reading from an empty file. */ /* To avoid the confusing message get the file size and deal with this situation separately. */ Could you please provide any evidence that this is the case? "man fgets" has no indication that fgets() sets "errno". Maybe "errno" of 2 remained from another, previous error? Aleksandrs Saveljevs You can use the following test program as a basis: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> int main() { FILE *f; char buffer[256]; f = fopen( "empty.txt" , "r" ); errno = 0; if (NULL == fgets(buffer, sizeof(buffer), f)) printf( "errno = %d\nstrerror() = %s\n" , errno, strerror(errno)); fclose(f); return 0; } Aleksandrs Saveljevs While at it, in that function the result of fgets() is checked for '\n'. It should also be checked for '\r'. Andris Mednis Thanks for finding - it is an error. RESOLVED in r54376. Aleksandrs Saveljevs The first part looks good, but it still checks for '\n' character only: if (NULL != (p = strchr(buf, '\n'))) *p = '\0'; On Windows, the read buffer will have '\r' at the end. REOPENED. Andris Mednis RESOLVED in r54484. Aleksandrs Saveljevs Cute little solution. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (37) There is the following code in zbx_print_rdn_value():

          /* According to RFC 4514:                                                   */
          /*    - escape characters '"' (U+0022), '+' U+002B, ',' U+002C, ';' U+003B, */
          /*      '<' U+003C, '>' U+003E, '\' U+005C  anywhere in string.             */
          /*    - escape characters space (' ' U+0020) or number sign ('#' U+0023) at */
          /*      the beginning of string.                                            */
          /*    - escape character space (' ' U+0020) at the end of string.           */
          /*    - escape null (U+0000) character anywhere. <--- we do not allow null. */
          
          if ((0x20 == (*p_in & 0x70) && ('"' == *p_in || '+' == *p_in || ',' == *p_in))
          		|| (0x30 == (*p_in & 0x70) && (';' == *p_in || '<' == *p_in ||
          		'>' == *p_in)) || '\\' == *p_in ||
          		(' ' == *p_in && (value == p_in || (value + len - 1) == p_in)))
          {
          	*p_out++ = '\\';
          }
          

          The comment refers to the "#" character, but it is not present in the conditional.

          Andris Mednis Thanks for noticing! RESOLVED in r54374.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (37) There is the following code in zbx_print_rdn_value(): /* According to RFC 4514: */ /* - escape characters '"' (U+0022), '+' U+002B, ',' U+002C, ';' U+003B, */ /* '<' U+003C, '>' U+003E, '\' U+005C anywhere in string. */ /* - escape characters space (' ' U+0020) or number sign ('#' U+0023) at */ /* the beginning of string. */ /* - escape character space (' ' U+0020) at the end of string. */ /* - escape null (U+0000) character anywhere. <--- we do not allow null. */ if ((0x20 == (*p_in & 0x70) && ('"' == *p_in || '+' == *p_in || ',' == *p_in)) || (0x30 == (*p_in & 0x70) && (';' == *p_in || '<' == *p_in || '>' == *p_in)) || '\\' == *p_in || (' ' == *p_in && (value == p_in || (value + len - 1) == p_in))) { *p_out++ = '\\'; } The comment refers to the "#" character, but it is not present in the conditional. Andris Mednis Thanks for noticing! RESOLVED in r54374. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (38) In zbx_x509_dn_gets() for PolarSSL, the following code can be simplified:

          -       p += zbx_strlcpy(p, "=", (size_t)(p_end - p));
          +       *p++ = '=';
          

          It seems to be safe, unless we wish to handle a case when size=0. Note that if size=0 we do not currently go to "small_buf".

          Andris Mednis zbx_strlcpy() takes care of terminating '\0'. With proposed change we would need to put '\0' "manually". Seems not worth changing.

          Aleksandrs Saveljevs Reasonable. WON'T FIX.

          Show
          Aleksandrs Saveljevs added a comment - - edited (38) In zbx_x509_dn_gets() for PolarSSL, the following code can be simplified: - p += zbx_strlcpy(p, "=", (size_t)(p_end - p)); + *p++ = '='; It seems to be safe, unless we wish to handle a case when size=0. Note that if size=0 we do not currently go to "small_buf". Andris Mednis zbx_strlcpy() takes care of terminating '\0'. With proposed change we would need to put '\0' "manually". Seems not worth changing. Aleksandrs Saveljevs Reasonable. WON'T FIX.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (39) The certificate returned by GnuTLS function zbx_get_peer_cert() should be freed with gnutls_x509_crt_deinit(). However, in functions zbx_tls_accept() and zbx_tls_get_attr() it does not seem to be freed in all cases.

          Andris Mednis RESOLVED in r54436.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (39) The certificate returned by GnuTLS function zbx_get_peer_cert() should be freed with gnutls_x509_crt_deinit(). However, in functions zbx_tls_accept() and zbx_tls_get_attr() it does not seem to be freed in all cases. Andris Mednis RESOLVED in r54436. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (40) Function zbx_tls_init_child() for PolarSSL uses the same logic for processing x509_crt_parse_file() and x509_crl_parse_file() return codes. However, it does not look like x509_crl_parse_file() returns the number of failed certificates like x509_crt_parse_file() does: it either returns 0 or a negative error code.

          Andris Mednis Thanks for finding! RESOLVED in r54437.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (40) Function zbx_tls_init_child() for PolarSSL uses the same logic for processing x509_crt_parse_file() and x509_crl_parse_file() return codes. However, it does not look like x509_crl_parse_file() returns the number of failed certificates like x509_crt_parse_file() does: it either returns 0 or a negative error code. Andris Mednis Thanks for finding! RESOLVED in r54437. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (41) For some configuration parameters, PolarSSL version of zbx_tls_init_child() writes from which file it read them, for others it does not. Should be consistent.

          Andris Mednis RESOLVED in r54405 (also for GnuTLS and OpenSSL).

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (41) For some configuration parameters, PolarSSL version of zbx_tls_init_child() writes from which file it read them, for others it does not. Should be consistent. Andris Mednis RESOLVED in r54405 (also for GnuTLS and OpenSSL). Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (42) Upon TLS initialization, it would be nice to log the list of all supported ciphersuites. Otherwise, it is quite difficult to find out which ciphersuites are supported by a TLS server: a TLS client sends a list of supported ciphersuites, but a server just chooses one of them. Therefore, they should be logged for simplified debugging.

          Andris Mednis RESOLVED in r55336.

          Aleksandrs Saveljevs Wonderful! CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (42) Upon TLS initialization, it would be nice to log the list of all supported ciphersuites. Otherwise, it is quite difficult to find out which ciphersuites are supported by a TLS server: a TLS client sends a list of supported ciphersuites, but a server just chooses one of them. Therefore, they should be logged for simplified debugging. Andris Mednis RESOLVED in r55336. Aleksandrs Saveljevs Wonderful! CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (43) In zbx_tls_init_child() for OpenSSL, the return value of X509_VERIFY_PARAM_new() is not checked. It can be NULL if malloc() fails.

          Aleksandrs Saveljevs Around the same place, when TLSKeyFile is loaded, the following line gets logged two times:

          zabbix_log(LOG_LEVEL_DEBUG, "%s(): loaded private key", __function_name);
          

          Andris Mednis "line gets logged two times" RESOLVED in r54405.

          Andris Mednis Checking of return value of X509_VERIFY_PARAM_new RESOLVED in r54439.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (43) In zbx_tls_init_child() for OpenSSL, the return value of X509_VERIFY_PARAM_new() is not checked. It can be NULL if malloc() fails. Aleksandrs Saveljevs Around the same place, when TLSKeyFile is loaded, the following line gets logged two times: zabbix_log(LOG_LEVEL_DEBUG, "%s(): loaded private key", __function_name); Andris Mednis "line gets logged two times" RESOLVED in r54405. Andris Mednis Checking of return value of X509_VERIFY_PARAM_new RESOLVED in r54439. Aleksandrs Saveljevs CLOSED
          Hide
          Andris Mednis added a comment - - edited

          (44) Debug log shows version for GnuTLS but not for mbed TLS and OpenSSL.

          Andris Mednis RESOLVED in r54389.

          Aleksandrs Saveljevs CLOSED

          Show
          Andris Mednis added a comment - - edited (44) Debug log shows version for GnuTLS but not for mbed TLS and OpenSSL. Andris Mednis RESOLVED in r54389. Aleksandrs Saveljevs CLOSED
          Hide
          Andris Mednis added a comment - - edited

          (45) If PSK file is empty, sometimes agentd crashes during exit (if compiled with OpenSSL).

          Andris Mednis RESOLVED in r54389.

          Aleksandrs Saveljevs CLOSED

          Show
          Andris Mednis added a comment - - edited (45) If PSK file is empty, sometimes agentd crashes during exit (if compiled with OpenSSL). Andris Mednis RESOLVED in r54389. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (46) The current "--help" output and man page implementation uses the following format for TLS options:

          --tls-crl-file=crl_file
          

          However, the currently used syntax for other options is this:

          --source-address IP-address
          --runtime-control runtime-option
          

          As discussed, it is thus proposed to change the parameter style to:

          --tls-crl-file CRL-file
          

          Andris Mednis In r54512 option syntax changed as proposed in source and man-pages, help messages changed to fit on 80-column terminal. In r54518 usage-messages changed to fit on 80-column terminal. RESOLVED.

          Aleksandrs Saveljevs Please see suggested improvements over r54512 in r54701 and improvements over r54518 in r54708. The latter also makes --version output fit into 80 columns. Still RESOLVED.

          Andris Mednis Thanks! r54708 reviewed and accepted. Please see proposed correction of r54701 in r54726.

          Aleksandrs Saveljevs Thanks for catching that word omission in r54701! CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (46) The current "--help" output and man page implementation uses the following format for TLS options: --tls-crl-file=crl_file However, the currently used syntax for other options is this: --source-address IP-address --runtime-control runtime-option As discussed, it is thus proposed to change the parameter style to: --tls-crl-file CRL-file Andris Mednis In r54512 option syntax changed as proposed in source and man-pages, help messages changed to fit on 80-column terminal. In r54518 usage-messages changed to fit on 80-column terminal. RESOLVED. Aleksandrs Saveljevs Please see suggested improvements over r54512 in r54701 and improvements over r54518 in r54708. The latter also makes --version output fit into 80 columns. Still RESOLVED. Andris Mednis Thanks! r54708 reviewed and accepted. Please see proposed correction of r54701 in r54726. Aleksandrs Saveljevs Thanks for catching that word omission in r54701! CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (47) Special notes regarding changes in r54389:

          • It is suggested to simply write "Unix" instead of "UNIX/GNU/Linux".
          • It is unclear why the code in zbx_tls_library_init() allows it to be run multiple times, whereas zbx_tls_library_deinit() runs only once. As far as I understood from the discussions, it may have to do with exiting and signal handlers, but it might be nice to document that in the code.
          • It looks like zbx_tls_library_deinit() will not be called in zabbix_get and zabbix_sender on Windows.

          Aleksandrs Saveljevs This might have appeared in a different revision, but anyway. There is now the following related warning if Zabbix is built without TLS:

          tls.c:2249:13: warning: ‘zbx_tls_library_init’ defined but not used [-Wunused-function]
           static void zbx_tls_library_init(void)
                       ^
          

          Andris Mednis "....simply write "Unix" instead of "UNIX/GNU/Linux"". RESOLVED in r54548.
          Andris Mednis "....It is unclear why the code in zbx_tls_library_init()..." RESOLVED in r54550.
          Andris Mednis "...warning if Zabbix is built without TLS..." RESOLVED in r54551.
          Andris Mednis "It looks like zbx_tls_library_deinit() will not be called....." for 'zabbix_get' RESOLVED in r54557, for 'zabbix_sender' RESOLVED in r54591.

          Aleksandrs Saveljevs After r54551, public functions like zbx_tls_library_deinit() are only defined if we are building with TLS. Maybe we should declare them in tls.h only if we are building with TLS, too? REOPENED.

          Andris Mednis RESOLVED in r54816.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (47) Special notes regarding changes in r54389: It is suggested to simply write "Unix" instead of "UNIX/GNU/Linux". It is unclear why the code in zbx_tls_library_init() allows it to be run multiple times, whereas zbx_tls_library_deinit() runs only once. As far as I understood from the discussions, it may have to do with exiting and signal handlers, but it might be nice to document that in the code. It looks like zbx_tls_library_deinit() will not be called in zabbix_get and zabbix_sender on Windows. Aleksandrs Saveljevs This might have appeared in a different revision, but anyway. There is now the following related warning if Zabbix is built without TLS: tls.c:2249:13: warning: ‘zbx_tls_library_init’ defined but not used [-Wunused-function] static void zbx_tls_library_init(void) ^ Andris Mednis "....simply write "Unix" instead of "UNIX/GNU/Linux"". RESOLVED in r54548. Andris Mednis "....It is unclear why the code in zbx_tls_library_init()..." RESOLVED in r54550. Andris Mednis "...warning if Zabbix is built without TLS..." RESOLVED in r54551. Andris Mednis "It looks like zbx_tls_library_deinit() will not be called....." for 'zabbix_get' RESOLVED in r54557, for 'zabbix_sender' RESOLVED in r54591. Aleksandrs Saveljevs After r54551, public functions like zbx_tls_library_deinit() are only defined if we are building with TLS. Maybe we should declare them in tls.h only if we are building with TLS, too? REOPENED. Andris Mednis RESOLVED in r54816. Aleksandrs Saveljevs CLOSED
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (48) [F] Specification does not mention "Translation strings" and "API changes", however there are quite a few. Please, correct specification.

          Andris Mednis Translation strings added to specification.

          Oleg Egorov CLOSED

          Show
          Ivo Kurzemnieks added a comment - - edited (48) [F] Specification does not mention "Translation strings" and "API changes", however there are quite a few. Please, correct specification. Andris Mednis Translation strings added to specification. Oleg Egorov CLOSED
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (49) [F] Translation strings?

          Oleg Egorov Added translation strings:

          • Certificate
          • Connections from host
          • Connections from proxy
          • Connections to host
          • Connections to proxy
          • Encryption
          • Incorrect value used for PSK field. It should consist of an even number of hexadecimal characters.
          • Issuer
          • No encryption
          • PSK
          • CERT
          • PSK cannot be empty.
          • PSK identity
          • PSK identity cannot be empty.
          • Incorrect value used for connections from host field.
          • Incorrect value used for connections to host field.
          • Incorrect value used for connections from proxy field.
          • Incorrect value used for connections to proxy field.
          • Agent encryption
          • Connections
          • Cannot update host encryption settings. Connection settings for both directions should be specified.

          Ivo Kurzemnieks The preceding strings are merged in pre-3.0.0alpha3 r56207

          Further string changes are in subissue (126).

          CLOSED

          <richlv>

          • note that "Connections to proxy" is listed here, but that's wrong - should be fixed in (120)
          • "Agent encryption." is listed here, but the added string seems to lack the trailing dot - REOPENED

          Ivo Kurzemnieks RESOLVED

          <richlv> thanks, CLOSED

          Show
          Ivo Kurzemnieks added a comment - - edited (49) [F] Translation strings? Oleg Egorov Added translation strings : Certificate Connections from host Connections from proxy Connections to host Connections to proxy Encryption Incorrect value used for PSK field. It should consist of an even number of hexadecimal characters. Issuer No encryption PSK CERT PSK cannot be empty. PSK identity PSK identity cannot be empty. Incorrect value used for connections from host field. Incorrect value used for connections to host field. Incorrect value used for connections from proxy field. Incorrect value used for connections to proxy field. Agent encryption Connections Cannot update host encryption settings. Connection settings for both directions should be specified. Ivo Kurzemnieks The preceding strings are merged in pre-3.0.0alpha3 r56207 Further string changes are in subissue (126). CLOSED < richlv > note that "Connections to proxy" is listed here, but that's wrong - should be fixed in (120) "Agent encryption." is listed here, but the added string seems to lack the trailing dot - REOPENED Ivo Kurzemnieks RESOLVED < richlv > thanks, CLOSED
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (50) [F] Field tls_accept can have values only 1,2,4.
          Field tls_connect can have values only 1,2,3,4,5,6,7.

          No zeroes or other values are allowed, so:

          • API validation should be updated accordingly.
          • Frontend field validation in hosts.php and proxy controllers should be changed accordingly.

          I suggest remove the decimal values and replace them with defines.
          The combination is for all checkboxes is, for example,

           HOST_ENCRYPTION_NONE,HOST_ENCRYPTION_PSK,HOST_ENCRYPTION_NONE|HOST_ENCRYPTION_PSK,HOST_ENCRYPTION_PSK,HOST_ENCRYPTION_NONE|HOST_ENCRYPTION_CERTIFICATE,HOST_ENCRYPTION_PSK|HOST_ENCRYPTION_CERTIFICATE,HOST_ENCRYPTION_NONE|HOST_ENCRYPTION_PSK|HOST_ENCRYPTION_CERTIFICATE
          

          Oleg Egorov RESOLVED IN r54597

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (50) [F] Field tls_accept can have values only 1,2,4 . Field tls_connect can have values only 1,2,3,4,5,6,7 . No zeroes or other values are allowed, so: API validation should be updated accordingly. Frontend field validation in hosts.php and proxy controllers should be changed accordingly. I suggest remove the decimal values and replace them with defines. The combination is for all checkboxes is, for example, HOST_ENCRYPTION_NONE,HOST_ENCRYPTION_PSK,HOST_ENCRYPTION_NONE|HOST_ENCRYPTION_PSK,HOST_ENCRYPTION_PSK,HOST_ENCRYPTION_NONE|HOST_ENCRYPTION_CERTIFICATE,HOST_ENCRYPTION_PSK|HOST_ENCRYPTION_CERTIFICATE,HOST_ENCRYPTION_NONE|HOST_ENCRYPTION_PSK|HOST_ENCRYPTION_CERTIFICATE Oleg Egorov RESOLVED IN r54597 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (51) [F] Host prototype edit form must have encryption tab with disabled fields that have same data coming from the host. Creating a host prototype via API, must also consider this - fill the new fields from parent host.

          Oleg Egorov Frontend part resolved in r54575

          Ivo Kurzemnieks Considering API, I guess I was wrong. Server should handle creation of discovered hosts with same values from host. See (56)
          CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (51) [F] Host prototype edit form must have encryption tab with disabled fields that have same data coming from the host. Creating a host prototype via API, must also consider this - fill the new fields from parent host. Oleg Egorov Frontend part resolved in r54575 Ivo Kurzemnieks Considering API, I guess I was wrong. Server should handle creation of discovered hosts with same values from host. See (56) CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (52) [F] When updating a discovered host, I get success message, but nothing has changed, no values are saved.

          Oleg Egorov RESOLVED IN r54592

          Ivo Kurzemnieks Massupdating discovered host encryption fields does not work. Problem in frontend, API works fine.
          REOPENED.

          Oleg Egorov RESOLVED IN r54644

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (52) [F] When updating a discovered host, I get success message, but nothing has changed, no values are saved. Oleg Egorov RESOLVED IN r54592 Ivo Kurzemnieks Massupdating discovered host encryption fields does not work. Problem in frontend, API works fine. REOPENED. Oleg Egorov RESOLVED IN r54644 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (53) [F] Hard to tell difference where ends "In" and where ends "Out". See attached image. Probably we can add two columns and use colspan on header and get rid of "IN / OUT" (or leave it?). Other suggestions were to make "... (To / From)" or "... (Connect / Accept)".

          Oleg Egorov RESOLVED IN r54588

          Ivo Kurzemnieks Spaces are a bit wide. I guess now it depends on css design. In the end we chose option #4
          CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (53) [F] Hard to tell difference where ends "In" and where ends "Out". See attached image . Probably we can add two columns and use colspan on header and get rid of "IN / OUT" (or leave it?). Other suggestions were to make "... (To / From)" or "... (Connect / Accept)". Oleg Egorov RESOLVED IN r54588 Ivo Kurzemnieks Spaces are a bit wide. I guess now it depends on css design. In the end we chose option #4 CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (54) [F] API host.massadd method does not save new fields. Suppose that is ok, because those fields are already added with host.create. But function should probably exist in checkInput() just like in proxy API. host.massupdate works fine for now, but functions should probably not be there either. Just like other fields, the host.massupdate will have to an option to check if there is something to be updated. For exampel host.update executes update for encryption fields, the massupdate skips updating them, however massupdate can update and validate fields if required.

          Oleg Egorov RESOLVED IN r54586

          Ivo Kurzemnieks massupdate still has some issues.
          In frontend checkboxes seem to have problems saving values. Edit form works.

          In API tls_connect is not validated properly. I can enter any value.

          REOPENED.

          Oleg Egorov RESOLVED IN r54630, r54631

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (54) [F] API host.massadd method does not save new fields. Suppose that is ok, because those fields are already added with host.create. But function should probably exist in checkInput() just like in proxy API. host.massupdate works fine for now, but functions should probably not be there either. Just like other fields, the host.massupdate will have to an option to check if there is something to be updated. For exampel host.update executes update for encryption fields, the massupdate skips updating them, however massupdate can update and validate fields if required. Oleg Egorov RESOLVED IN r54586 Ivo Kurzemnieks massupdate still has some issues. In frontend checkboxes seem to have problems saving values. Edit form works. In API tls_connect is not validated properly. I can enter any value. REOPENED. Oleg Egorov RESOLVED IN r54630, r54631 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (55) [F] Please, see my changes in r54534

          Oleg Egorov Thank you!
          Reviewed and improved in r54566

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (55) [F] Please, see my changes in r54534 Oleg Egorov Thank you! Reviewed and improved in r54566 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (56) [S] Server should handle creation of discovered hosts with same values from host.

          Andris Zeila RESOLVED in r54642

          Aleksandrs Saveljevs Looks good, but please see r54698. Most notably it fixes a SQL construction bug - "d = ",";" was missing for ZBX_FLAG_LLD_HOST_UPDATE_IPMI_PASS. Still RESOLVED.

          Andris Zeila Thanks, reviewed & CLOSED

          Show
          Ivo Kurzemnieks added a comment - - edited (56) [S] Server should handle creation of discovered hosts with same values from host. Andris Zeila RESOLVED in r54642 Aleksandrs Saveljevs Looks good, but please see r54698. Most notably it fixes a SQL construction bug - "d = ",";" was missing for ZBX_FLAG_LLD_HOST_UPDATE_IPMI_PASS. Still RESOLVED. Andris Zeila Thanks, reviewed & CLOSED
          Hide
          Ivo Kurzemnieks added a comment -

          Frontend code TESTED.

          Show
          Ivo Kurzemnieks added a comment - Frontend code TESTED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (57) The following suggestion comes from a research paper, "The Most Dangerous Code in the World: Validating SSL Certificates in Non-Browser Software", see http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf .

          It says in section 4.1 that checking the return code of SSL_connect() is not enough - if SSL_connect() is successful, it should be followed by SSL_get_verify_result(), see https://www.openssl.org/docs/ssl/SSL_get_verify_result.html . (The paper also warns about gnutls_certificate_verify_peers2(), but we seem to be using that function correctly.)

          In section 7.6, it then says that calling SSL_get_verify_result() seems to only be necessary in case SSL_VERIFY_PEER is not set, which we actually do. It should be verified whether any changes are required to our code.

          Andris Mednis Thanks for inspiring paper! Calling SSL_get_verify_result() turned out to be a good idea even with SSL_VERIFY_PEER set and I added it to our code in both zbx_tls_connect() and zbx_tls_accept() functions - see issue (152).

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (57) The following suggestion comes from a research paper, "The Most Dangerous Code in the World: Validating SSL Certificates in Non-Browser Software", see http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf . It says in section 4.1 that checking the return code of SSL_connect() is not enough - if SSL_connect() is successful, it should be followed by SSL_get_verify_result(), see https://www.openssl.org/docs/ssl/SSL_get_verify_result.html . (The paper also warns about gnutls_certificate_verify_peers2(), but we seem to be using that function correctly.) In section 7.6, it then says that calling SSL_get_verify_result() seems to only be necessary in case SSL_VERIFY_PEER is not set, which we actually do. It should be verified whether any changes are required to our code. Andris Mednis Thanks for inspiring paper! Calling SSL_get_verify_result() turned out to be a good idea even with SSL_VERIFY_PEER set and I added it to our code in both zbx_tls_connect() and zbx_tls_accept() functions - see issue (152) . Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (58) The following tries to document the current state of certificate validation.

          • If "Subject" field is left empty, then any certificate signed by a trusted CA will be accepted.
          • Therefore, using "Subject" field is a must - it should be explicitly stated in the documentation.
          • However, arriving at a proper value for "Subject" field is non-trivial - one should use a command-line tool for it. These command lines should be documented.
          • "Subject" field (which is a must) currently requires manual handling for every host. It would be nice to reduce the amount of effort in two ways:
            1. support partial matching, listing only required fields: "CN=agent.example.com" (then using a command-line tool will not be necessary)
            2. support macros: "CN={HOST.DNS}"
          • "Issuer" field mostly matters if the provided CA file contains multiple trusted CAs, so that the wanted CA can be chosen.
          • There is a trend to move away from "commonName" (see http://stackoverflow.com/questions/10175812/how-to-create-a-self-signed-certificate-with-openssl/27931596#27931596 and http://unmitigatedrisk.com/?p=381) in favor of "subjectAltName". The current implementation only checks "commonName" (in "Subject" only!) and ignores "subjectAltName". This means that if a Web server certificate omits "commonName" and uses "subjectAltName", this certificate cannot be reused for Zabbix if a user wants to verify the hostname.

          Whatever can be still redesigned should be redesigned. The rest should be documented.

          Aleksandrs Saveljevs Similarly, using --tls-agent-cert-subject for zabbix_get and --tls-server-cert-subject for zabbix_sender is a must, if proper certificate verification is desired.

          Andris Mednis

          Aleksandrs Saveljevs Created ZBXNEXT-3144 for partial matching and macros, and ZBXNEXT-3145 for "subjectAltName".

          Aleksandrs Saveljevs The page at https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#restricting_allowed_certificate_issuer_and_subject does not seem to stress that "Issuer" is optional and only really matters if there are mutiple CAs. It might also be useful to note that the considerations in "Restricting allowed certificate Issuer and Subject" also apply to "zabbix_get" and "zabbix_sender". REOPENED.

          Andris Mednis Fixed in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#restricting_allowed_certificate_issuer_and_subject .

          Aleksandrs Saveljevs Thank you! CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (58) The following tries to document the current state of certificate validation. If "Subject" field is left empty, then any certificate signed by a trusted CA will be accepted. Therefore, using "Subject" field is a must - it should be explicitly stated in the documentation. However, arriving at a proper value for "Subject" field is non-trivial - one should use a command-line tool for it. These command lines should be documented. "Subject" field (which is a must) currently requires manual handling for every host. It would be nice to reduce the amount of effort in two ways: support partial matching, listing only required fields: "CN=agent.example.com" (then using a command-line tool will not be necessary) support macros: "CN={HOST.DNS}" "Issuer" field mostly matters if the provided CA file contains multiple trusted CAs, so that the wanted CA can be chosen. There is a trend to move away from "commonName" (see http://stackoverflow.com/questions/10175812/how-to-create-a-self-signed-certificate-with-openssl/27931596#27931596 and http://unmitigatedrisk.com/?p=381 ) in favor of "subjectAltName". The current implementation only checks "commonName" (in "Subject" only!) and ignores "subjectAltName". This means that if a Web server certificate omits "commonName" and uses "subjectAltName", this certificate cannot be reused for Zabbix if a user wants to verify the hostname. Whatever can be still redesigned should be redesigned. The rest should be documented. Aleksandrs Saveljevs Similarly, using --tls-agent-cert-subject for zabbix_get and --tls-server-cert-subject for zabbix_sender is a must, if proper certificate verification is desired. Andris Mednis Dealing with certificate Issuer/Subject fields is documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#restricting_allowed_certificate_issuer_and_subject . Partial matching, macros for Subject field and checking of "subjectAltName" are not included into current design (can be put into separate feature requests). zabbix_get now supports --tls-agent-cert-issuer and --tls-agent-cert-subject options. zabbix_sender supports --tls-server-cert-issuer and --tls-server-cert-subject options. Aleksandrs Saveljevs Created ZBXNEXT-3144 for partial matching and macros, and ZBXNEXT-3145 for "subjectAltName". Aleksandrs Saveljevs The page at https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#restricting_allowed_certificate_issuer_and_subject does not seem to stress that "Issuer" is optional and only really matters if there are mutiple CAs. It might also be useful to note that the considerations in "Restricting allowed certificate Issuer and Subject" also apply to "zabbix_get" and "zabbix_sender". REOPENED. Andris Mednis Fixed in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#restricting_allowed_certificate_issuer_and_subject . Aleksandrs Saveljevs Thank you! CLOSED.
          Hide
          Andris Mednis added a comment - - edited

          (59) If a TLS parameter has invalid value (e.g. corrupted certificate file or empty PSK file) then some server, proxy or agentd processes sometimes do no terminate (they have to be terminated with "kill -9").

          Andris Mednis RESOLVED in r54657.

          Aleksandrs Saveljevs Looks good. I only removed #ifndef _WINDOWS from Unix-only code in r54684. Please take a look. Still RESOLVED.

          Andris Mednis Thanks ! Reviewed and accepted.

          Aleksandrs Saveljevs CLOSED

          Show
          Andris Mednis added a comment - - edited (59) If a TLS parameter has invalid value (e.g. corrupted certificate file or empty PSK file) then some server, proxy or agentd processes sometimes do no terminate (they have to be terminated with "kill -9"). Andris Mednis RESOLVED in r54657. Aleksandrs Saveljevs Looks good. I only removed #ifndef _WINDOWS from Unix-only code in r54684. Please take a look. Still RESOLVED. Andris Mednis Thanks ! Reviewed and accepted. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (60) The following warnings are produced compiling Zabbix without TLS support:

          active.c:564:6: warning: variable 'tls_arg1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED'
          #define ZBX_TCP_SEC_UNENCRYPTED         1               /* do not use encryption with this socket */
                                                  ^
          active.c:582:33: note: uninitialized use occurs here
                                  configured_tls_connect_mode, tls_arg1, tls_arg2)))
                                                               ^~~~~~~~
          active.c:564:2: note: remove the 'if' if its condition is always true
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          active.c:493:19: note: initialize the variable 'tls_arg1' to silence this warning
                  char                            *tls_arg1, *tls_arg2;
                                                           ^
                                                            = NULL
          active.c:564:6: warning: variable 'tls_arg2' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED'
          #define ZBX_TCP_SEC_UNENCRYPTED         1               /* do not use encryption with this socket */
                                                  ^
          active.c:582:43: note: uninitialized use occurs here
                                  configured_tls_connect_mode, tls_arg1, tls_arg2)))
                                                                         ^~~~~~~~
          active.c:564:2: note: remove the 'if' if its condition is always true
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          active.c:493:30: note: initialize the variable 'tls_arg2' to silence this warning
                  char                            *tls_arg1, *tls_arg2;
                                                                      ^
                                                                       = NULL
          active.c:745:6: warning: variable 'tls_arg1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED'
          #define ZBX_TCP_SEC_UNENCRYPTED         1               /* do not use encryption with this socket */
                                                  ^
          active.c:763:33: note: uninitialized use occurs here
                                  configured_tls_connect_mode, tls_arg1, tls_arg2)))
                                                               ^~~~~~~~
          active.c:745:2: note: remove the 'if' if its condition is always true
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          active.c:685:19: note: initialize the variable 'tls_arg1' to silence this warning
                  char                            *tls_arg1, *tls_arg2;
                                                           ^
                                                            = NULL
          active.c:745:6: warning: variable 'tls_arg2' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED'
          #define ZBX_TCP_SEC_UNENCRYPTED         1               /* do not use encryption with this socket */
                                                  ^
          active.c:763:43: note: uninitialized use occurs here
                                  configured_tls_connect_mode, tls_arg1, tls_arg2)))
                                                                         ^~~~~~~~
          active.c:745:2: note: remove the 'if' if its condition is always true
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          active.c:685:30: note: initialize the variable 'tls_arg2' to silence this warning
                  char                            *tls_arg1, *tls_arg2;
                                                                      ^
                                                                       = NULL
          4 warnings generated.
          servercomms.c:46:6: warning: variable 'tls_arg1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED'
          #define ZBX_TCP_SEC_UNENCRYPTED         1               /* do not use encryption with this socket */
                                                  ^
          servercomms.c:64:33: note: uninitialized use occurs here
                                  configured_tls_connect_mode, tls_arg1, tls_arg2)))
                                                               ^~~~~~~~
          servercomms.c:46:2: note: remove the 'if' if its condition is always true
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          servercomms.c:41:16: note: initialize the variable 'tls_arg1' to silence this warning
                  char    *tls_arg1, *tls_arg2;
                                   ^
                                    = NULL
          servercomms.c:46:6: warning: variable 'tls_arg2' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED'
          #define ZBX_TCP_SEC_UNENCRYPTED         1               /* do not use encryption with this socket */
                                                  ^
          servercomms.c:64:43: note: uninitialized use occurs here
                                  configured_tls_connect_mode, tls_arg1, tls_arg2)))
                                                                         ^~~~~~~~
          servercomms.c:46:2: note: remove the 'if' if its condition is always true
                  if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          servercomms.c:41:27: note: initialize the variable 'tls_arg2' to silence this warning
                  char    *tls_arg1, *tls_arg2;
                                              ^
                                               = NULL
          2 warnings generated.
          

          Andris Mednis RESOLVED in r54843.

          Aleksandrs Saveljevs Looks good. Another approach would be to write something like this, without initialization:

          #if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL)
          	if (ZBX_TCP_SEC_TLS_CERT == configured_tls_connect_mode)
          	{
          		tls_arg1 = CONFIG_TLS_SERVER_CERT_ISSUER;
          		tls_arg2 = CONFIG_TLS_SERVER_CERT_SUBJECT;
          	}
          	else if (ZBX_TCP_SEC_TLS_PSK == configured_tls_connect_mode)
          	{
          		tls_arg1 = CONFIG_TLS_PSK_IDENTITY;
          		tls_arg2 = NULL;			/* zbx_tls_connect() will find PSK */
          	}
          	else	/* ZBX_TCP_SEC_UNENCRYPTED */
          #endif
          	{
          		tls_arg1 = NULL;
          		tls_arg2 = NULL;
          	}
          

          If the above is not inspiring, this subissue can be closed.

          Aleksandrs Saveljevs As discussed, not inspiring. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (60) The following warnings are produced compiling Zabbix without TLS support: active.c:564:6: warning: variable 'tls_arg1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED' #define ZBX_TCP_SEC_UNENCRYPTED 1 /* do not use encryption with this socket */ ^ active.c:582:33: note: uninitialized use occurs here configured_tls_connect_mode, tls_arg1, tls_arg2))) ^~~~~~~~ active.c:564:2: note: remove the 'if' if its condition is always true if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ active.c:493:19: note: initialize the variable 'tls_arg1' to silence this warning char *tls_arg1, *tls_arg2; ^ = NULL active.c:564:6: warning: variable 'tls_arg2' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED' #define ZBX_TCP_SEC_UNENCRYPTED 1 /* do not use encryption with this socket */ ^ active.c:582:43: note: uninitialized use occurs here configured_tls_connect_mode, tls_arg1, tls_arg2))) ^~~~~~~~ active.c:564:2: note: remove the 'if' if its condition is always true if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ active.c:493:30: note: initialize the variable 'tls_arg2' to silence this warning char *tls_arg1, *tls_arg2; ^ = NULL active.c:745:6: warning: variable 'tls_arg1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED' #define ZBX_TCP_SEC_UNENCRYPTED 1 /* do not use encryption with this socket */ ^ active.c:763:33: note: uninitialized use occurs here configured_tls_connect_mode, tls_arg1, tls_arg2))) ^~~~~~~~ active.c:745:2: note: remove the 'if' if its condition is always true if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ active.c:685:19: note: initialize the variable 'tls_arg1' to silence this warning char *tls_arg1, *tls_arg2; ^ = NULL active.c:745:6: warning: variable 'tls_arg2' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED' #define ZBX_TCP_SEC_UNENCRYPTED 1 /* do not use encryption with this socket */ ^ active.c:763:43: note: uninitialized use occurs here configured_tls_connect_mode, tls_arg1, tls_arg2))) ^~~~~~~~ active.c:745:2: note: remove the 'if' if its condition is always true if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ active.c:685:30: note: initialize the variable 'tls_arg2' to silence this warning char *tls_arg1, *tls_arg2; ^ = NULL 4 warnings generated. servercomms.c:46:6: warning: variable 'tls_arg1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED' #define ZBX_TCP_SEC_UNENCRYPTED 1 /* do not use encryption with this socket */ ^ servercomms.c:64:33: note: uninitialized use occurs here configured_tls_connect_mode, tls_arg1, tls_arg2))) ^~~~~~~~ servercomms.c:46:2: note: remove the 'if' if its condition is always true if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ servercomms.c:41:16: note: initialize the variable 'tls_arg1' to silence this warning char *tls_arg1, *tls_arg2; ^ = NULL servercomms.c:46:6: warning: variable 'tls_arg2' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../include/comms.h:116:34: note: expanded from macro 'ZBX_TCP_SEC_UNENCRYPTED' #define ZBX_TCP_SEC_UNENCRYPTED 1 /* do not use encryption with this socket */ ^ servercomms.c:64:43: note: uninitialized use occurs here configured_tls_connect_mode, tls_arg1, tls_arg2))) ^~~~~~~~ servercomms.c:46:2: note: remove the 'if' if its condition is always true if (ZBX_TCP_SEC_UNENCRYPTED == configured_tls_connect_mode) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ servercomms.c:41:27: note: initialize the variable 'tls_arg2' to silence this warning char *tls_arg1, *tls_arg2; ^ = NULL 2 warnings generated. Andris Mednis RESOLVED in r54843. Aleksandrs Saveljevs Looks good. Another approach would be to write something like this, without initialization: #if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL) if (ZBX_TCP_SEC_TLS_CERT == configured_tls_connect_mode) { tls_arg1 = CONFIG_TLS_SERVER_CERT_ISSUER; tls_arg2 = CONFIG_TLS_SERVER_CERT_SUBJECT; } else if (ZBX_TCP_SEC_TLS_PSK == configured_tls_connect_mode) { tls_arg1 = CONFIG_TLS_PSK_IDENTITY; tls_arg2 = NULL; /* zbx_tls_connect() will find PSK */ } else /* ZBX_TCP_SEC_UNENCRYPTED */ #endif { tls_arg1 = NULL; tls_arg2 = NULL; } If the above is not inspiring, this subissue can be closed. Aleksandrs Saveljevs As discussed, not inspiring. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (61) [F] In "Encryption" tab of host configuration, enter "2" in PSK. The frontend does not allow to save the value with the following error message:

          Incorrect value used for PSK field. Only hexadecimal characters are supported.
          

          "2" is a perfect hexadecimal character, so the error message should be improved.

          Oleg Egorov RESOLVED IN r54728

          Aleksandrs Saveljevs After entering an incorrect PSK value for hosts and pressing "Update", it stays on the same "Encryption" tab. However, for proxies it switches to "Proxy" tab instead. REOPENED.

          Oleg Egorov RESOLVED IN r56078

          Ivo Kurzemnieks CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (61) [F] In "Encryption" tab of host configuration, enter "2" in PSK. The frontend does not allow to save the value with the following error message: Incorrect value used for PSK field. Only hexadecimal characters are supported. "2" is a perfect hexadecimal character, so the error message should be improved. Oleg Egorov RESOLVED IN r54728 Aleksandrs Saveljevs After entering an incorrect PSK value for hosts and pressing "Update", it stays on the same "Encryption" tab. However, for proxies it switches to "Proxy" tab instead. REOPENED. Oleg Egorov RESOLVED IN r56078 Ivo Kurzemnieks CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (62) [F] In "Encryption" tab of host configuration, if we uncheck all "Connections from host" checkboxes, then it saves successfully, but after reopening the host's encryption tab there is a checkbox beside "No encryption". Silently changing "no checkboxes" to "No encryption" is not good, because if a user unchecks all checkboxes, his intention might be to disallow any kind of connections from that host. If so, the result will not be what he wanted. If we require at least one checkbox, the frontend should complain if there are none.

          Oleg Egorov RESOLVED IN r54727

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (62) [F] In "Encryption" tab of host configuration, if we uncheck all "Connections from host" checkboxes, then it saves successfully, but after reopening the host's encryption tab there is a checkbox beside "No encryption". Silently changing "no checkboxes" to "No encryption" is not good, because if a user unchecks all checkboxes, his intention might be to disallow any kind of connections from that host. If so, the result will not be what he wanted. If we require at least one checkbox, the frontend should complain if there are none. Oleg Egorov RESOLVED IN r54727 Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (63) [F] For discovered hosts, we allow to view them, but not change anything - e.g., IPMI settings cannot be changed and macros cannot be added. However, changing encryption settings is currently allowed, which does not make a lot of sense, because any changes will be overwritten the next time discovery runs.

          Oleg Egorov RESOLVED IN r54730

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (63) [F] For discovered hosts, we allow to view them, but not change anything - e.g., IPMI settings cannot be changed and macros cannot be added. However, changing encryption settings is currently allowed, which does not make a lot of sense, because any changes will be overwritten the next time discovery runs. Oleg Egorov RESOLVED IN r54730 Aleksandrs Saveljevs CLOSED
          Hide
          Oleg Egorov added a comment - - edited

          (64) As discussed, improved proxies edit page, encryption tab.
          Enabled/Disabled connection to and from host options based on proxy mode.

          RESOLVED IN r54737

          Aleksandrs Saveljevs There is something to be improved either on the frontend side or on the server side.

          Suppose we had an active proxy intially and we had the server configured to accept PSK connections from it. We then decided to configure the server to connect unencrypted to the proxy, but have not made any changes on the proxy side yet. This means that proxy will still attempt to connect to the server. Server then logs the following lines:

           17047:20150813:123846.908 cannot parse history data from active proxy at "127.0.0.1": proxy "proxy-openssl-1.0.1e" is configured in passive mode
          

          The fact that server could parse the proxy name from the received JSON means that it knows the PSK identity and PSK value. However, in frontend it currently looks like this:

          The PSK identity and PSK values are hidden, but they are actually taken into account by server. REOPENED.

          Oleg Egorov RESOLVED IN r55137

          Aleksandrs Saveljevs Not possible to save the following configuration:

          REOPENED

          Oleg Egorov RESOLVED IN r55300

          Aleksandrs Saveljevs It is now possible to save this configuration, but PSK values will be cleared!

          Aleksandrs Saveljevs While at it, please take a look at typo fixes in r55805. REOPENED.

          Oleg Egorov Thanks for improvements. And, yes, after "save" disabled values removed. Because HTML doesn't send disabled fields. Same logic exist in other places, for example in "Housekeeping"

          RESOLVED IN r56110

          Ivo Kurzemnieks CLOSED.

          Show
          Oleg Egorov added a comment - - edited (64) As discussed, improved proxies edit page, encryption tab. Enabled/Disabled connection to and from host options based on proxy mode. RESOLVED IN r54737 Aleksandrs Saveljevs There is something to be improved either on the frontend side or on the server side. Suppose we had an active proxy intially and we had the server configured to accept PSK connections from it. We then decided to configure the server to connect unencrypted to the proxy, but have not made any changes on the proxy side yet. This means that proxy will still attempt to connect to the server. Server then logs the following lines: 17047:20150813:123846.908 cannot parse history data from active proxy at "127.0.0.1": proxy "proxy-openssl-1.0.1e" is configured in passive mode The fact that server could parse the proxy name from the received JSON means that it knows the PSK identity and PSK value. However, in frontend it currently looks like this: The PSK identity and PSK values are hidden, but they are actually taken into account by server. REOPENED. Oleg Egorov RESOLVED IN r55137 Aleksandrs Saveljevs Not possible to save the following configuration: REOPENED Oleg Egorov RESOLVED IN r55300 Aleksandrs Saveljevs It is now possible to save this configuration, but PSK values will be cleared! Aleksandrs Saveljevs While at it, please take a look at typo fixes in r55805. REOPENED. Oleg Egorov Thanks for improvements. And, yes, after "save" disabled values removed. Because HTML doesn't send disabled fields. Same logic exist in other places, for example in "Housekeeping" RESOLVED IN r56110 Ivo Kurzemnieks CLOSED.
          Hide
          Andris Mednis added a comment - - edited

          (65) In libgnutls.m4 and libopenssl.m4 scripts the directory /usr/include is searched before /usr/local/include. http://www.network-theory.co.uk/docs/gccintro/gccintro_21.html says that in GCC "... a header file found in ‘/usr/local/include’ takes precedence over a file with the same name in ‘/usr/include’. Similarly, a library found in ‘/usr/local/lib’ takes precedence over a library with the same name in ‘/usr/lib’."

          Andris Mednis RESOLVED in r54745.

          Aleksandrs Saveljevs According to my notes, the following command outputs directories that GCC searches by default:

          $ gcc -v -xc -E - < /dev/null
          ...
           /usr/lib/gcc/x86_64-linux-gnu/4.9/include
           /usr/local/include
           /usr/lib/gcc/x86_64-linux-gnu/4.9/include-fixed
           /usr/include/x86_64-linux-gnu
           /usr/include
          ...
          

          Aleksandrs Saveljevs CLOSED

          Show
          Andris Mednis added a comment - - edited (65) In libgnutls.m4 and libopenssl.m4 scripts the directory /usr/include is searched before /usr/local/include. http://www.network-theory.co.uk/docs/gccintro/gccintro_21.html says that in GCC "... a header file found in ‘/usr/local/include’ takes precedence over a file with the same name in ‘/usr/include’. Similarly, a library found in ‘/usr/local/lib’ takes precedence over a library with the same name in ‘/usr/lib’." Andris Mednis RESOLVED in r54745. Aleksandrs Saveljevs According to my notes, the following command outputs directories that GCC searches by default: $ gcc -v -xc -E - < /dev/null ... /usr/lib/gcc/x86_64-linux-gnu/4.9/include /usr/local/include /usr/lib/gcc/x86_64-linux-gnu/4.9/include-fixed /usr/include/x86_64-linux-gnu /usr/include ... Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (66) [D] Similar to (58), which deals with certificates, this subissue attempts to document the current state of PSK.

          • The page at https://en.wikipedia.org/wiki/TLS-PSK says:

            There are several ciphersuites: The first set of ciphersuites uses only symmetric key operations for authentication. The second set uses a Diffie-Hellman key exchange authenticated with a pre-shared key. The third set combines public key authentication of the server with pre-shared key authentication of the client.

            From these three, the 2nd option is preferred and the 1st option is supported. The 3rd option is not supported and it should be noted in the documentation.

          • The limit on PSK length seems to be 32 bytes for mbed TLS and 256 bytes for GnuTLS and OpenSSL.
          • Any PSK length up to the limit will work (e.g., 1, 2, 3, ..., 111, etc. bytes).
          • Commands to generate a PSK should be documented.

          Aleksandrs Saveljevs For example, the following commands can be used to generate a PSK

          • with GnuTLS:
            $ psktool -u psk_identity -p database.psk -s 32
            Generating a random key for user 'psk_identity'
            Key stored to database.psk
            $ cat database.psk 
            psk_identity:9b8eafedfaae00cece62e85d5f4792c7d9c9bcc851b23216a1d300311cc4f7cb
            
          • with OpenSSL:
            $ openssl rand -hex 32
            af8ced32dfe8714e548694e2d29e1a14ba6fa13f216cb35c19d0feb1084b0429
            

          Aleksandrs Saveljevs It can be seen that "psktool" above generates a database file with a PSK identity and its associated PSK. It should be noted that Zabbix expects just a PSK in the PSK file, so if one uses "psktool", then identity string should be removed from the file.

          Aleksandrs Saveljevs It would be nice if macros would be supported in PSK identity field, e.g., {HOST.HOST}.

          Andris Mednis

          Aleksandrs Saveljevs Supporting macros in PSK fields reported in ZBXNEXT-3148. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (66) [D] Similar to (58), which deals with certificates, this subissue attempts to document the current state of PSK. The page at https://en.wikipedia.org/wiki/TLS-PSK says: There are several ciphersuites: The first set of ciphersuites uses only symmetric key operations for authentication. The second set uses a Diffie-Hellman key exchange authenticated with a pre-shared key. The third set combines public key authentication of the server with pre-shared key authentication of the client. From these three, the 2nd option is preferred and the 1st option is supported. The 3rd option is not supported and it should be noted in the documentation. The limit on PSK length seems to be 32 bytes for mbed TLS and 256 bytes for GnuTLS and OpenSSL. Any PSK length up to the limit will work (e.g., 1, 2, 3, ..., 111, etc. bytes). Commands to generate a PSK should be documented. Aleksandrs Saveljevs For example, the following commands can be used to generate a PSK with GnuTLS: $ psktool -u psk_identity -p database.psk -s 32 Generating a random key for user 'psk_identity' Key stored to database.psk $ cat database.psk psk_identity:9b8eafedfaae00cece62e85d5f4792c7d9c9bcc851b23216a1d300311cc4f7cb with OpenSSL: $ openssl rand -hex 32 af8ced32dfe8714e548694e2d29e1a14ba6fa13f216cb35c19d0feb1084b0429 Aleksandrs Saveljevs It can be seen that "psktool" above generates a database file with a PSK identity and its associated PSK. It should be noted that Zabbix expects just a PSK in the PSK file, so if one uses "psktool", then identity string should be removed from the file. Aleksandrs Saveljevs It would be nice if macros would be supported in PSK identity field, e.g., {HOST.HOST}. Andris Mednis Supported ciphersuites documented in https://www.zabbix.com/documentation/3.0/manual/encryption#ciphersuites . Limits on PSK length documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_pre_shared_keys#size_limits . It was decided to add a minimum PSK length limit - see issue (150) . Commands to generate a PSK documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_pre_shared_keys#generating_psk . Aleksandrs Saveljevs Supporting macros in PSK fields reported in ZBXNEXT-3148 . CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (67) It is required that every PSK identity has a unique value. In other words, if two hosts (or a host and a proxy) configure two different PSK values for one PSK identity, there will be problems, as mentioned at https://www.zabbix.com/documentation/3.0/manual/encryption?&#using_pre-shared_keys_psk .

          Unfortunately, the only diagnostic we provide is the following line in DCsync_hosts():

          zabbix_log(LOG_LEVEL_WARNING, "PSK value changed for identity \"%s\"", row[28]);
          

          From this line by itself, it is not clear whether we have two different PSK values configured for one PSK identity, or a user has just fixed a typo, and even if one could deduce that there are different PSK values, one would have to look directly into the database to find out offending hosts.

          It would be nice if in DCsync_hosts() we could print hosts that have conflicting PSK values configured. If frontend would help to prevent such configuration errors, too, that would also be wonderful.

          Glebs Ivanovskis Conflict detection and reporting implemented in r56440. Messages will look like:

           10883:20151029:180934.222 PSK value changed for identity "test"
           10883:20151029:180934.222 PSK value changed for identity "test 2"
           10883:20151029:180934.222 multiple PSK values configured for identity "test" on hosts: "Test 2", "Dummy", "Test 1" and proxies: "Test 1"
           10883:20151029:180934.222 multiple PSK values configured for identity "test 2" on proxies: "Test 4", "Test 3"
          

          There will also be no more than one message about PSK value change per PSK identity per config update.
          Moved conflict logging outside of config cache lock to improve performance in r56536.
          RESOLVED

          Aleksandrs Saveljevs The proposed solution produces beautifully formatted output, but it seems too heavyweight: it adds a new structure definition, a new static variable, increases ZBX_DC_PSK size, and adds two functions - all that just for the sake of dealing with the situation that is a misconfiguration error.

          Aleksandrs Saveljevs Attached subissue-67-r56413.patch (a patch against r56413) is an alternative proposal to achieve a similar goal with less code. Unlike the currently committed solution, it does not log "PSK value changed ..." lines, because PSK value changes are not necessarily a problem. Instead, it produces lines like the following:

           23454:20151112:151131.955 forced reloading of the configuration cache
           23454:20151112:151131.956 conflicting PSK values for PSK identity "ID1" on hosts "host 1" and "host 2" (and maybe others)
           23454:20151112:151131.956 conflicting PSK values for PSK identity "ID1" on hosts "host 1" and "host 3" (and maybe others)
           23454:20151112:151131.956 conflicting PSK values for PSK identity "ID1" on hosts "host 1" and "host 4" (and maybe others)
          

          Aleksandrs Saveljevs Let's ask Andris Mednis or Alexander Vladishev to take a look at both solutions and choose the preferred approach. Still RESOLVED.

          Aleksandrs Saveljevs As discussed on IRC, the solution in the patch seems to be preferred. So the old solution was reverted in r57470, and the patch applied in r57474 and r57479. Glebs Ivanovskis, please take a look.

          Glebs Ivanovskis PSK conflict logging in lightweight solution occurs inside configuration cache lock. It can impact performance and this was something Alexander Vladishev wasn't particularly fond of. But as far as simplicity is our top priority I am satisfied with this solution.
          CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (67) It is required that every PSK identity has a unique value. In other words, if two hosts (or a host and a proxy) configure two different PSK values for one PSK identity, there will be problems, as mentioned at https://www.zabbix.com/documentation/3.0/manual/encryption?&#using_pre-shared_keys_psk . Unfortunately, the only diagnostic we provide is the following line in DCsync_hosts(): zabbix_log(LOG_LEVEL_WARNING, "PSK value changed for identity \" %s\"", row[28]); From this line by itself, it is not clear whether we have two different PSK values configured for one PSK identity, or a user has just fixed a typo, and even if one could deduce that there are different PSK values, one would have to look directly into the database to find out offending hosts. It would be nice if in DCsync_hosts() we could print hosts that have conflicting PSK values configured. If frontend would help to prevent such configuration errors, too, that would also be wonderful. Glebs Ivanovskis Conflict detection and reporting implemented in r56440. Messages will look like: 10883:20151029:180934.222 PSK value changed for identity "test" 10883:20151029:180934.222 PSK value changed for identity "test 2" 10883:20151029:180934.222 multiple PSK values configured for identity "test" on hosts: "Test 2", "Dummy", "Test 1" and proxies: "Test 1" 10883:20151029:180934.222 multiple PSK values configured for identity "test 2" on proxies: "Test 4", "Test 3" There will also be no more than one message about PSK value change per PSK identity per config update. Moved conflict logging outside of config cache lock to improve performance in r56536. RESOLVED Aleksandrs Saveljevs The proposed solution produces beautifully formatted output, but it seems too heavyweight: it adds a new structure definition, a new static variable, increases ZBX_DC_PSK size, and adds two functions - all that just for the sake of dealing with the situation that is a misconfiguration error. Aleksandrs Saveljevs Attached subissue-67-r56413.patch (a patch against r56413) is an alternative proposal to achieve a similar goal with less code. Unlike the currently committed solution, it does not log "PSK value changed ..." lines, because PSK value changes are not necessarily a problem. Instead, it produces lines like the following: 23454:20151112:151131.955 forced reloading of the configuration cache 23454:20151112:151131.956 conflicting PSK values for PSK identity "ID1" on hosts "host 1" and "host 2" (and maybe others) 23454:20151112:151131.956 conflicting PSK values for PSK identity "ID1" on hosts "host 1" and "host 3" (and maybe others) 23454:20151112:151131.956 conflicting PSK values for PSK identity "ID1" on hosts "host 1" and "host 4" (and maybe others) Aleksandrs Saveljevs Let's ask Andris Mednis or Alexander Vladishev to take a look at both solutions and choose the preferred approach. Still RESOLVED. Aleksandrs Saveljevs As discussed on IRC, the solution in the patch seems to be preferred. So the old solution was reverted in r57470, and the patch applied in r57474 and r57479. Glebs Ivanovskis , please take a look. Glebs Ivanovskis PSK conflict logging in lightweight solution occurs inside configuration cache lock. It can impact performance and this was something Alexander Vladishev wasn't particularly fond of. But as far as simplicity is our top priority I am satisfied with this solution. CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (68) Suppose we have a server configured to connect with PSK to passive agents, but agents only accept unencrypted connections. Then we have the following error on the red "ZBX" in the frontend:

          Get value from agent failed: TLS handshake returned SSL_ERROR_SYSCALL:: 
          

          According to tls.c, there should be additional information after "SSL_ERROR_SYSCALL", so it looks like a bug. Server uses OpenSSL 1.0.2c.

          Andris Mednis Thanks for finding a bug! RESOLVED in 55094.

          Aleksandrs Saveljevs It now returns the following message in the described use case:

          Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30001]: Connection closed by peer. Check allowed connection types and access rights
          

          This is good. Unfortunately, if the agent is GnuTLS, it returns the same message if their PSK values do not match. The error message is then a bit misleading, because connection types and access rights are OK. See also (69) for example messages with mbed TLS and OpenSSL agents, which are much better. REOPENED.

          Andris Mednis RESOLVED in r55368.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (68) Suppose we have a server configured to connect with PSK to passive agents, but agents only accept unencrypted connections. Then we have the following error on the red "ZBX" in the frontend: Get value from agent failed: TLS handshake returned SSL_ERROR_SYSCALL:: According to tls.c, there should be additional information after "SSL_ERROR_SYSCALL", so it looks like a bug. Server uses OpenSSL 1.0.2c. Andris Mednis Thanks for finding a bug! RESOLVED in 55094. Aleksandrs Saveljevs It now returns the following message in the described use case: Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30001]: Connection closed by peer. Check allowed connection types and access rights This is good. Unfortunately, if the agent is GnuTLS, it returns the same message if their PSK values do not match. The error message is then a bit misleading, because connection types and access rights are OK. See also (69) for example messages with mbed TLS and OpenSSL agents, which are much better. REOPENED. Andris Mednis RESOLVED in r55368. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (69) [D] This subissue is meant to collect different error messages a user can see on the red "ZBX" in the frontend.

          PSK

          • In case server is configured to connect with PSK to passive agents, but agents only accept unencrypted connections. Then the error messages are as follows:
            • mbed TLS 1.3.11
              Get value from agent failed: ssl_handshake(): SSL - The connection indicated an EOF
              
            • GnuTLS 3.3.16
              Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated.
              
            • OpenSSL 1.0.2c
              Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30001]: Connection closed by peer. Check allowed connection types and access rights
              
          • In case server is configured to connect with PSK to passive agents and agent also accepts PSK connections, but their PSK values do not match. Then the error messages are as follows:
            • mbed TLS 1.3.11
              • if agent is GnuTLS 3.1.18:
                Get value from agent failed: ssl_handshake(): SSL - The connection indicated an EOF
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: ssl_handshake(): SSL - A fatal alert message was received from our peer
                
            • GnuTLS 3.3.16
              • if agent is GnuTLS 3.1.18:
                Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated.
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 20 Bad record MAC
                
            • OpenSSL 1.0.2c
              • if agent is GnuTLS 3.1.18 (FIXME after (68)):
                Get value from agent failed: TLS handshake returned SSL_ERROR_SYSCALL:: 
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1472: error:140943FC:SSL routines:ssl3_read_bytes:sslv3 alert bad record mac: SSL alert number 20: TLS read fatal alert "bad record mac"
                
          • In case server is configured to connect with PSK to passive agents and agent also accepts PSK connections, but their PSK identities do not match. Then the error messages are as follows:
            • mbed TLS 1.3.11
              • if agent is GnuTLS 3.1.18:
                Get value from agent failed: ssl_handshake(): SSL - The connection indicated an EOF
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: ssl_handshake(): NET - Connection was reset by peer
                
            • GnuTLS 3.3.16
              • if agent is GnuTLS 3.1.18:
                Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated.
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 115 The SRP/PSK username is missing or not known
                
            • OpenSSL 1.0.2c
              • if agent is GnuTLS 3.1.18 (FIXME after (68)):
                Get value from agent failed: TLS handshake returned SSL_ERROR_SYSCALL:: 
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1472: error:1409445B:SSL routines:ssl3_read_bytes:reason(1115): SSL alert number 115: TLS read fatal alert "unknown PSK identity"
                
          • In case server is configured to connect with PSK to passive agents, but the PSK is longer than the library can handle. Then the error messages are as follows:
            • mbed TLS 1.3.11
              Get value from agent failed: ssl_set_psk(): SSL - Bad input parameters to function
              
          • In case server is configured to accept PSK connections from active agents, but PSK values do not match. The following lines are written into the server log:
            • mbed TLS 1.3.11
               13855:20150812:154541.594 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - Verification of the message MAC failed
              
            • GnuTLS 3.3.16
               13743:20150812:154021.740 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -24 Decryption has failed.
              
            • OpenSSL 1.0.2c
               12930:20150812:151922.173 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 532: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"
              
          • In case server is configured to accept PSK connections from active agents, but PSK identities do not match. The following lines are written into the server log:
            • mbed TLS 1.3.11
               13456:20150812:153410.263 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - Unknown identity received (eg, PSK identity)
              
            • GnuTLS 3.3.16
               13571:20150812:153653.750 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -31 Error in password file.
              
            • OpenSSL 1.0.2c
               12930:20150812:152427.477 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 2767: error:1408B0DF:SSL routines:ssl3_get_client_key_exchange:psk identity not found: TLS write fatal alert "unknown PSK identity"
              
          • In case a server sends a PSK identity longer than an agent can handle:
            • mbed TLS 1.3.11
              • if agent is PolarSSL 1.3.9:
                N/A
                
              • if agent is GnuTLS 3.1.18:
                Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30003]: ssl_handshake(): SSL - The connection indicated an EOF
                
              • if agent is OpenSSL 1.0.1:
                Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30004]: ssl_handshake(): NET - Connection was reset by peer
                
            • GnuTLS 3.3.16
              N/A
              
            • OpenSSL 1.0.2c
              N/A
              
          • In case an agent sends a PSK identity longer than a server can handle:
            • mbed TLS 1.3.11
              N/A
              
            • GnuTLS 3.3.16 (FIXME should we write the first line?)
               27277:20150828:093312.236 zbx_tls_accept(): gnutls_handshake() returned: -90 The SRP username supplied is illegal.
               27277:20150828:093312.236 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -90 The SRP username supplied is illegal.
              
            • OpenSSL 1.0.2c
              failed to accept an incoming connection: from 127.0.0.1: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1472: error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure: SSL alert number 40: TLS read fatal alert "handshake failure"
              

          Certificates

          • In case a server connects to an agent with a certificate, but agent certificate is not trusted:
            • mbed TLS 1.3.11
              Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30002]: invalid peer certificate: self-signed or not signed by trusted CA
              
            • GnuTLS 3.3.16
              • if agent is GnuTLS 3.1.18:
                Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30003]: invalid peer certificate: The certificate is NOT trusted. The certificate issuer is unknown. 
                
              • if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1:
                Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30006]: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 46 Unknown certificate
                
            • OpenSSL 1.0.2c
              ...
              

          Andris Mednis Nice collection! Parts of it can be included in documentation upon need. CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (69) [D] This subissue is meant to collect different error messages a user can see on the red "ZBX" in the frontend. PSK In case server is configured to connect with PSK to passive agents, but agents only accept unencrypted connections. Then the error messages are as follows: mbed TLS 1.3.11 Get value from agent failed: ssl_handshake(): SSL - The connection indicated an EOF GnuTLS 3.3.16 Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated. OpenSSL 1.0.2c Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30001]: Connection closed by peer. Check allowed connection types and access rights In case server is configured to connect with PSK to passive agents and agent also accepts PSK connections, but their PSK values do not match. Then the error messages are as follows: mbed TLS 1.3.11 if agent is GnuTLS 3.1.18: Get value from agent failed: ssl_handshake(): SSL - The connection indicated an EOF if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: ssl_handshake(): SSL - A fatal alert message was received from our peer GnuTLS 3.3.16 if agent is GnuTLS 3.1.18: Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated. if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 20 Bad record MAC OpenSSL 1.0.2c if agent is GnuTLS 3.1.18 ( FIXME after (68)): Get value from agent failed: TLS handshake returned SSL_ERROR_SYSCALL:: if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1472: error:140943FC:SSL routines:ssl3_read_bytes:sslv3 alert bad record mac: SSL alert number 20: TLS read fatal alert "bad record mac" In case server is configured to connect with PSK to passive agents and agent also accepts PSK connections, but their PSK identities do not match. Then the error messages are as follows: mbed TLS 1.3.11 if agent is GnuTLS 3.1.18: Get value from agent failed: ssl_handshake(): SSL - The connection indicated an EOF if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: ssl_handshake(): NET - Connection was reset by peer GnuTLS 3.3.16 if agent is GnuTLS 3.1.18: Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated. if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 115 The SRP/PSK username is missing or not known OpenSSL 1.0.2c if agent is GnuTLS 3.1.18 ( FIXME after (68)): Get value from agent failed: TLS handshake returned SSL_ERROR_SYSCALL:: if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1472: error:1409445B:SSL routines:ssl3_read_bytes:reason(1115): SSL alert number 115: TLS read fatal alert "unknown PSK identity" In case server is configured to connect with PSK to passive agents, but the PSK is longer than the library can handle. Then the error messages are as follows: mbed TLS 1.3.11 Get value from agent failed: ssl_set_psk(): SSL - Bad input parameters to function In case server is configured to accept PSK connections from active agents, but PSK values do not match. The following lines are written into the server log: mbed TLS 1.3.11 13855:20150812:154541.594 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - Verification of the message MAC failed GnuTLS 3.3.16 13743:20150812:154021.740 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -24 Decryption has failed. OpenSSL 1.0.2c 12930:20150812:151922.173 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 532: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac" In case server is configured to accept PSK connections from active agents, but PSK identities do not match. The following lines are written into the server log: mbed TLS 1.3.11 13456:20150812:153410.263 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - Unknown identity received (eg, PSK identity) GnuTLS 3.3.16 13571:20150812:153653.750 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -31 Error in password file. OpenSSL 1.0.2c 12930:20150812:152427.477 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 2767: error:1408B0DF:SSL routines:ssl3_get_client_key_exchange:psk identity not found: TLS write fatal alert "unknown PSK identity" In case a server sends a PSK identity longer than an agent can handle: mbed TLS 1.3.11 if agent is PolarSSL 1.3.9: N/A if agent is GnuTLS 3.1.18: Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30003]: ssl_handshake(): SSL - The connection indicated an EOF if agent is OpenSSL 1.0.1: Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30004]: ssl_handshake(): NET - Connection was reset by peer GnuTLS 3.3.16 N/A OpenSSL 1.0.2c N/A In case an agent sends a PSK identity longer than a server can handle: mbed TLS 1.3.11 N/A GnuTLS 3.3.16 ( FIXME should we write the first line?) 27277:20150828:093312.236 zbx_tls_accept(): gnutls_handshake() returned: -90 The SRP username supplied is illegal. 27277:20150828:093312.236 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -90 The SRP username supplied is illegal. OpenSSL 1.0.2c failed to accept an incoming connection: from 127.0.0.1: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1472: error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure: SSL alert number 40: TLS read fatal alert "handshake failure" Certificates In case a server connects to an agent with a certificate, but agent certificate is not trusted: mbed TLS 1.3.11 Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30002]: invalid peer certificate: self-signed or not signed by trusted CA GnuTLS 3.3.16 if agent is GnuTLS 3.1.18: Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30003]: invalid peer certificate: The certificate is NOT trusted. The certificate issuer is unknown. if agent is PolarSSL 1.3.9 or OpenSSL 1.0.1: Get value from agent failed: TCP connection successful, cannot establish TLS to [[127.0.0.1]:30006]: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 46 Unknown certificate OpenSSL 1.0.2c ... Andris Mednis Nice collection! Parts of it can be included in documentation upon need. CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (70) [D] This subissue is meant to collect different error messages a user can see in the agent log file.

          PSK

          • In case server's and agent's PSK values do not match (passive checks):
            • PolarSSL 1.3.9 (FIXME?: hidden under DebugLevel=4)
                8405:20150811:172752.447 End of zbx_tls_accept():FAIL error:'ssl_handshake(): SSL - Verification of the message MAC failed'
              
            • GnuTLS 3.1.18
               28377:20150810:171211.722 GnuTLS audit: "Discarded message[0] due to invalid decryption"
               28377:20150810:171211.722 zbx_tls_accept(): gnutls_handshake() returned: -24 Decryption has failed.   
              
            • OpenSSL 1.0.1 (FIXME?: this is DebugLevel=4 only)
               28386:20150810:170730.950 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"
              
          • In case server's and agent's PSK identities do not match (passive checks):
            • PolarSSL 1.3.9 (FIXME?: this is DebugLevel=4 only)
                8098:20150811:170223.856 End of zbx_tls_accept():FAIL error:'ssl_handshake(): SSL - Unknown identity received (eg, PSK identity)'
              
            • GnuTLS 3.1.18
                2297:20150811:104208.794 zbx_tls_accept(): gnutls_handshake() returned: -31 Error in password file.
              
            • OpenSSL 1.0.1 (FIXME?: this is DebugLevel=4 only)
                8120:20150811:170228.861 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 2717: error:1408B0DF:SSL routines:SSL3_GET_CLIENT_KEY_EXCHANGE:psk identity not found: TLS write fatal alert "unknown PSK identity"
              
          • In case server's and agent's PSK values do not match (active checks):
            • PolarSSL 1.3.9
               12973:20150812:151312.292 active check data upload to [127.0.0.1:10001] started to fail ([connect] ssl_handshake(): SSL - A fatal alert message was received from our peer)
               12973:20150812:151407.572 active check configuration update from [127.0.0.1:10001] started to fail (ssl_handshake(): SSL - A fatal alert message was received from our peer)
              
            • GnuTLS 3.1.18
               12984:20150812:151312.293 active check data upload to [127.0.0.1:10001] started to fail ([connect] zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 20 Bad record MAC)
               12984:20150812:151407.573 active check configuration update from [127.0.0.1:10001] started to fail (zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 20 Bad record MAC)
              
            • OpenSSL 1.0.1
               13001:20150812:151312.292 active check data upload to [127.0.0.1:10001] started to fail ([connect] TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac: SSL alert number 20: TLS read fatal alert "bad record mac")
               13001:20150812:151407.572 active check configuration update from [127.0.0.1:10001] started to fail (TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac: SSL alert number 20: TLS read fatal alert "bad record mac")
              
          • In case server's and agent's PSK identities do not match (active checks):
            • PolarSSL 1.3.9
               12973:20150812:145307.695 active check configuration update from [127.0.0.1:10001] started to fail (ssl_handshake(): SSL - A fatal alert message was received from our peer)
               12973:20150812:145307.696 active check data upload to [127.0.0.1:10001] started to fail ([connect] ssl_handshake(): SSL - A fatal alert message was received from our peer)
              
            • GnuTLS 3.1.18
               12984:20150812:145207.573 active check data upload to [127.0.0.1:10001] started to fail ([connect] zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 115 The SRP/PSK username is missing or not known)
               12984:20150812:145307.713 active check configuration update from [127.0.0.1:10001] started to fail (zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 115 The SRP/PSK username is missing or not known)
              
            • OpenSSL 1.0.1
               13001:20150812:145307.695 active check configuration update from [127.0.0.1:10001] started to fail (TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:1409445B:SSL routines:SSL3_READ_BYTES:reason(1115): SSL alert number 115: TLS read fatal alert "unknown PSK identity")
               13001:20150812:145307.696 active check data upload to [127.0.0.1:10001] started to fail ([connect] TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:1409445B:SSL routines:SSL3_READ_BYTES:reason(1115): SSL alert number 115: TLS read fatal alert "unknown PSK identity")
              

          Certificates

          • In case the server tries to connect with a PSK, but agent only accepts certificates:
            • PolarSSL 1.3.9
               15549:20150901:194207.217 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - The server has no ciphersuites in common with the client
              
            • GnuTLS 3.1.18
               18345:20150902:134905.695 zbx_tls_accept(): gnutls_handshake() returned: -21 Could not negotiate a supported cipher suite.
               18345:20150902:134905.695 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -21 Could not negotiate a supported cipher suite.
              
            • OpenSSL 1.0.1
               18356:20150902:134928.703 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 1353: error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher: TLS write fatal alert "handshake failure"
              
          • In case the server tries to connect with a certificate, but there is something wrong with agent certificate (maybe unsupported nameConstraints?):
            • PolarSSL 1.3.9 (FIXME: can we do something about unknown IP?)
               15549:20150901:194752.279 Cannot get socket IP address: [107] Transport endpoint is not connected
               15549:20150901:194752.279 failed to accept an incoming connection: from unknown IP: ssl_handshake(): NET - Connection was reset by peer
              
            • GnuTLS 3.1.18 (FIXME: does not write anything)
              
              
            • OpenSSL 1.0.1
               18370:20150902:135753.955 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 3291: error:140890B2:SSL routines:SSL3_GET_CLIENT_CERTIFICATE:no certificate returned: TLS write fatal alert "certificate unknown"
              
          • In case the server tries to connect with a certificate, but agent certificate is not issued by a trusted CA:
            • PolarSSL 1.3.9
               26982:20150930:171222.748 Cannot get socket IP address: [107] Transport endpoint is not connected
               26982:20150930:171222.748 failed to accept an incoming connection: from unknown IP: ssl_handshake(): NET - Connection was reset by peer
              
            • GnuTLS 3.1.18
               26673:20150930:170240.218 zbx_tls_accept(): gnutls_handshake() returned: -49 No certificate was found.
               26673:20150930:170240.218 Cannot get socket IP address: [107] Transport endpoint is not connected
               26673:20150930:170240.218 failed to accept an incoming connection: from unknown IP: zbx_tls_accept(): gnutls_handshake() failed: -49 No certificate was found.
              
            • OpenSSL 1.0.1
               26684:20150930:170730.281 Cannot get socket IP address: [107] Transport endpoint is not connected
               26684:20150930:170730.281 failed to accept an incoming connection: from unknown IP: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1240: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA"
              
          • In case the server tries to connect to the agent, but agent certificate issuer is incorrect:
            • PolarSSL 1.3.9 (FIXME? DebugLevel=4 only)
               20267:20150930:150008.275 Process listener error: ssl_read() failed: SSL - The peer notified us that the connection is going to be closed 
              
            • GnuTLS 3.1.18 (the first two messages are DebugLevel=4, the third - DebugLevel=3; FIXME user only sees the log file spammed with the third message)
               20278:20150930:150305.540 Requested []
               20278:20150930:150305.540 Got signal [signal:13(SIGPIPE),sender_pid:20278]. Ignoring ...
               20278:20150930:150305.540 gnutls_bye() returned: -53 Error in the push function.
              
            • OpenSSL 1.0.1 (FIXME? DebugLevel=4 only)
               20317:20150930:150456.714 Process listener error: connection closed during read
              

          Aleksandrs Saveljevs Let's add a few error messages from the proxy log, too.

          Certificates

          • In case the server connects to a proxy, but a proxy has a CA certificate:
            • PolarSSL 1.3.9
               28787:20150930:175805.107 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - None of the common ciphersuites is usable (eg, no suitable certificate, see debug messages)
              
            • GnuTLS 3.1.18
               28807:20151001:091405.190 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
               28807:20151001:091405.190 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -50 The request is invalid. 
              
            • OpenSSL 1.0.1
               28827:20150930:175801.084 Cannot get socket IP address: [107] Transport endpoint is not connected
               28827:20150930:175801.084 Cannot get socket IP address: [107] Transport endpoint is not connected
               28827:20150930:175801.084 failed to accept an incoming connection: from unknown IP: TLS handshake with unknown IP returned error code 5:: 
              

          Aleksandrs Saveljevs Let's add a few error messages from the server log as well.

          Certificates

          • In case the server connects to an agent, but the agent does not trust the server certificate due to subject mismatch:
            • mbed TSL 1.3.11 (does not log anything)
              
              
            • GnuTLS 3.3.16 (FIXME note that there is no identifying information)
                 896:20151001:105756.912 gnutls_bye() returned: -53 Error in the push function.
              
            • OpenSSL 1.0.2c (FIXME note that there is no identifying information)
               31720:20151001:103843.452 Cannot get socket IP address: [107] Transport endpoint is not connected
               31720:20151001:103843.452 TLS shutdown with unknown IP returned error code 5: TLS read warning alert "close notify"
              

          Andris Mednis Thanks for collection! Parts of it can be included in documentation upon need. CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (70) [D] This subissue is meant to collect different error messages a user can see in the agent log file. PSK In case server's and agent's PSK values do not match (passive checks): PolarSSL 1.3.9 ( FIXME ?: hidden under DebugLevel=4) 8405:20150811:172752.447 End of zbx_tls_accept():FAIL error:'ssl_handshake(): SSL - Verification of the message MAC failed' GnuTLS 3.1.18 28377:20150810:171211.722 GnuTLS audit: "Discarded message[0] due to invalid decryption" 28377:20150810:171211.722 zbx_tls_accept(): gnutls_handshake() returned: -24 Decryption has failed. OpenSSL 1.0.1 ( FIXME ?: this is DebugLevel=4 only) 28386:20150810:170730.950 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac" In case server's and agent's PSK identities do not match (passive checks): PolarSSL 1.3.9 ( FIXME ?: this is DebugLevel=4 only) 8098:20150811:170223.856 End of zbx_tls_accept():FAIL error:'ssl_handshake(): SSL - Unknown identity received (eg, PSK identity)' GnuTLS 3.1.18 2297:20150811:104208.794 zbx_tls_accept(): gnutls_handshake() returned: -31 Error in password file. OpenSSL 1.0.1 ( FIXME ?: this is DebugLevel=4 only) 8120:20150811:170228.861 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 2717: error:1408B0DF:SSL routines:SSL3_GET_CLIENT_KEY_EXCHANGE:psk identity not found: TLS write fatal alert "unknown PSK identity" In case server's and agent's PSK values do not match (active checks): PolarSSL 1.3.9 12973:20150812:151312.292 active check data upload to [127.0.0.1:10001] started to fail ([connect] ssl_handshake(): SSL - A fatal alert message was received from our peer) 12973:20150812:151407.572 active check configuration update from [127.0.0.1:10001] started to fail (ssl_handshake(): SSL - A fatal alert message was received from our peer) GnuTLS 3.1.18 12984:20150812:151312.293 active check data upload to [127.0.0.1:10001] started to fail ([connect] zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 20 Bad record MAC) 12984:20150812:151407.573 active check configuration update from [127.0.0.1:10001] started to fail (zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 20 Bad record MAC) OpenSSL 1.0.1 13001:20150812:151312.292 active check data upload to [127.0.0.1:10001] started to fail ([connect] TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac: SSL alert number 20: TLS read fatal alert "bad record mac") 13001:20150812:151407.572 active check configuration update from [127.0.0.1:10001] started to fail (TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac: SSL alert number 20: TLS read fatal alert "bad record mac") In case server's and agent's PSK identities do not match (active checks): PolarSSL 1.3.9 12973:20150812:145307.695 active check configuration update from [127.0.0.1:10001] started to fail (ssl_handshake(): SSL - A fatal alert message was received from our peer) 12973:20150812:145307.696 active check data upload to [127.0.0.1:10001] started to fail ([connect] ssl_handshake(): SSL - A fatal alert message was received from our peer) GnuTLS 3.1.18 12984:20150812:145207.573 active check data upload to [127.0.0.1:10001] started to fail ([connect] zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 115 The SRP/PSK username is missing or not known) 12984:20150812:145307.713 active check configuration update from [127.0.0.1:10001] started to fail (zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 115 The SRP/PSK username is missing or not known) OpenSSL 1.0.1 13001:20150812:145307.695 active check configuration update from [127.0.0.1:10001] started to fail (TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:1409445B:SSL routines:SSL3_READ_BYTES:reason(1115): SSL alert number 115: TLS read fatal alert "unknown PSK identity") 13001:20150812:145307.696 active check data upload to [127.0.0.1:10001] started to fail ([connect] TLS handshake returned SSL_ERROR_SSL: file s3_pkt.c line 1240: error:1409445B:SSL routines:SSL3_READ_BYTES:reason(1115): SSL alert number 115: TLS read fatal alert "unknown PSK identity") Certificates In case the server tries to connect with a PSK, but agent only accepts certificates: PolarSSL 1.3.9 15549:20150901:194207.217 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - The server has no ciphersuites in common with the client GnuTLS 3.1.18 18345:20150902:134905.695 zbx_tls_accept(): gnutls_handshake() returned: -21 Could not negotiate a supported cipher suite. 18345:20150902:134905.695 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -21 Could not negotiate a supported cipher suite. OpenSSL 1.0.1 18356:20150902:134928.703 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 1353: error:1408A0C1:SSL routines:SSL3_GET_CLIENT_HELLO:no shared cipher: TLS write fatal alert "handshake failure" In case the server tries to connect with a certificate, but there is something wrong with agent certificate (maybe unsupported nameConstraints?): PolarSSL 1.3.9 ( FIXME : can we do something about unknown IP?) 15549:20150901:194752.279 Cannot get socket IP address: [107] Transport endpoint is not connected 15549:20150901:194752.279 failed to accept an incoming connection: from unknown IP: ssl_handshake(): NET - Connection was reset by peer GnuTLS 3.1.18 ( FIXME : does not write anything) OpenSSL 1.0.1 18370:20150902:135753.955 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 3291: error:140890B2:SSL routines:SSL3_GET_CLIENT_CERTIFICATE:no certificate returned: TLS write fatal alert "certificate unknown" In case the server tries to connect with a certificate, but agent certificate is not issued by a trusted CA: PolarSSL 1.3.9 26982:20150930:171222.748 Cannot get socket IP address: [107] Transport endpoint is not connected 26982:20150930:171222.748 failed to accept an incoming connection: from unknown IP: ssl_handshake(): NET - Connection was reset by peer GnuTLS 3.1.18 26673:20150930:170240.218 zbx_tls_accept(): gnutls_handshake() returned: -49 No certificate was found. 26673:20150930:170240.218 Cannot get socket IP address: [107] Transport endpoint is not connected 26673:20150930:170240.218 failed to accept an incoming connection: from unknown IP: zbx_tls_accept(): gnutls_handshake() failed: -49 No certificate was found. OpenSSL 1.0.1 26684:20150930:170730.281 Cannot get socket IP address: [107] Transport endpoint is not connected 26684:20150930:170730.281 failed to accept an incoming connection: from unknown IP: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1240: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA" In case the server tries to connect to the agent, but agent certificate issuer is incorrect: PolarSSL 1.3.9 ( FIXME ? DebugLevel=4 only) 20267:20150930:150008.275 Process listener error: ssl_read() failed: SSL - The peer notified us that the connection is going to be closed GnuTLS 3.1.18 (the first two messages are DebugLevel=4, the third - DebugLevel=3; FIXME user only sees the log file spammed with the third message) 20278:20150930:150305.540 Requested [] 20278:20150930:150305.540 Got signal [signal:13(SIGPIPE),sender_pid:20278]. Ignoring ... 20278:20150930:150305.540 gnutls_bye() returned: -53 Error in the push function. OpenSSL 1.0.1 ( FIXME ? DebugLevel=4 only) 20317:20150930:150456.714 Process listener error: connection closed during read Aleksandrs Saveljevs Let's add a few error messages from the proxy log, too. Certificates In case the server connects to a proxy, but a proxy has a CA certificate: PolarSSL 1.3.9 28787:20150930:175805.107 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): SSL - None of the common ciphersuites is usable (eg, no suitable certificate, see debug messages) GnuTLS 3.1.18 28807:20151001:091405.190 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 28807:20151001:091405.190 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -50 The request is invalid. OpenSSL 1.0.1 28827:20150930:175801.084 Cannot get socket IP address: [107] Transport endpoint is not connected 28827:20150930:175801.084 Cannot get socket IP address: [107] Transport endpoint is not connected 28827:20150930:175801.084 failed to accept an incoming connection: from unknown IP: TLS handshake with unknown IP returned error code 5:: Aleksandrs Saveljevs Let's add a few error messages from the server log as well. Certificates In case the server connects to an agent, but the agent does not trust the server certificate due to subject mismatch: mbed TSL 1.3.11 (does not log anything) GnuTLS 3.3.16 ( FIXME note that there is no identifying information) 896:20151001:105756.912 gnutls_bye() returned: -53 Error in the push function. OpenSSL 1.0.2c ( FIXME note that there is no identifying information) 31720:20151001:103843.452 Cannot get socket IP address: [107] Transport endpoint is not connected 31720:20151001:103843.452 TLS shutdown with unknown IP returned error code 5: TLS read warning alert "close notify" Andris Mednis Thanks for collection! Parts of it can be included in documentation upon need. CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (71) In case a server with GnuTLS is configured to connect to an agent with PSK, but the agent only accepts unencrypted connections, then the server spams the log file with the following errors for each connection:

           26751:20150810:163128.098 zbx_tls_connect(): gnutls_handshake() returned: -110 The TLS connection was non-properly terminated.
          

          This spamming does not occur for mbed TLS or OpenSSL flavors of Zabbix server.

          Aleksandrs Saveljevs Similar behavior is observed when an active proxy tries to connect to a server with a bad PSK identity.

          Andris Mednis RESOLVED in r55098.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (71) In case a server with GnuTLS is configured to connect to an agent with PSK, but the agent only accepts unencrypted connections, then the server spams the log file with the following errors for each connection: 26751:20150810:163128.098 zbx_tls_connect(): gnutls_handshake() returned: -110 The TLS connection was non-properly terminated. This spamming does not occur for mbed TLS or OpenSSL flavors of Zabbix server. Aleksandrs Saveljevs Similar behavior is observed when an active proxy tries to connect to a server with a bad PSK identity. Andris Mednis RESOLVED in r55098. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (72) If server's and agent's PSK values do not match, then agent with OpenSSL crashes after a while:

          ...
           28386:20150810:170730.912 zbx_tls_accept() ciphersuites: PSK-AES128-CBC-SHA
           28386:20150810:170730.913 zbx_psk_server_cb(): requested PSK identity: "agent-openssl-1.0.1"
           28386:20150810:170730.950 End of zbx_tls_accept():FAIL error:'TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"'
           28386:20150810:170730.950 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"
          ...
           28386:20150810:171230.788 In zbx_tls_accept()
           28386:20150810:171230.788 zbx_tls_accept() ciphersuites: PSK-AES128-CBC-SHA
           28386:20150810:171230.788 zbx_psk_server_cb(): requested PSK identity: "agent-openssl-1.0.1"
           28386:20150810:171230.826 End of zbx_tls_accept():FAIL error:'TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"'
           28386:20150810:171230.826 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"
          ...
           28386:20150810:171330.965 In zbx_tls_accept()
           28386:20150810:171330.965 zbx_tls_accept() ciphersuites: PSK-AES128-CBC-SHA
           28386:20150810:171330.965 zbx_psk_server_cb(): requested PSK identity: "agent-openssl-1.0.1"
           28386:20150810:171331.003 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x19bb000]. Crashing ...
           28386:20150810:171331.003 ====== Fatal information: ======
           28386:20150810:171331.003 Program counter: 0x7efbf8cee4bd
           28386:20150810:171331.003 === Registers: ===
           28386:20150810:171331.003 r8      =          19a6944 =             26896708 =             26896708
           28386:20150810:171331.003 r9      =          19bafeb =             26980331 =             26980331
           28386:20150810:171331.003 r10     =          199ad6b =             26848619 =             26848619
           28386:20150810:171331.003 r11     =     7efbf8cee780 =      139620676200320 =      139620676200320
           28386:20150810:171331.003 r12     =               3d =                   61 =                   61
           28386:20150810:171331.003 r13     = ffffffffffffff80 = 18446744073709551488 =                 -128
           28386:20150810:171331.003 r14     =          199adeb =             26848747 =             26848747
           28386:20150810:171331.003 r15     =          19a6944 =             26896708 =             26896708
           28386:20150810:171331.003 rdi     =         7ff8ced9 =           2147012313 =           2147012313
           28386:20150810:171331.003 rsi     =         6dd77c83 =           1842838659 =           1842838659
           28386:20150810:171331.003 rbp     =         2f47e970 =            793241968 =            793241968
           28386:20150810:171331.003 rbx     =         20d94675 =            551110261 =            551110261
           28386:20150810:171331.003 rdx     =         baef906d =           3136262253 =           3136262253
           28386:20150810:171331.003 rax     =         e23e2d6c =           3795725676 =           3795725676
           28386:20150810:171331.003 rcx     =         b4154531 =           3021292849 =           3021292849
           28386:20150810:171331.003 rsp     =     7ffe96c11aa0 =      140731427658400 =      140731427658400
           28386:20150810:171331.003 rip     =     7efbf8cee4bd =      139620676199613 =      139620676199613
           28386:20150810:171331.003 efl     =            10202 =                66050 =                66050
           28386:20150810:171331.003 csgsfs  =               33 =                   51 =                   51
           28386:20150810:171331.003 err     =                4 =                    4 =                    4
           28386:20150810:171331.003 trapno  =                e =                   14 =                   14
           28386:20150810:171331.003 oldmask =                0 =                    0 =                    0
           28386:20150810:171331.003 cr2     =          19bb000 =             26980352 =             26980352
           28386:20150810:171331.003 === Backtrace: ===
           28386:20150810:171331.003 3: sbin/zabbix_agentd: listener #1 [waiting for connection](print_fatal_info+0xa5) [0x427c25]
           28386:20150810:171331.003 2: sbin/zabbix_agentd: listener #1 [waiting for connection]() [0x4280c6]
           28386:20150810:171331.003 1: /lib/x86_64-linux-gnu/libc.so.6(+0x35180) [0x7efbf81ea180]
           28386:20150810:171331.003 0: /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0(+0x744bd) [0x7efbf8cee4bd]
           28386:20150810:171331.003 === Memory map: ===
           28386:20150810:171331.004 00400000-0044d000 r-xp 00000000 08:02 8126861                            /home/asaveljevs/ZBXNEXT-1263/agent-openssl-1.0.1-install/sbin/zabbix_agentd
           28386:20150810:171331.004 0064c000-0064e000 rw-p 0004c000 08:02 8126861                            /home/asaveljevs/ZBXNEXT-1263/agent-openssl-1.0.1-install/sbin/zabbix_agentd
           28386:20150810:171331.004 0064e000-00654000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 01977000-01998000 rw-p 00000000 00:00 0                                  [heap]
           28386:20150810:171331.004 01998000-019bb000 rw-p 00000000 00:00 0                                  [heap]
           28386:20150810:171331.004 7efbf7f9f000-7efbf7fb5000 r-xp 00000000 08:02 6293377                    /lib/x86_64-linux-gnu/libgcc_s.so.1
           28386:20150810:171331.004 7efbf7fb5000-7efbf81b4000 ---p 00016000 08:02 6293377                    /lib/x86_64-linux-gnu/libgcc_s.so.1
           28386:20150810:171331.004 7efbf81b4000-7efbf81b5000 rw-p 00015000 08:02 6293377                    /lib/x86_64-linux-gnu/libgcc_s.so.1
           28386:20150810:171331.004 7efbf81b5000-7efbf8354000 r-xp 00000000 08:02 6298479                    /lib/x86_64-linux-gnu/libc-2.19.so
           28386:20150810:171331.004 7efbf8354000-7efbf8554000 ---p 0019f000 08:02 6298479                    /lib/x86_64-linux-gnu/libc-2.19.so
           28386:20150810:171331.004 7efbf8554000-7efbf8558000 r--p 0019f000 08:02 6298479                    /lib/x86_64-linux-gnu/libc-2.19.so
           28386:20150810:171331.004 7efbf8558000-7efbf855a000 rw-p 001a3000 08:02 6298479                    /lib/x86_64-linux-gnu/libc-2.19.so
           28386:20150810:171331.004 7efbf855a000-7efbf855e000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf855e000-7efbf8572000 r-xp 00000000 08:02 6295822                    /lib/x86_64-linux-gnu/libresolv-2.19.so
           28386:20150810:171331.004 7efbf8572000-7efbf8771000 ---p 00014000 08:02 6295822                    /lib/x86_64-linux-gnu/libresolv-2.19.so
           28386:20150810:171331.004 7efbf8771000-7efbf8772000 r--p 00013000 08:02 6295822                    /lib/x86_64-linux-gnu/libresolv-2.19.so
           28386:20150810:171331.004 7efbf8772000-7efbf8773000 rw-p 00014000 08:02 6295822                    /lib/x86_64-linux-gnu/libresolv-2.19.so
           28386:20150810:171331.004 7efbf8773000-7efbf8775000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf8775000-7efbf8778000 r-xp 00000000 08:02 6293626                    /lib/x86_64-linux-gnu/libdl-2.19.so
           28386:20150810:171331.004 7efbf8778000-7efbf8977000 ---p 00003000 08:02 6293626                    /lib/x86_64-linux-gnu/libdl-2.19.so
           28386:20150810:171331.004 7efbf8977000-7efbf8978000 r--p 00002000 08:02 6293626                    /lib/x86_64-linux-gnu/libdl-2.19.so
           28386:20150810:171331.004 7efbf8978000-7efbf8979000 rw-p 00003000 08:02 6293626                    /lib/x86_64-linux-gnu/libdl-2.19.so
           28386:20150810:171331.004 7efbf8979000-7efbf8a79000 r-xp 00000000 08:02 6298477                    /lib/x86_64-linux-gnu/libm-2.19.so
           28386:20150810:171331.004 7efbf8a79000-7efbf8c78000 ---p 00100000 08:02 6298477                    /lib/x86_64-linux-gnu/libm-2.19.so
           28386:20150810:171331.004 7efbf8c78000-7efbf8c79000 r--p 000ff000 08:02 6298477                    /lib/x86_64-linux-gnu/libm-2.19.so
           28386:20150810:171331.004 7efbf8c79000-7efbf8c7a000 rw-p 00100000 08:02 6298477                    /lib/x86_64-linux-gnu/libm-2.19.so
           28386:20150810:171331.004 7efbf8c7a000-7efbf8e1d000 r-xp 00000000 08:02 2503069                    /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0
           28386:20150810:171331.004 7efbf8e1d000-7efbf901d000 ---p 001a3000 08:02 2503069                    /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0
           28386:20150810:171331.004 7efbf901d000-7efbf9040000 rw-p 001a3000 08:02 2503069                    /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0
           28386:20150810:171331.004 7efbf9040000-7efbf9044000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf9044000-7efbf909f000 r-xp 00000000 08:02 2503072                    /home/asaveljevs/software/openssl-1.0.1-install/lib/libssl.so.1.0.0
           28386:20150810:171331.004 7efbf909f000-7efbf929e000 ---p 0005b000 08:02 2503072                    /home/asaveljevs/software/openssl-1.0.1-install/lib/libssl.so.1.0.0
           28386:20150810:171331.004 7efbf929e000-7efbf92a8000 rw-p 0005a000 08:02 2503072                    /home/asaveljevs/software/openssl-1.0.1-install/lib/libssl.so.1.0.0
           28386:20150810:171331.004 7efbf92a8000-7efbf92c8000 r-xp 00000000 08:02 6293411                    /lib/x86_64-linux-gnu/ld-2.19.so
           28386:20150810:171331.004 7efbf93ff000-7efbf94a0000 rw-s 00000000 00:04 103153730                  /SYSV6c0201a0 (deleted)
           28386:20150810:171331.004 7efbf94a0000-7efbf94a5000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf94c4000-7efbf94c5000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf94c5000-7efbf94c6000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf94c6000-7efbf94c8000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7efbf94c8000-7efbf94c9000 r--p 00020000 08:02 6293411                    /lib/x86_64-linux-gnu/ld-2.19.so
           28386:20150810:171331.004 7efbf94c9000-7efbf94ca000 rw-p 00021000 08:02 6293411                    /lib/x86_64-linux-gnu/ld-2.19.so
           28386:20150810:171331.004 7efbf94ca000-7efbf94cb000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7ffe96bf5000-7ffe96c14000 rwxp 00000000 00:00 0                          [stack]
           28386:20150810:171331.004 7ffe96c14000-7ffe96c16000 rw-p 00000000 00:00 0 
           28386:20150810:171331.004 7ffe96d99000-7ffe96d9b000 r-xp 00000000 00:00 0                          [vdso]
           28386:20150810:171331.004 7ffe96d9b000-7ffe96d9d000 r--p 00000000 00:00 0                          [vvar]
           28386:20150810:171331.004 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
           28386:20150810:171331.005 ================================
           28386:20150810:171331.005 Please consider attaching a disassembly listing to your bug report.
           28386:20150810:171331.005 This listing can be produced with, e.g., objdump -DSswx zabbix_agentd.
           28386:20150810:171331.005 ================================
           28384:20150810:171331.005 One child process died (PID:28386,exitcode/signal:1). Exiting ...
           28384:20150810:171331.005 zbx_on_exit() called
           28388:20150810:171331.005 Got signal [signal:15(SIGTERM),sender_pid:28384,sender_uid:1000,reason:0]. Exiting ...
           28385:20150810:171331.005 Got signal [signal:15(SIGTERM),sender_pid:28384,sender_uid:1000,reason:0]. Exiting ...
          zabbix_agentd [28384]: Error on thread waiting.
           28384:20150810:171331.006 In unload_modules()
           28384:20150810:171331.006 End of unload_modules()
           28384:20150810:171331.006 Zabbix Agent stopped. Zabbix 2.5.0 (revision {ZABBIX_REVISION}).
          

          It crashes if the server uses mbed TLS or GnuTLS. With OpenSSL server, either it does not crash or I have not waited long enough.

          Aleksandrs Saveljevs The issue did not reproduce in Andris' environment. In my environment, only the OpenSSL 1.0.1 agent crashes - OpenSSL 1.0.1e and OpenSSL 1.0.2c work well. The 1.0.1 version crashes in SSL_accept() function and our zbx_psk_server_cb() seems to execute successfully. For this reason, let's postpone investigating this issue further until it turns out to be a practical problem for someone. WON'T FIX.

          Show
          Aleksandrs Saveljevs added a comment - - edited (72) If server's and agent's PSK values do not match, then agent with OpenSSL crashes after a while: ... 28386:20150810:170730.912 zbx_tls_accept() ciphersuites: PSK-AES128-CBC-SHA 28386:20150810:170730.913 zbx_psk_server_cb(): requested PSK identity: "agent-openssl-1.0.1" 28386:20150810:170730.950 End of zbx_tls_accept():FAIL error:'TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"' 28386:20150810:170730.950 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac" ... 28386:20150810:171230.788 In zbx_tls_accept() 28386:20150810:171230.788 zbx_tls_accept() ciphersuites: PSK-AES128-CBC-SHA 28386:20150810:171230.788 zbx_psk_server_cb(): requested PSK identity: "agent-openssl-1.0.1" 28386:20150810:171230.826 End of zbx_tls_accept():FAIL error:'TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac"' 28386:20150810:171230.826 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_pkt.c line 480: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac: TLS write fatal alert "bad record mac" ... 28386:20150810:171330.965 In zbx_tls_accept() 28386:20150810:171330.965 zbx_tls_accept() ciphersuites: PSK-AES128-CBC-SHA 28386:20150810:171330.965 zbx_psk_server_cb(): requested PSK identity: "agent-openssl-1.0.1" 28386:20150810:171331.003 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x19bb000]. Crashing ... 28386:20150810:171331.003 ====== Fatal information: ====== 28386:20150810:171331.003 Program counter: 0x7efbf8cee4bd 28386:20150810:171331.003 === Registers: === 28386:20150810:171331.003 r8 = 19a6944 = 26896708 = 26896708 28386:20150810:171331.003 r9 = 19bafeb = 26980331 = 26980331 28386:20150810:171331.003 r10 = 199ad6b = 26848619 = 26848619 28386:20150810:171331.003 r11 = 7efbf8cee780 = 139620676200320 = 139620676200320 28386:20150810:171331.003 r12 = 3d = 61 = 61 28386:20150810:171331.003 r13 = ffffffffffffff80 = 18446744073709551488 = -128 28386:20150810:171331.003 r14 = 199adeb = 26848747 = 26848747 28386:20150810:171331.003 r15 = 19a6944 = 26896708 = 26896708 28386:20150810:171331.003 rdi = 7ff8ced9 = 2147012313 = 2147012313 28386:20150810:171331.003 rsi = 6dd77c83 = 1842838659 = 1842838659 28386:20150810:171331.003 rbp = 2f47e970 = 793241968 = 793241968 28386:20150810:171331.003 rbx = 20d94675 = 551110261 = 551110261 28386:20150810:171331.003 rdx = baef906d = 3136262253 = 3136262253 28386:20150810:171331.003 rax = e23e2d6c = 3795725676 = 3795725676 28386:20150810:171331.003 rcx = b4154531 = 3021292849 = 3021292849 28386:20150810:171331.003 rsp = 7ffe96c11aa0 = 140731427658400 = 140731427658400 28386:20150810:171331.003 rip = 7efbf8cee4bd = 139620676199613 = 139620676199613 28386:20150810:171331.003 efl = 10202 = 66050 = 66050 28386:20150810:171331.003 csgsfs = 33 = 51 = 51 28386:20150810:171331.003 err = 4 = 4 = 4 28386:20150810:171331.003 trapno = e = 14 = 14 28386:20150810:171331.003 oldmask = 0 = 0 = 0 28386:20150810:171331.003 cr2 = 19bb000 = 26980352 = 26980352 28386:20150810:171331.003 === Backtrace: === 28386:20150810:171331.003 3: sbin/zabbix_agentd: listener #1 [waiting for connection](print_fatal_info+0xa5) [0x427c25] 28386:20150810:171331.003 2: sbin/zabbix_agentd: listener #1 [waiting for connection]() [0x4280c6] 28386:20150810:171331.003 1: /lib/x86_64-linux-gnu/libc.so.6(+0x35180) [0x7efbf81ea180] 28386:20150810:171331.003 0: /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0(+0x744bd) [0x7efbf8cee4bd] 28386:20150810:171331.003 === Memory map: === 28386:20150810:171331.004 00400000-0044d000 r-xp 00000000 08:02 8126861 /home/asaveljevs/ZBXNEXT-1263/agent-openssl-1.0.1-install/sbin/zabbix_agentd 28386:20150810:171331.004 0064c000-0064e000 rw-p 0004c000 08:02 8126861 /home/asaveljevs/ZBXNEXT-1263/agent-openssl-1.0.1-install/sbin/zabbix_agentd 28386:20150810:171331.004 0064e000-00654000 rw-p 00000000 00:00 0 28386:20150810:171331.004 01977000-01998000 rw-p 00000000 00:00 0 [heap] 28386:20150810:171331.004 01998000-019bb000 rw-p 00000000 00:00 0 [heap] 28386:20150810:171331.004 7efbf7f9f000-7efbf7fb5000 r-xp 00000000 08:02 6293377 /lib/x86_64-linux-gnu/libgcc_s.so.1 28386:20150810:171331.004 7efbf7fb5000-7efbf81b4000 ---p 00016000 08:02 6293377 /lib/x86_64-linux-gnu/libgcc_s.so.1 28386:20150810:171331.004 7efbf81b4000-7efbf81b5000 rw-p 00015000 08:02 6293377 /lib/x86_64-linux-gnu/libgcc_s.so.1 28386:20150810:171331.004 7efbf81b5000-7efbf8354000 r-xp 00000000 08:02 6298479 /lib/x86_64-linux-gnu/libc-2.19.so 28386:20150810:171331.004 7efbf8354000-7efbf8554000 ---p 0019f000 08:02 6298479 /lib/x86_64-linux-gnu/libc-2.19.so 28386:20150810:171331.004 7efbf8554000-7efbf8558000 r--p 0019f000 08:02 6298479 /lib/x86_64-linux-gnu/libc-2.19.so 28386:20150810:171331.004 7efbf8558000-7efbf855a000 rw-p 001a3000 08:02 6298479 /lib/x86_64-linux-gnu/libc-2.19.so 28386:20150810:171331.004 7efbf855a000-7efbf855e000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf855e000-7efbf8572000 r-xp 00000000 08:02 6295822 /lib/x86_64-linux-gnu/libresolv-2.19.so 28386:20150810:171331.004 7efbf8572000-7efbf8771000 ---p 00014000 08:02 6295822 /lib/x86_64-linux-gnu/libresolv-2.19.so 28386:20150810:171331.004 7efbf8771000-7efbf8772000 r--p 00013000 08:02 6295822 /lib/x86_64-linux-gnu/libresolv-2.19.so 28386:20150810:171331.004 7efbf8772000-7efbf8773000 rw-p 00014000 08:02 6295822 /lib/x86_64-linux-gnu/libresolv-2.19.so 28386:20150810:171331.004 7efbf8773000-7efbf8775000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf8775000-7efbf8778000 r-xp 00000000 08:02 6293626 /lib/x86_64-linux-gnu/libdl-2.19.so 28386:20150810:171331.004 7efbf8778000-7efbf8977000 ---p 00003000 08:02 6293626 /lib/x86_64-linux-gnu/libdl-2.19.so 28386:20150810:171331.004 7efbf8977000-7efbf8978000 r--p 00002000 08:02 6293626 /lib/x86_64-linux-gnu/libdl-2.19.so 28386:20150810:171331.004 7efbf8978000-7efbf8979000 rw-p 00003000 08:02 6293626 /lib/x86_64-linux-gnu/libdl-2.19.so 28386:20150810:171331.004 7efbf8979000-7efbf8a79000 r-xp 00000000 08:02 6298477 /lib/x86_64-linux-gnu/libm-2.19.so 28386:20150810:171331.004 7efbf8a79000-7efbf8c78000 ---p 00100000 08:02 6298477 /lib/x86_64-linux-gnu/libm-2.19.so 28386:20150810:171331.004 7efbf8c78000-7efbf8c79000 r--p 000ff000 08:02 6298477 /lib/x86_64-linux-gnu/libm-2.19.so 28386:20150810:171331.004 7efbf8c79000-7efbf8c7a000 rw-p 00100000 08:02 6298477 /lib/x86_64-linux-gnu/libm-2.19.so 28386:20150810:171331.004 7efbf8c7a000-7efbf8e1d000 r-xp 00000000 08:02 2503069 /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0 28386:20150810:171331.004 7efbf8e1d000-7efbf901d000 ---p 001a3000 08:02 2503069 /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0 28386:20150810:171331.004 7efbf901d000-7efbf9040000 rw-p 001a3000 08:02 2503069 /home/asaveljevs/software/openssl-1.0.1-install/lib/libcrypto.so.1.0.0 28386:20150810:171331.004 7efbf9040000-7efbf9044000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf9044000-7efbf909f000 r-xp 00000000 08:02 2503072 /home/asaveljevs/software/openssl-1.0.1-install/lib/libssl.so.1.0.0 28386:20150810:171331.004 7efbf909f000-7efbf929e000 ---p 0005b000 08:02 2503072 /home/asaveljevs/software/openssl-1.0.1-install/lib/libssl.so.1.0.0 28386:20150810:171331.004 7efbf929e000-7efbf92a8000 rw-p 0005a000 08:02 2503072 /home/asaveljevs/software/openssl-1.0.1-install/lib/libssl.so.1.0.0 28386:20150810:171331.004 7efbf92a8000-7efbf92c8000 r-xp 00000000 08:02 6293411 /lib/x86_64-linux-gnu/ld-2.19.so 28386:20150810:171331.004 7efbf93ff000-7efbf94a0000 rw-s 00000000 00:04 103153730 /SYSV6c0201a0 (deleted) 28386:20150810:171331.004 7efbf94a0000-7efbf94a5000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf94c4000-7efbf94c5000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf94c5000-7efbf94c6000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf94c6000-7efbf94c8000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7efbf94c8000-7efbf94c9000 r--p 00020000 08:02 6293411 /lib/x86_64-linux-gnu/ld-2.19.so 28386:20150810:171331.004 7efbf94c9000-7efbf94ca000 rw-p 00021000 08:02 6293411 /lib/x86_64-linux-gnu/ld-2.19.so 28386:20150810:171331.004 7efbf94ca000-7efbf94cb000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7ffe96bf5000-7ffe96c14000 rwxp 00000000 00:00 0 [stack] 28386:20150810:171331.004 7ffe96c14000-7ffe96c16000 rw-p 00000000 00:00 0 28386:20150810:171331.004 7ffe96d99000-7ffe96d9b000 r-xp 00000000 00:00 0 [vdso] 28386:20150810:171331.004 7ffe96d9b000-7ffe96d9d000 r--p 00000000 00:00 0 [vvar] 28386:20150810:171331.004 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] 28386:20150810:171331.005 ================================ 28386:20150810:171331.005 Please consider attaching a disassembly listing to your bug report. 28386:20150810:171331.005 This listing can be produced with, e.g., objdump -DSswx zabbix_agentd. 28386:20150810:171331.005 ================================ 28384:20150810:171331.005 One child process died (PID:28386,exitcode/signal:1). Exiting ... 28384:20150810:171331.005 zbx_on_exit() called 28388:20150810:171331.005 Got signal [signal:15(SIGTERM),sender_pid:28384,sender_uid:1000,reason:0]. Exiting ... 28385:20150810:171331.005 Got signal [signal:15(SIGTERM),sender_pid:28384,sender_uid:1000,reason:0]. Exiting ... zabbix_agentd [28384]: Error on thread waiting. 28384:20150810:171331.006 In unload_modules() 28384:20150810:171331.006 End of unload_modules() 28384:20150810:171331.006 Zabbix Agent stopped. Zabbix 2.5.0 (revision {ZABBIX_REVISION}). It crashes if the server uses mbed TLS or GnuTLS. With OpenSSL server, either it does not crash or I have not waited long enough. Aleksandrs Saveljevs The issue did not reproduce in Andris' environment. In my environment, only the OpenSSL 1.0.1 agent crashes - OpenSSL 1.0.1e and OpenSSL 1.0.2c work well. The 1.0.1 version crashes in SSL_accept() function and our zbx_psk_server_cb() seems to execute successfully. For this reason, let's postpone investigating this issue further until it turns out to be a practical problem for someone. WON'T FIX.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (73) If an agent's PSK value is longer than it can support, it does not complain under DebugLevel=3 (at least with mbed TLS). The following is only written under DebugLevel=4:

           11669:20150812:130437.513 End of zbx_tls_accept():FAIL error:'ssl_set_psk(): SSL - Bad input parameters to function'
          

          Aleksandrs Saveljevs I wonder whether we can validate PSK value at startup.

          Andris Mednis PSK value validation at startup could be done in agentd, zabbix_get, and zabbix_sender. In server and proxy PSKs come from database and can change at runtime.

          Andris Mednis RESOLVED in r54889.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (73) If an agent's PSK value is longer than it can support, it does not complain under DebugLevel=3 (at least with mbed TLS). The following is only written under DebugLevel=4: 11669:20150812:130437.513 End of zbx_tls_accept():FAIL error:'ssl_set_psk(): SSL - Bad input parameters to function' Aleksandrs Saveljevs I wonder whether we can validate PSK value at startup. Andris Mednis PSK value validation at startup could be done in agentd, zabbix_get, and zabbix_sender. In server and proxy PSKs come from database and can change at runtime. Andris Mednis RESOLVED in r54889. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (74) With mbed TLS version of zabbix_get, specify a 20000 character string as the PSK identity. Observe the following error:

          *** Error in `bin/zabbix_get': free(): invalid next size (normal): 0x0000000001c9ce80 ***
          

          Andris Mednis While the bug seems to be located in mbed TLS library, the check was added to restrict PSK identity size to 512 bytes. RESOLVED in r54933.

          Aleksandrs Saveljevs mbed TLS bug was reported at https://github.com/ARMmbed/mbedtls/issues/238 .

          Aleksandrs Saveljevs The following suggestions are proposed for zbx_check_psk_identity_size() function:

          • the name refers to "size", while in other places we refer to PSK identity length ("len");
          • it seems to be misplaced between zbx_tls_library_init() and zbx_tls_library_deinit() - may be better to move it either to PSK-related functions or closer to where it is used.

          REOPENED

          Andris Mednis RESOLVED in r55018.

          Aleksandrs Saveljevs The error message still referred to "size" instead of "length". Fixed that in r55198 (to be reviewed under (16)). CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (74) With mbed TLS version of zabbix_get, specify a 20000 character string as the PSK identity. Observe the following error: *** Error in `bin/zabbix_get': free(): invalid next size (normal): 0x0000000001c9ce80 *** Andris Mednis While the bug seems to be located in mbed TLS library, the check was added to restrict PSK identity size to 512 bytes. RESOLVED in r54933. Aleksandrs Saveljevs mbed TLS bug was reported at https://github.com/ARMmbed/mbedtls/issues/238 . Aleksandrs Saveljevs The following suggestions are proposed for zbx_check_psk_identity_size() function: the name refers to "size", while in other places we refer to PSK identity length ("len"); it seems to be misplaced between zbx_tls_library_init() and zbx_tls_library_deinit() - may be better to move it either to PSK-related functions or closer to where it is used. REOPENED Andris Mednis RESOLVED in r55018. Aleksandrs Saveljevs The error message still referred to "size" instead of "length". Fixed that in r55198 (to be reviewed under (16)). CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (75) Documentation at https://www.zabbix.com/documentation/3.0/manual/encryption?&#using_pre-shared_keys_psk states that 128 bytes is the limit for PSK identity for mbed TLS. However, mbed TLS ssl_set_psk() function accepts longer PSK identities just fine (e.g., with zabbix_get).

          Aleksandrs Saveljevs Just FYI: trying to use a long PSK identity for OpenSSL zabbix_get, we get this:

          using zabbix_get with openssl-1.0.2c... zabbix_get [11969]: Warning: requested PSK identity "..." does not fit into 129-byte buffer
          using zabbix_get with openssl-1.0.1e... zabbix_get [11985]: Warning: requested PSK identity "..." does not fit into 128-byte buffer
          using zabbix_get with openssl-1.0.1... zabbix_get [12000]: Warning: requested PSK identity "..." does not fit into 128-byte buffer
          

          Note how 1.0.2c provides a 129-byte buffer, while earlier versions provide 128-byte buffers.

          Aleksandrs Saveljevs Documentation also states that for OpenSSL the limit is 128 UTF-8 characters, but according to the above, it is 128 bytes.

          Andris Mednis RESOLVED in https://www.zabbix.com/documentation/3.0/manual/encryption?&#using_pre-shared_keys_psk (parameters corrected and summarized into table for better readability).

          Aleksandrs Saveljevs Looks good. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (75) Documentation at https://www.zabbix.com/documentation/3.0/manual/encryption?&#using_pre-shared_keys_psk states that 128 bytes is the limit for PSK identity for mbed TLS. However, mbed TLS ssl_set_psk() function accepts longer PSK identities just fine (e.g., with zabbix_get). Aleksandrs Saveljevs Just FYI: trying to use a long PSK identity for OpenSSL zabbix_get, we get this: using zabbix_get with openssl-1.0.2c... zabbix_get [11969]: Warning: requested PSK identity "..." does not fit into 129-byte buffer using zabbix_get with openssl-1.0.1e... zabbix_get [11985]: Warning: requested PSK identity "..." does not fit into 128-byte buffer using zabbix_get with openssl-1.0.1... zabbix_get [12000]: Warning: requested PSK identity "..." does not fit into 128-byte buffer Note how 1.0.2c provides a 129-byte buffer, while earlier versions provide 128-byte buffers. Aleksandrs Saveljevs Documentation also states that for OpenSSL the limit is 128 UTF-8 characters, but according to the above, it is 128 bytes. Andris Mednis RESOLVED in https://www.zabbix.com/documentation/3.0/manual/encryption?&#using_pre-shared_keys_psk (parameters corrected and summarized into table for better readability). Aleksandrs Saveljevs Looks good. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (76) [F] There should be "Encryption" column for proxies, similar to hosts.

          Oleg Egorov RESOLVED IN r55100, r55129

          Aleksandrs Saveljevs CLOSED

          Ivo Kurzemnieks Host list: rename header of encryption to "Encryption (In/Out)" and add a third middle column that will contain separator "/".

          Proxies list: Move "Encryption" column after "Mode" column.

          Discussed with Oleg Egorov, Oleg Ivanivskyi, richlv, Pavel Amosov, Alexei Vladishev, Oleksiy Zagorskyi, dimir, Andris Mednis, Alexey Pustovalov.

          REOPENED

          Oleg Egorov RESOLVED in r56181

          Ivo Kurzemnieks Now host list encryption icons have rounded corners only on the left side, but proxy list has both sides with rounded corners. I heard that this will be redesigned under different branch and all CSS classes will again change, so this cannot be tested properly for now.
          CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (76) [F] There should be "Encryption" column for proxies, similar to hosts. Oleg Egorov RESOLVED IN r55100, r55129 Aleksandrs Saveljevs CLOSED Ivo Kurzemnieks Host list: rename header of encryption to "Encryption (In/Out)" and add a third middle column that will contain separator "/". Proxies list: Move "Encryption" column after "Mode" column. Discussed with Oleg Egorov , Oleg Ivanivskyi , richlv , Pavel Amosov , Alexei Vladishev , Oleksiy Zagorskyi , dimir , Andris Mednis , Alexey Pustovalov . REOPENED Oleg Egorov RESOLVED in r56181 Ivo Kurzemnieks Now host list encryption icons have rounded corners only on the left side, but proxy list has both sides with rounded corners. I heard that this will be redesigned under different branch and all CSS classes will again change, so this cannot be tested properly for now. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (77) Suppose a GnuTLS server is configured to connect to proxy with PSK, but proxy only accepts unencrypted connections. Server log file is then heavily spammed with the following line:

           20629:20150813:131101.709 gnutls_bye() returned: -53 Error in the push function.
          

          In comparison, mbed TLS and OpenSSL servers spam the log with the following lines:

           21100:20150813:131229.241 cannot obtain data from proxy "proxy-gnutls-3.2.21": ssl_read() failed: SSL - The peer notified us that the connection is going to be closed
           20877:20150813:131355.482 cannot obtain data from proxy "proxy-gnutls-3.2.21": connection closed during read
          

          These message seem to be acceptable, especially the OpenSSL one. However, the GnuTLS message should be improved.

          Andris Mednis RESOLVED in r55303.

          Aleksandrs Saveljevs Now, if server receives no data from the proxy, it logs these error messages (with suggestions offered in r55327):

           10825:20150901:160619.481 gnutls_bye() returned: -53 Error in the push function. Check allowed connection types and access rights.
           10825:20150901:160619.481 proxy "proxy-openssl-1.0.1e" at "127.0.0.1" returned no host availability data: check allowed connection types and access rights
          

          The second message is nice and we can keep it that way. Regarding the first message, " Check allowed connection types and access rights." does not seem to add any value to the error message, because it does not say for which host we should check connection types and access rights. This gnutls_bye() message is only useful in connection with the second message. Since these two messages will always appear together, a user will be able to empirically infer that they are connected. The second message always includes the suggestion. For this reason, it is proposed to revert changes in src/libs/zbxcrypto/tls.c in r55303 and remove the hint.

          This will at least make it consistent with other error messages like the following:

           11375:20150901:161319.202 zbx_tls_accept(): gnutls_handshake() returned: -24 Decryption has failed.
           11375:20150901:161319.202 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -24 Decryption has failed.
          

          REOPENED

          Andris Mednis RESOLVED in r55348.

          Aleksandrs Saveljevs r55352 makes it a bit shorter. Still RESOLVED.

          Andris Mednis Thanks! Reviewed, accepted. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (77) Suppose a GnuTLS server is configured to connect to proxy with PSK, but proxy only accepts unencrypted connections. Server log file is then heavily spammed with the following line: 20629:20150813:131101.709 gnutls_bye() returned: -53 Error in the push function. In comparison, mbed TLS and OpenSSL servers spam the log with the following lines: 21100:20150813:131229.241 cannot obtain data from proxy "proxy-gnutls-3.2.21": ssl_read() failed: SSL - The peer notified us that the connection is going to be closed 20877:20150813:131355.482 cannot obtain data from proxy "proxy-gnutls-3.2.21": connection closed during read These message seem to be acceptable, especially the OpenSSL one. However, the GnuTLS message should be improved. Andris Mednis RESOLVED in r55303. Aleksandrs Saveljevs Now, if server receives no data from the proxy, it logs these error messages (with suggestions offered in r55327): 10825:20150901:160619.481 gnutls_bye() returned: -53 Error in the push function. Check allowed connection types and access rights. 10825:20150901:160619.481 proxy "proxy-openssl-1.0.1e" at "127.0.0.1" returned no host availability data: check allowed connection types and access rights The second message is nice and we can keep it that way. Regarding the first message, " Check allowed connection types and access rights." does not seem to add any value to the error message, because it does not say for which host we should check connection types and access rights. This gnutls_bye() message is only useful in connection with the second message. Since these two messages will always appear together, a user will be able to empirically infer that they are connected. The second message always includes the suggestion. For this reason, it is proposed to revert changes in src/libs/zbxcrypto/tls.c in r55303 and remove the hint. This will at least make it consistent with other error messages like the following: 11375:20150901:161319.202 zbx_tls_accept(): gnutls_handshake() returned: -24 Decryption has failed. 11375:20150901:161319.202 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -24 Decryption has failed. REOPENED Andris Mednis RESOLVED in r55348. Aleksandrs Saveljevs r55352 makes it a bit shorter. Still RESOLVED. Andris Mednis Thanks! Reviewed, accepted. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (78) [S] This is a continuation of (64) for the server side.

          For instance, suppose we have a proxy configured in unencrypted passive mode in the frontend, but it still has PSK for active mode that we configured previously. These PSK values should not be loaded into the server's configuration cache.

          Similarly, if a host is being monitored by a proxy, the server should not load its encryption data.

          Andris Mednis There is a related ZBX-10016 .
          WON'T FIX.

          Aleksandrs Saveljevs Just to mention the rationale for closing as "Won't fix": if we have both certificates and PSK configured for a host and only certificates are accepted, then if a host is connecting using PSK we wish to be able to decrypt the connection and issue a user-friendly error message like "please use certificates, not PSK", rather than the general "cannot decrypt your message".

          Show
          Aleksandrs Saveljevs added a comment - - edited (78) [S] This is a continuation of (64) for the server side. For instance, suppose we have a proxy configured in unencrypted passive mode in the frontend, but it still has PSK for active mode that we configured previously. These PSK values should not be loaded into the server's configuration cache. Similarly, if a host is being monitored by a proxy, the server should not load its encryption data. Andris Mednis There is a related ZBX-10016 . WON'T FIX. Aleksandrs Saveljevs Just to mention the rationale for closing as "Won't fix": if we have both certificates and PSK configured for a host and only certificates are accepted, then if a host is connecting using PSK we wish to be able to decrypt the connection and issue a user-friendly error message like "please use certificates, not PSK", rather than the general "cannot decrypt your message".
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (79) [F] Suppose we change host's encryption settings from PSK to unencrypted. Currently, PSK values will remain in the database, but they should instead be cleared.

          Oleg Egorov RESOLVED IN r55300

          Aleksandrs Saveljevs Haven't tested it yet, but r55300 broke something. Suppose we go to "Administration" -> "Proxies", select a proxy, click "Disable hosts". All of the disabled hosts will have their PSK identity and PSK values cleared. Quite unexpected. REOPENED.

          Oleg Egorov RESOLVED IN r55596, r55550

          Ivo Kurzemnieks CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (79) [F] Suppose we change host's encryption settings from PSK to unencrypted. Currently, PSK values will remain in the database, but they should instead be cleared. Oleg Egorov RESOLVED IN r55300 Aleksandrs Saveljevs Haven't tested it yet, but r55300 broke something. Suppose we go to "Administration" -> "Proxies", select a proxy, click "Disable hosts". All of the disabled hosts will have their PSK identity and PSK values cleared. Quite unexpected. REOPENED. Oleg Egorov RESOLVED IN r55596, r55550 Ivo Kurzemnieks CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (80) Function zbx_socket_create() uses "%d" for outputting variable "port", which is of type "unsigned short". It should use "%hu" instead.

          Aleksandrs Saveljevs RESOLVED in r55219.

          Andris Mednis Looks good. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (80) Function zbx_socket_create() uses "%d" for outputting variable "port", which is of type "unsigned short". It should use "%hu" instead. Aleksandrs Saveljevs RESOLVED in r55219. Andris Mednis Looks good. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (81) [F] If we clone a host, its encryption settings are preserved. If we clone a proxy, its encryption settings are reset to no encryption.

          Oleg Egorov RESOLVED IN r55251

          Ivo Kurzemnieks CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (81) [F] If we clone a host, its encryption settings are preserved. If we clone a proxy, its encryption settings are reset to no encryption. Oleg Egorov RESOLVED IN r55251 Ivo Kurzemnieks CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (82) [F] This may or may not be connected to (79). Open a host, do not change anything, click "Update". The host list opens and there is a green message at the top. Expand details and observe the following:

          Only variables should be passed by reference [hosts.php:558 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() → CHost->massUpdate() in include/classes/api/services/CHost.php:951]
          

          Oleg Egorov RESOLVED IN r55596, r55550

          Ivo Kurzemnieks Seems like this error is gone now.
          CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (82) [F] This may or may not be connected to (79). Open a host, do not change anything, click "Update". The host list opens and there is a green message at the top. Expand details and observe the following: Only variables should be passed by reference [hosts.php:558 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() → CHost->massUpdate() in include/classes/api/services/CHost.php:951] Oleg Egorov RESOLVED IN r55596, r55550 Ivo Kurzemnieks Seems like this error is gone now. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (83) [S] Suppose we have the following PKI infrastructure: Root CA -> Intermediate CA -> server. In the Web world, users are expected to have Root CA certificate imported and the server is expected to serve the chain with its own certificate and Intermediate CA certificate (e.g., see SSLCertificateFile, SSLCertificateKeyFile, and SSLCertificateChainFile for Apache at http://httpd.apache.org/docs/2.2/mod/mod_ssl.html).

          This works for GnuTLS and OpenSSL versions of Zabbix agent. However, when we attempt to specify a certificate chain in mbed TLS agent's TLSCertFile, we get the following error during startup and the agent refuses to start:

            3462:20150908:160002.349 cannot parse 1 certificate(s) in file "agent-polarssl-1.3.9-install/cert/good-utf-8-dns.crt"
          

          There is a similar problem if we try to specify multiple chained certificates in TLSCAFile.

          Aleksandrs Saveljevs Actually, the problem is with OpenSSL, too: it loads the certificate file just fine, but tcpdump shows that it only sends one certificate during handshake, instead of the whole chain. GnuTLS seems to work well in this regard so far.

          Aleksandrs Saveljevs This seems to be a blocker for further testing.

          Andris Mednis Seems no problem with mbedTLS and GnuTLS. For OpenSSL RESOLVED in r55713.

          Aleksandrs Saveljevs The problem with mbed TLS is described below in (84).

          Aleksandrs Saveljevs OpenSSL fix looks good, except a small label suggestion in r55777. RESOLVED.

          Andris Mednis Thanks, CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (83) [S] Suppose we have the following PKI infrastructure: Root CA -> Intermediate CA -> server. In the Web world, users are expected to have Root CA certificate imported and the server is expected to serve the chain with its own certificate and Intermediate CA certificate (e.g., see SSLCertificateFile, SSLCertificateKeyFile, and SSLCertificateChainFile for Apache at http://httpd.apache.org/docs/2.2/mod/mod_ssl.html ). This works for GnuTLS and OpenSSL versions of Zabbix agent. However, when we attempt to specify a certificate chain in mbed TLS agent's TLSCertFile, we get the following error during startup and the agent refuses to start: 3462:20150908:160002.349 cannot parse 1 certificate(s) in file "agent-polarssl-1.3.9-install/cert/good-utf-8-dns.crt" There is a similar problem if we try to specify multiple chained certificates in TLSCAFile. Aleksandrs Saveljevs Actually, the problem is with OpenSSL, too: it loads the certificate file just fine, but tcpdump shows that it only sends one certificate during handshake, instead of the whole chain. GnuTLS seems to work well in this regard so far. Aleksandrs Saveljevs This seems to be a blocker for further testing. Andris Mednis Seems no problem with mbedTLS and GnuTLS. For OpenSSL RESOLVED in r55713. Aleksandrs Saveljevs The problem with mbed TLS is described below in (84). Aleksandrs Saveljevs OpenSSL fix looks good, except a small label suggestion in r55777. RESOLVED. Andris Mednis Thanks, CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (84) [D] During testing, I initially tried issuing certificates for IP addresses in subjectAltName. However, it did not work. In particular, OpenSSL does not support IP constraints in CA certificates (see https://www.openssl.org/docs/manmaster/crypto/X509_STORE_CTX_get_error.html ):

          X509_V_ERR_UNSUPPORTED_CONSTRAINT_TYPE: unsupported name constraint type
          An unsupported name constraint type was encountered. OpenSSL currently only supports directory name, DNS name, email and URI types.

          Error messages that Zabbix logged were a bit cryptic (like "unknown CA"). We might wish to mention somewhere that issuing certificates for IP addresses may be problematic, at least when critical IP name constraints are used.

          Aleksandrs Saveljevs It seems that mbed TLS does not understand nameConstraints and refuses to load CA certificates where this section is marked as critical. This seems to have been the initial cause for (83) and we might wish to document that.

          Aleksandrs Saveljevs If an mbed TLS agent or proxy receives a certificate chain from the server that contains a critical nameConstraints section, it logs the following:

           24232:20150925:151623.462 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): X509 - The extension tag or value is invalid : ASN1 - ASN1 tag was of an unexpected value
          

          Andris Mednis Documented in
          https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#limitations_on_using_x509_v3_certificate_extensions

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (84) [D] During testing, I initially tried issuing certificates for IP addresses in subjectAltName. However, it did not work. In particular, OpenSSL does not support IP constraints in CA certificates (see https://www.openssl.org/docs/manmaster/crypto/X509_STORE_CTX_get_error.html ): X509_V_ERR_UNSUPPORTED_CONSTRAINT_TYPE: unsupported name constraint type An unsupported name constraint type was encountered. OpenSSL currently only supports directory name, DNS name, email and URI types. Error messages that Zabbix logged were a bit cryptic (like "unknown CA"). We might wish to mention somewhere that issuing certificates for IP addresses may be problematic, at least when critical IP name constraints are used. Aleksandrs Saveljevs It seems that mbed TLS does not understand nameConstraints and refuses to load CA certificates where this section is marked as critical. This seems to have been the initial cause for (83) and we might wish to document that. Aleksandrs Saveljevs If an mbed TLS agent or proxy receives a certificate chain from the server that contains a critical nameConstraints section, it logs the following: 24232:20150925:151623.462 failed to accept an incoming connection: from 127.0.0.1: ssl_handshake(): X509 - The extension tag or value is invalid : ASN1 - ASN1 tag was of an unexpected value Andris Mednis Documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#limitations_on_using_x509_v3_certificate_extensions Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (85) [S] Suppose we have a server connecting to an OpenSSL proxy and there is something wrong with proxy's certificate. Then the proxy logs the following:

          27409:20150910:122722.150 failed to accept an incoming connection: from 127.0.0.1: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1256: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA"
          ...
          27409:20150910:122634.639 Cannot get socket IP address: [107] Transport endpoint is not connected
          27409:20150910:122634.639 failed to accept an incoming connection: from unknown IP: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1256: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA"
          ...
          27409:20150910:122629.557 Cannot get socket IP address: [107] Transport endpoint is not connected
          27409:20150910:122629.557 Cannot get socket IP address: [107] Transport endpoint is not connected
          27409:20150910:122629.557 failed to accept an incoming connection: from unknown IP: TLS connection from unknown IP has been closed during handshake: file s3_pkt.c line 1256: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA"
          

          It can be seen that sometimes it obtains an IP in both places correctly, in some cases - in just one place, and in other cases - in neither of them. If possible, it would be nice to always obtain a peer's IP address.

          In comparison, here are the error messages from GnuTLS proxy. It never seems to fail to get an IP address:

           27389:20150910:122641.698 failed to accept an incoming connection: from 127.0.0.1: invalid peer certificate: The certificate is NOT trusted. The certificate issuer is unknown. 
          

          Aleksandrs Saveljevs Similarly, suppose we have an mbed TLS server, an OpenSSL proxy, and the proxy certificate is issued by an untrusted CA. The proxy logs is then as follows:

           27600:20150930:173635.629 Cannot get socket IP address: [107] Transport endpoint is not connected
           27600:20150930:173635.629 Cannot get socket IP address: [107] Transport endpoint is not connected
           27600:20150930:173635.629 failed to accept an incoming connection: from unknown IP: TLS handshake with unknown IP returned error code 5::
          

          It seems like there should be something at the end of the message.

          Aleksandrs Saveljevs For comparison, here are the corresponding messages for mbed TLS and GnuTLS proxies:

           27560:20150930:173804.460 Cannot get socket IP address: [107] Transport endpoint is not connected
           27560:20150930:173804.460 failed to accept an incoming connection: from unknown IP: ssl_handshake(): NET - Connection was reset by peer
          
           27580:20150930:173822.616 zbx_tls_accept(): gnutls_handshake() returned: -54 Error in the pull function.
           27580:20150930:173822.616 Cannot get socket IP address: [107] Transport endpoint is not connected
           27580:20150930:173822.616 failed to accept an incoming connection: from unknown IP: zbx_tls_accept(): gnutls_handshake() failed: -54 Error in the pull function.
          

          These are not ideal either, but at least they seem to provide as much information as they can (except IP address).

          Andris Mednis Getting IP address RESOLVED in r56122.

          Aleksandrs Saveljevs Looks plausible, but zbx_tls_accept() was not the only place where we called get_ip_by_socket(). For instance, zbx_tls_close() needs it, too. Since we are now doing get_ip_by_socket() in the very beginning anyway, can we save the result in the socket structure to reuse later? REOPENED.

          Aleksandrs Saveljevs Here we can see that the IP address is needed during shutdown:

           26785:20151027:151649.191 Cannot get socket IP address: [107] Transport endpoint is not connected
           26785:20151027:151649.191 TLS shutdown with unknown IP returned error code 5: 
          

          Andris Mednis RESOLVED in r57165.

          Aleksandrs Saveljevs Looks good. I have removed the use of get_ip_by_socket() from src/zabbix_server/trapper/active.c in r57332. Please take a look.

          That function is now only used in src/libs/zbxcomms/comms.c, so theoretically it can be made a static function. Do we wish to do it?

          Andris Mednis Thanks for improvement, r57332 accepted. I agree with idea to make get_ip_by_socket() static.

          Aleksandrs Saveljevs RESOLVED in r57343.

          Andris Mednis Thanks! CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (85) [S] Suppose we have a server connecting to an OpenSSL proxy and there is something wrong with proxy's certificate. Then the proxy logs the following: 27409:20150910:122722.150 failed to accept an incoming connection: from 127.0.0.1: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1256: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA" ... 27409:20150910:122634.639 Cannot get socket IP address: [107] Transport endpoint is not connected 27409:20150910:122634.639 failed to accept an incoming connection: from unknown IP: TLS connection from 127.0.0.1 has been closed during handshake: file s3_pkt.c line 1256: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA" ... 27409:20150910:122629.557 Cannot get socket IP address: [107] Transport endpoint is not connected 27409:20150910:122629.557 Cannot get socket IP address: [107] Transport endpoint is not connected 27409:20150910:122629.557 failed to accept an incoming connection: from unknown IP: TLS connection from unknown IP has been closed during handshake: file s3_pkt.c line 1256: error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca: SSL alert number 48: TLS read fatal alert "unknown CA" It can be seen that sometimes it obtains an IP in both places correctly, in some cases - in just one place, and in other cases - in neither of them. If possible, it would be nice to always obtain a peer's IP address. In comparison, here are the error messages from GnuTLS proxy. It never seems to fail to get an IP address: 27389:20150910:122641.698 failed to accept an incoming connection: from 127.0.0.1: invalid peer certificate: The certificate is NOT trusted. The certificate issuer is unknown. Aleksandrs Saveljevs Similarly, suppose we have an mbed TLS server, an OpenSSL proxy, and the proxy certificate is issued by an untrusted CA. The proxy logs is then as follows: 27600:20150930:173635.629 Cannot get socket IP address: [107] Transport endpoint is not connected 27600:20150930:173635.629 Cannot get socket IP address: [107] Transport endpoint is not connected 27600:20150930:173635.629 failed to accept an incoming connection: from unknown IP: TLS handshake with unknown IP returned error code 5:: It seems like there should be something at the end of the message. Aleksandrs Saveljevs For comparison, here are the corresponding messages for mbed TLS and GnuTLS proxies: 27560:20150930:173804.460 Cannot get socket IP address: [107] Transport endpoint is not connected 27560:20150930:173804.460 failed to accept an incoming connection: from unknown IP: ssl_handshake(): NET - Connection was reset by peer 27580:20150930:173822.616 zbx_tls_accept(): gnutls_handshake() returned: -54 Error in the pull function. 27580:20150930:173822.616 Cannot get socket IP address: [107] Transport endpoint is not connected 27580:20150930:173822.616 failed to accept an incoming connection: from unknown IP: zbx_tls_accept(): gnutls_handshake() failed: -54 Error in the pull function. These are not ideal either, but at least they seem to provide as much information as they can (except IP address). Andris Mednis Getting IP address RESOLVED in r56122. Aleksandrs Saveljevs Looks plausible, but zbx_tls_accept() was not the only place where we called get_ip_by_socket(). For instance, zbx_tls_close() needs it, too. Since we are now doing get_ip_by_socket() in the very beginning anyway, can we save the result in the socket structure to reuse later? REOPENED. Aleksandrs Saveljevs Here we can see that the IP address is needed during shutdown: 26785:20151027:151649.191 Cannot get socket IP address: [107] Transport endpoint is not connected 26785:20151027:151649.191 TLS shutdown with unknown IP returned error code 5: Andris Mednis RESOLVED in r57165. Aleksandrs Saveljevs Looks good. I have removed the use of get_ip_by_socket() from src/zabbix_server/trapper/active.c in r57332. Please take a look. That function is now only used in src/libs/zbxcomms/comms.c, so theoretically it can be made a static function. Do we wish to do it? Andris Mednis Thanks for improvement, r57332 accepted. I agree with idea to make get_ip_by_socket() static. Aleksandrs Saveljevs RESOLVED in r57343. Andris Mednis Thanks! CLOSED
          Hide
          richlv added a comment - - edited

          (86) for sender, we may specify "--tls-psk-file" and "--tls-psk-identity" along with "--tls-connect unencrypted" or omitting "--tls-connect" - maybe we should only allow psk parameters with "--tls-connect psk" ?

          Aleksandrs Saveljevs Related issue: (33).

          Aleksandrs Saveljevs zabbix_sender currently allows to specify "--tls-connect cert" without providing any other certificate parameters. It will then establish a connection to the server, but fail during TLS handshake. We should detect the absence of certificate settings as soon as possible, without disturbing the server.

          Aleksandrs Saveljevs The latter case RESOLVED in r56745 and r56746. Additionally, attached "subissue-86-r56746.patch" makes a fully passive agent ignore TLSConnect setting and a fully active agent ignore TLSAccept setting. However, it requires CONFIG_PASSIVE_FORKS and CONFIG_ACTIVE_FORKS to be available for zabbix_agent, zabbix_get, zabbix_sender, which does not make it pretty, so the patch is open for consideration.

          Aleksandrs Saveljevs As for richlv's proposal, let's make configuration file and command-line parameter validation the same. Especially since zabbix_sender can reuse agent configuration file, so if we make it complain about PSK parameters in the configuration file if "--tls-connect cert --tls-cert-file ..." is on the command line, it will make overriding more difficult.

          Aleksandrs Saveljevs Note that ZBXNEXT-3045 makes it a bit easier for the patch.

          Andris Mednis Patch "subissue-86-r56746.patch" applied in r58121, looks good.

          Aleksandrs Saveljevs Thank you! CLOSED.

          Show
          richlv added a comment - - edited (86) for sender, we may specify "--tls-psk-file" and "--tls-psk-identity" along with "--tls-connect unencrypted" or omitting "--tls-connect" - maybe we should only allow psk parameters with "--tls-connect psk" ? Aleksandrs Saveljevs Related issue: (33). Aleksandrs Saveljevs zabbix_sender currently allows to specify "--tls-connect cert" without providing any other certificate parameters. It will then establish a connection to the server, but fail during TLS handshake. We should detect the absence of certificate settings as soon as possible, without disturbing the server. Aleksandrs Saveljevs The latter case RESOLVED in r56745 and r56746. Additionally, attached "subissue-86-r56746.patch" makes a fully passive agent ignore TLSConnect setting and a fully active agent ignore TLSAccept setting. However, it requires CONFIG_PASSIVE_FORKS and CONFIG_ACTIVE_FORKS to be available for zabbix_agent, zabbix_get, zabbix_sender, which does not make it pretty, so the patch is open for consideration. Aleksandrs Saveljevs As for richlv 's proposal, let's make configuration file and command-line parameter validation the same. Especially since zabbix_sender can reuse agent configuration file, so if we make it complain about PSK parameters in the configuration file if "--tls-connect cert --tls-cert-file ..." is on the command line, it will make overriding more difficult. Aleksandrs Saveljevs Note that ZBXNEXT-3045 makes it a bit easier for the patch. Andris Mednis Patch "subissue-86-r56746.patch" applied in r58121, looks good. Aleksandrs Saveljevs Thank you! CLOSED.
          Hide
          Raymond Kuiper added a comment -

          Created ZBXNEXT-2962 to add internal events to the crypto system.

          Show
          Raymond Kuiper added a comment - Created ZBXNEXT-2962 to add internal events to the crypto system.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (87) [I] Update to trunk in r55786 incorrectly handled (i.e., ignored) the change to build/win32/project/Makefile_agent.inc in r55782 (ZBX-9847).

          Aleksandrs Saveljevs Fixed in r55804. RESOLVED.

          Andris Zeila CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (87) [I] Update to trunk in r55786 incorrectly handled (i.e., ignored) the change to build/win32/project/Makefile_agent.inc in r55782 ( ZBX-9847 ). Aleksandrs Saveljevs Fixed in r55804. RESOLVED. Andris Zeila CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (88) [F] There is an undefined index when trying to save the following configuration:

          Oleg Egorov This issue exist in trunk, see ZBX-9949

          CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (88) [F] There is an undefined index when trying to save the following configuration: Oleg Egorov This issue exist in trunk, see ZBX-9949 CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (89) [S] Commit r55842 tries to resolve a conflict with ZBX-9274. Earlier in this task, we tried to make lines fit into 79 characters by default. Unfortunately, after ZBX-9274 they are longer:

          $ src/zabbix_agent/zabbix_agentd --help
          ...
          Options:
            -c --config config-file               Absolute path to the configuration file
                                                  (default: "/home/Developer/projects/zabbix-bin/etc/zabbix_agentd.conf")
            -p --print                            Print known items and exit
            -t --test item-key                    Test specified item and exit
            -R --runtime-control runtime-option   Perform administrative functions
          ...
          

          Commit r55842 should be reviewed and we should decide what to do with these long lines.

          Aleksandrs Saveljevs Even if we do not specify a custom path, lines are still longer that 79 characters (86 characters in the example below):

            -c --config config-file               Absolute path to the configuration file
                                                  (default: "/usr/local/etc/zabbix_agentd.conf")
          

          Andris Mednis For default values RESOLVED in r56077.

          Aleksandrs Saveljevs Some lines in proxy and server output could be joined. Please see r56105. Apart from that, looks good.

          Andris Mednis Thanks, accepted. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (89) [S] Commit r55842 tries to resolve a conflict with ZBX-9274 . Earlier in this task, we tried to make lines fit into 79 characters by default. Unfortunately, after ZBX-9274 they are longer: $ src/zabbix_agent/zabbix_agentd --help ... Options: -c --config config-file Absolute path to the configuration file (default: "/home/Developer/projects/zabbix-bin/etc/zabbix_agentd.conf") -p --print Print known items and exit -t --test item-key Test specified item and exit -R --runtime-control runtime-option Perform administrative functions ... Commit r55842 should be reviewed and we should decide what to do with these long lines. Aleksandrs Saveljevs Even if we do not specify a custom path, lines are still longer that 79 characters (86 characters in the example below): -c --config config-file Absolute path to the configuration file (default: "/usr/local/etc/zabbix_agentd.conf") Andris Mednis For default values RESOLVED in r56077. Aleksandrs Saveljevs Some lines in proxy and server output could be joined. Please see r56105. Apart from that, looks good. Andris Mednis Thanks, accepted. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (90) [S] Suppose we have a GnuTLS server connecting to OpenSSL proxy. We then have a memory leak:

          Andris Mednis Seems like the "memory leak" is in fact a growing session cache. RESOLVED in r55905.

          Andris Mednis mbed TLS and GnuTLS have no built-in and enabled by default session caching, some external infrastructure (memory, initialization, callback functions) must be provided by application to achieve it.

          Aleksandrs Saveljevs Indeed, memory usage no longer increases for OpenSSL. Nice find! CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (90) [S] Suppose we have a GnuTLS server connecting to OpenSSL proxy. We then have a memory leak: Andris Mednis Seems like the "memory leak" is in fact a growing session cache. RESOLVED in r55905. Andris Mednis mbed TLS and GnuTLS have no built-in and enabled by default session caching, some external infrastructure (memory, initialization, callback functions) must be provided by application to achieve it. Aleksandrs Saveljevs Indeed, memory usage no longer increases for OpenSSL. Nice find! CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (91) [S] Suppose we have a GnuTLS server and an OpenSSL proxy, which has a CA certificate (i.e., one that only has "Certificate Sign" and "CRL Sign" critical key usages). Communication between them works perfectly, while it shouldn't, similarly to how it fails with mbed TLS and GnuTLS proxies.

          Aleksandrs Saveljevs Although not critical, but there is something to be improved in case of GnuTLS proxy:

           30956:20151001:102408.107 cannot connect to proxy "proxy-gnutls-3.2.21": TCP connection successful, cannot establish TLS to [[127.0.0.1]:20003]: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated.
          

          The suspicious part here is "The TLS connection was non-properly terminated". Compare that with a message in case of mbed TLS proxy:

           30956:20151001:102527.676 cannot connect to proxy "proxy-mbedtls-1.3.10": TCP connection successful, cannot establish TLS to [[127.0.0.1]:20002]: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 40 Handshake failed
          

          Aleksandrs Saveljevs Another example where communication works when it shouldn't is when the passive proxy certificate is only issued with "TLS Web Client Authentication", but not "TLS Web Server Authentication" critical extended key usage. It fails correctly for mbed TLS and OpenSSL servers, but a GnuTLS server communicates well with GnuTLS proxy and OpenSSL proxy (but not mbed TLS proxy).

          Aleksandrs Saveljevs A similar scenario is observed when a proxy is active: a GnuTLS server talks to such a proxy, which has a CA certificate. The server logs the following, but sends configuration anyway:

           23644:20151002:123602.274 GnuTLS audit: "Peer's certificate does not allow digital signatures. Key usage violation detected (ignored)."
           23644:20151002:123602.300 GnuTLS audit: "Peer's certificate does not allow digital signatures. Key usage violation detected (ignored)."
           23644:20151002:123602.302 sending configuration data to proxy "proxy-openssl-1.0.1e" at "127.0.0.1", datalen 2658
          

          Andris Mednis

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (91) [S] Suppose we have a GnuTLS server and an OpenSSL proxy, which has a CA certificate (i.e., one that only has "Certificate Sign" and "CRL Sign" critical key usages). Communication between them works perfectly, while it shouldn't, similarly to how it fails with mbed TLS and GnuTLS proxies. Aleksandrs Saveljevs Although not critical, but there is something to be improved in case of GnuTLS proxy: 30956:20151001:102408.107 cannot connect to proxy "proxy-gnutls-3.2.21": TCP connection successful, cannot establish TLS to [[127.0.0.1]:20003]: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated. The suspicious part here is "The TLS connection was non-properly terminated". Compare that with a message in case of mbed TLS proxy: 30956:20151001:102527.676 cannot connect to proxy "proxy-mbedtls-1.3.10": TCP connection successful, cannot establish TLS to [[127.0.0.1]:20002]: zbx_tls_connect(): gnutls_handshake() failed with fatal alert: 40 Handshake failed Aleksandrs Saveljevs Another example where communication works when it shouldn't is when the passive proxy certificate is only issued with "TLS Web Client Authentication", but not "TLS Web Server Authentication" critical extended key usage. It fails correctly for mbed TLS and OpenSSL servers, but a GnuTLS server communicates well with GnuTLS proxy and OpenSSL proxy (but not mbed TLS proxy). Aleksandrs Saveljevs A similar scenario is observed when a proxy is active: a GnuTLS server talks to such a proxy, which has a CA certificate. The server logs the following, but sends configuration anyway: 23644:20151002:123602.274 GnuTLS audit: "Peer's certificate does not allow digital signatures. Key usage violation detected (ignored)." 23644:20151002:123602.300 GnuTLS audit: "Peer's certificate does not allow digital signatures. Key usage violation detected (ignored)." 23644:20151002:123602.302 sending configuration data to proxy "proxy-openssl-1.0.1e" at "127.0.0.1", datalen 2658 Andris Mednis GnuTLS http://gnutls.org/faq.html#key-usage-violation : "... newer versions of GnuTLS will also allow such key usage violations (and will only output a warning message).". https://lists.gnu.org/archive/html/info-gnu/2012-11/msg00004.html: * Version 3.1.4 (released 2012-11-10) ... ** libgnutls: Always tolerate key usage violation errors from the side of the peer, but also notify via an audit message. GnuTLS ChangeLog: * NEWS: released 3.1.5 .... 2012-11-07 Nikos Mavrogiannopoulos <nmav@gnutls.org> * NEWS, lib/gnutls_int.h, lib/gnutls_sig.c: key usage violations are tolerated. Documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#limitations_on_using_x509_v3_certificate_extensions . Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (92) [S] The same scenario as in (91), but the server is now OpenSSL. It logs these messages:

           31733:20151001:103024.698 cannot connect to proxy "proxy-openssl-1.0.1e": TCP connection successful, cannot establish TLS to [[127.0.0.1]:20004]: SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "unsupported c
          

          It can be seen that the error message is truncated. Apparently, 256 bytes for "info_buf" variable are not enough.

          Andris Mednis Message truncation RESOLVED in r57993.
          "info_buf" was not changed, so far its size is sufficient for TLS alert printing.
          See also solution of (152) where messages from OpenSSL in case of not allowed certificate use are improved.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (92) [S] The same scenario as in (91), but the server is now OpenSSL. It logs these messages: 31733:20151001:103024.698 cannot connect to proxy "proxy-openssl-1.0.1e": TCP connection successful, cannot establish TLS to [[127.0.0.1]:20004]: SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "unsupported c It can be seen that the error message is truncated. Apparently, 256 bytes for "info_buf" variable are not enough. Andris Mednis Message truncation RESOLVED in r57993. "info_buf" was not changed, so far its size is sufficient for TLS alert printing. See also solution of (152) where messages from OpenSSL in case of not allowed certificate use are improved. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (93) [S] Something is improper with dynamically changing logging levels.

          Assume the same scenario as in (92). A GnuTLS proxy initially logs the following every second:

           32419:20151001:104527.821 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
           32419:20151001:104527.821 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -50 The request is invalid.
          

          After we increase the log level once and decrease it once, it starts logging just the following:

             343:20151001:104844.013 Got signal [signal:10(SIGUSR1),sender_pid:339,sender_uid:1000,value_int:1538(0x00000602)].
             343:20151001:104844.013 log level has been decreased to 3 (warning)
             343:20151001:104845.002 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
             343:20151001:104846.006 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
             343:20151001:104847.015 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
             343:20151001:104848.019 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
             343:20151001:104849.028 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
             343:20151001:104850.031 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
             343:20151001:104851.039 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid.
          

          Glebs Ivanovskis This issue was discovered to be a big one, therefore it was decided to fix it under its own ZBX-10042. CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (93) [S] Something is improper with dynamically changing logging levels. Assume the same scenario as in (92). A GnuTLS proxy initially logs the following every second: 32419:20151001:104527.821 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 32419:20151001:104527.821 failed to accept an incoming connection: from 127.0.0.1: zbx_tls_accept(): gnutls_handshake() failed: -50 The request is invalid. After we increase the log level once and decrease it once, it starts logging just the following: 343:20151001:104844.013 Got signal [signal:10(SIGUSR1),sender_pid:339,sender_uid:1000,value_int:1538(0x00000602)]. 343:20151001:104844.013 log level has been decreased to 3 (warning) 343:20151001:104845.002 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 343:20151001:104846.006 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 343:20151001:104847.015 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 343:20151001:104848.019 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 343:20151001:104849.028 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 343:20151001:104850.031 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. 343:20151001:104851.039 zbx_tls_accept(): gnutls_handshake() returned: -50 The request is invalid. Glebs Ivanovskis This issue was discovered to be a big one, therefore it was decided to fix it under its own ZBX-10042 . CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (94) [S] Suppose we have an mbed TLS server and an active GnuTLS proxy. The proxy uses a CA certificate. When it tries to connect to the server, it logs these messages:

           22485:20151002:112435.757 GnuTLS audit: "Peer's certificate does not allow digital signatures. Key usage violation detected (ignored)."
           22485:20151002:112435.759 Unable to connect to the server [127.0.0.1]:10001 [TCP connection successful, cannot establish TLS to [[127.0.0.1]:10001]: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated.]
          

          This may be a bit confusing, because the audit message seems to refer to the peer (the server), but it actually refers to the proxy itself. Fixing this is optional.

          Aleksandrs Saveljevs This GnuTLS audit message is also logged by itself without any identifying information in case of GnuTLS server.

          Andris Mednis Comment for issue (91) describes GnuTLS policy in case of key usage violation.

          Aleksandrs Saveljevs The audit message was not fixed, but OK. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (94) [S] Suppose we have an mbed TLS server and an active GnuTLS proxy. The proxy uses a CA certificate. When it tries to connect to the server, it logs these messages: 22485:20151002:112435.757 GnuTLS audit: "Peer's certificate does not allow digital signatures. Key usage violation detected (ignored)." 22485:20151002:112435.759 Unable to connect to the server [127.0.0.1]:10001 [TCP connection successful, cannot establish TLS to [[127.0.0.1]:10001]: zbx_tls_connect(): gnutls_handshake() failed: -110 The TLS connection was non-properly terminated.] This may be a bit confusing, because the audit message seems to refer to the peer (the server), but it actually refers to the proxy itself. Fixing this is optional. Aleksandrs Saveljevs This GnuTLS audit message is also logged by itself without any identifying information in case of GnuTLS server. Andris Mednis Comment for issue (91) describes GnuTLS policy in case of key usage violation. Aleksandrs Saveljevs The audit message was not fixed, but OK. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (95) [S] Suppose we corrupt a CRL file in some way. For instance, suppose we have CRL 1 and CRL 2, and we concatenate them together. We then remove -----END X509 CRL---- at the end of the first CRL, remove -----BEGIN X509 CRL---- at the beginning of the second CRL, and join the two ends together, removing some characters to make the middle line the same length as other lines.

          If we perform the procedure above, the mbed TLS and GnuTLS servers will complain about a bad CRL file. However, an OpenSSL server starts happily without any error messages.

          The bad news is that "openssl crl -in root-ca-1-and-2.crl -text -noout" command for CRL inspection also works on this file and only prints the contents of CRL 1. If, however, it is possible to ask OpenSSL for a stricter validation of the CRL file, then we should do it.

          Aleksandrs Saveljevs Actually, it seems that the OpenSSL command above reads only the first CRL from the file by default, similar to how it read only the first certificate in (83). It happens that way even if the CRL file is not corrupted.

          Andris Mednis Yes, "slightly corrupted" CRL file loads successfully into OpenSSL (tested with r56034) but TLS connections will fail at runtime - there will be error messages like

          TCP successful, cannot establish TLS to [[127.0.0.1]:10092]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine
          

          and

          Get value from agent failed: TCP successful, cannot establish TLS to [[127.0.0.1]:10070]: SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "unknown CA"
          

          so it is not silently ignored.
          Heavily corrupted CRL file is rejected by OpenSSL:

          cannot load CRL(s) from file "...zabbix_crl_file": file asn1_lib.c line 147: error:0D07209B:asn1 encoding routines:ASN1_get_object:too long file tasn_dec.c line 1186: error:0D068066:asn1 encoding routines:ASN1_CHECK_TLEN:bad object header file tasn_dec.c line 372: error:0D07803A:asn1 encoding routines:ASN1_ITEM_EX_D2I:nested asn1 error: Type=X509_CRL file pem_oth.c line 83: error:0906700D:PEM routines:PEM_ASN1_read_bio:ASN1 lib file by_file.c line 209: error:0B070009:x509 certificate routines:X509_load_crl_file:PEM lib
          

          Aleksandrs Saveljevs Not ideal, but probably good enough: it is possible to corrupt a CRL in a more delicate way (e.g., by tweaking signature bits), so that CRL validity is only known at runtime anyway. WON'T FIX.

          Show
          Aleksandrs Saveljevs added a comment - - edited (95) [S] Suppose we corrupt a CRL file in some way. For instance, suppose we have CRL 1 and CRL 2, and we concatenate them together. We then remove -----END X509 CRL---- at the end of the first CRL, remove -----BEGIN X509 CRL ---- at the beginning of the second CRL, and join the two ends together, removing some characters to make the middle line the same length as other lines. If we perform the procedure above, the mbed TLS and GnuTLS servers will complain about a bad CRL file. However, an OpenSSL server starts happily without any error messages. The bad news is that "openssl crl -in root-ca-1-and-2.crl -text -noout" command for CRL inspection also works on this file and only prints the contents of CRL 1. If, however, it is possible to ask OpenSSL for a stricter validation of the CRL file, then we should do it. Aleksandrs Saveljevs Actually, it seems that the OpenSSL command above reads only the first CRL from the file by default, similar to how it read only the first certificate in (83). It happens that way even if the CRL file is not corrupted. Andris Mednis Yes, "slightly corrupted" CRL file loads successfully into OpenSSL (tested with r56034) but TLS connections will fail at runtime - there will be error messages like TCP successful, cannot establish TLS to [[127.0.0.1]:10092]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine and Get value from agent failed: TCP successful, cannot establish TLS to [[127.0.0.1]:10070]: SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "unknown CA" so it is not silently ignored. Heavily corrupted CRL file is rejected by OpenSSL: cannot load CRL(s) from file "...zabbix_crl_file" : file asn1_lib.c line 147: error:0D07209B:asn1 encoding routines:ASN1_get_object:too long file tasn_dec.c line 1186: error:0D068066:asn1 encoding routines:ASN1_CHECK_TLEN:bad object header file tasn_dec.c line 372: error:0D07803A:asn1 encoding routines:ASN1_ITEM_EX_D2I:nested asn1 error: Type=X509_CRL file pem_oth.c line 83: error:0906700D:PEM routines:PEM_ASN1_read_bio:ASN1 lib file by_file.c line 209: error:0B070009:x509 certificate routines:X509_load_crl_file:PEM lib Aleksandrs Saveljevs Not ideal, but probably good enough: it is possible to corrupt a CRL in a more delicate way (e.g., by tweaking signature bits), so that CRL validity is only known at runtime anyway. WON'T FIX.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (96) [S] This is related to (95), which talked about a corrupted CRL file. Here, nothing is corrupted, but OpenSSL server does not seem to take the CRL into account (the same configuration with mbed TLS and GnuTLS servers rejects revoked certificates).

          Andris Mednis RESOLVED in r56034. Now OpenSSL requires and checks CRLs along all certificate chain.

          Aleksandrs Saveljevs The error messages had the downside that they made the executable a bit larger:

          $ strings src/zabbix_agent/zabbix_agentd | grep X509_STORE
          ...
          X509_STORE_add_lookup() 1 failed when loading CRL(s) from file "%s":
          X509_STORE_set_flags() 1 failed when loading CRL(s) from file "%s":
          X509_STORE_add_lookup() 2 failed when loading CRL(s) from file "%s":
          X509_STORE_set_flags() 2 failed when loading CRL(s) from file "%s":
          ...
          

          Commit r56079 replaces them with the following instead:

          $ strings src/zabbix_agent/zabbix_agentd | grep X509_STORE
          ...
          X509_STORE_add_lookup() #%d failed when loading CRL(s) from file "%s":
          X509_STORE_set_flags() #%d failed when loading CRL(s) from file "%s":
          ...
          

          Please take a look. Still RESOLVED.

          Andris Mednis Thanks for improvement! Accepted. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (96) [S] This is related to (95), which talked about a corrupted CRL file. Here, nothing is corrupted, but OpenSSL server does not seem to take the CRL into account (the same configuration with mbed TLS and GnuTLS servers rejects revoked certificates). Andris Mednis RESOLVED in r56034. Now OpenSSL requires and checks CRLs along all certificate chain. Aleksandrs Saveljevs The error messages had the downside that they made the executable a bit larger: $ strings src/zabbix_agent/zabbix_agentd | grep X509_STORE ... X509_STORE_add_lookup() 1 failed when loading CRL(s) from file "%s": X509_STORE_set_flags() 1 failed when loading CRL(s) from file "%s": X509_STORE_add_lookup() 2 failed when loading CRL(s) from file "%s": X509_STORE_set_flags() 2 failed when loading CRL(s) from file "%s": ... Commit r56079 replaces them with the following instead: $ strings src/zabbix_agent/zabbix_agentd | grep X509_STORE ... X509_STORE_add_lookup() #%d failed when loading CRL(s) from file "%s": X509_STORE_set_flags() #%d failed when loading CRL(s) from file "%s": ... Please take a look. Still RESOLVED. Andris Mednis Thanks for improvement! Accepted. CLOSED.
          Hide
          Andris Mednis added a comment - - edited

          (97) [S] OpenSSL generates session tickets but their use is not supported in this release.

          Andris Mednis RESOLVED in r56048 (disabled generation of tickets).

          Aleksandrs Saveljevs Looks good. Additionally, please see r56109, which makes RFC references in tls.c consistent.

          Andris Mednis Thanks, r56109 accepted. CLOSED.

          Show
          Andris Mednis added a comment - - edited (97) [S] OpenSSL generates session tickets but their use is not supported in this release. Andris Mednis RESOLVED in r56048 (disabled generation of tickets). Aleksandrs Saveljevs Looks good. Additionally, please see r56109, which makes RFC references in tls.c consistent. Andris Mednis Thanks, r56109 accepted. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (98) [S] zbx_tls_init_child() function prints debug messages for mbed TLS and OpenSSL using __function_name prefix, for GnuTLS - without.

          Andris Mednis RESOLVED in r56082.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (98) [S] zbx_tls_init_child() function prints debug messages for mbed TLS and OpenSSL using __function_name prefix, for GnuTLS - without. Andris Mednis RESOLVED in r56082. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (99) [DS] There is a discussion on the CRL checking strategy, which is driven by the OpenSSL flavor of Zabbix.

          Man page https://www.openssl.org/docs/manmaster/crypto/X509_VERIFY_PARAM_set_flags.html describes two options for CRL checking:

          X509_V_FLAG_CRL_CHECK enables CRL checking for the certificate chain leaf certificate. An error occurs if a suitable CRL cannot be found.

          X509_V_FLAG_CRL_CHECK_ALL enables CRL checking for the entire certificate chain.

          If TLSCRLFile is specified, then when a server makes a connection or accepts one, both of these options require a CRL for the peer's CA to be present in TLSCRLFile. However, flag X509_V_FLAG_CRL_CHECK_ALL additionally requires CRL to be present for all CAs in the chain. Otherwise, the peer's certificate is rejected with a cryptic message like the following in case of mbed TLS and OpenSSL peers:

           15486:20151013:095213.262 failed to accept an incoming connection: from 127.0.0.1:
              TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 3251: error:14089086:
              SSL routines:ssl3_get_client_certificate:certificate verify failed: TLS write fatal alert "unknown CA"
          

          In case of a GnuTLS peer, it rejects with the following log message:

           15818:20151013:095529.228 failed to accept an incoming connection: from 127.0.0.1:
              TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:
              rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:
              rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin
          

          It seems like X509_V_FLAG_CRL_CHECK_ALL is the way to go. However, mbed TLS and GnuTLS flavors currently do not require CRLs for all CAs in the chain - they just use whatever CRLs are provided, so there is a discrepancy here between flavors. However, such a discrepancy would still be there if X509_V_FLAG_CRL_CHECK would be used - it would require a CRL for the peer's CA.

          It should be investigated whether mbed TLS and GnuTLS can be asked to require a CRL. In any case, the CRL requirements and OpenSSL messages should be documented.

          Andris Mednis I did not find a flag similar to X509_V_FLAG_CRL_CHECK_ALL in mbed TLS and GnuTLS. Seems like to enforce a mandatory checking of CRL for every CA in the chain one has to use low level functions in mbed TLS and GnuTLS.

          Aleksandrs Saveljevs In that case, we can make this a documentation-only task and leave further investigation until this feature is really needed by users. Currently, if they want one behavior, they can use OpenSSL flavor (which is more secure), and if they want a different behavior, they are use either mbed TLS or GnuTLS flavor (which is more convenient).

          Andris Mednis Important note about OpenSSL requiring CRL for every configured CA documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#certificate_revocation_lists_crl .
          Error messages documented in https://www.zabbix.com/documentation/3.0/manual/encryption/troubleshooting/certificate_problems#openssl_used_with_crls_and_for_some_ca_in_the_certificate_chain_its_crl_is_not_included_in_tlscrlfile .

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (99) [DS] There is a discussion on the CRL checking strategy, which is driven by the OpenSSL flavor of Zabbix. Man page https://www.openssl.org/docs/manmaster/crypto/X509_VERIFY_PARAM_set_flags.html describes two options for CRL checking: X509_V_FLAG_CRL_CHECK enables CRL checking for the certificate chain leaf certificate. An error occurs if a suitable CRL cannot be found. X509_V_FLAG_CRL_CHECK_ALL enables CRL checking for the entire certificate chain. If TLSCRLFile is specified, then when a server makes a connection or accepts one, both of these options require a CRL for the peer's CA to be present in TLSCRLFile. However, flag X509_V_FLAG_CRL_CHECK_ALL additionally requires CRL to be present for all CAs in the chain. Otherwise, the peer's certificate is rejected with a cryptic message like the following in case of mbed TLS and OpenSSL peers: 15486:20151013:095213.262 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file s3_srvr.c line 3251: error:14089086: SSL routines:ssl3_get_client_certificate:certificate verify failed: TLS write fatal alert "unknown CA" In case of a GnuTLS peer, it rejects with the following log message: 15818:20151013:095529.228 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A: rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072: rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin It seems like X509_V_FLAG_CRL_CHECK_ALL is the way to go. However, mbed TLS and GnuTLS flavors currently do not require CRLs for all CAs in the chain - they just use whatever CRLs are provided, so there is a discrepancy here between flavors. However, such a discrepancy would still be there if X509_V_FLAG_CRL_CHECK would be used - it would require a CRL for the peer's CA. It should be investigated whether mbed TLS and GnuTLS can be asked to require a CRL. In any case, the CRL requirements and OpenSSL messages should be documented. Andris Mednis I did not find a flag similar to X509_V_FLAG_CRL_CHECK_ALL in mbed TLS and GnuTLS. Seems like to enforce a mandatory checking of CRL for every CA in the chain one has to use low level functions in mbed TLS and GnuTLS. Aleksandrs Saveljevs In that case, we can make this a documentation-only task and leave further investigation until this feature is really needed by users. Currently, if they want one behavior, they can use OpenSSL flavor (which is more secure), and if they want a different behavior, they are use either mbed TLS or GnuTLS flavor (which is more convenient). Andris Mednis Important note about OpenSSL requiring CRL for every configured CA documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#certificate_revocation_lists_crl . Error messages documented in https://www.zabbix.com/documentation/3.0/manual/encryption/troubleshooting/certificate_problems#openssl_used_with_crls_and_for_some_ca_in_the_certificate_chain_its_crl_is_not_included_in_tlscrlfile . Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (100) [S] It can be seen in TCP dump that mbed TLS and GnuTLS servers (in TLS sense) generate a session ID in "Server Hello" message, whereas OpenSSL TLS server sends an empty session ID. Since we do not support session resumption, can we make mbed TLS and GnuTLS omit the session ID?

          Aleksandrs Saveljevs See also https://tools.ietf.org/html/rfc4507#section-3.4 on "Interaction with TLS Session ID".

          Aleksandrs Saveljevs By the way, both https://blog.cloudflare.com/tls-session-resumption-full-speed-and-secure/ and https://timtaubert.de/blog/2014/11/the-sad-state-of-server-side-tls-session-resumption-implementations/ (at the very end) mention the use of memcached for sharing sessions across multiple servers.

          Andris Mednis From mbed TLS 1.3.9 function ssl_write_server_hello() in file library/ssl_srv.c it seems there is no way to disable generating of session id in case session tickets are not used. Change r56471 disables session tickets with mbed TLS.

          Andris Mednis From GnuTLS 3.3.14 read_client_hello() (or _gnutls_read_client_hello() in GnuTLS 3.1.18) it seems that a new session id is always generated if an old session is not successfully restored.

          Andris Mednis Testing with r56478 shows that with OpenSSL the 32-byte session ID is present in Server Hello just as with mbed TLS and GnuTLS.

          Aleksandrs Saveljevs Retesting OpenSSL TLS server still shows "Session ID Length" of 0 in my environment, but anyway. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (100) [S] It can be seen in TCP dump that mbed TLS and GnuTLS servers (in TLS sense) generate a session ID in "Server Hello" message, whereas OpenSSL TLS server sends an empty session ID. Since we do not support session resumption, can we make mbed TLS and GnuTLS omit the session ID? Aleksandrs Saveljevs See also https://tools.ietf.org/html/rfc4507#section-3.4 on "Interaction with TLS Session ID". Aleksandrs Saveljevs By the way, both https://blog.cloudflare.com/tls-session-resumption-full-speed-and-secure/ and https://timtaubert.de/blog/2014/11/the-sad-state-of-server-side-tls-session-resumption-implementations/ (at the very end) mention the use of memcached for sharing sessions across multiple servers. Andris Mednis From mbed TLS 1.3.9 function ssl_write_server_hello() in file library/ssl_srv.c it seems there is no way to disable generating of session id in case session tickets are not used. Change r56471 disables session tickets with mbed TLS. Andris Mednis From GnuTLS 3.3.14 read_client_hello() (or _gnutls_read_client_hello() in GnuTLS 3.1.18) it seems that a new session id is always generated if an old session is not successfully restored. Andris Mednis Testing with r56478 shows that with OpenSSL the 32-byte session ID is present in Server Hello just as with mbed TLS and GnuTLS. Aleksandrs Saveljevs Retesting OpenSSL TLS server still shows "Session ID Length" of 0 in my environment, but anyway. CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (101) [F]

          Fatal error: Call to undefined function B() in C:\Development\ZBXNEXT-1263\frontends\php\include\views\configuration.host.massupdate.php on line 215
          

          Oleg Egorov Issue exist in trunk, broken by wiper 55663.
          Duplicated mass update fatal error fix from ZBXNEXT-2683
          Fixed in r56114

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (101) [F] Fatal error: Call to undefined function B() in C:\Development\ZBXNEXT-1263\frontends\php\include\views\configuration.host.massupdate.php on line 215 Oleg Egorov Issue exist in trunk, broken by wiper 55663. Duplicated mass update fatal error fix from ZBXNEXT-2683 Fixed in r56114 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (102) [F]

          Undefined index: host [hosts.php:137 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() in include\classes\api\services\CHost.php:603]

          Undefined index: name [hosts.php:138 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() → CHost->validateUpdate() in include\classes\api\services\CHost.php:1860]

          array_key_exists() expects parameter 2 to be array, null given [hosts.php:138 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() → CHost->validateUpdate() → array_key_exists() in include\classes\api\services\CHost.php:1860]

          Oleg Egorov How to reproduce?

          Ivo Kurzemnieks By using API via frontend, ofcourse. For the first problem use

          API::Host()->update(['hostid' => '10129']);

          and this will make another problem (see (104)).

          For the second problem use

          API::Host()->update(['hostid' => '10129', 'host' => 'host'])

          . This, I think, has been fixed in r56117 (see(105)). When in doubt update branch back to r56114.

          Oleg Egorov

          • First issue is (104) and it RESOLVED in r56125;
          • Second issue RESOLVED in r56117 (Thank you, for yours improvements)

          RESOLVED in r56125

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (102) [F] Undefined index: host [hosts.php:137 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() in include\classes\api\services\CHost.php:603] Undefined index: name [hosts.php:138 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() → CHost->validateUpdate() in include\classes\api\services\CHost.php:1860] array_key_exists() expects parameter 2 to be array, null given [hosts.php:138 → CFrontendApiWrapper->update() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → call_user_func_array() → CHost->update() → CHost->validateUpdate() → array_key_exists() in include\classes\api\services\CHost.php:1860] Oleg Egorov How to reproduce? Ivo Kurzemnieks By using API via frontend, ofcourse. For the first problem use API::Host()->update(['hostid' => '10129']); and this will make another problem ( see (104) ). For the second problem use API::Host()->update(['hostid' => '10129', 'host' => 'host']) . This, I think, has been fixed in r56117 ( see(105) ). When in doubt update branch back to r56114. Oleg Egorov First issue is (104) and it RESOLVED in r56125; Second issue RESOLVED in r56117 (Thank you, for yours improvements) RESOLVED in r56125 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (103) [F]

          • host.massupdate does not clean fields when encryption type is changed;
          • host.massupdate does not validate PSK encryption fields.

          Oleg Egorov RESOLVED in r56162, r56179, r56180

          Ivo Kurzemnieks When massupdate form receives an error, it redirects back to list view. That is wrong. It is hard to test massupdate form for errors and JS behaviour. I made a separate issue ZBX-9958 for this.
          CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (103) [F] host.massupdate does not clean fields when encryption type is changed; host.massupdate does not validate PSK encryption fields. Oleg Egorov RESOLVED in r56162, r56179, r56180 Ivo Kurzemnieks When massupdate form receives an error, it redirects back to list view. That is wrong. It is hard to test massupdate form for errors and JS behaviour. I made a separate issue ZBX-9958 for this. CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (104) [F]

          API::Host()->update(['hostid' => '10128']);
          

          results in empty host name that cannot be select from frontend afterwards

          Oleg Egorov RESOLVED in r56125

          Ivo Kurzemnieks CLOSED.

          Show
          Ivo Kurzemnieks added a comment - - edited (104) [F] API::Host()->update(['hostid' => '10128']); results in empty host name that cannot be select from frontend afterwards Oleg Egorov RESOLVED in r56125 Ivo Kurzemnieks CLOSED.
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (105) [F]

          • host.create and host.update had incorrect function description and parameters;
          • Some things were copy pasted from Proxy API to Host API leaving variable names and comments without any changes;
          • Some functionality was useless for validateCreate() due to code being copied from checkInput();
          • validateCreate() had illocical execution order of validation here are some of the things:
            • groups field existence should be checked prior to collecting groups IDs;
            • CUpdateDiscoveredValidator() is not required for validateCreate();
            • $host['host'] is the only mandatory field for create() method, so no need for array_key_exists() after is has been validated once and correct characters and should be validated before setObjectName();
            • duplicate names can be checked via CArrayHelper::findDuplicate();
            • $host['name'] will exist either way - manually set or copied from $host['host'], so no need for array_key_exists().
            • No need to select IDs when getting all hosts and templates to check existing names.
          • Also I think validateEncryption() sounds better than encryptionValidation().
            Oleg Egorov RESOLVED in r56126

          Some of this functionality has been RESOLVED in r56117

          Most of my changes took place for validateCreate(). See if anything is fixing (102) and if you can improve validateUpdate() the same way by making execution order more logical.

          Oleg Egorov Thank you! Reviewed and CLOSED

          Show
          Ivo Kurzemnieks added a comment - - edited (105) [F] host.create and host.update had incorrect function description and parameters; Some things were copy pasted from Proxy API to Host API leaving variable names and comments without any changes; Some functionality was useless for validateCreate() due to code being copied from checkInput() ; validateCreate() had illocical execution order of validation here are some of the things: groups field existence should be checked prior to collecting groups IDs; CUpdateDiscoveredValidator() is not required for validateCreate() ; $host['host'] is the only mandatory field for create() method, so no need for array_key_exists() after is has been validated once and correct characters and should be validated before setObjectName() ; duplicate names can be checked via CArrayHelper::findDuplicate() ; $host['name'] will exist either way - manually set or copied from $host['host'] , so no need for array_key_exists() . No need to select IDs when getting all hosts and templates to check existing names. Also I think validateEncryption() sounds better than encryptionValidation() . Oleg Egorov RESOLVED in r56126 Some of this functionality has been RESOLVED in r56117 Most of my changes took place for validateCreate() . See if anything is fixing (102) and if you can improve validateUpdate() the same way by making execution order more logical. Oleg Egorov Thank you! Reviewed and CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (106) [DS] There is a problem with CRLs when two CAs have identical names.

          Suppose we have "Root CA 1", which is a parent of "Intermediate CA 1" and "Intermediate CA 2", and we have "Root CA 2", which is a parent of different "Intermediate CA 1" and "Intermediate CA 2". Suppose also that we create a CRL file which includes CRL data from all of these CAs.

          If the CRLs do not have "Authority Key Identifier" extension (see https://tools.ietf.org/html/rfc5280#section-5.2.1 for a description of that extension), then the observed behavior is as follows (note that the order of CRLs in a file matters):

          • GnuTLS works well;
          • mbed TLS logs warnings like the following (FIXME: note that the message is a bit misleading):
             18776:20151014:103502.214 cannot connect to proxy "proxy-gnutls-3.2.21": TCP successful, cannot establish TLS to [[127.0.0.1]:20003]: invalid peer certificate: CRL not signed by trusted CA
             18776:20151014:103502.216 cannot connect to proxy "proxy-mbedtls-1.3.10": TCP successful, cannot establish TLS to [[127.0.0.1]:20002]: invalid peer certificate: CRL not signed by trusted CA
             18776:20151014:103502.216 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: invalid peer certificate: CRL not signed by trusted CA
            
          • OpenSSL logs warnings like the following:
            • when making connections:
               19837:20151014:105037.737 cannot connect to proxy "proxy-gnutls-3.2.21": TCP successful, cannot establish TLS to [[127.0.0.1]:20003]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine
               19837:20151014:105037.740 cannot connect to proxy "proxy-mbedtls-1.3.10": TCP successful, cannot establish TLS to [[127.0.0.1]:20002]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine
               19837:20151014:105037.741 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine
              
            • when accepting connections:
               19037:20151014:104106.157 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin
               19037:20151014:104106.188 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin
               19037:20151014:104106.243 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin
               19037:20151014:104106.300 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin
              

          In case of mbed TLS, if we include a CRL for only "Intermediate CA 1" of "Root CA 1", then it only complains about certificates issued by "Intermediate CA 1" of "Root CA 2". If we include a CRL for only "Intermediate CA 1" of "Root CA 2", then it starts complaining about "Intermediate CA 1" of "Root CA 1" certificates, which hints that it tries to find a CRL by name only (this is kind of explainable, because "Authority Key Identifier" extension is not included).

          Now, the OpenSSL problem can be fixed by including "Authority Key Identifier" extension into the generated CRLs - it then starts working normally. Unfortunately, the mbed TLS problem remains. If possible, we should fix it. If not, we should probably document (and maybe report a bug or a feature request) that CRLs for CAs with identical names will not work in case of mbed TLS, even with "Authority Key Identifier" extension.

          Andris Mednis I think discussions PolarSSL skips checking keyids when looking for the issuer? and x509_crt_check_parent doesn't check AKID/SKID are relevant to this issue. They mention that RFC 5280 RECOMMENDS, but NOT REQUIRES mbedTLS to verify key identifiers, although CAs MUST include "Authority Key Identifier" in CRLs.
          Documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#limitations_on_using_crl_extensions .

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (106) [DS] There is a problem with CRLs when two CAs have identical names. Suppose we have "Root CA 1", which is a parent of "Intermediate CA 1" and "Intermediate CA 2", and we have "Root CA 2", which is a parent of different "Intermediate CA 1" and "Intermediate CA 2". Suppose also that we create a CRL file which includes CRL data from all of these CAs. If the CRLs do not have "Authority Key Identifier" extension (see https://tools.ietf.org/html/rfc5280#section-5.2.1 for a description of that extension), then the observed behavior is as follows (note that the order of CRLs in a file matters): GnuTLS works well; mbed TLS logs warnings like the following ( FIXME : note that the message is a bit misleading): 18776:20151014:103502.214 cannot connect to proxy "proxy-gnutls-3.2.21": TCP successful, cannot establish TLS to [[127.0.0.1]:20003]: invalid peer certificate: CRL not signed by trusted CA 18776:20151014:103502.216 cannot connect to proxy "proxy-mbedtls-1.3.10": TCP successful, cannot establish TLS to [[127.0.0.1]:20002]: invalid peer certificate: CRL not signed by trusted CA 18776:20151014:103502.216 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: invalid peer certificate: CRL not signed by trusted CA OpenSSL logs warnings like the following: when making connections: 19837:20151014:105037.737 cannot connect to proxy "proxy-gnutls-3.2.21": TCP successful, cannot establish TLS to [[127.0.0.1]:20003]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine 19837:20151014:105037.740 cannot connect to proxy "proxy-mbedtls-1.3.10": TCP successful, cannot establish TLS to [[127.0.0.1]:20002]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine 19837:20151014:105037.741 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: SSL_connect() returned SSL_ERROR_SSL: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routine when accepting connections: 19037:20151014:104106.157 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin 19037:20151014:104106.188 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin 19037:20151014:104106.243 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin 19037:20151014:104106.300 failed to accept an incoming connection: from 127.0.0.1: TLS handshake with 127.0.0.1 returned error code 1: file rsa_pk1.c line 103: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 file rsa_eay.c line 705: error:04067072:rsa routines:RSA_EAY_PUBLIC_DECRYPT:paddin In case of mbed TLS, if we include a CRL for only "Intermediate CA 1" of "Root CA 1", then it only complains about certificates issued by "Intermediate CA 1" of "Root CA 2". If we include a CRL for only "Intermediate CA 1" of "Root CA 2", then it starts complaining about "Intermediate CA 1" of "Root CA 1" certificates, which hints that it tries to find a CRL by name only (this is kind of explainable, because "Authority Key Identifier" extension is not included). Now, the OpenSSL problem can be fixed by including "Authority Key Identifier" extension into the generated CRLs - it then starts working normally. Unfortunately, the mbed TLS problem remains. If possible, we should fix it. If not, we should probably document (and maybe report a bug or a feature request) that CRLs for CAs with identical names will not work in case of mbed TLS, even with "Authority Key Identifier" extension. Andris Mednis I think discussions PolarSSL skips checking keyids when looking for the issuer? and x509_crt_check_parent doesn't check AKID/SKID are relevant to this issue. They mention that RFC 5280 RECOMMENDS, but NOT REQUIRES mbedTLS to verify key identifiers, although CAs MUST include "Authority Key Identifier" in CRLs. Documented in https://www.zabbix.com/documentation/3.0/manual/encryption/using_certificates#limitations_on_using_crl_extensions . Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (107) [DS] Suppose we have a CRL that expires during server operation.

          • mbed TLS:
            • before expiration:
               29114:20151014:161559.766 cannot connect to proxy "proxy-openssl-1.0.1e":
                  TCP successful, cannot establish TLS to [[127.0.0.1]:20004]:
                  invalid peer certificate: revoked
              
            • after expiration:
               29114:20151014:161600.768 cannot connect to proxy "proxy-openssl-1.0.1e":
                  TCP successful, cannot establish TLS to [[127.0.0.1]:20004]:
                  invalid peer certificate: revoked, CRL expired
              
          • GnuTLS:
            • before and after expiration the same:
               29273:20151014:164224.178 cannot connect to proxy "proxy-openssl-1.0.1e":
                  TCP successful, cannot establish TLS to [[127.0.0.1]:20004]:
                  invalid peer certificate: The certificate is NOT trusted. The certificate chain is revoked.
              
          • OpenSSL:
            • before expiration:
               29427:20151014:165841.400 cannot connect to proxy "proxy-openssl-1.0.1e":
                  TCP successful, cannot establish TLS to [[127.0.0.1]:20004]:
                  SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086:
                  SSL routines:ssl3_get_server_certificate:certificate verify failed:
                  TLS write fatal alert "certificate revoked"
              
            • after expiration:
               29427:20151014:165842.412 cannot connect to proxy "proxy-openssl-1.0.1e":
                  TCP successful, cannot establish TLS to [[127.0.0.1]:20004]:
                  SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086:
                  SSL routines:ssl3_get_server_certificate:certificate verify failed:
                  TLS write fatal alert "certificate expired"
              

          This is probably mostly a documentation task, because OpenSSL's message in the last case is misleading.

          Andris Mednis OpenSSL case is described in 2005 in http://svn.haxx.se/dev/archive-2005-03/0025.shtml .

          Andris Mednis Documented in https://www.zabbix.com/documentation/3.0/manual/encryption/troubleshooting/certificate_problems#crl_expired_or_expires_during_server_operation .

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (107) [DS] Suppose we have a CRL that expires during server operation. mbed TLS: before expiration: 29114:20151014:161559.766 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: invalid peer certificate: revoked after expiration: 29114:20151014:161600.768 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: invalid peer certificate: revoked, CRL expired GnuTLS: before and after expiration the same: 29273:20151014:164224.178 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: invalid peer certificate: The certificate is NOT trusted. The certificate chain is revoked. OpenSSL: before expiration: 29427:20151014:165841.400 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086: SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "certificate revoked" after expiration: 29427:20151014:165842.412 cannot connect to proxy "proxy-openssl-1.0.1e": TCP successful, cannot establish TLS to [[127.0.0.1]:20004]: SSL_connect() returned SSL_ERROR_SSL: file s3_clnt.c line 1253: error:14090086: SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "certificate expired" This is probably mostly a documentation task, because OpenSSL's message in the last case is misleading. Andris Mednis OpenSSL case is described in 2005 in http://svn.haxx.se/dev/archive-2005-03/0025.shtml . Andris Mednis Documented in https://www.zabbix.com/documentation/3.0/manual/encryption/troubleshooting/certificate_problems#crl_expired_or_expires_during_server_operation . Aleksandrs Saveljevs CLOSED
          Hide
          Ivo Kurzemnieks added a comment - - edited

          (108) [F] Similarly to hosts, proxy API also had illogical validation order. Please, review my changes in r56155, r56187

          Oleg Egorov Thank you! Reviewed and CLOSED

          Show
          Ivo Kurzemnieks added a comment - - edited (108) [F] Similarly to hosts, proxy API also had illogical validation order. Please, review my changes in r56155, r56187 Oleg Egorov Thank you! Reviewed and CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (109) [S] There is a doubt regarding the following code:

          ssize_t	res;
          
          ...
          
          if (0 > res)
          {
          #if defined(_WINDOWS)
          	zbx_set_socket_strerror("gnutls_record_send() failed: %Id %s", res, gnutls_strerror(res));
          #else
          	zbx_set_socket_strerror("gnutls_record_send() failed: %zd %s", res, gnutls_strerror(res));
          #endif
          	return ZBX_PROTO_ERROR;
          }
          

          According to "man 3 printf":

          z A following integer conversion corresponds to a size_t or ssize_t argument, or a following n conversion
          corresponds to a pointer to a size_t argument.

          We seem to avoid "%zd" because of portability issues. At least we define "zbx_fs_size_t" and "ZBX_FS_SIZE_T" specifically for outputting variables of type "size_t".

          Glebs Ivanovskis Should we similarly add "zbx_fs_size_t" and "ZBX_FS_SSIZE_T" for outputting variables of type "ssize_t"? Like in r56189.

          Aleksandrs Saveljevs Looks plausible, but is __INT64_C() defined for Windows? REOPENED.

          Glebs Ivanovskis Good point! If not, we will define it. See r56205. RESOLVED

          Aleksandrs Saveljevs Merged r56205 as r56218 in the new branch. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (109) [S] There is a doubt regarding the following code: ssize_t res; ... if (0 > res) { # if defined(_WINDOWS) zbx_set_socket_strerror( "gnutls_record_send() failed: %Id %s" , res, gnutls_strerror(res)); # else zbx_set_socket_strerror( "gnutls_record_send() failed: %zd %s" , res, gnutls_strerror(res)); #endif return ZBX_PROTO_ERROR; } According to "man 3 printf": z A following integer conversion corresponds to a size_t or ssize_t argument, or a following n conversion corresponds to a pointer to a size_t argument. We seem to avoid "%zd" because of portability issues. At least we define "zbx_fs_size_t" and "ZBX_FS_SIZE_T" specifically for outputting variables of type "size_t". Glebs Ivanovskis Should we similarly add "zbx_fs_size_t" and "ZBX_FS_SSIZE_T" for outputting variables of type "ssize_t"? Like in r56189. Aleksandrs Saveljevs Looks plausible, but is __INT64_C() defined for Windows? REOPENED. Glebs Ivanovskis Good point! If not, we will define it. See r56205. RESOLVED Aleksandrs Saveljevs Merged r56205 as r56218 in the new branch. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (110) [S] We have zbx_socket_create() function, which we use for both TCP and UDP. Currently, we call zbx_tls_connect() if either ZBX_TCP_SEC_TLS_CERT or ZBX_TCP_SEC_TLS_PSK is specified. Since we never pass those for UDP, this is OK. However, it is deemed that adding a check for type being SOCK_STREAM would make it a bit more future-proof. Currently, it is a bit superfluous, but may aid readability. On the other hand, if we handle "SOCK_STREAM == type" case, we should also handle the "else" case, making the code more verbose.

          Glebs Ivanovskis RESOLVED in r56642.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (110) [S] We have zbx_socket_create() function, which we use for both TCP and UDP. Currently, we call zbx_tls_connect() if either ZBX_TCP_SEC_TLS_CERT or ZBX_TCP_SEC_TLS_PSK is specified. Since we never pass those for UDP, this is OK. However, it is deemed that adding a check for type being SOCK_STREAM would make it a bit more future-proof. Currently, it is a bit superfluous, but may aid readability. On the other hand, if we handle "SOCK_STREAM == type" case, we should also handle the "else" case, making the code more verbose. Glebs Ivanovskis RESOLVED in r56642. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (111) [G] As of r56181, agent does not compile on Windows:

          ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2065: 'sigset_t' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2146: syntax error : missing ';' before identifier 'mask'
          ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2065: 'mask' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2065: 'orig_mask' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2484) : error C2143: syntax error : missing ';' before 'type'
          ..\..\..\src\libs\zbxcrypto\tls.c(2650) : error C2065: 'pers' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2650) : warning C4047: 'function' : 'unsigned char *' differs in levels of indirection from 'int'
          ..\..\..\src\libs\zbxcrypto\tls.c(2650) : warning C4024: 'zbx_make_personalization_string' : different types for formal and actual parameter 1
          ..\..\..\src\libs\zbxcrypto\tls.c(2654) : error C2065: 'pers' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2654) : warning C4047: 'function' : 'const unsigned char *' differs in levels of indirection from 'int'
          ..\..\..\src\libs\zbxcrypto\tls.c(2654) : warning C4024: 'ctr_drbg_init' : different types for formal and actual parameter 4
          ..\..\..\src\libs\zbxcrypto\tls.c(2658) : error C2065: 'pers' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2658) : warning C4022: 'zbx_guaranteed_memset' : pointer mismatch for actual parameter 1
          ..\..\..\src\libs\zbxcrypto\tls.c(2658) : error C2065: 'pers' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2666) : error C2065: 'pers' : undeclared identifier
          ..\..\..\src\libs\zbxcrypto\tls.c(2666) : warning C4022: 'zbx_guaranteed_memset' : pointer mismatch for actual parameter 1
          ..\..\..\src\libs\zbxcrypto\tls.c(2666) : error C2065: 'pers' : undeclared identifier�
          

          Aleksandrs Saveljevs Fixed in r56199. RESOLVED.

          Andris Mednis Thanks! Reviewed. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (111) [G] As of r56181, agent does not compile on Windows: ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2065: 'sigset_t' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2146: syntax error : missing ';' before identifier 'mask' ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2065: 'mask' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2483) : error C2065: 'orig_mask' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2484) : error C2143: syntax error : missing ';' before 'type' ..\..\..\src\libs\zbxcrypto\tls.c(2650) : error C2065: 'pers' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2650) : warning C4047: 'function' : 'unsigned char *' differs in levels of indirection from 'int' ..\..\..\src\libs\zbxcrypto\tls.c(2650) : warning C4024: 'zbx_make_personalization_string' : different types for formal and actual parameter 1 ..\..\..\src\libs\zbxcrypto\tls.c(2654) : error C2065: 'pers' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2654) : warning C4047: 'function' : 'const unsigned char *' differs in levels of indirection from 'int' ..\..\..\src\libs\zbxcrypto\tls.c(2654) : warning C4024: 'ctr_drbg_init' : different types for formal and actual parameter 4 ..\..\..\src\libs\zbxcrypto\tls.c(2658) : error C2065: 'pers' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2658) : warning C4022: 'zbx_guaranteed_memset' : pointer mismatch for actual parameter 1 ..\..\..\src\libs\zbxcrypto\tls.c(2658) : error C2065: 'pers' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2666) : error C2065: 'pers' : undeclared identifier ..\..\..\src\libs\zbxcrypto\tls.c(2666) : warning C4022: 'zbx_guaranteed_memset' : pointer mismatch for actual parameter 1 ..\..\..\src\libs\zbxcrypto\tls.c(2666) : error C2065: 'pers' : undeclared identifier� Aleksandrs Saveljevs Fixed in r56199. RESOLVED. Andris Mednis Thanks! Reviewed. CLOSED.
          Hide
          Oleg Egorov added a comment - - edited

          (112) Changed dropdown to radio in Connections to host/proxy fields

          RESOLVED in r56185

          Ivo Kurzemnieks Please see my changes in r56191, r56182, r6183, r56192

          Oleg Egorov Thank you, reviewed and CLOSED

          Show
          Oleg Egorov added a comment - - edited (112) Changed dropdown to radio in Connections to host/proxy fields RESOLVED in r56185 Ivo Kurzemnieks Please see my changes in r56191, r56182, r6183, r56192 Oleg Egorov Thank you, reviewed and CLOSED
          Hide
          Andris Mednis added a comment - - edited

          Available in version pre-3.0.0alpha3 (trunk) rev. 56207.

          Show
          Andris Mednis added a comment - - edited Available in version pre-3.0.0alpha3 (trunk) rev. 56207.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          The following subissue fixes made it into the first merge: (1) - (15), (17) - (32), (34) - (56), (59) - (65), (68), (71) - (77), (79) - (83), (87) - (90), (95) - (98), (101) - (105), (108), (111), (112).

          Show
          Aleksandrs Saveljevs added a comment - - edited The following subissue fixes made it into the first merge: (1) - (15), (17) - (32), (34) - (56), (59) - (65), (68), (71) - (77), (79) - (83), (87) - (90), (95) - (98), (101) - (105), (108), (111), (112).
          Hide
          Andris Mednis added a comment -

          The new development branch is svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-2.

          Show
          Andris Mednis added a comment - The new development branch is svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-2.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (113) The following warning are generated by clang 3.5.0 when compiling Zabbix without TLS:

          zabbix_agentd.c:271:12: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          opt_count[ch]++;
                                   ^~~
          zabbix_agentd.c:336:20: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          if (1 < opt_count[ch])
                                           ^~~
          zabbix_agentd.c:263:15: warning: unused variable 'opt_mask' [-Wunused-variable]
                  unsigned int    opt_mask = 0;
                                  ^
          3 warnings generated.
          zabbix_get.c:302:12: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          opt_count[ch]++;
                                   ^~~
          zabbix_get.c:391:20: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          if (1 < opt_count[ch])
                                           ^~~
          2 warnings generated.
          zabbix_sender.c:608:12: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          opt_count[ch]++;
                                   ^~~
          zabbix_sender.c:719:33: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          if ('v' == ch && 2 < opt_count[ch])     /* '-v' or '-vv' can be specified */
                                                        ^~~
          zabbix_sender.c:727:33: warning: array subscript is of type 'char' [-Wchar-subscripts]
                          if ('v' != ch && 1 < opt_count[ch])
                                                        ^~~
          3 warnings generated.
          

          Aleksandrs Saveljevs RESOLVED in r56275.

          Andris Mednis Thanks! Reviewed. CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (113) The following warning are generated by clang 3.5.0 when compiling Zabbix without TLS: zabbix_agentd.c:271:12: warning: array subscript is of type 'char' [-Wchar-subscripts] opt_count[ch]++; ^~~ zabbix_agentd.c:336:20: warning: array subscript is of type 'char' [-Wchar-subscripts] if (1 < opt_count[ch]) ^~~ zabbix_agentd.c:263:15: warning: unused variable 'opt_mask' [-Wunused-variable] unsigned int opt_mask = 0; ^ 3 warnings generated. zabbix_get.c:302:12: warning: array subscript is of type 'char' [-Wchar-subscripts] opt_count[ch]++; ^~~ zabbix_get.c:391:20: warning: array subscript is of type 'char' [-Wchar-subscripts] if (1 < opt_count[ch]) ^~~ 2 warnings generated. zabbix_sender.c:608:12: warning: array subscript is of type 'char' [-Wchar-subscripts] opt_count[ch]++; ^~~ zabbix_sender.c:719:33: warning: array subscript is of type 'char' [-Wchar-subscripts] if ('v' == ch && 2 < opt_count[ch]) /* '-v' or '-vv' can be specified */ ^~~ zabbix_sender.c:727:33: warning: array subscript is of type 'char' [-Wchar-subscripts] if ('v' != ch && 1 < opt_count[ch]) ^~~ 3 warnings generated. Aleksandrs Saveljevs RESOLVED in r56275. Andris Mednis Thanks! Reviewed. CLOSED
          Hide
          Sandis Neilands (Inactive) added a comment - - edited

          (114) Coverity CID 131419 when scanning build without encryption. 'sock' is checked for NULL in the new code after it was already dereferenced in zbx_tcp_check_security() few lines earlier.

          ** CID 131419:  Null pointer dereferences  (FORWARD_NULL)
          /src/libs/zbxdbhigh/proxy.c: 2275 in process_mass_data()
          
          
          ________________________________________________________________________________________________________
          *** CID 131419:  Null pointer dereferences  (FORWARD_NULL)
          /src/libs/zbxdbhigh/proxy.c: 2275 in process_mass_data()
          2269     		/* If data have come from a proxy then trust them (connection with the proxy has been already checked */
          2270     		/* and it is the responsibility of proxy to check incoming data). If the data have come directly into */
          2271     		/* trapper process (active check or trapper item) then check if the connection is allowed. */
          2272     		/* It is enough to check connection type, and optionally, certificate issuer and subject, and PSK */
          2273     		/* identity only for a host. No need to check it for every item if the host is the same. */
          2274     
          >>>     CID 131419:  Null pointer dereferences  (FORWARD_NULL)
          >>>     Comparing "sock" to null implies that "sock" might be null.
          2275     		if (0 == proxy_hostid && NULL != sock)
          2276     		{
          2277     			if (hostid_prev == items[i].host.hostid)	/* host and connection already checked */
          2278     			{
          2279     				if (0 == flag_host_allow)
          2280     					continue;
          

          Andris Mednis Coverity is right - "sock" might be NULL. It is NULL in process_mass_data() in case

          ZBX_THREAD_ENTRY(proxypoller_thread, args)
          	\
                  process_proxy();
          	  \
          	  process_hist_data(NULL, &jp, proxy.hostid, NULL, 0);  <----- 'sock' is NULL here.
          	    \
          	    process_mass_data(sock, proxy_hostid, values, values_num, &processed);
          

          I think the check in line 2275 "if (0 == proxy_hostid && NULL != sock)" is correct.

          Sandis Neilands The part that checks if 'sock' is NULL is redundant.

          • Before changes: in process_mass_data() 'sock' accessed only when 'proxy_hostid' is 0.
          • After changes: the check in line 2275 implies that 'sock' can be NULL even when 'proxy_hostid' is 0.

          Can 'sock' be NULL if 'proxy_hostid' is 0?

          No.

          The callstacks.

          #1
          process_trap()                // OK: passes sock != NULL, proxy_hostid == 0
          --process_mass_data() 
          
          #2
          process_proxy()               // OK: passes sock == NULL, proxy_hostid != 0
          --process_hist_data()         // passes sock, proxy_hostid unmodified
          ----process_mass_data()
          
          #3
          recv_agenthistory()           // OK: passes sock != NULL, proxy_hostid == 0
          --process_hist_data()         // passes sock, proxy_hostid unmodified
          ----process_mass_data()
          
          #4
          recv_proxyhistory()           // OK: passes sock != NULL, proxy_hostid != 0
          --process_hist_data()         // passes sock, proxy_hostid unmodified
          ----process_mass_data()
          

          So in both cases when 'proxy_hostid' is 0 (#1, #3) 'sock' is set.

          Improvements:

          • remove redundant part of the check in line 2275 (NULL != sock);
          • fix header comment of process_mass_data(): 'sock' is set also for proxy connections - active proxy connections (see #4 - active, #2 - passive where it is NULL).

          Sandis Neilands RESOLVED in r56347:56351, 56413. Aleksandrs Saveljevs, please review.

          Aleksandrs Saveljevs Two issues:

          1. r56413 introduces a scope with { and }, without any if's or while's before {. We usually avoid doing that and it is easy to rewrite these functions without it.
          2. Function zbx_tls_get_attr() had the following comment: "This function can be used only on server-side of TLS connection (GnuTLS makes it asymmetric)." That function was split in two, zbx_tls_get_attr_cert() and zbx_tls_get_attr_psk(), and the comment was copied over to both of them. Does it really apply to both functions (there was a conjecture that this is all because of gnutls_psk_server_get_username() function, which is only used in zbx_tls_get_attr_psk())? In any case, it would be nice to clarify what exactly in GnuTLS makes it asymmetric.

          REOPENED

          Sandis Neilands RESOLVED in r56639. Also removed connection_type from zbx_tls_conn_attr_t - didn't find any uses of it. Aleksandrs Saveljevs, please review.

          Aleksandrs Saveljevs Multiple issues:

          1. Structure zbx_tls_conn_attr_t is now empty when compiled without TLS. With GCC, this is allowed, see https://gcc.gnu.org/onlinedocs/gcc/Empty-Structures.html . However, according to http://stackoverflow.com/questions/24685399/c-empty-struct-what-does-this-mean-do , empty structures yield undefined behavior.
          2. If you removed attr->connection_type from zbx_tls_get_attr_cert() and zbx_tls_get_attr_psk(), then splitting these functions into variable declarations and code was not necessary. Although it does not hurt either, can be left as is.
          3. In zbx_tls_get_attr_psk() function's comment, it refers to gnutls_psk_server_get_username() and gnutls_psk_client_get_hint(). If we mention the latter function, we may wish to mention that it is currently not used by Zabbix and is simply a theoretical antipode to the former function.
          4. That comment also lacks a final ")".

          REOPENED

          Andris Mednis Structure zbx_tls_conn_attr_t was designed with 2 purposes:

          1. to keep those connection attributes which cannot be trivially obtained from zbx_socket_t,
          2. to limit information passed around just to connection attributes (including connection_type), without exposing TLS context from zbx_socket_t.

          Sandis Neilands Fixed comment for zbx_tls_get_attr_psk() in r56750.

          Regarding conditional compilation - currently most calls to encryption code are compiled in conditionally. We should do the same for the encryption functions and data structures.

          Let's revisit this point after (143) is done.

          Aleksandrs Saveljevs Comment in r56750 looks good.

          Andris Mednis Does not compile on MS Windows without encryption because of empty zbx_tls_conn_attr_t structure:

          d:\zabbix-3.0.0alpha5\src\zabbix_agent\../libs/zbxcrypto/tls_tcp_active.h(31) :
          error C2016: C requires that a struct or union has at least one member
          NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 9.0\
          VC\Bin\amd64\cl.exe"' : return code '0x2'
          

          Sandis Neilands RESOLVED in r57005. Andris Mednis, please review.

          Andris Mednis r57005 reviewed, accepted. CLOSED

          Show
          Sandis Neilands (Inactive) added a comment - - edited (114) Coverity CID 131419 when scanning build without encryption. 'sock' is checked for NULL in the new code after it was already dereferenced in zbx_tcp_check_security() few lines earlier. ** CID 131419: Null pointer dereferences (FORWARD_NULL) /src/libs/zbxdbhigh/proxy.c: 2275 in process_mass_data() ________________________________________________________________________________________________________ *** CID 131419: Null pointer dereferences (FORWARD_NULL) /src/libs/zbxdbhigh/proxy.c: 2275 in process_mass_data() 2269 /* If data have come from a proxy then trust them (connection with the proxy has been already checked */ 2270 /* and it is the responsibility of proxy to check incoming data). If the data have come directly into */ 2271 /* trapper process (active check or trapper item) then check if the connection is allowed. */ 2272 /* It is enough to check connection type, and optionally, certificate issuer and subject, and PSK */ 2273 /* identity only for a host. No need to check it for every item if the host is the same. */ 2274 >>> CID 131419: Null pointer dereferences (FORWARD_NULL) >>> Comparing "sock" to null implies that "sock" might be null . 2275 if (0 == proxy_hostid && NULL != sock) 2276 { 2277 if (hostid_prev == items[i].host.hostid) /* host and connection already checked */ 2278 { 2279 if (0 == flag_host_allow) 2280 continue ; Andris Mednis Coverity is right - "sock" might be NULL. It is NULL in process_mass_data() in case ZBX_THREAD_ENTRY(proxypoller_thread, args) \ process_proxy(); \ process_hist_data(NULL, &jp, proxy.hostid, NULL, 0); <----- 'sock' is NULL here. \ process_mass_data(sock, proxy_hostid, values, values_num, &processed); I think the check in line 2275 "if (0 == proxy_hostid && NULL != sock)" is correct. Sandis Neilands The part that checks if 'sock' is NULL is redundant. Before changes : in process_mass_data() 'sock' accessed only when 'proxy_hostid' is 0. After changes : the check in line 2275 implies that 'sock' can be NULL even when 'proxy_hostid' is 0. Can 'sock' be NULL if 'proxy_hostid' is 0? No. The callstacks. #1 process_trap() // OK: passes sock != NULL, proxy_hostid == 0 --process_mass_data() #2 process_proxy() // OK: passes sock == NULL, proxy_hostid != 0 --process_hist_data() // passes sock, proxy_hostid unmodified ----process_mass_data() #3 recv_agenthistory() // OK: passes sock != NULL, proxy_hostid == 0 --process_hist_data() // passes sock, proxy_hostid unmodified ----process_mass_data() #4 recv_proxyhistory() // OK: passes sock != NULL, proxy_hostid != 0 --process_hist_data() // passes sock, proxy_hostid unmodified ----process_mass_data() So in both cases when 'proxy_hostid' is 0 (#1, #3) 'sock' is set. Improvements: remove redundant part of the check in line 2275 ( NULL != sock ); fix header comment of process_mass_data(): 'sock' is set also for proxy connections - active proxy connections (see #4 - active, #2 - passive where it is NULL). Sandis Neilands RESOLVED in r56347:56351, 56413. Aleksandrs Saveljevs , please review. Aleksandrs Saveljevs Two issues: r56413 introduces a scope with { and }, without any if's or while's before {. We usually avoid doing that and it is easy to rewrite these functions without it. Function zbx_tls_get_attr() had the following comment: "This function can be used only on server-side of TLS connection (GnuTLS makes it asymmetric)." That function was split in two, zbx_tls_get_attr_cert() and zbx_tls_get_attr_psk(), and the comment was copied over to both of them. Does it really apply to both functions (there was a conjecture that this is all because of gnutls_psk_server_get_username() function, which is only used in zbx_tls_get_attr_psk())? In any case, it would be nice to clarify what exactly in GnuTLS makes it asymmetric. REOPENED Sandis Neilands RESOLVED in r56639. Also removed connection_type from zbx_tls_conn_attr_t - didn't find any uses of it. Aleksandrs Saveljevs , please review. Aleksandrs Saveljevs Multiple issues: Structure zbx_tls_conn_attr_t is now empty when compiled without TLS. With GCC, this is allowed, see https://gcc.gnu.org/onlinedocs/gcc/Empty-Structures.html . However, according to http://stackoverflow.com/questions/24685399/c-empty-struct-what-does-this-mean-do , empty structures yield undefined behavior. If you removed attr->connection_type from zbx_tls_get_attr_cert() and zbx_tls_get_attr_psk(), then splitting these functions into variable declarations and code was not necessary. Although it does not hurt either, can be left as is. In zbx_tls_get_attr_psk() function's comment, it refers to gnutls_psk_server_get_username() and gnutls_psk_client_get_hint(). If we mention the latter function, we may wish to mention that it is currently not used by Zabbix and is simply a theoretical antipode to the former function. That comment also lacks a final ")". REOPENED Andris Mednis Structure zbx_tls_conn_attr_t was designed with 2 purposes: to keep those connection attributes which cannot be trivially obtained from zbx_socket_t , to limit information passed around just to connection attributes (including connection_type ), without exposing TLS context from zbx_socket_t . Sandis Neilands Fixed comment for zbx_tls_get_attr_psk() in r56750. Regarding conditional compilation - currently most calls to encryption code are compiled in conditionally. We should do the same for the encryption functions and data structures. Let's revisit this point after (143) is done. Aleksandrs Saveljevs Comment in r56750 looks good. Andris Mednis Does not compile on MS Windows without encryption because of empty zbx_tls_conn_attr_t structure: d:\zabbix-3.0.0alpha5\src\zabbix_agent\../libs/zbxcrypto/tls_tcp_active.h(31) : error C2016: C requires that a struct or union has at least one member NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 9.0\ VC\Bin\amd64\cl.exe"' : return code '0x2' Sandis Neilands RESOLVED in r57005. Andris Mednis , please review. Andris Mednis r57005 reviewed, accepted. CLOSED
          Hide
          Sandis Neilands (Inactive) added a comment - - edited

          (115) Coverity CID 131420 for OpenSSL build. '*attr.issuer' and '*attr.subject' might be used uninitialized in zbx_check_server_issuer_subject() in case of TLS PSK connection type (see zbx_tls_get_attr()).

          * CID 131420:    (UNINIT)
          /src/libs/zbxcrypto/tls.c: 2367 in zbx_check_server_issuer_subject()
          /src/libs/zbxcrypto/tls.c: 2374 in zbx_check_server_issuer_subject()
          
          
          ________________________________________________________________________________________________________
          *** CID 131420:    (UNINIT)
          /src/libs/zbxcrypto/tls.c: 2367 in zbx_check_server_issuer_subject()
          2361     		*error = zbx_dsprintf(*error, "cannot get connection attributes for connection from %s",
          2362     				get_ip_by_socket(sock));
          2363     		return FAIL;
          2364     	}
          2365     
          2366     	/* simplified match, not compliant with RFC 4517, 4518 */
          >>>     CID 131420:    (UNINIT)
          >>>     Using uninitialized element of array "*attr.issuer" when calling "strcmp".
          2367     	if (NULL != CONFIG_TLS_SERVER_CERT_ISSUER && 0 != strcmp(CONFIG_TLS_SERVER_CERT_ISSUER, attr.issuer))
          2368     	{
          2369     		*error = zbx_dsprintf(*error, "certificate issuer does not match for %s", get_ip_by_socket(sock));
          2370     		return FAIL;
          2371     	}
          2372     
          /src/libs/zbxcrypto/tls.c: 2374 in zbx_check_server_issuer_subject()
          2368     	{
          2369     		*error = zbx_dsprintf(*error, "certificate issuer does not match for %s", get_ip_by_socket(sock));
          2370     		return FAIL;
          2371     	}
          2372     
          2373     	/* simplified match, not compliant with RFC 4517, 4518 */
          >>>     CID 131420:    (UNINIT)
          >>>     Using uninitialized element of array "*attr.subject" when calling "strcmp".
          2374     	if (NULL != CONFIG_TLS_SERVER_CERT_SUBJECT && 0 != strcmp(CONFIG_TLS_SERVER_CERT_SUBJECT, attr.subject))
          2375     	{
          2376     		*error = zbx_dsprintf(*error, "certificate subject does not match for %s", get_ip_by_socket(sock));
          2377     		return FAIL;
          2378     	}
          2379     
          

          Sandis Neilands RESOLVED in r56347:56351, 56413. Aleksandrs Saveljevs, please review.

          Aleksandrs Saveljevs I have fixed a bit of style and a compiler warning regarding unused "attr" in DCcheck_proxy_permissions() in r56549, but I am not sure about the previous changes to that function. Even though they do not introduce any new headers into dbconfig.c, they introduce zbx_socket_t. That dbconfig.c file should have as little to do with networking code as possible.

          Aleksandrs Saveljevs By the way, what are DCcheck_proxy_permissions() and DCget_psk_by_identity() doing in tls_tcp_active.h? Wouldn't include/dbcache.h be a better place for them? We can open a new subissue for that. REOPENED.

          Sandis Neilands As for your first point - zbx_tls_conn_attr_t is networking code as well. Changing it to zbx_socket_t replaces one networking structure to another.

          You raise a valid question in your second point. Indeed having those two functions defined in libs/zbxdbcache/dbconfig.c but declared in libs/zbxcrypto/tls_tcp_active.h probably indicates some sort of compromise solution earlier in the development.

          It might have something to do with the following comment in libs/zbxcrypto/tls.c.

          /* Pointer to DCget_psk_by_identity() initialized at runtime. This is a workaround for linking. */
          /* Server and proxy link with src/libs/zbxdbcache/dbconfig.o where DCget_psk_by_identity() resides */
          /* but other components (e.g. agent) do not link dbconfig.o. */
          

          I suggest opening a new sub-issue or even ZBX for this problem and closing this one on the grounds that having networking code (either socket or TLS connection attributes) in dbconfig.c is indeed inappropriate however the underlying issue was not introduced in scope of this sub-issue.

          Aleksandrs Saveljevs As for the first point, with all due respect to zbx_tls_conn_attr_t, it is not as hardcore a networking code as zbx_socket_t. This was also the use for attr->connection_type, removed in (114). We can discuss with Andris Mednis whether the change was worth it.

          Aleksandrs Saveljevs Second point moved to (143).

          Sandis Neilands Let's see how the code looks after (143) and then revisit this point.

          Sandis Neilands Changed DCcheck_proxy_permissions() to acquire TLS connection attributes before locking the cache in r59078.

          The reason why DCcheck_proxy_permissions() is in dbconfig.c at all is that we need to lock the cache from that function. Cache locking currently can only be done from dbconfig.c file. since Zabbix can be compiled without encryption we cannot pass more specific encryption structures to the function without resorting to void pointers or other hacks.

          Aleksandrs Saveljevs Removed extra blank lines in r59079. Apart from that, looks good enough for now.

          Sandis Neilands CLOSED.

          Show
          Sandis Neilands (Inactive) added a comment - - edited (115) Coverity CID 131420 for OpenSSL build. '*attr.issuer' and '*attr.subject' might be used uninitialized in zbx_check_server_issuer_subject() in case of TLS PSK connection type (see zbx_tls_get_attr()). * CID 131420: (UNINIT) /src/libs/zbxcrypto/tls.c: 2367 in zbx_check_server_issuer_subject() /src/libs/zbxcrypto/tls.c: 2374 in zbx_check_server_issuer_subject() ________________________________________________________________________________________________________ *** CID 131420: (UNINIT) /src/libs/zbxcrypto/tls.c: 2367 in zbx_check_server_issuer_subject() 2361 *error = zbx_dsprintf(*error, "cannot get connection attributes for connection from %s" , 2362 get_ip_by_socket(sock)); 2363 return FAIL; 2364 } 2365 2366 /* simplified match, not compliant with RFC 4517, 4518 */ >>> CID 131420: (UNINIT) >>> Using uninitialized element of array "*attr.issuer" when calling "strcmp" . 2367 if (NULL != CONFIG_TLS_SERVER_CERT_ISSUER && 0 != strcmp(CONFIG_TLS_SERVER_CERT_ISSUER, attr.issuer)) 2368 { 2369 *error = zbx_dsprintf(*error, "certificate issuer does not match for %s" , get_ip_by_socket(sock)); 2370 return FAIL; 2371 } 2372 /src/libs/zbxcrypto/tls.c: 2374 in zbx_check_server_issuer_subject() 2368 { 2369 *error = zbx_dsprintf(*error, "certificate issuer does not match for %s" , get_ip_by_socket(sock)); 2370 return FAIL; 2371 } 2372 2373 /* simplified match, not compliant with RFC 4517, 4518 */ >>> CID 131420: (UNINIT) >>> Using uninitialized element of array "*attr.subject" when calling "strcmp" . 2374 if (NULL != CONFIG_TLS_SERVER_CERT_SUBJECT && 0 != strcmp(CONFIG_TLS_SERVER_CERT_SUBJECT, attr.subject)) 2375 { 2376 *error = zbx_dsprintf(*error, "certificate subject does not match for %s" , get_ip_by_socket(sock)); 2377 return FAIL; 2378 } 2379 Sandis Neilands RESOLVED in r56347:56351, 56413. Aleksandrs Saveljevs , please review. Aleksandrs Saveljevs I have fixed a bit of style and a compiler warning regarding unused "attr" in DCcheck_proxy_permissions() in r56549, but I am not sure about the previous changes to that function. Even though they do not introduce any new headers into dbconfig.c, they introduce zbx_socket_t. That dbconfig.c file should have as little to do with networking code as possible. Aleksandrs Saveljevs By the way, what are DCcheck_proxy_permissions() and DCget_psk_by_identity() doing in tls_tcp_active.h? Wouldn't include/dbcache.h be a better place for them? We can open a new subissue for that. REOPENED. Sandis Neilands As for your first point - zbx_tls_conn_attr_t is networking code as well. Changing it to zbx_socket_t replaces one networking structure to another. You raise a valid question in your second point. Indeed having those two functions defined in libs/zbxdbcache/dbconfig.c but declared in libs/zbxcrypto/tls_tcp_active.h probably indicates some sort of compromise solution earlier in the development. It might have something to do with the following comment in libs/zbxcrypto/tls.c . /* Pointer to DCget_psk_by_identity() initialized at runtime. This is a workaround for linking. */ /* Server and proxy link with src/libs/zbxdbcache/dbconfig.o where DCget_psk_by_identity() resides */ /* but other components (e.g. agent) do not link dbconfig.o. */ I suggest opening a new sub-issue or even ZBX for this problem and closing this one on the grounds that having networking code (either socket or TLS connection attributes) in dbconfig.c is indeed inappropriate however the underlying issue was not introduced in scope of this sub-issue. Aleksandrs Saveljevs As for the first point, with all due respect to zbx_tls_conn_attr_t , it is not as hardcore a networking code as zbx_socket_t . This was also the use for attr->connection_type , removed in (114). We can discuss with Andris Mednis whether the change was worth it. Aleksandrs Saveljevs Second point moved to (143). Sandis Neilands Let's see how the code looks after (143) and then revisit this point. Sandis Neilands Changed DCcheck_proxy_permissions() to acquire TLS connection attributes before locking the cache in r59078. The reason why DCcheck_proxy_permissions() is in dbconfig.c at all is that we need to lock the cache from that function. Cache locking currently can only be done from dbconfig.c file. since Zabbix can be compiled without encryption we cannot pass more specific encryption structures to the function without resorting to void pointers or other hacks. Aleksandrs Saveljevs Removed extra blank lines in r59079. Apart from that, looks good enough for now. Sandis Neilands CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (116) [F] Suppose we have a discovered host (one created from a prototype). In the "Encryption" tab, it has everything disabled, which is good. But if we clone this host, the new host will have "Connections to host" reset to "No encryption", even if the original host had it set to something else.

          Oleg Egorov RESOLVED in r56258

          Ivo Kurzemnieks CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (116) [F] Suppose we have a discovered host (one created from a prototype). In the "Encryption" tab, it has everything disabled, which is good. But if we clone this host, the new host will have "Connections to host" reset to "No encryption", even if the original host had it set to something else. Oleg Egorov RESOLVED in r56258 Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (117) [F] In the host list, "Agent encryption" column looks dubious - as of r56237, everything there is green. But if everything is green, that means that the color does not carry any semantic load, and if the color does not convey any meaning, then why use it?

          In the past, "NONE" used to be gray, which was a good idea. It was changed, perhaps due to a consideration that gray in the "Availability" column means "nothing happens" or "not present", whereas in the "Agent encryption" column "None" is a valid connection method. However, then there are two further considerations: (a) in the "Availability" column, we show all possible options, whereas in the "Agent encryption" column we show only enabled options, so the meta-meaning of these columns is different anyway, and (b) we can think about "None" like "encryption does not happen" or "encryption not present", so it would have about the same meaning as the "Availability" column. Green color gives a sense of security and if we have no encryption, then there is nothing secure about it.

          Oleg Egorov Redesigned host list encryption column.
          Added not used "Connections from host" encryption methods.
          RESOLVED IN r56238

          <richlv> note that we show (now) 'none' only of there's no incoming/outgoing encryption for that host - as soon as there's at least one encryption option enabled, we should show them all, so the meaning of on/off should be at least somewhat logical

          Ivo Kurzemnieks One more thing: let's use ZBX_STYLE_STATUS_* in views.
          REOPENED

          Oleg Egorov RESOLVED in r56285, r56288

          Ivo Kurzemnieks CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (117) [F] In the host list, "Agent encryption" column looks dubious - as of r56237, everything there is green. But if everything is green, that means that the color does not carry any semantic load, and if the color does not convey any meaning, then why use it? In the past, "NONE" used to be gray, which was a good idea. It was changed, perhaps due to a consideration that gray in the "Availability" column means "nothing happens" or "not present", whereas in the "Agent encryption" column "None" is a valid connection method. However, then there are two further considerations: (a) in the "Availability" column, we show all possible options, whereas in the "Agent encryption" column we show only enabled options, so the meta-meaning of these columns is different anyway, and (b) we can think about "None" like "encryption does not happen" or "encryption not present", so it would have about the same meaning as the "Availability" column. Green color gives a sense of security and if we have no encryption, then there is nothing secure about it. Oleg Egorov Redesigned host list encryption column. Added not used "Connections from host" encryption methods. RESOLVED IN r56238 < richlv > note that we show (now) 'none' only of there's no incoming/outgoing encryption for that host - as soon as there's at least one encryption option enabled, we should show them all, so the meaning of on/off should be at least somewhat logical Ivo Kurzemnieks One more thing: let's use ZBX_STYLE_STATUS_* in views. REOPENED Oleg Egorov RESOLVED in r56285, r56288 Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (119) [F] In the host list, suppose we click "Enable" to enable a host. The detail message in the green cloud above says the following:

          Updated status of host "TEST"
          

          According to https://zabbix.org/wiki/Docs/specs/syntax#Formatting, such messages should always end with a period.

          Oleg Egorov Old issue, fixed in r56254. RESOLVED.

          Aleksandrs Saveljevs The issue was fixed like so:

          -		info(_('Updated status of host').' "'.$host['host'].'"');
          +		info(_('Updated status of host').' "'.$host['host'].'".');
          

          Shouldn't the message use a parameter instead of concatenation? In Russian, for instance, it looks like this:

          Состояние узла сети обновлено "TEST"
          

          REOPENED

          Oleg Egorov Thank you! RESOLVED in r56287

          Ivo Kurzemnieks CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (119) [F] In the host list, suppose we click "Enable" to enable a host. The detail message in the green cloud above says the following: Updated status of host "TEST" According to https://zabbix.org/wiki/Docs/specs/syntax#Formatting , such messages should always end with a period. Oleg Egorov Old issue, fixed in r56254. RESOLVED. Aleksandrs Saveljevs The issue was fixed like so: - info(_('Updated status of host').' "'.$host['host'].'"'); + info(_('Updated status of host').' "'.$host['host'].'".'); Shouldn't the message use a parameter instead of concatenation? In Russian, for instance, it looks like this: Состояние узла сети обновлено "TEST" REOPENED Oleg Egorov Thank you! RESOLVED in r56287 Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (120) [F] In "Encryption" tab for proxies, it has two labels: "Connections to host" and "Connections from proxy". The first should probably say "Connections to proxy".

          Oleg Egorov RESOLVED in r56257

          Ivo Kurzemnieks CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (120) [F] In "Encryption" tab for proxies, it has two labels: "Connections to host" and "Connections from proxy". The first should probably say "Connections to proxy". Oleg Egorov RESOLVED in r56257 Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (121) [G] Suppose we start zabbix_sender with -vv option. We then get the following output:

          zabbix_sender [25010]: DEBUG: In zbx_tls_init_child()
          zabbix_sender [25010]: DEBUG: OpenSSL library (version OpenSSL 1.0.2c 12 Jun 2015) initialized
          zabbix_sender [25010]: DEBUG: End of zbx_tls_init_child()
          zabbix_sender [25013]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000022"}]
          info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000022"
          sent: 1; skipped: 0; total: 1
          

          The zbx_tls_init_child() debug is what is added, compared to previous non-TLS versions. It does not look that bad, but I am not sure it was intended (especially "In ..." and "End of ..." lines). Could you please check?

          Andris Mednis RESOLVED in r56259. If it is a good solution, it could be applied to zabbix_get, too.

          Aleksandrs Saveljevs Looks suspicious - we only perform initialization if TLS parameters are present, but perform deinitialization unconditionally. REOPENED.

          Aleksandrs Saveljevs Actually, my question was whether we intended to include the complete TLS debug output into -vv or not.

          Andris Mednis It was my intention to include TLS debug output with -vv only if TLS parameters are used and skip it otherwise. Seems useful, for example:

          $ zabbix_sender -c ~/zabbix_agentd.conf -z xxx.xxx.xxx.xxx -p xxx -s "Zabbix server" -k trapper_test -o ABC -vv
          zabbix_sender [12667]: DEBUG: In zbx_tls_init_child()
          zabbix_sender [12667]: DEBUG: OpenSSL library (version OpenSSL 1.0.2d 9 Jul 2015) initialized
          zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): loaded CA certificate(s) from file "/home/zabbix30/zabbix_ca_file"
          zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): loaded certificate(s) from file "/home/zabbix30/zabbix_agentd.crt"
          zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): loaded private key from file "/home/zabbix30/zabbix_agentd.key"
          zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): certificate ciphersuites: ECDHE-RSA-AES128-GCM-SHA256 ECDHE-RSA-AES128-SHA256 ECDHE-RSA-AES128-SHA AES128-GCM-SHA256 AES128-SHA256 AES128-SHA
          zabbix_sender [12667]: DEBUG: End of zbx_tls_init_child()
          zabbix_sender [12668]: DEBUG: In zbx_tls_connect(): issuer:"CN=ZBXNEXT-1263 Signing CA,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" subject:"CN=Zabbix proxy,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com"
          zabbix_sender [12668]: DEBUG: peer certificate issuer:"CN=ZBXNEXT-1263 Signing CA,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" subject:"CN=Zabbix proxy,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com"
          zabbix_sender [12668]: DEBUG: End of zbx_tls_connect():SUCCEED (established TLSv1.2 AES128-GCM-SHA256)
          zabbix_sender [12668]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000121"}]
          info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000121"
          sent: 1; skipped: 0; total: 1
          

          Are there reasons to not include it ?

          Aleksandrs Saveljevs The output is useful. Just wanted to clarify whether that is a bug or a feature. But your idea of skipping TLS library initialization if TLS settings are not used is a good one.

          Aleksandrs Saveljevs Extending that further: Zabbix sender, if used with agent configuration file, will load certificates, CRLs, etc., if those are present in the agent configuration. However, if --tls-connect is "psk", then doing that is not necessary.

          Andris Mednis RESOLVED in r59167.

          Aleksandrs Saveljevs Removed unnecessary TLS initialization in zabbix_get in r59265. Please review.

          Andris Mednis Thanks! Reviewed, accepted.

          Aleksandrs Saveljevs CLOSED, although as mentioned by Glebs Ivanovskis, r59265 conflicts with (158). Let's handle this in (158).

          Show
          Aleksandrs Saveljevs added a comment - - edited (121) [G] Suppose we start zabbix_sender with -vv option. We then get the following output: zabbix_sender [25010]: DEBUG: In zbx_tls_init_child() zabbix_sender [25010]: DEBUG: OpenSSL library (version OpenSSL 1.0.2c 12 Jun 2015) initialized zabbix_sender [25010]: DEBUG: End of zbx_tls_init_child() zabbix_sender [25013]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000022"}] info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000022" sent: 1; skipped: 0; total: 1 The zbx_tls_init_child() debug is what is added, compared to previous non-TLS versions. It does not look that bad, but I am not sure it was intended (especially "In ..." and "End of ..." lines). Could you please check? Andris Mednis RESOLVED in r56259. If it is a good solution, it could be applied to zabbix_get, too. Aleksandrs Saveljevs Looks suspicious - we only perform initialization if TLS parameters are present, but perform deinitialization unconditionally. REOPENED. Aleksandrs Saveljevs Actually, my question was whether we intended to include the complete TLS debug output into -vv or not. Andris Mednis It was my intention to include TLS debug output with -vv only if TLS parameters are used and skip it otherwise. Seems useful, for example: $ zabbix_sender -c ~/zabbix_agentd.conf -z xxx.xxx.xxx.xxx -p xxx -s "Zabbix server" -k trapper_test -o ABC -vv zabbix_sender [12667]: DEBUG: In zbx_tls_init_child() zabbix_sender [12667]: DEBUG: OpenSSL library (version OpenSSL 1.0.2d 9 Jul 2015) initialized zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): loaded CA certificate(s) from file "/home/zabbix30/zabbix_ca_file" zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): loaded certificate(s) from file "/home/zabbix30/zabbix_agentd.crt" zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): loaded private key from file "/home/zabbix30/zabbix_agentd.key" zabbix_sender [12667]: DEBUG: zbx_tls_init_child(): certificate ciphersuites: ECDHE-RSA-AES128-GCM-SHA256 ECDHE-RSA-AES128-SHA256 ECDHE-RSA-AES128-SHA AES128-GCM-SHA256 AES128-SHA256 AES128-SHA zabbix_sender [12667]: DEBUG: End of zbx_tls_init_child() zabbix_sender [12668]: DEBUG: In zbx_tls_connect(): issuer:"CN=ZBXNEXT-1263 Signing CA,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" subject:"CN=Zabbix proxy,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" zabbix_sender [12668]: DEBUG: peer certificate issuer:"CN=ZBXNEXT-1263 Signing CA,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" subject:"CN=Zabbix proxy,OU=C development team,O=Zabbix SIA,DC=zabbix,DC=com" zabbix_sender [12668]: DEBUG: End of zbx_tls_connect():SUCCEED (established TLSv1.2 AES128-GCM-SHA256) zabbix_sender [12668]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000121"}] info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000121" sent: 1; skipped: 0; total: 1 Are there reasons to not include it ? Aleksandrs Saveljevs The output is useful. Just wanted to clarify whether that is a bug or a feature. But your idea of skipping TLS library initialization if TLS settings are not used is a good one. Aleksandrs Saveljevs Extending that further: Zabbix sender, if used with agent configuration file, will load certificates, CRLs, etc., if those are present in the agent configuration. However, if --tls-connect is "psk", then doing that is not necessary. Andris Mednis RESOLVED in r59167. Aleksandrs Saveljevs Removed unnecessary TLS initialization in zabbix_get in r59265. Please review. Andris Mednis Thanks! Reviewed, accepted. Aleksandrs Saveljevs CLOSED, although as mentioned by Glebs Ivanovskis , r59265 conflicts with (158). Let's handle this in (158).
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (122) [S] When a server loads PSK values into configuration cache, it treats "aa" and "AA" differently:

           27594:20151019:173150.373 PSK value changed for identity "Zabbix sender"
           27594:20151019:173150.373 PSK value changed for identity "Zabbix sender"
          

          It probably should treat them in a case-insensitive manner.

          Glebs Ivanovskis RESOLVED in r56444.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (122) [S] When a server loads PSK values into configuration cache, it treats "aa" and "AA" differently: 27594:20151019:173150.373 PSK value changed for identity "Zabbix sender" 27594:20151019:173150.373 PSK value changed for identity "Zabbix sender" It probably should treat them in a case-insensitive manner. Glebs Ivanovskis RESOLVED in r56444. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (123) [G] This is probably not related to this development, but the Windows agent sometimes prints the following on startup:

            2956:20151020:152820.964 agent #0 started [main process]
            3100:20151020:152820.964 agent #1 started [collector]
            2328:20151020:152820.964 agent #3 started [listener #2]
            3220:20151020:152820.964 agent #3 started [listener #1]
            3664:20151020:152820.964 agent #4 started [active checks #1
          

          Note the same agent number for both listeners.

          Sandis Neilands Indeed, this is a regression since ZBX-8796. process_type, process_num, server_num are NOT thread local variables.

          src/zabbix_agent/zabbix_agentd.c

          ZBX_THREAD_LOCAL unsigned char  process_type    = 255;  /* ZBX_PROCESS_TYPE_UNKNOWN */
          ZBX_THREAD_LOCAL int            process_num;
          ZBX_THREAD_LOCAL int            server_num      = 0;
          

          src/zabbix_agent/listener.c, etc.

          extern unsigned char    process_type, program_type;
          extern int              server_num, process_num;
          

          According to Microsoft this should produce (compile time?) errors. We don't have neither errors, nor warnings in our build environment regarding this problem.

          You must use the thread attribute for the declaration and the definition of thread local data,
          regardless of whether the declaration and definition occur in the same file or separate files.
          For example, the following code generates an error: 
          
          #define Thread   __declspec( thread )
          extern int tls_i;     /* This generates an error, because the   */
          int Thread tls_i;     /* declaration and the definition differ. */
          

          Sandis Neilands CLOSED. Opened ZBX-10031 for this sub-issue as it's not related to encryption work.

          Show
          Aleksandrs Saveljevs added a comment - - edited (123) [G] This is probably not related to this development, but the Windows agent sometimes prints the following on startup: 2956:20151020:152820.964 agent #0 started [main process] 3100:20151020:152820.964 agent #1 started [collector] 2328:20151020:152820.964 agent #3 started [listener #2] 3220:20151020:152820.964 agent #3 started [listener #1] 3664:20151020:152820.964 agent #4 started [active checks #1 Note the same agent number for both listeners. Sandis Neilands Indeed, this is a regression since ZBX-8796 . process_type, process_num, server_num are NOT thread local variables. src/zabbix_agent/zabbix_agentd.c ZBX_THREAD_LOCAL unsigned char process_type = 255; /* ZBX_PROCESS_TYPE_UNKNOWN */ ZBX_THREAD_LOCAL int process_num; ZBX_THREAD_LOCAL int server_num = 0; src/zabbix_agent/listener.c , etc. extern unsigned char process_type, program_type; extern int server_num, process_num; According to Microsoft this should produce (compile time?) errors. We don't have neither errors, nor warnings in our build environment regarding this problem. You must use the thread attribute for the declaration and the definition of thread local data, regardless of whether the declaration and definition occur in the same file or separate files. For example, the following code generates an error: #define Thread __declspec( thread ) extern int tls_i; /* This generates an error, because the */ int Thread tls_i; /* declaration and the definition differ. */ Sandis Neilands CLOSED. Opened ZBX-10031 for this sub-issue as it's not related to encryption work.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (124) [G] Agents now print the list of enabled features on startup:

            9973:20151020:142944.687 Starting Zabbix Agent [agent-polarssl-1.3.9]. Zabbix 3.0.0alpha4 (revision {ZABBIX_REVISION}).
            9973:20151020:142944.687 **** Enabled features ****
            9973:20151020:142944.687 TLS support:           YES
            9973:20151020:142944.687 **************************
          

          Let's add IPv6 support here, too.

          Aleksandrs Saveljevs RESOLVED in r56286.

          Andris Mednis CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (124) [G] Agents now print the list of enabled features on startup: 9973:20151020:142944.687 Starting Zabbix Agent [agent-polarssl-1.3.9]. Zabbix 3.0.0alpha4 (revision {ZABBIX_REVISION}). 9973:20151020:142944.687 **** Enabled features **** 9973:20151020:142944.687 TLS support: YES 9973:20151020:142944.687 ************************** Let's add IPv6 support here, too. Aleksandrs Saveljevs RESOLVED in r56286. Andris Mednis CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (125) [GS] Suppose we have 3 Windows agents with the same PSK identity ("Windows Agent") and PSK value (all correctly configured in the agent configuration files). Among those agents, a GnuTLS host is configured in the frontend with the correct identity, but mbed TLS and OpenSSL hosts are configured with a mistyped one: "Windows Agent1". So GnuTLS data should be accepted by the server, but mbed TLS and OpenSSL data - rejected.

          It works well, but notice that in the log messages below the received PSK identity from mbed TLS sometimes has "~" character at the end:

           11895:20151020:163739.695 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent~"
           11895:20151020:163754.164 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent"
           11895:20151020:163809.117 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent"
           11895:20151020:163825.586 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent"
           11895:20151020:163840.055 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent~"
           11895:20151020:163855.523 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent~"
          

          Aleksandrs Saveljevs Might be non-trivial to reproduce. Only happened once for me so far.

          Andris Mednis Thanks for finding! RESOLVED in r58351.

          Aleksandrs Saveljevs Looks good. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (125) [GS] Suppose we have 3 Windows agents with the same PSK identity ("Windows Agent") and PSK value (all correctly configured in the agent configuration files). Among those agents, a GnuTLS host is configured in the frontend with the correct identity, but mbed TLS and OpenSSL hosts are configured with a mistyped one: "Windows Agent1". So GnuTLS data should be accepted by the server, but mbed TLS and OpenSSL data - rejected. It works well, but notice that in the log messages below the received PSK identity from mbed TLS sometimes has "~" character at the end: 11895:20151020:163739.695 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent~" 11895:20151020:163754.164 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent" 11895:20151020:163809.117 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent" 11895:20151020:163825.586 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent" 11895:20151020:163840.055 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent~" 11895:20151020:163855.523 false PSK identity for host "windows-agent-mbedtls-1.3.11" item "agent.version" (not every rejected item might be reported): configured identity "Windows Agent1", received identity "Windows Agent~" Aleksandrs Saveljevs Might be non-trivial to reproduce. Only happened once for me so far. Andris Mednis Thanks for finding! RESOLVED in r58351. Aleksandrs Saveljevs Looks good. CLOSED.
          Hide
          Oleg Egorov added a comment - - edited

          (126) [F] Translation strings
          Added translation strings:

          • Updated status of host "%1$s".
          • Cannot execute script

          Removed translation strings

          • Updated status of host
          • Cannot connect to the trapper port of zabbix server daemon, but it should be available to run the script.

          Ivo Kurzemnieks The dot was misplaced. Corrected.
          CLOSED

          Aleksandrs Saveljevs Updated list for (142). RESOLVED.

          Oleg Egorov CLOSED

          Show
          Oleg Egorov added a comment - - edited (126) [F] Translation strings Added translation strings : Updated status of host "%1$s". Cannot execute script Removed translation strings Updated status of host Cannot connect to the trapper port of zabbix server daemon, but it should be available to run the script. Ivo Kurzemnieks The dot was misplaced. Corrected. CLOSED Aleksandrs Saveljevs Updated list for (142). RESOLVED. Oleg Egorov CLOSED
          Hide
          Patrick Ernst added a comment - - edited

          (127) Server -> Agent connection to run remote commands are unencryped even if encryption is forced by configuration.
          This was tested with PSK and Build 56280.

          /var/log/zabbix/zabbix_agentd.log
          ...
          1278:20151021:102809.794 agent #4 started [listener #3]
          1277:20151021:103833.604 failed to accept an incoming connection: from 10.91.0.191: unencrypted connections are not allowed
          

          Andris Mednis Thanks, Patrick, for testing and reporting this! RESOLVED in r56446.

          Andris Mednis Does not compile if HAVE_OPENIPMI is undefined. REOPENED.

          Andris Mednis RESOLVED in r56478.

          Aleksandrs Saveljevs The following condition in escalator.c is incorrect:

          #if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL)
          if (ZBX_SCRIPT_TYPE_CUSTOM_SCRIPT == script.type)
          {
          	strscpy(host.tls_issuer, row[13 + ZBX_IPMI_FIELDS_NUM]);
          	strscpy(host.tls_subject, row[14 + ZBX_IPMI_FIELDS_NUM]);
          	strscpy(host.tls_psk_identity, row[15 + ZBX_IPMI_FIELDS_NUM]);
          	strscpy(host.tls_psk, row[16 + ZBX_IPMI_FIELDS_NUM]);
          }
          #endif
          

          Encryption settings for a host should also be filled if the script is a global script. REOPENED.

          Aleksandrs Saveljevs While at it, please take a look at IPMI SQL comment fixes in r56586.

          Andris Mednis Thanks, r56586 reviewed. RESOLVED in r56600.

          Aleksandrs Saveljevs CLOSED

          Patrick Ernst Sorry, can't confirm fix. Problem still persists with r56631.

          Aleksandrs Saveljevs The fix is available in development branch svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-2 and has not been merged into trunk yet. Have you tested the development branch?

          Glebs Ivanovskis Running remote commands was broken after the first encryption merge in revision 56207. This affects 3.0.0alpha3 and 3.0.0alpha4. Remote commands do not work in both encrypted/unencrypted modes regardless of whether encryption support is compiled in or not. The reason for that is that host->tls_connect is not initialized at all. This issue is partially solved in ZBX-10029 (and merged into trunk in revision 56950) to enable at least unencrypted remote commands:

          		host->tls_connect = (unsigned char)1;	/* temporary fix */
          

          This line will cause a conflict during the next merge of encryption branch, which has already got a proper fix of this issue. So, feel free to remove that temporary fix in favour of

          		host->tls_connect = (unsigned char)atoi(row[2]);
          

          Andris Mednis Conflict resolved, temporary fix removed in r56979 (update to the latest trunk).

          Andris Mednis CLOSED

          Show
          Patrick Ernst added a comment - - edited (127) Server -> Agent connection to run remote commands are unencryped even if encryption is forced by configuration. This was tested with PSK and Build 56280. /var/log/zabbix/zabbix_agentd.log ... 1278:20151021:102809.794 agent #4 started [listener #3] 1277:20151021:103833.604 failed to accept an incoming connection: from 10.91.0.191: unencrypted connections are not allowed Andris Mednis Thanks, Patrick, for testing and reporting this! RESOLVED in r56446. Andris Mednis Does not compile if HAVE_OPENIPMI is undefined. REOPENED. Andris Mednis RESOLVED in r56478. Aleksandrs Saveljevs The following condition in escalator.c is incorrect: # if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL) if (ZBX_SCRIPT_TYPE_CUSTOM_SCRIPT == script.type) { strscpy(host.tls_issuer, row[13 + ZBX_IPMI_FIELDS_NUM]); strscpy(host.tls_subject, row[14 + ZBX_IPMI_FIELDS_NUM]); strscpy(host.tls_psk_identity, row[15 + ZBX_IPMI_FIELDS_NUM]); strscpy(host.tls_psk, row[16 + ZBX_IPMI_FIELDS_NUM]); } #endif Encryption settings for a host should also be filled if the script is a global script. REOPENED. Aleksandrs Saveljevs While at it, please take a look at IPMI SQL comment fixes in r56586. Andris Mednis Thanks, r56586 reviewed. RESOLVED in r56600. Aleksandrs Saveljevs CLOSED Patrick Ernst Sorry, can't confirm fix. Problem still persists with r56631. Aleksandrs Saveljevs The fix is available in development branch svn://svn.zabbix.com/branches/dev/ZBXNEXT-1263-2 and has not been merged into trunk yet. Have you tested the development branch? Glebs Ivanovskis Running remote commands was broken after the first encryption merge in revision 56207. This affects 3.0.0alpha3 and 3.0.0alpha4. Remote commands do not work in both encrypted/unencrypted modes regardless of whether encryption support is compiled in or not. The reason for that is that host->tls_connect is not initialized at all. This issue is partially solved in ZBX-10029 (and merged into trunk in revision 56950) to enable at least unencrypted remote commands: host->tls_connect = (unsigned char )1; /* temporary fix */ This line will cause a conflict during the next merge of encryption branch, which has already got a proper fix of this issue. So, feel free to remove that temporary fix in favour of host->tls_connect = (unsigned char )atoi(row[2]); Andris Mednis Conflict resolved, temporary fix removed in r56979 (update to the latest trunk). Andris Mednis CLOSED
          Hide
          richlv added a comment - - edited

          (128) minor, but in app/views/administration.proxy.edit.js.php a comment says:

          if proxy is active, then disabled "Connections to proxy" field and enabled "Connections from proxy"
          if proxy is passive, then enabled "Connections to proxy" field and disabled "Connections from proxy"

          this sounds strange, depending on the intent one of these might be better :

          • if proxy is active, disable "Connections to proxy" field and enable "Connections from proxy"
          • if proxy is active, "Connections to proxy" field is disabled and "Connections from proxy" is enabled

          Oleg Egorov RESOLVED IN r59057

          Ivo Kurzemnieks CLOSED

          Show
          richlv added a comment - - edited (128) minor, but in app/views/administration.proxy.edit.js.php a comment says: if proxy is active, then disabled "Connections to proxy" field and enabled "Connections from proxy" if proxy is passive, then enabled "Connections to proxy" field and disabled "Connections from proxy" this sounds strange, depending on the intent one of these might be better : if proxy is active, disable "Connections to proxy" field and enable "Connections from proxy" if proxy is active, "Connections to proxy" field is disabled and "Connections from proxy" is enabled Oleg Egorov RESOLVED IN r59057 Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (129) [D] Unrelated to this development, but has to be fixed somewhere. It is a bug in ZBXNEXT-2637:

          ### Option: UnsafeUserParameters
          #	Allow all characters to be passed in arguments to user-defined parameters.
          #	The following characters are not allowed:
          #	\ ' ” ` * ? [ ] { } ~ $ ! & ; ( ) < > | # @
          

          Here, the double quote is not a plain double quote. If possible, both configuration files and online manual should be fixed.

          <richlv> fixed in r56443 :

          • zabbix_agent.conf
          • zabbix_agentd.conf
          • zabbix_agentd.win.conf

          fixed in the manual :

          RESOLVED

          Aleksandrs Saveljevs Fixed config/zabbix_agent as well. Please take a look. Still RESOLVED.

          <richlv> somehow managed to think that page was removed already... thanks

          CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (129) [D] Unrelated to this development, but has to be fixed somewhere. It is a bug in ZBXNEXT-2637 : ### Option: UnsafeUserParameters # Allow all characters to be passed in arguments to user-defined parameters. # The following characters are not allowed: # \ ' ” ` * ? [ ] { } ~ $ ! & ; ( ) < > | # @ Here, the double quote is not a plain double quote. If possible, both configuration files and online manual should be fixed. < richlv > fixed in r56443 : zabbix_agent.conf zabbix_agentd.conf zabbix_agentd.win.conf fixed in the manual : config/zabbix_agentd config/zabbix_agentd RESOLVED Aleksandrs Saveljevs Fixed config/zabbix_agent as well. Please take a look. Still RESOLVED. < richlv > somehow managed to think that page was removed already... thanks CLOSED
          Hide
          richlv added a comment - - edited

          (130) in src/libs/zbxcrypto/tls.c, there seems to be a typo, repeated 4 times :

          '*error' is not NULL if an error occured
          /* an error occured */

          should be 'occurred'

          Glebs Ivanovskis RESOLVED in r56612, r56620.

          Aleksandrs Saveljevs CLOSED

          Show
          richlv added a comment - - edited (130) in src/libs/zbxcrypto/tls.c, there seems to be a typo, repeated 4 times : '*error' is not NULL if an error occured /* an error occured */ should be 'occurred' Glebs Ivanovskis RESOLVED in r56612, r56620. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (131) [D] This issue is to document resource usage and performance of various Zabbix components using different libraries.

          Memory usage for Windows agent (r56284)

          • unencrypted
            proc_info[zabbix_agentd.unencrypted.exe,vmsize]   4.71 MB
            proc_info[zabbix_agentd.unencrypted.exe,wkset]    7.09 MB
            
          • mbed TLS 1.3.11
            proc_info[zabbix_agentd.mbedtls.exe,vmsize]       5.74 MB
            proc_info[zabbix_agentd.mbedtls.exe,wkset]        8.68 MB
            
          • GnuTLS 3.3.13
            proc_info[zabbix_agentd.gnutls.exe,vmsize]        7.25 MB
            proc_info[zabbix_agentd.gnutls.exe,wkset]        11.35 MB
            
          • OpenSSL 1.0.2c
            proc_info[zabbix_agentd.openssl.exe,vmsize]       7.86 MB
            proc_info[zabbix_agentd.openssl.exe,wkset]       11.32 MB
            

          CPU usage and poller busyness for server using certificates (r56284)

          This test was performed by using a server with 10 pollers that monitored about 100 frequently polled items on 6 agents (with different libraries) using certificates. CPU used: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz.

          According to the graphs, the average CPU usage for pollers was:

          • mbed TLS: 78%
          • GnuTLS: 36%
          • OpenSSL: 25%

          Average poller busyness (note that, unlike CPU usage, busyness also heavily depends on the agent):

          • mbed TLS: 55%
          • GnuTLS: 20%
          • OpenSSL: 19%

          So OpenSSL seems to be most efficient (although, as pointed out by Andris Mednis, performance should ideally be tested on the same cipher suites - see (137) for the currently used ones) and we might wish to recommend OpenSSL or GnuTLS setups for performance (although it might be good to reference some official benchmark).

          CPU usage for agent using certificates (r56284)

          An interesting observation is that for mbed TLS and GnuTLS agents, the CPU usage is higher with mbed TLS server (the first case) than with GnuTLS and OpenSSL servers.

          The average CPU usage for the second and third cases is as follows:

          • mbed TLS: 23%
          • GnuTLS: 11%
          • OpenSSL: 6%

          CPU usage and poller busyness for server using PSK (r56284)

          Same scenario as the test above, except that PSK are used instead of certificates.

          Average CPU usage for pollers:

          • mbed TLS: 42.38%
          • GnuTLS: 5.57%
          • OpenSSL: 4.14%

          Average poller busyness:

          • mbed TLS: 54.33%
          • GnuTLS: 14.60%
          • OpenSSL: 12.65%

          It can be seen that CPU usage is significantly lower, compared to certificates, especially in GnuTLS and OpenSSL cases.

          CPU usage for agent using PSK (r56284)

          An interesting observation here, too, is that for mbed TLS and GnuTLS agents, the CPU usage is higher with mbed TLS server (the first case). However, unlike certificates, CPU usage with GnuTLS server in noticeably higher than with OpenSSL server (again, as pointed out by Andris Mednis, it might simply be due to different ciphers being used).

          The average agent CPU usage for mbed TLS server:

          • mbed TLS: 24.84%
          • GnuTLS: 11.86%
          • OpenSSL: 1.02%

          The average agent CPU usage for GnuTLS server:

          • mbed TLS: 14.42%
          • GnuTLS: 4.50%
          • OpenSSL: 0.92%

          The average agent CPU usage for OpenSSL server:

          • mbed TLS: 1.15%
          • GnuTLS: 1.25%
          • OpenSSL: 1.04%

          CPU usage and poller busyness for server without encryption (r56284)

          Same scenario, except that no encryption is used.

          Omitting the graphs, the average CPU usage for pollers, regardless of encryption library, was about 1.55%, and poller busyness was around 2.50%.

          CPU usage for agent without encryption (r56284)

          Similarly here, the average CPU usage for agent listeners, regardless of encryption library, was around 0.40%.

          Performance of zabbix_get (r56284)

          This test was performed on localhost (i.e., no network latency, same as the tests above) by using a modified zabbix_get that would query "agent.ping" item 1000 times. The tables below show how many seconds it took zabbix_get to perform these queries (rows are different flavors of zabbix_get, columns - zabbix_agentd).

          No encryption:
            unencrypted mbed TLS GnuTLS OpenSSL
          unencrypted real 0m0.050s
          user 0m0.004s
          sys 0m0.032s
          - - -
          mbed TLS - - - -
          GnuTLS - - - -
          OpenSSL - - - -
          Using certificates:
            unencrypted mbed TLS GnuTLS OpenSSL
          unencrypted - - - -
          mbed TLS - real 1m36.140s
          user 0m13.300s
          sys 0m0.024s
          real 0m52.301s
          user 0m12.984s
          sys 0m0.016s
          real 0m40.002s
          user 0m3.332s
          sys 0m0.040s
          GnuTLS - real 0m45.968s
          user 0m2.408s
          sys 0m0.820s
          real 0m5.419s
          user 0m2.884s
          sys 0m0.116s
          real 0m3.710s
          user 0m2.112s
          sys 0m0.100s
          OpenSSL - real 0m48.036s
          user 0m1.888s
          sys 0m0.000s
          real 0m4.801s
          user 0m1.548s
          sys 0m0.092s
          real 0m2.921s
          user 0m1.396s
          sys 0m0.072s
          Using PSK:
            unencrypted mbed TLS GnuTLS OpenSSL
          unencrypted - - - -
          mbed TLS - real 1m32.923s
          user 0m10.108s
          sys 0m0.044s
          real 0m49.616s
          user 0m9.740s
          sys 0m0.032s
          real 0m40.002s
          user 0m0.000s
          sys 0m0.340s
          GnuTLS - real 0m40.125s
          user 0m0.516s
          sys 0m0.512s
          real 0m1.657s
          user 0m0.772s
          sys 0m0.088s
          real 0m0.254s
          user 0m0.100s
          sys 0m0.064s
          OpenSSL - real 0m40.010s
          user 0m0.000s
          sys 0m0.340s
          real 0m0.236s
          user 0m0.064s
          sys 0m0.068s
          real 0m0.233s
          user 0m0.064s
          sys 0m0.072s

          Andris Mednis A note about OpenSSL being the fastest, followed by GnuTLS added to https://www.zabbix.com/documentation/3.0/manual/encryption#compiling_zabbix_with_encryption_support . CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (131) [D] This issue is to document resource usage and performance of various Zabbix components using different libraries. Memory usage for Windows agent (r56284) unencrypted proc_info[zabbix_agentd.unencrypted.exe,vmsize] 4.71 MB proc_info[zabbix_agentd.unencrypted.exe,wkset] 7.09 MB mbed TLS 1.3.11 proc_info[zabbix_agentd.mbedtls.exe,vmsize] 5.74 MB proc_info[zabbix_agentd.mbedtls.exe,wkset] 8.68 MB GnuTLS 3.3.13 proc_info[zabbix_agentd.gnutls.exe,vmsize] 7.25 MB proc_info[zabbix_agentd.gnutls.exe,wkset] 11.35 MB OpenSSL 1.0.2c proc_info[zabbix_agentd.openssl.exe,vmsize] 7.86 MB proc_info[zabbix_agentd.openssl.exe,wkset] 11.32 MB CPU usage and poller busyness for server using certificates (r56284) This test was performed by using a server with 10 pollers that monitored about 100 frequently polled items on 6 agents (with different libraries) using certificates. CPU used: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz. According to the graphs, the average CPU usage for pollers was: mbed TLS: 78% GnuTLS: 36% OpenSSL: 25% Average poller busyness (note that, unlike CPU usage, busyness also heavily depends on the agent): mbed TLS: 55% GnuTLS: 20% OpenSSL: 19% So OpenSSL seems to be most efficient (although, as pointed out by Andris Mednis , performance should ideally be tested on the same cipher suites - see (137) for the currently used ones) and we might wish to recommend OpenSSL or GnuTLS setups for performance (although it might be good to reference some official benchmark). CPU usage for agent using certificates (r56284) An interesting observation is that for mbed TLS and GnuTLS agents, the CPU usage is higher with mbed TLS server (the first case) than with GnuTLS and OpenSSL servers. The average CPU usage for the second and third cases is as follows: mbed TLS: 23% GnuTLS: 11% OpenSSL: 6% CPU usage and poller busyness for server using PSK (r56284) Same scenario as the test above, except that PSK are used instead of certificates. Average CPU usage for pollers: mbed TLS: 42.38% GnuTLS: 5.57% OpenSSL: 4.14% Average poller busyness: mbed TLS: 54.33% GnuTLS: 14.60% OpenSSL: 12.65% It can be seen that CPU usage is significantly lower, compared to certificates, especially in GnuTLS and OpenSSL cases. CPU usage for agent using PSK (r56284) An interesting observation here, too, is that for mbed TLS and GnuTLS agents, the CPU usage is higher with mbed TLS server (the first case). However, unlike certificates, CPU usage with GnuTLS server in noticeably higher than with OpenSSL server (again, as pointed out by Andris Mednis , it might simply be due to different ciphers being used). The average agent CPU usage for mbed TLS server: mbed TLS: 24.84% GnuTLS: 11.86% OpenSSL: 1.02% The average agent CPU usage for GnuTLS server: mbed TLS: 14.42% GnuTLS: 4.50% OpenSSL: 0.92% The average agent CPU usage for OpenSSL server: mbed TLS: 1.15% GnuTLS: 1.25% OpenSSL: 1.04% CPU usage and poller busyness for server without encryption (r56284) Same scenario, except that no encryption is used. Omitting the graphs, the average CPU usage for pollers, regardless of encryption library, was about 1.55%, and poller busyness was around 2.50%. CPU usage for agent without encryption (r56284) Similarly here, the average CPU usage for agent listeners, regardless of encryption library, was around 0.40%. Performance of zabbix_get (r56284) This test was performed on localhost (i.e., no network latency, same as the tests above) by using a modified zabbix_get that would query "agent.ping" item 1000 times. The tables below show how many seconds it took zabbix_get to perform these queries (rows are different flavors of zabbix_get, columns - zabbix_agentd). No encryption:   unencrypted mbed TLS GnuTLS OpenSSL unencrypted real 0m0.050s user 0m0.004s sys 0m0.032s - - - mbed TLS - - - - GnuTLS - - - - OpenSSL - - - - Using certificates:   unencrypted mbed TLS GnuTLS OpenSSL unencrypted - - - - mbed TLS - real 1m36.140s user 0m13.300s sys 0m0.024s real 0m52.301s user 0m12.984s sys 0m0.016s real 0m40.002s user 0m3.332s sys 0m0.040s GnuTLS - real 0m45.968s user 0m2.408s sys 0m0.820s real 0m5.419s user 0m2.884s sys 0m0.116s real 0m3.710s user 0m2.112s sys 0m0.100s OpenSSL - real 0m48.036s user 0m1.888s sys 0m0.000s real 0m4.801s user 0m1.548s sys 0m0.092s real 0m2.921s user 0m1.396s sys 0m0.072s Using PSK:   unencrypted mbed TLS GnuTLS OpenSSL unencrypted - - - - mbed TLS - real 1m32.923s user 0m10.108s sys 0m0.044s real 0m49.616s user 0m9.740s sys 0m0.032s real 0m40.002s user 0m0.000s sys 0m0.340s GnuTLS - real 0m40.125s user 0m0.516s sys 0m0.512s real 0m1.657s user 0m0.772s sys 0m0.088s real 0m0.254s user 0m0.100s sys 0m0.064s OpenSSL - real 0m40.010s user 0m0.000s sys 0m0.340s real 0m0.236s user 0m0.064s sys 0m0.068s real 0m0.233s user 0m0.064s sys 0m0.072s Andris Mednis A note about OpenSSL being the fastest, followed by GnuTLS added to https://www.zabbix.com/documentation/3.0/manual/encryption#compiling_zabbix_with_encryption_support . CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (132) [S] Suppose we have the following setup:

          1. mbed TLS proxy monitors Windows GnuTLS agent;
          2. GnuTLS proxy monitors Windows OpenSSL agent;
          3. OpenSSL proxy monitors Windows mbed TLS agent.

          Suppose also that agents do not like proxy certificates because certificate subject is wrong.

          In case #2, the "Z" on the host becomes red with the message "Received empty response from Zabbix Agent at [192.168.1.2]. Assuming that agent dropped connection because of access permissions."

          In cases #1 and #3, the "Z" stays green and all queried items become not supported with messages "Get value from agent failed: ssl_read() failed: SSL - The peer notified us that the connection is going to be closed" and "Get value from agent failed: connection closed during read", respectively.

          If possible, we should do something about it, so that in cases #1 and #3 the "Z" becomes red, too.

          Aleksandrs Saveljevs This looks like a bigger problem. A server monitoring an agent that is currently unavailable makes items go unsupported, too:

           24808:20151023:140001.612 item "zabbix:proc_info[zabbix_agentd.openssl.exe,vmsize]" became not supported: Get value from agent failed: cannot connect to [[192.168.1.2]:23051]: [4] Interrupted system call
           24806:20151023:140006.613 item "zabbix:proc_info[zabbix_agentd.mbedtls.exe,vmsize]" became not supported: Get value from agent failed: cannot connect to [[192.168.1.2]:23051]: [4] Interrupted system call
          

          Andris Mednis RESOLVED in r57110.

          Aleksandrs Saveljevs Some stylistic suggestions (in particular, hiding zbx_alarm_on() and zbx_alarm_off() functions from Windows in common.h) are available in r57347.

          Aleksandrs Saveljevs Added "zbx_timed_out" handling around ZBX_TCP_READ() and ZBX_TCP_WRITE() in zbx_tls_read() and zbx_tls_write() for Windows in r57354, r57356, and r57357. Please take a look.

          Andris Mednis Added "zbx_timed_out" handling in zbx_tls_close() for mbed TLS and GnuTLS in 57386.

          Aleksandrs Saveljevs Looks good. May be nice to add "zbx_timed_out" handling to zbx_tls_accept, too.

          Andris Mednis RESOLVED in r57420 and r57421.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (132) [S] Suppose we have the following setup: mbed TLS proxy monitors Windows GnuTLS agent; GnuTLS proxy monitors Windows OpenSSL agent; OpenSSL proxy monitors Windows mbed TLS agent. Suppose also that agents do not like proxy certificates because certificate subject is wrong. In case #2, the "Z" on the host becomes red with the message "Received empty response from Zabbix Agent at [192.168.1.2]. Assuming that agent dropped connection because of access permissions." In cases #1 and #3, the "Z" stays green and all queried items become not supported with messages "Get value from agent failed: ssl_read() failed: SSL - The peer notified us that the connection is going to be closed" and "Get value from agent failed: connection closed during read", respectively. If possible, we should do something about it, so that in cases #1 and #3 the "Z" becomes red, too. Aleksandrs Saveljevs This looks like a bigger problem. A server monitoring an agent that is currently unavailable makes items go unsupported, too: 24808:20151023:140001.612 item "zabbix:proc_info[zabbix_agentd.openssl.exe,vmsize]" became not supported: Get value from agent failed: cannot connect to [[192.168.1.2]:23051]: [4] Interrupted system call 24806:20151023:140006.613 item "zabbix:proc_info[zabbix_agentd.mbedtls.exe,vmsize]" became not supported: Get value from agent failed: cannot connect to [[192.168.1.2]:23051]: [4] Interrupted system call Andris Mednis RESOLVED in r57110. Aleksandrs Saveljevs Some stylistic suggestions (in particular, hiding zbx_alarm_on() and zbx_alarm_off() functions from Windows in common.h) are available in r57347. Aleksandrs Saveljevs Added "zbx_timed_out" handling around ZBX_TCP_READ() and ZBX_TCP_WRITE() in zbx_tls_read() and zbx_tls_write() for Windows in r57354, r57356, and r57357. Please take a look. Andris Mednis Added "zbx_timed_out" handling in zbx_tls_close() for mbed TLS and GnuTLS in 57386. Aleksandrs Saveljevs Looks good. May be nice to add "zbx_timed_out" handling to zbx_tls_accept, too. Andris Mednis RESOLVED in r57420 and r57421. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (133) [G] On Windows, configure an agent with a correct PSK. Everything works fine. Now, corrupt the PSK file a bit (e.g., by removing the last digit). The agent does not start, which is correct. However, it is not possible to start the agent anymore, even when PSK file is fixed back. Windows restart helps, but it is not a solution.

          Sandis Neilands __zbx_mutex_lock() doesn't handle WAIT_ABANDONED result from WaitForSingleObject() but instead exits. For me this happens when zabbix_agent tries to LOCK_LOG.

          More info on mutexes in Windows.

          1. why this happens only when starting zabbix_agent as a service.

          • Starting from Windows Vista/Server 2008 services run in separate session (session 0). Each session has its own mutex namespace. This explains why we still can't start agent as a service after successfully launching it as console application. But why we don't have the same problem with agent as console application?
          • We have the same problem with agent as console application. We just need another agent to still remain in the background in the same session (e.g. also in console) for the mutex to remain in system. Otherwise Windows garbage collects the unreferenced mutex and when we start the agent again we get a new one rather the abandoned one.

          2. why the lock wasn't released properly in case of erroneous PSK file.

          Race condition: one listener thread calls exit() while another is still in the logging function and hasn't released the mutex yet.

          Sandis Neilands CLOSED. Opened ZBX-10033, ZBX-10034 to deal with this sub-issue. We must absolutely fix at least ZBX-10033 as otherwise with the current implementation invalid PSK file will stop all agents in the session. However the issue is generic and not isolated to encryption feature.

          Show
          Aleksandrs Saveljevs added a comment - - edited (133) [G] On Windows, configure an agent with a correct PSK. Everything works fine. Now, corrupt the PSK file a bit (e.g., by removing the last digit). The agent does not start, which is correct. However, it is not possible to start the agent anymore, even when PSK file is fixed back. Windows restart helps, but it is not a solution. Sandis Neilands __zbx_mutex_lock() doesn't handle WAIT_ABANDONED result from WaitForSingleObject() but instead exits. For me this happens when zabbix_agent tries to LOCK_LOG. More info on mutexes in Windows . 1. why this happens only when starting zabbix_agent as a service. Starting from Windows Vista/Server 2008 services run in separate session (session 0). Each session has its own mutex namespace. This explains why we still can't start agent as a service after successfully launching it as console application. But why we don't have the same problem with agent as console application? We have the same problem with agent as console application. We just need another agent to still remain in the background in the same session (e.g. also in console) for the mutex to remain in system. Otherwise Windows garbage collects the unreferenced mutex and when we start the agent again we get a new one rather the abandoned one. 2. why the lock wasn't released properly in case of erroneous PSK file. Race condition: one listener thread calls exit() while another is still in the logging function and hasn't released the mutex yet. Sandis Neilands CLOSED. Opened ZBX-10033 , ZBX-10034 to deal with this sub-issue. We must absolutely fix at least ZBX-10033 as otherwise with the current implementation invalid PSK file will stop all agents in the session. However the issue is generic and not isolated to encryption feature.
          Hide
          Andris Mednis added a comment - - edited

          (134) [D] zabbix_sender man-page says about "-c, --config config-file" option that "Only parameters Hostname, ServerActive and SourceIP are supported.". TLS parameters should be added for "-c, --config config-file" option.

          Sandis Neilands RESOLVED in r56409.

          Andris Mednis CLOSED.

          Show
          Andris Mednis added a comment - - edited (134) [D] zabbix_sender man-page says about "-c, --config config-file" option that "Only parameters Hostname, ServerActive and SourceIP are supported.". TLS parameters should be added for "-c, --config config-file" option. Sandis Neilands RESOLVED in r56409. Andris Mednis CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (135) [G] Suppose we have the following CA hierarchy: "Root CA" -> "Intermediate CA" -> "Zabbix server". Zabbix server sends its and intermediate CA's certificates. Suppose we specify "Intermediate CA" in agent's configuration file as a certificate authority. A GnuTLS agent works well. However, mbed TLS and OpenSSL agents reject the server certificate:

          • mbed TLS:
              2464:20151026:101017.484 active check configuration update from [192.168.1.2:20004] started to fail (TCP successful, cannot establish TLS to [[192.168.1.2]:20004]: invalid peer certificate: self-signed or not signed by trusted CA)
              1632:20151026:101031.578 failed to accept an incoming connection: from 192.168.1.2: invalid peer certificate: self-signed or not signed by trusted CA
            
          • OpenSSL:
              1652:20151026:101023.390 active check configuration update from [192.168.1.2:20003] started to fail (TCP successful, cannot establish TLS to [[192.168.1.2]:20003]: SSL_connect() returned SSL_ERROR_SSL: file .\ssl\s3_clnt.c line 1205: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "unknown CA")
              3900:20151026:101036.531 failed to accept an incoming connection: from 192.168.1.2: TLS handshake returned error code 1: file .\ssl\s3_srvr.c line 3195: error:14089086:SSL routines:ssl3_get_client_certificate:certificate verify failed: TLS write fatal alert "unknown CA"
            

          Andris Mednis It seems that rfc5280 allows trusting Intermediate CA without Root CA but does not require it.
          Sectiion "3.2. Certification Paths and Trust" says:

          (a) Certification paths start with a public key of a CA in a
          user's own domain, or with the public key of the top of a
          hierarchy. Starting with the public key of a CA in a user's
          own domain has certain advantages. In some environments, the
          local domain is the most trusted.

          Sectiion "6. Certification Path Validation" says:

          The selection of a trust anchor is a matter of policy: it could be
          the top CA in a hierarchical PKI, the CA that issued the verifier's
          own certificate(s), or any other CA in a network PKI. The path
          validation procedure is the same regardless of the choice of trust
          anchor. In addition, different applications may rely on different
          trust anchors, or may accept paths that begin with any of a set of
          trust anchors.

          Can an intermediate CA be trusted like a self-signed root CA? discusses this situation and workarounds.

          OpenSSL in version 1.0.2a introduced flag X509_V_FLAG_PARTIAL_CHAIN. In OpenSSL 1.1.0 (currently in alpha relaease) a new option for verify command was added:

          -partial_chain
          Allow verification to succeed even if a complete chain cannot be built to a self-signed trust-anchor, provided it is possible to construct a chain to a trusted certificate that might not be self-signed.

          Long discussions in OpenLDAP X509_V_FLAG_PARTIAL_CHAIN support in OpenLDAP and cURL [PATCH] openssl: allow partial trust chains maillists show that not everybody agrees with accepting Intermediate CA certificate as a trust anchor.

          Currently Zabbix relies on default behaviour of a underlying crypto toolkit.
          If it is not sufficient then it should be configurable in Zabbix - to accept Intermediate CA certificate as a trust anchor or not.

          Aleksandrs Saveljevs Great research! CLOSED for now.

          Show
          Aleksandrs Saveljevs added a comment - - edited (135) [G] Suppose we have the following CA hierarchy: "Root CA" -> "Intermediate CA" -> "Zabbix server". Zabbix server sends its and intermediate CA's certificates. Suppose we specify "Intermediate CA" in agent's configuration file as a certificate authority. A GnuTLS agent works well. However, mbed TLS and OpenSSL agents reject the server certificate: mbed TLS: 2464:20151026:101017.484 active check configuration update from [192.168.1.2:20004] started to fail (TCP successful, cannot establish TLS to [[192.168.1.2]:20004]: invalid peer certificate: self-signed or not signed by trusted CA) 1632:20151026:101031.578 failed to accept an incoming connection: from 192.168.1.2: invalid peer certificate: self-signed or not signed by trusted CA OpenSSL: 1652:20151026:101023.390 active check configuration update from [192.168.1.2:20003] started to fail (TCP successful, cannot establish TLS to [[192.168.1.2]:20003]: SSL_connect() returned SSL_ERROR_SSL: file .\ssl\s3_clnt.c line 1205: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: TLS write fatal alert "unknown CA") 3900:20151026:101036.531 failed to accept an incoming connection: from 192.168.1.2: TLS handshake returned error code 1: file .\ssl\s3_srvr.c line 3195: error:14089086:SSL routines:ssl3_get_client_certificate:certificate verify failed: TLS write fatal alert "unknown CA" Andris Mednis It seems that rfc5280 allows trusting Intermediate CA without Root CA but does not require it. Sectiion "3.2. Certification Paths and Trust" says: (a) Certification paths start with a public key of a CA in a user's own domain, or with the public key of the top of a hierarchy. Starting with the public key of a CA in a user's own domain has certain advantages. In some environments, the local domain is the most trusted. Sectiion "6. Certification Path Validation" says: The selection of a trust anchor is a matter of policy: it could be the top CA in a hierarchical PKI, the CA that issued the verifier's own certificate(s), or any other CA in a network PKI. The path validation procedure is the same regardless of the choice of trust anchor. In addition, different applications may rely on different trust anchors, or may accept paths that begin with any of a set of trust anchors. Can an intermediate CA be trusted like a self-signed root CA? discusses this situation and workarounds. OpenSSL in version 1.0.2a introduced flag X509_V_FLAG_PARTIAL_CHAIN. In OpenSSL 1.1.0 (currently in alpha relaease) a new option for verify command was added: -partial_chain Allow verification to succeed even if a complete chain cannot be built to a self-signed trust-anchor, provided it is possible to construct a chain to a trusted certificate that might not be self-signed. Long discussions in OpenLDAP X509_V_FLAG_PARTIAL_CHAIN support in OpenLDAP and cURL [PATCH] openssl: allow partial trust chains maillists show that not everybody agrees with accepting Intermediate CA certificate as a trust anchor. Currently Zabbix relies on default behaviour of a underlying crypto toolkit. If it is not sufficient then it should be configurable in Zabbix - to accept Intermediate CA certificate as a trust anchor or not. Aleksandrs Saveljevs Great research! CLOSED for now.
          Hide
          richlv added a comment - - edited

          (136) really minor, but for both get and sender it would be useful to specify that for --tls-connect 'unencrypted' is the default value
          this would be useful in both the manpages & the output of --help

          Sandis Neilands RESOLVED in r56408.

          <richlv> looks good to me, thanks
          CLOSED

          Show
          richlv added a comment - - edited (136) really minor, but for both get and sender it would be useful to specify that for --tls-connect 'unencrypted' is the default value this would be useful in both the manpages & the output of --help Sandis Neilands RESOLVED in r56408. < richlv > looks good to me, thanks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (137) This subissue is to document which cipher suites are preferred (in library native terms) and which cipher suites are actually chosen (in my environment, as of r56284, in Wireshark terms). This effort complements performance documentation in (131).

          Cipher suite priorities:

          • PolarSSL 1.3.9
            • certificates
              TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256
              TLS-ECDHE-RSA-WITH-AES-128-CBC-SHA256
              TLS-ECDHE-RSA-WITH-AES-128-CBC-SHA
              TLS-RSA-WITH-AES-128-GCM-SHA256
              TLS-RSA-WITH-AES-128-CBC-SHA256
              TLS-RSA-WITH-AES-128-CBC-SHA
              
            • PSK
              TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256
              TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA
              TLS-PSK-WITH-AES-128-GCM-SHA256
              TLS-PSK-WITH-AES-128-CBC-SHA256
              TLS-PSK-WITH-AES-128-CBC-SHA
              
          • GnuTLS 3.1.18
            • certificates
              TLS_ECDHE_RSA_AES_128_GCM_SHA256
              TLS_ECDHE_RSA_AES_128_CBC_SHA256
              TLS_ECDHE_RSA_AES_128_CBC_SHA1
              TLS_RSA_AES_128_GCM_SHA256
              TLS_RSA_AES_128_CBC_SHA256
              TLS_RSA_AES_128_CBC_SHA1
              
            • PSK
              TLS_ECDHE_PSK_AES_128_CBC_SHA256
              TLS_ECDHE_PSK_AES_128_CBC_SHA1
              TLS_PSK_AES_128_GCM_SHA256
              TLS_PSK_AES_128_CBC_SHA256
              TLS_PSK_AES_128_CBC_SHA1
              
          • OpenSSL 1.0.2c
            • certificates
              ECDHE-RSA-AES128-GCM-SHA256
              ECDHE-RSA-AES128-SHA256
              ECDHE-RSA-AES128-SHA
              AES128-GCM-SHA256
              AES128-SHA256
              AES128-SHA
              
            • PSK
              PSK-AES128-CBC-SHA
              

          Used cipher suites (rows - TLS client, columns - TLS server):

          Using certificates:
            unencrypted mbed TLS GnuTLS OpenSSL
          unencrypted - - - -
          mbed TLS - TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256
          GnuTLS - TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256
          OpenSSL - TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256

          NOTE: Striked out values in OpenSSL column pertain to r56284, before (138) fix. New values are after (138) fix.

          Using PSK:
            unencrypted mbed TLS GnuTLS OpenSSL
          unencrypted - - - -
          mbed TLS - TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-PSK-WITH-AES-128-CBC-SHA
          GnuTLS - TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-PSK-WITH-AES-128-CBC-SHA
          OpenSSL - TLS-PSK-WITH-AES-128-CBC-SHA TLS-PSK-WITH-AES-128-CBC-SHA TLS-PSK-WITH-AES-128-CBC-SHA

          Andris Mednis Documented at https://www.zabbix.com/documentation/3.0/manual/encryption#ciphersuites .

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (137) This subissue is to document which cipher suites are preferred (in library native terms) and which cipher suites are actually chosen (in my environment, as of r56284, in Wireshark terms). This effort complements performance documentation in (131). Cipher suite priorities: PolarSSL 1.3.9 certificates TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-CBC-SHA256 TLS-ECDHE-RSA-WITH-AES-128-CBC-SHA TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-CBC-SHA256 TLS-RSA-WITH-AES-128-CBC-SHA PSK TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA TLS-PSK-WITH-AES-128-GCM-SHA256 TLS-PSK-WITH-AES-128-CBC-SHA256 TLS-PSK-WITH-AES-128-CBC-SHA GnuTLS 3.1.18 certificates TLS_ECDHE_RSA_AES_128_GCM_SHA256 TLS_ECDHE_RSA_AES_128_CBC_SHA256 TLS_ECDHE_RSA_AES_128_CBC_SHA1 TLS_RSA_AES_128_GCM_SHA256 TLS_RSA_AES_128_CBC_SHA256 TLS_RSA_AES_128_CBC_SHA1 PSK TLS_ECDHE_PSK_AES_128_CBC_SHA256 TLS_ECDHE_PSK_AES_128_CBC_SHA1 TLS_PSK_AES_128_GCM_SHA256 TLS_PSK_AES_128_CBC_SHA256 TLS_PSK_AES_128_CBC_SHA1 OpenSSL 1.0.2c certificates ECDHE-RSA-AES128-GCM-SHA256 ECDHE-RSA-AES128-SHA256 ECDHE-RSA-AES128-SHA AES128-GCM-SHA256 AES128-SHA256 AES128-SHA PSK PSK-AES128-CBC-SHA Used cipher suites (rows - TLS client, columns - TLS server): Using certificates:   unencrypted mbed TLS GnuTLS OpenSSL unencrypted - - - - mbed TLS - TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 GnuTLS - TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 OpenSSL - TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 TLS-RSA-WITH-AES-128-GCM-SHA256 TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256 NOTE: Striked out values in OpenSSL column pertain to r56284, before (138) fix. New values are after (138) fix. Using PSK:   unencrypted mbed TLS GnuTLS OpenSSL unencrypted - - - - mbed TLS - TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-PSK-WITH-AES-128-CBC-SHA GnuTLS - TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-ECDHE-PSK-WITH-AES-128-CBC-SHA256 TLS-PSK-WITH-AES-128-CBC-SHA OpenSSL - TLS-PSK-WITH-AES-128-CBC-SHA TLS-PSK-WITH-AES-128-CBC-SHA TLS-PSK-WITH-AES-128-CBC-SHA Andris Mednis Documented at https://www.zabbix.com/documentation/3.0/manual/encryption#ciphersuites . Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (138) It can be seen in the current revision of (137) above that cipher suite priority list for certificates for OpenSSL is the following:

          ECDHE-RSA-AES128-GCM-SHA256
          ECDHE-RSA-AES128-SHA256
          ECDHE-RSA-AES128-SHA
          AES128-GCM-SHA256
          AES128-SHA256
          AES128-SHA
          

          However, when OpenSSL acts as a TLS server, it chooses AES128-GCM-SHA256, the non-ECDHE variant, even though TLS clients offer ECDHE options with a higher priority.

          Andris Mednis RESOLVED in r57320.

          Aleksandrs Saveljevs Please take a look at some stylistic fixes in r57400.

          Andris Mednis Thanks! Accepted. Some grammar change in r57406.

          Aleksandrs Saveljevs Grammar change accepted. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (138) It can be seen in the current revision of (137) above that cipher suite priority list for certificates for OpenSSL is the following: ECDHE-RSA-AES128-GCM-SHA256 ECDHE-RSA-AES128-SHA256 ECDHE-RSA-AES128-SHA AES128-GCM-SHA256 AES128-SHA256 AES128-SHA However, when OpenSSL acts as a TLS server, it chooses AES128-GCM-SHA256, the non-ECDHE variant, even though TLS clients offer ECDHE options with a higher priority. Andris Mednis RESOLVED in r57320. Aleksandrs Saveljevs Please take a look at some stylistic fixes in r57400. Andris Mednis Thanks! Accepted. Some grammar change in r57406. Aleksandrs Saveljevs Grammar change accepted. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (139) [G] Suppose we use mbed TLS zabbix_get to obtain a value from an mbed TLS agent. It can be seen in Wireshark that there is an RST packet at the end, which there shouldn't be. There are also RST packets with different library combinations, but some combinations are OK (e.g., GnuTLS zabbix_get with GnuTLS agent).

          Andris Mednis TLS 1.2 standard http://tools.ietf.org/html/rfc5246, section 7.2.1. describes Closure Alerts and how TLS connection is to be closed.

          In my understanding it means:

          • shutting down TLS connection is not like closing underlying TCP connection,
          • after TLS connection is shut down its underlying TCP connection may be reused for other purposes (e.g. for opening a new TLS connection)
            or closed.
          • either party can initiate TLS shutdown,
          • the party initiating TLS shutdown sends "close_notify" alert.
          • the other party must respond with its own "close_notify" alert,
            (unless some other fatal alert has been received) and close TLS connection.
            This is different from closing TCP - TLS is not designed to allow to close only, for example, transmitting of data and continue receiving of data.
          • when a party has initiated TLS shutdown and sent "close_notify" alert it may choose to wait for response "close_notify" or close the TCP connection
            without waiting for response.

          Currently Zabbix sends "close_notify" alert, does not wait for response "close_notify" and closes TCP connection.
          This may lead to RST packets in response to other party's "close_notify" alert and TCP FIN/ACK packets.

          Similar discussion is at http://fixunix.com/tcp-ip/488997-ssl-tcp-connection-termination-results-rst-print.html

          Seems to me there is nothing to be fixed with RST packets.

          Aleksandrs Saveljevs According to http://linux.die.net/man/3/gnutls_bye and https://www.openssl.org/docs/manmaster/ssl/SSL_shutdown.html , we do a clean shutdown with GnuTLS by calling gnutls_bye() with GNUTLS_SHUT_RDWR and a quick shutdown with OpenSSL by calling SSL_shutdown() only once. Do we want to be consistent?

          Andris Mednis Thanks for finding it! As we currently do not reuse TCP connection there is no gain in doing bidirectional TLS shutdown. I replaced it with unidirectional shutdown for GnuTLS. Now there are fewer packets to be transmitted but some of them are RST ones. RESOLVED in r57395.

          Aleksandrs Saveljevs CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (139) [G] Suppose we use mbed TLS zabbix_get to obtain a value from an mbed TLS agent. It can be seen in Wireshark that there is an RST packet at the end, which there shouldn't be. There are also RST packets with different library combinations, but some combinations are OK (e.g., GnuTLS zabbix_get with GnuTLS agent). Andris Mednis TLS 1.2 standard http://tools.ietf.org/html/rfc5246 , section 7.2.1. describes Closure Alerts and how TLS connection is to be closed. In my understanding it means: shutting down TLS connection is not like closing underlying TCP connection, after TLS connection is shut down its underlying TCP connection may be reused for other purposes (e.g. for opening a new TLS connection) or closed. either party can initiate TLS shutdown, the party initiating TLS shutdown sends "close_notify" alert. the other party must respond with its own "close_notify" alert, (unless some other fatal alert has been received) and close TLS connection. This is different from closing TCP - TLS is not designed to allow to close only, for example, transmitting of data and continue receiving of data. when a party has initiated TLS shutdown and sent "close_notify" alert it may choose to wait for response "close_notify" or close the TCP connection without waiting for response. Currently Zabbix sends "close_notify" alert, does not wait for response "close_notify" and closes TCP connection. This may lead to RST packets in response to other party's "close_notify" alert and TCP FIN/ACK packets. Similar discussion is at http://fixunix.com/tcp-ip/488997-ssl-tcp-connection-termination-results-rst-print.html Seems to me there is nothing to be fixed with RST packets. Aleksandrs Saveljevs According to http://linux.die.net/man/3/gnutls_bye and https://www.openssl.org/docs/manmaster/ssl/SSL_shutdown.html , we do a clean shutdown with GnuTLS by calling gnutls_bye() with GNUTLS_SHUT_RDWR and a quick shutdown with OpenSSL by calling SSL_shutdown() only once. Do we want to be consistent? Andris Mednis Thanks for finding it! As we currently do not reuse TCP connection there is no gain in doing bidirectional TLS shutdown. I replaced it with unidirectional shutdown for GnuTLS. Now there are fewer packets to be transmitted but some of them are RST ones. RESOLVED in r57395. Aleksandrs Saveljevs CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (140) [S] Suppose we have the following CA hierarchy: "Root CA" -> "Intermediate CA" -> server. Suppose also we specify server certificate and "Root CA" certificate in server's TLSCertFile, without "Intermediate CA" certificate (in other words, TLSCertFile has a broken certificate chain). A GnuTLS server refuses to start with the following error message:

           18871:20151103:122240.379 cannot load certificate or private key from file ".../server.crt" or ".../server.key": 2: (unknown error code)
          

          This is because we have the following code in tls.c:

          if (GNUTLS_E_SUCCESS != gnutls_certificate_set_x509_key_file(my_cert_creds, CONFIG_TLS_CERT_FILE,
          		CONFIG_TLS_KEY_FILE, GNUTLS_X509_FMT_PEM))
          {
          	zabbix_log(LOG_LEVEL_CRIT, "cannot load certificate or private key from file \"%s\" or \"%s\":"
          			" %d: %s", CONFIG_TLS_CERT_FILE, CONFIG_TLS_KEY_FILE, res,
          			gnutls_strerror(res));
          	zbx_tls_free();
          	exit(EXIT_FAILURE);
          }
          

          Variable "res" and its decoded error message is printed, but "res" itself is not set.

          While at it, we might wish to think about the following: in the above scenario, mbed TLS and OpenSSL servers start just fine, but their connections fail. Maybe we can make mbed TLS and OpenSSL servers validate their own certificate chain on startup, too?

          Andris Mednis Thanks for finding! Setting of 'res' variable RESOLVED in r57388.

          Aleksandrs Saveljevs Looks good. The error is now the following:

           17786:20151229:102913.798 cannot load certificate or private key from file ".../server.crt" or ".../server.key": -324: The provided X.509 certificate list is not sorted (in subject to issuer order)
          

          Have you investigated whether the same validation can be performed by mbed TLS and OpenSSL?

          Andris Mednis In OpenSSL we could validate that our private key matches our certificate with SSL_CTX_check_private_key(). I did not find easy way to check own certificate chain on startup in mbed TLS and OpenSSL but it should be doable.

          Andris Mednis Solved for GnuTLS. Fixing for mbedTLS and OpenSSL is optional and of low priority, CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (140) [S] Suppose we have the following CA hierarchy: "Root CA" -> "Intermediate CA" -> server. Suppose also we specify server certificate and "Root CA" certificate in server's TLSCertFile, without "Intermediate CA" certificate (in other words, TLSCertFile has a broken certificate chain). A GnuTLS server refuses to start with the following error message: 18871:20151103:122240.379 cannot load certificate or private key from file ".../server.crt" or ".../server.key": 2: (unknown error code) This is because we have the following code in tls.c: if (GNUTLS_E_SUCCESS != gnutls_certificate_set_x509_key_file(my_cert_creds, CONFIG_TLS_CERT_FILE, CONFIG_TLS_KEY_FILE, GNUTLS_X509_FMT_PEM)) { zabbix_log(LOG_LEVEL_CRIT, "cannot load certificate or private key from file \" %s\ " or \" %s\ ":" " %d: %s" , CONFIG_TLS_CERT_FILE, CONFIG_TLS_KEY_FILE, res, gnutls_strerror(res)); zbx_tls_free(); exit(EXIT_FAILURE); } Variable "res" and its decoded error message is printed, but "res" itself is not set. While at it, we might wish to think about the following: in the above scenario, mbed TLS and OpenSSL servers start just fine, but their connections fail. Maybe we can make mbed TLS and OpenSSL servers validate their own certificate chain on startup, too? Andris Mednis Thanks for finding! Setting of 'res' variable RESOLVED in r57388. Aleksandrs Saveljevs Looks good. The error is now the following: 17786:20151229:102913.798 cannot load certificate or private key from file ".../server.crt" or ".../server.key": -324: The provided X.509 certificate list is not sorted (in subject to issuer order) Have you investigated whether the same validation can be performed by mbed TLS and OpenSSL? Andris Mednis In OpenSSL we could validate that our private key matches our certificate with SSL_CTX_check_private_key(). I did not find easy way to check own certificate chain on startup in mbed TLS and OpenSSL but it should be doable. Andris Mednis Solved for GnuTLS. Fixing for mbedTLS and OpenSSL is optional and of low priority, CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (141) [S] In function process_mass_data(), we loop through all received values. Whenever a value's host changes, we do zbx_tls_get_attr_cert() and zbx_tls_get_attr_psk(). However, these attributes are constant per connection - it is enough to call these functions to get the attributes only once.

          Glebs Ivanovskis RESOLVED in revisions 56627, 56628.

          Aleksandrs Saveljevs There was a warning regarding (... && ... || ... && ...):

          proxy.c:2201:50: warning: '&&' within '||' [-Wlogical-op-parentheses]
                                  (ZBX_TCP_SEC_TLS_PSK == sock->connection_type && SUCCEED != zbx_tls_get_attr_psk(sock, &attr) ||
                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~
          proxy.c:2201:50: note: place parentheses around the '&&' expression to silence this warning
                                  (ZBX_TCP_SEC_TLS_PSK == sock->connection_type && SUCCEED != zbx_tls_get_attr_psk(sock, &attr) ||
                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          proxy.c:2202:50: warning: '&&' within '||' [-Wlogical-op-parentheses]
                                  ZBX_TCP_SEC_TLS_CERT == sock->connection_type && SUCCEED != zbx_tls_get_attr_cert(sock, &attr)))
                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          proxy.c:2202:50: note: place parentheses around the '&&' expression to silence this warning
                                  ZBX_TCP_SEC_TLS_CERT == sock->connection_type && SUCCEED != zbx_tls_get_attr_cert(sock, &attr)))
                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          2 warnings generated.
          

          Aleksandrs Saveljevs Fixed that in r56723. Please take a look. Ideally, we should also "goto out" instead of "return", but that would require three extra lines to introduce the label, because it is only used with TLS. Decided not to.

          Andris Mednis r56723 looks good. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (141) [S] In function process_mass_data(), we loop through all received values. Whenever a value's host changes, we do zbx_tls_get_attr_cert() and zbx_tls_get_attr_psk(). However, these attributes are constant per connection - it is enough to call these functions to get the attributes only once. Glebs Ivanovskis RESOLVED in revisions 56627, 56628. Aleksandrs Saveljevs There was a warning regarding (... && ... || ... && ...): proxy.c:2201:50: warning: '&&' within '||' [-Wlogical-op-parentheses] (ZBX_TCP_SEC_TLS_PSK == sock->connection_type && SUCCEED != zbx_tls_get_attr_psk(sock, &attr) || ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~ proxy.c:2201:50: note: place parentheses around the '&&' expression to silence this warning (ZBX_TCP_SEC_TLS_PSK == sock->connection_type && SUCCEED != zbx_tls_get_attr_psk(sock, &attr) || ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ proxy.c:2202:50: warning: '&&' within '||' [-Wlogical-op-parentheses] ZBX_TCP_SEC_TLS_CERT == sock->connection_type && SUCCEED != zbx_tls_get_attr_cert(sock, &attr))) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ proxy.c:2202:50: note: place parentheses around the '&&' expression to silence this warning ZBX_TCP_SEC_TLS_CERT == sock->connection_type && SUCCEED != zbx_tls_get_attr_cert(sock, &attr))) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated. Aleksandrs Saveljevs Fixed that in r56723. Please take a look. Ideally, we should also "goto out" instead of "return", but that would require three extra lines to introduce the label, because it is only used with TLS. Decided not to. Andris Mednis r56723 looks good. CLOSED.
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (142) [F] Suppose we try to execute a global script from our frontend. Connection from frontend to server is successful, but connection from server to agent is not. We then get the following error message:

          The header complains about not being able to connect to server, but this is wrong.

          Aleksandrs Saveljevs It seems that the frontend did not distinguish between reasons of failure since the beginning, except that in r14720 the error message was more generic: "SCRIPT ERROR".

          Aleksandrs Saveljevs Replaced the error message with a more generic "Cannot execute script", similar to "Cannot add item", in r56812. If the frontend cannot connect to Zabbix server, there is another, more detailed error message for that. RESOLVED.

          Oleg Egorov CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (142) [F] Suppose we try to execute a global script from our frontend. Connection from frontend to server is successful, but connection from server to agent is not. We then get the following error message: The header complains about not being able to connect to server, but this is wrong. Aleksandrs Saveljevs It seems that the frontend did not distinguish between reasons of failure since the beginning, except that in r14720 the error message was more generic: "SCRIPT ERROR". Aleksandrs Saveljevs Replaced the error message with a more generic "Cannot execute script", similar to "Cannot add item", in r56812. If the frontend cannot connect to Zabbix server, there is another, more detailed error message for that. RESOLVED. Oleg Egorov CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (143) [S] This is a continuation of the discussion started in (115):

          Aleksandrs Saveljevs By the way, what are DCcheck_proxy_permissions() and DCget_psk_by_identity() doing in tls_tcp_active.h? Wouldn't include/dbcache.h be a better place for them? We can open a new subissue for that. REOPENED.

          Sandis Neilands You raise a valid question in your second point. Indeed having those two functions defined in libs/zbxdbcache/dbconfig.c but declared in libs/zbxcrypto/tls_tcp_active.h probably indicates some sort of compromise solution earlier in the development.

          It might have something to do with the following comment in libs/zbxcrypto/tls.c.

          /* Pointer to DCget_psk_by_identity() initialized at runtime. This is a workaround for linking. */
          /* Server and proxy link with src/libs/zbxdbcache/dbconfig.o where DCget_psk_by_identity() resides */
          /* but other components (e.g. agent) do not link dbconfig.o. */
          

          I suggest opening a new sub-issue or even ZBX for this problem...

          Sandis Neilands Another nice-to-have re-factoring related to this is to conditionally compile in definitions, declarations of encryption stuff only if we are compiling with encryption enabled. Currently some functions/data structures are compiled in in plain build but they are not used anywhere.

          Aleksandrs Saveljevs Moved DCget_psk_by_identity() and DCcheck_proxy_permissions() from src/libs/zbxcrypto/tls_tcp_active.h to include/dbcache.h in r59043. RESOLVED.

          Andris Mednis Reviewed, looks good. CLOSED

          Show
          Aleksandrs Saveljevs added a comment - - edited (143) [S] This is a continuation of the discussion started in (115): Aleksandrs Saveljevs By the way, what are DCcheck_proxy_permissions() and DCget_psk_by_identity() doing in tls_tcp_active.h? Wouldn't include/dbcache.h be a better place for them? We can open a new subissue for that. REOPENED. Sandis Neilands You raise a valid question in your second point. Indeed having those two functions defined in libs/zbxdbcache/dbconfig.c but declared in libs/zbxcrypto/tls_tcp_active.h probably indicates some sort of compromise solution earlier in the development. It might have something to do with the following comment in libs/zbxcrypto/tls.c . /* Pointer to DCget_psk_by_identity() initialized at runtime. This is a workaround for linking. */ /* Server and proxy link with src/libs/zbxdbcache/dbconfig.o where DCget_psk_by_identity() resides */ /* but other components (e.g. agent) do not link dbconfig.o. */ I suggest opening a new sub-issue or even ZBX for this problem... Sandis Neilands Another nice-to-have re-factoring related to this is to conditionally compile in definitions, declarations of encryption stuff only if we are compiling with encryption enabled. Currently some functions/data structures are compiled in in plain build but they are not used anywhere. Aleksandrs Saveljevs Moved DCget_psk_by_identity() and DCcheck_proxy_permissions() from src/libs/zbxcrypto/tls_tcp_active.h to include/dbcache.h in r59043. RESOLVED. Andris Mednis Reviewed, looks good. CLOSED
          Hide
          Oleg Egorov added a comment - - edited

          (144) [F]
          host.update

          Request

          {
              "hostid": 10379
          }
          

          Result

          Cannot update host encryption settings. Connection settings for both directions should be specified.
          

          RESOLVED IN r56724

          Ivo Kurzemnieks CLOSED

          Show
          Oleg Egorov added a comment - - edited (144) [F] host.update Request { "hostid": 10379 } Result Cannot update host encryption settings. Connection settings for both directions should be specified. RESOLVED IN r56724 Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          List of bugs and issues that have already been reported for encryption separately:

          • ZBX-10083 (3.0 Alpha 4 - FreeBSD 10.1 - Agent build fails with TLS support)
          • ZBX-10086 (cannot disable core dump at server startup)
          • ZBXNEXT-3047 (Windows Zabbix Agent Binary is not compiled with TLS Support)
          Show
          Aleksandrs Saveljevs added a comment - - edited List of bugs and issues that have already been reported for encryption separately: ZBX-10083 (3.0 Alpha 4 - FreeBSD 10.1 - Agent build fails with TLS support) ZBX-10086 (cannot disable core dump at server startup) ZBXNEXT-3047 (Windows Zabbix Agent Binary is not compiled with TLS Support)
          Hide
          Andris Mednis added a comment - - edited

          (145) [D] Document that adding encryption increases time of checks and actions, depending on network latency. For example, if packet delay is 100ms then opening a TCP connection and sending unencrypted request takes around 200ms. With encryption about 1000 ms are added for establishing TLS connection. Timeouts may need to be increased, otherwise some items and actions running remote scripts on agents may work with unencrypted connections but fail with timeout with encrypted.

          Andris Mednis Documented in https://www.zabbix.com/documentation/3.0/manual/encryption#limitations .

          Aleksandrs Saveljevs CLOSED

          Show
          Andris Mednis added a comment - - edited (145) [D] Document that adding encryption increases time of checks and actions, depending on network latency. For example, if packet delay is 100ms then opening a TCP connection and sending unencrypted request takes around 200ms. With encryption about 1000 ms are added for establishing TLS connection. Timeouts may need to be increased, otherwise some items and actions running remote scripts on agents may work with unencrypted connections but fail with timeout with encrypted. Andris Mednis Documented in https://www.zabbix.com/documentation/3.0/manual/encryption#limitations . Aleksandrs Saveljevs CLOSED
          Hide
          richlv added a comment - - edited

          (146) when importing an xml with an empty "tls_accept" field, the error message says :
          Incorrect value used for connections from host field.

          that is terribly cryptic and not easy to figure out what went wrong.

          Oleg Egorov RESOLVED in r59060 and r59067. Thank you, Ivo

          Ivo Kurzemnieks CLOSED

          Show
          richlv added a comment - - edited (146) when importing an xml with an empty "tls_accept" field, the error message says : Incorrect value used for connections from host field. that is terribly cryptic and not easy to figure out what went wrong. Oleg Egorov RESOLVED in r59060 and r59067. Thank you, Ivo Ivo Kurzemnieks CLOSED
          Hide
          Aleksandrs Saveljevs added a comment - - edited

          (147) [S] When we need to log something inside a function and prefix it with a function name, we seem to prefer the colon-less style, like so:

          zabbix_log(LOG_LEVEL_DEBUG, "%s() title:'%s'", __function_name, title);
          

          However, tls.c adds a colon after function name:

          zabbix_log(LOG_LEVEL_DEBUG, "%s(): requested PSK identity: \"%.*s\"", __function_name,
          		(int)psk_identity_len, psk_identity);
          

          The majority of this colon style is in tls.c:

          $ find -name \*.c | xargs grep -i 'zabbix_log.*"%s() ' | wc -l
          174
          $ find -name \*.c | xargs grep -i 'zabbix_log.*"%s(): ' | wc -l
          48
          $ find -name \*.c | xargs grep -i 'zabbix_log.*"%s(): ' | grep tls.c | wc -l                                                                                                            
          33
          

          Fixing this is optional, but it might help to reduce future confusion in tasks like (1) in ZBX-10001.

          Andris Mednis RESOLVED in r57331.

          Aleksandrs Saveljevs Looks good. CLOSED.

          Show
          Aleksandrs Saveljevs added a comment - - edited (147) [S] When we need to log something inside a function and prefix it with a function name, we seem to prefer the colon-less style, like so: zabbix_log(LOG_LEVEL_DEBUG, "%s() title:'%s'", __function_name, title); However, tls.c adds a colon after function name: zabbix_log(LOG_LEVEL_DEBUG, "%s(): requested PSK identity: \"%.*s\"", __function_name, (int)psk_identity_len, psk_identity); The majority of this colon style is in tls.c: $ find -name \*.c | xargs grep -i 'zabbix_log.*"%s() ' | wc -l 174 $ find -name \*.c | xargs grep -i 'zabbix_log.*"%s(): ' | wc -l 48 $ find -name \*.c | xargs grep -i 'zabbix_log.*"%s(): ' | grep tls.c | wc -l 33 Fixing this is optional, but it might help to reduce future confusion in tasks like (1) in ZBX-10001 . Andris Mednis RESOLVED in r57331. Aleksandrs Saveljevs Looks good. CLOSED.
          Hide
          Andris Mednis added a comment - - edited

          (148) [S] Debug logging with GnuTLS includes key size in bytes as the last parameter ('16' in the following example):

           14252:20151127:103659.269 End of zbx_tls_accept():SUCCEED (established TLS1.2 ECDHE-RSA-AES-128-GCM-AEAD-0-16)
          

          This is redundant, as the key size is included in 'AES-128'.

          Andris Mednis RESOLVED in r56998.

          Aleksandrs Saveljevs Looks good. All relevant cipher names in lib/algorithms/ciphers.c in GnuTLS seem to include key size. CLOSED.