Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-18545

Quarterly "TLS connection has been closed during handshake"

    XMLWordPrintable

    Details

    • Type: Problem report
    • Status: Need info
    • Priority: Trivial
    • Resolution: Unresolved
    • Affects Version/s: 4.0.24
    • Fix Version/s: None
    • Component/s: Agent (G), Server (S)
    • Environment:

      Description

      Steps to reproduce:

      1. Currently no possible as it happens at random times about every 3 months

      Result:

      • Lots of messages with
        21471:20201022:125748.424 failed to accept an incoming connection: from ***.***.***.***: TLS connection has been closed during handshake:
        21471:20201022:125748.424 failed to accept an incoming connection: from ***.***.***.***: TLS connection has been closed during handshake:
        21471:20201022:125748.424 failed to accept an incoming connection: from ***.***.***.***: TLS connection has been closed during handshake:
        
      • Test with openssl s_client and given PSK identity and key result in
        SSL handshake has read 0 bytes and written 289 bytes
        
      • Higher Debug-Levels reveal nothing specific:
        End of zbx_tls_accept():FAIL error:'TLS connection has been closed during handshake:'
        
      • tcpdump + Wireshark show incoming TLS-Handshakes with Client-Hello, no Response from Zabbix-Server except a TCP connection closings with "encrypted alert 21" and lots of FIN_WAIT1 tcp connections, around 600.
      • Code-Part seems to use openssl as Library. Return path for this error gets SSL_ERROR_ZERO_RETURN which obviously seem to suggest handshake problems without any further specifics.
      • Zabbix-Proxy can still receive agent data (passive) but not send it to the Zabbix-Server
      • The only solutions seems to be to do the following steps
        1. Block incoming connections
        2. Restart prozess or server/VM
        3. Increasingly allow more and more clients to connect
      • A different "solution" is to just wait 3-5 hours until the problem magically disappears as fast as it started out nothing.

        Attachments

          Activity

            People

            Assignee:
            neogan Andrei Gushchin
            Reporter:
            Aperto-SE Aperto Systemsengineering
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated: