[ZBX-18777] Occasional unspecified certficate verification error with PSK on Windows Server 2019 Created: 2020 Dec 16 Updated: 2025 Mar 20 Resolved: 2020 Dec 18 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 5.0.6 |
Fix Version/s: | None |
Type: | Problem report | Priority: | Trivial |
Reporter: | Markku Leiniö | Assignee: | Aleksandrs Pahomovs |
Resolution: | Won't fix | Votes: | 0 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Server: Zabbix server 5.0.6 on Debian Linux 10 (Buster) |
Attachments: |
![]() ![]() ![]() ![]() |
||||
Issue Links: |
|
Description |
Steps to reproduce:
Result: Server logs occasionally: 10377:20201216:125835.926 failed to accept an incoming connection: from 10.11.22.33: unspecified certificate verification error: TLS handshake set result code to 5: At the same time client (10.11.22.33 agent above) logs: 12628:20201216:125833.259 active check data upload to [zabbix-server-ip:10051] started to fail ([connect] TCP successful, cannot establish TLS to [[zabbix-server-ip]:10051]: SSL_connect() timed out) Expected: Other information: Initially we had server 4.4.10 on Debian Linux 9 (Stretch) and agents 4.4.x, and we didn't have those errors. Then we first changed the server to a new one with Debian Linux 10 (Buster) with server 4.4.10 (new installation, copied the configurations), and that's when the error messages started. We then upgraded both server and agents to 5.0.6, but the occasional errors continued. There are less errors though with 5.0.6. Notable detail is that agents on Linux, on Windows 10 or on Windows Server 2016 do not cause these errors (agents are 4.0.x, 4.4.x or 5.0.6). Debian 9 server (old server with no problems) openssl version: OpenSSL 1.1.0l 10 Sep 2019 Debian 10 server (current) openssl version: OpenSSL 1.1.1d 10 Sep 2019 Agent TLS configuration:
|
Comments |
Comment by Markku Leiniö [ 2020 Dec 16 ] |
Additional information: Servers have been installed from the official Zabbix repo using the supplied dpkg files and instructions, as well as the Linux agents. |
Comment by Markku Leiniö [ 2020 Dec 16 ] |
Let me know if you have specific hints how to troubleshoot this further. |
Comment by Markku Leiniö [ 2020 Dec 17 ] |
To get an idea of the recurrence pattern at the moment (timestamps are EET): 10375:20201216:182815.626 failed to accept an incoming connection: from 10.33.33.8: unspecified certificate verification error: TLS handshake set result code to 5: while all hosts have several items with 1 minute interval (the usual Windows metrics like CPU, disk and network-related). |
Comment by Aleksandrs Pahomovs [ 2020 Dec 17 ] |
Hello, Could you please try the same only without encryption, it must be excluded or confirmed. |
Comment by Markku Leiniö [ 2020 Dec 17 ] |
Hi, ok, I will. But first, here is one error case: 10376:20201217:120429.472 failed to accept an incoming connection: from 192.168.0.1: unspecified certificate verification error: TLS handshake set result code to 5: Here is a Zabbix server-side pcap export attached how it looks like. 192.168.0.1 = agent on Windows 2019, 10.10.10.1 = Zabbix server on Debian 10
Edited the text above: actually we only use TLS with selected agents, and those happen to be Windows-only. So we don't currently have data about the TLS PSK behaviour from Linux agents in this case. |
Comment by Markku Leiniö [ 2020 Dec 17 ] |
So to conclude, there are no IP connectivity errors (as shown by the client logs as well, "TCP successful"), just TLS problems. Disabling TLS does not bring us closer to solution actually. |
Comment by Aleksandrs Pahomovs [ 2020 Dec 17 ] |
Do you use zabbix agent version 1 or 2? |
Comment by Markku Leiniö [ 2020 Dec 17 ] |
These are v1 agents only. |
Comment by Aleksandrs Pahomovs [ 2020 Dec 17 ] |
Could you please check your PSK key size? Is it 512 bits or more? |
Comment by Markku Leiniö [ 2020 Dec 17 ] |
PSK size was originally 64 hex characters = 32 bytes = 256 bits, but as part of troubleshooting I reduced it to 62 characters = 31 bytes = 248 bits (didn't affect as far as I noticed). |
Comment by Aleksandrs Pahomovs [ 2020 Dec 17 ] |
Unfortunately, I can't reproduce your issue. {root@debian:/home/zabbix# cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 10 (buster)" NAME="Debian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" root@debian:/home/zabbix# zabbix_server -V zabbix_server (Zabbix) 5.0.6 Revision 93895db26b 30 November 2020, compilation time: Nov 30 2020 08:11:40 Copyright (C) 2020 Zabbix SIA License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it according to the license. There is NO WARRANTY, to the extent permitted by law. This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (http://www.openssl.org/). Compiled with OpenSSL 1.1.1d 10 Sep 2019 Running with OpenSSL 1.1.1d 10 Sep 2019 C:\Users\Administrator>systeminfo Host Name: WIN-D2P4GAJ25MJ OS Name: Microsoft Windows Server 2019 Essentials OS Version: 10.0.17763 N/A Build 17763 OS Manufacturer: Microsoft Corporation OS Configuration: Standalone Server OS Build Type: Multiprocessor Free Registered Owner: Windows User C:\Users\Administrator>zabbix_agentd -V zabbix_agentd Win64 (service) (Zabbix) 5.0.6 Revision 93895db26b 30 November 2020, compilation time: Nov 30 2020 16:06:48 Copyright (C) 2020 Zabbix SIA License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it according to the license. There is NO WARRANTY, to the extent permitted by law. This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (http://www.openssl.org/). Compiled with OpenSSL 1.1.1g 21 Apr 2020 Running with OpenSSL 1.1.1g 21 Apr 2020 |
Comment by Markku Leiniö [ 2020 Dec 17 ] |
I rebooted the server again about three hours ago, and I haven't got errors since. Let's see... Server: LogFile=/var/log/zabbix/zabbix_server.log LogFileSize=0 PidFile=/var/run/zabbix/zabbix_server.pid SocketDir=/var/run/zabbix DBHost=x.x.x.x DBName=zabbix DBUser=zabbix DBPassword=xxx StartPollersUnreachable=20 StartPingers=20 StartVMwareCollectors=3 SNMPTrapperFile=/var/log/snmptrap/snmptrap.log CacheSize=64M HistoryCacheSize=64M TrendCacheSize=32M ValueCacheSize=128M Timeout=4 AlertScriptsPath=/usr/lib/zabbix/alertscripts ExternalScripts=/usr/lib/zabbix/externalscripts FpingLocation=/usr/bin/fping Fping6Location=/usr/bin/fping6 LogSlowQueries=3000 StatsAllowedIP=127.0.0.1 Agent (config created by the MSI installer, no files in conf.d): LogFile=C:\Program Files\Zabbix Agent\zabbix_agentd.log Server=x.x.x.x ServerActive=x.x.x.x Hostname=xxx Include=C:\Program Files\Zabbix Agent\zabbix_agentd.conf.d\ TLSConnect=psk TLSAccept=psk TLSPSKIdentity=xxx TLSPSKFile=C:\Program Files\Zabbix Agent\psk.key I appreciate your attention on this. |
Comment by Markku Leiniö [ 2020 Dec 18 ] |
To let you know: There haven't been any TLS errors yet since I reboted the Zabbix server. Also, I only now found out that these had been happening as well, with active Linux agents (with no TLS): 593:20201216:190911.181 active check data upload to [zabbix-server-ip:10051] started to fail ([recv] ZBX_TCP_READ() timed out) 593:20201216:190913.525 active check data upload to [zabbix-server-ip:10051] is working again 593:20201217:094451.199 active check data upload to [zabbix-server-ip:10051] started to fail ([recv] ZBX_TCP_READ() timed out) 593:20201217:094452.199 active check data upload to [zabbix-server-ip:10051] is working again 593:20201217:132737.596 active check data upload to [zabbix-server-ip:10051] started to fail ([recv] ZBX_TCP_READ() timed out) 593:20201217:132742.776 active check data upload to [zabbix-server-ip:10051] is working again Those started exactly when we changed the server (still with 4.4.10 server), and continued when upgraded server to 5.0.6. But these are now gone since I rebooted the Zabbix server yesterday. It now looks very much so that there was something strange with the new server, and after the latest reboot that something got fixed. All system upgrades have been up to date all the time, so I cannot point this to any specific event or detail. I'll keep checking this and report again after a few days at latest. |
Comment by Aleksandrs Pahomovs [ 2020 Dec 18 ] |
I see that is not a problem with zabbix, Finally, probably problem is on the routing table. |
Comment by Aleksandrs Pahomovs [ 2020 Dec 18 ] |
Please be advised that this section of the tracker is for bug reports only. The case you have submitted can not be qualified as one, so please reach out to [email protected] for commercial support or consultancy services. Alternatively, you can also use our IRC channel or community forum (https://www.zabbix.com/forum) for assistance. With that said, we are closing this ticket. Thank you for understanding. |
Comment by Markku Leiniö [ 2020 Dec 22 ] |
FYI, no similar errors have occurred after the last reboot. |