[ZBX-15091] TLS communication between zabbix-server and zabbix-proxies stops working after change to wintertime Created: 2018 Oct 30  Updated: 2024 Apr 10

Status: Need info
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 3.4.14
Fix Version/s: None

Type: Problem report Priority: Trivial
Reporter: Edgars Melveris Assignee: Zabbix Development Team
Resolution: Unresolved Votes: 2
Labels: encryption
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

actual version 3.4.10


Issue Links:
Causes
Duplicate
Team: Team B
Story Points: 3

 Description   

Last 28th during the time change to wintertime all communications between Zabbix server and Zabbix proxy suddenly stopped working.

11276:20181028:021904.224 sending configuration data to proxy "A" at "X.X.X.63", datalen 36831646
 11271:20181028:022017.939 sending configuration data to proxy "B" at "X.X.X.74", datalen 14052878
 11278:20181028:022022.380 sending configuration data to proxy "C" at "X.X.X.56", datalen 36070831
 11271:20181028:022051.518 sending configuration data to proxy "D" at "X.X.X.53", datalen 37935485
 11274:20181028:022127.225 sending configuration data to proxy "E" at "X.X.X.55", datalen 40935097
 11274:20181028:022212.521 sending configuration data to proxy "F" at "X.X.X.49", datalen 35815498
 11282:20181028:022252.250 sending configuration data to proxy "G" at "X.X.X.74", datalen 14684905
 11279:20181028:022444.342 sending configuration data to proxy "H" at "X.X.X.54", datalen 41381057
 11277:20181028:022911.060 sending configuration data to proxy "A" at "X.X.X.63", datalen 36831646
 11275:20181028:023029.085 sending configuration data to proxy "C" at "X.X.X.56", datalen 36070831
 11279:20181028:023044.988 sending configuration data to proxy "B" at "X.X.X.74", datalen 14052878
 11274:20181028:023058.625 sending configuration data to proxy "D" at "X.X.X.53", datalen 37935485
 11274:20181028:023134.601 sending configuration data to proxy "E" at "X.X.X.55", datalen 40935097
 11276:20181028:023218.856 sending configuration data to proxy "F" at "X.X.X.49", datalen 35815498
 11274:20181028:023259.784 sending configuration data to proxy "G" at "X.X.X.74", datalen 14684905
 11274:20181028:023451.778 sending configuration data to proxy "H" at "X.X.X.54", datalen 41381057
 11275:20181028:023918.121 sending configuration data to proxy "A" at "X.X.X.63", datalen 36831646
 11277:20181028:024035.549 sending configuration data to proxy "C" at "X.X.X.56", datalen 36070831
 11282:20181028:024105.781 sending configuration data to proxy "D" at "X.X.X.53", datalen 37935485
 11280:20181028:024111.567 sending configuration data to proxy "B" at "X.X.X.74", datalen 14052878
 11274:20181028:024142.038 sending configuration data to proxy "E" at "X.X.X.55", datalen 40935097
 11281:20181028:024225.339 sending configuration data to proxy "F" at "X.X.X.49", datalen 35815498
 11271:20181028:024307.378 sending configuration data to proxy "G" at "X.X.X.74", datalen 14684905
 11279:20181028:024458.978 sending configuration data to proxy "H" at "X.X.X.54", datalen 41381057
 11278:20181028:024925.014 sending configuration data to proxy "A" at "X.X.X.63", datalen 36831646
 11280:20181028:025041.940 sending configuration data to proxy "C" at "X.X.X.56", datalen 36070831
 11281:20181028:025112.798 sending configuration data to proxy "D" at "X.X.X.53", datalen 37935485
 11272:20181028:025132.167 sending configuration data to proxy "B" at "X.X.X.74", datalen 14052878
 11275:20181028:025149.643 sending configuration data to proxy "E" at "X.X.X.55", datalen 40935097
 11276:20181028:025231.784 sending configuration data to proxy "F" at "X.X.X.49", datalen 35815498
 11279:20181028:025314.788 sending configuration data to proxy "G" at "X.X.X.74", datalen 14684905
 11275:20181028:025507.153 sending configuration data to proxy "H" at "X.X.X.54", datalen 41381057
 11280:20181028:025931.875 sending configuration data to proxy "A" at "X.X.X.63", datalen 36831646
 11280:20181028:020048.275 sending configuration data to proxy "C" at "X.X.X.56", datalen 36070831
 11282:20181028:020120.518 sending configuration data to proxy "D" at "X.X.X.53", datalen 37935485
 11279:20181028:020148.842 sending configuration data to proxy "B" at "X.X.X.74", datalen 14052878
 11280:20181028:020157.451 sending configuration data to proxy "E" at "X.X.X.55", datalen 40935097
 11274:20181028:020238.388 sending configuration data to proxy "F" at "X.X.X.49", datalen 35815498
 11281:20181028:020322.559 sending configuration data to proxy "G" at "X.X.X.74", datalen 14684905
 11278:20181028:020515.240 sending configuration data to proxy "H" at "X.X.X.54", datalen 41381057
 11242:20181028:020935.444 executing housekeeper

No more useful logs until restart only housekeper logs.

In the proxy side the logs are quite similar:

9256:20181028:025917.057 SNMP agent item "authServerUptime.['X.X.X.70']" on host "A" failed: first network error, wait for 15 seconds
  8895:20181028:025932.379 received configuration data from server at "server", datalen 36831646
  9032:20181028:025950.648 SNMP agent item "wlsxSysExtPacketLossPercent" on host "B" failed: first network error, wait for 15 seconds
  8898:20181028:020322.184 executing housekeeper
  8898:20181028:020330.795 slow query: 8.593921 sec, "delete from proxy_history where id<13790379271 and (clock<1540685002 or (id<=13790379271 and clock<1540688602))"
  8898:20181028:020330.834 housekeeper [deleted 1884192 records in 8.634713 sec, idle for 1 hour(s)]
  8895:20181028:020940.637 received configuration data from server at "server", datalen 36831646
  8898:20181028:030331.137 executing housekeeper
  8898:20181028:030331.141 housekeeper [deleted 0 records in 0.002344 sec, idle for 1 hour(s)]
  8898:20181028:040331.451 executing housekeeper
  8898:20181028:040331.455 housekeeper [deleted 0 records in 0.002180 sec, idle for 1 hour(s)]

Again only housekeeping logs after restart

The first thing we tried was restarting the proxy without restarting the server

These are the logs in the server side and the proxy side after this:

Server log:

cannot send configuration data to proxy "A" at "X.X.X.54": TLS write set result code to 5:

Proxy log:

Unable to connect to the server [server]:10051 [TCP successful, cannot establish TLS to [[server]:10051]: SSL_connect() I/O error: [104] Connection reset by peer]

Restarting the agent shows similar errors:

active check configuration update from [A:10051] started to fail (TCP successful, cannot establish TLS to [[A]:10051]: SSL_connect() timed out)

We restarted the server and after this restart all the proxies started working again.



 Comments   
Comment by Andrejs Kozlovs [ 2019 Feb 15 ]

Is it possible to reproduce TLS issue after proxy restart by system time moving back? Does logs become alive after some time (time change delta, in given case -1h ) if no reload proxy?

Comment by Glebs Ivanovskis [ 2019 Feb 20 ]

You don't have to wait a year or mess with system clock to try to reproduce the issue. Instead you can mess up timezone for Zabbix daemons with DST changes at any desired moment. See man tzset:

The value of TZ can be one of three formats.
...
The second format is used when there is daylight saving time:
std offset dst [offset],start[/time],end[/time]
...
The start field specifies when daylight saving time goes into effect and the end field specifies when the change is made back to standard time.
...

Generated at Fri Apr 26 23:51:59 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.