[ZBX-12957] zabbix agentd use active mode traffic abnormality Created: 2017 Oct 27 Updated: 2024 Apr 10 Resolved: 2018 Dec 21 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Proxy (P), Server (S) |
Affects Version/s: | 3.2.9 |
Fix Version/s: | 4.0.0alpha9, 4.0 (plan) |
Type: | Problem report | Priority: | Trivial |
Reporter: | yinyaliang | Assignee: | Michael Veksler |
Resolution: | Fixed | Votes: | 0 |
Labels: | agent, client_timediff, proxy_timediff | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
centos 6.5 |
Attachments: |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
||||||||||||||||||||||||||||||||||||||||
Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
Team: | |||||||||||||||||||||||||||||||||||||||||
Sprint: | Sprint 27, Sprint 28, Sprint 29, Sprint 30, Sprint 31, Sprint 32, Sprint 33, Sprint 34, Sprint 35, Sprint 36, Sprint 37, Sprint 38, Sprint 47, Dec 2018 | ||||||||||||||||||||||||||||||||||||||||
Story Points: | 1 |
Description |
Hello. I use version 3.2. When the network interrupts for some time or I stop MySQL for some time, the traffic flow on the agentd side will suddenly increase, which will be several times to dozens of times my bandwidth |
Comments |
Comment by Ingus Vilnis [ 2017 Oct 27 ] | ||||||||||||||||||||||||
Stop creating duplicate issues here! It is enough with one report and duplicates will not speed up any resolution. When the network or database is recovering and become available, active agent is sending the collected values to server. Due to very low item update interval (5 seconds) it is possible that the values which are afterwards calculated as speed per second do not get updated correctly because of the short intervals and thus mathematically result in very large numbers. You may want to increase the update interval of the items to somewhat higher values (e.g. 30 or 60 seconds) and try to reproduce again. | ||||||||||||||||||||||||
Comment by yinyaliang [ 2017 Oct 29 ] | ||||||||||||||||||||||||
Sorry, I have submitted the question on this platform for the first time, not very skilled, indeed, my update time is 5s, thank you for your solution | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2017 Oct 29 ] | ||||||||||||||||||||||||
Could you please clarify what was down, database or the network between agent and server? Could you show us Zabbix server's process busyness graphs for that time interval? Are there any other active items monitored by this agent? I would love to see system.uptime or system.localtime Latest data around that time if you have those. | ||||||||||||||||||||||||
Comment by yinyaliang [ 2017 Oct 30 ] | ||||||||||||||||||||||||
This problem appears twice, once is the core of network problems, database maintenance is stop service at a time ,there are some active items on the agentd,but the other items do not have the problem,example(memory......) | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2017 Oct 30 ] | ||||||||||||||||||||||||
Oh, great! Can you please show data gathering process busyness graph too? If system.uptime and system.localtime are active, can you show their Values for the time interval around the moment when problem occurred, just like in 20171028004134.png | ||||||||||||||||||||||||
Comment by yinyaliang [ 2017 Oct 30 ] | ||||||||||||||||||||||||
thanks for you help, gather processing,localtime and uptime | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2017 Oct 30 ] | ||||||||||||||||||||||||
Thank you! | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 07 ] | ||||||||||||||||||||||||
Managed to get quite interesting results for system.localtime converted to Zabbix agent (active) type by suspending trappers and releasing one of them from time to time:
From the log file: 19489:20171107:004202.024 trapper got '{"request":"agent data","data":[{"host":"Zabbix server","key":"system.cpu.switches","value":"45978494","clock":1510008038,"ns":14401032},{"host":"Zabbix server","key":"system.cpu.util[,idle]","value":"95.713509","clock":1510008038,"ns":14567055},{"host":"Zabbix server","key":"system.cpu.util[,interrupt]","value":"0.000000","clock":1510008038,"ns":14719329},{"host":"Zabbix server","key":"system.cpu.util[,softirq]","value":"0.010450","clock":1510008038,"ns":14868482},{"host":"Zabbix server","key":"system.cpu.util[,steal]","value":"0.000000","clock":1510008038,"ns":15025410},{"host":"Zabbix server","key":"system.cpu.util[,iowait]","value":"0.158836","clock":1510008038,"ns":15196637},{"host":"Zabbix server","key":"system.cpu.util[,system]","value":"1.519395","clock":1510008038,"ns":15352948},{"host":"Zabbix server","key":"system.cpu.util[,nice]","value":"0.025079","clock":1510008038,"ns":15507540},{"host":"Zabbix server","key":"system.cpu.util[,user]","value":"2.572730","clock":1510008038,"ns":15665049},{"host":"Zabbix server","key":"system.swap.size[,free]","value":"2161111040","clock":1510008038,"ns":15825709},{"host":"Zabbix server","key":"system.swap.size[,pfree]","value":"100.000000","clock":1510008038,"ns":15993437},{"host":"Zabbix server","key":"system.localtime","value":"1510008038","clock":1510008038,"ns":16147397},{"host":"Zabbix server","key":"system.cpu.intr","value":"6064854","clock":1510008038,"ns":16389943},{"host":"Zabbix server","key":"system.users.num","value":"1","clock":1510008038,"ns":19364872},{"host":"Zabbix server","key":"proc.num[]","value":"322","clock":1510008038,"ns":22950414}],"clock":1510008040,"ns":23786452}' $ date -d @1510008038 Tue Nov 7 00:40:38 EET 2017 As we see, value 1510008038 originally had timestamp 1510008038, but was received by Zabbix server at different time and as a result timestamp was adjusted to 00:42:00. | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 07 ] | ||||||||||||||||||||||||
The problem is that Zabbix server tries to do ntpd's work. sasha strongly insists it needs to. Discussions on the topic happen in ZBXNEXT-3298. | ||||||||||||||||||||||||
Comment by Sergejs Paskevics [ 2018 Jun 12 ] | ||||||||||||||||||||||||
Successfully tested | ||||||||||||||||||||||||
Comment by Michael Veksler [ 2018 Jun 19 ] | ||||||||||||||||||||||||
Available in 4.0.0alpha9 r82013. | ||||||||||||||||||||||||
Comment by richlv [ 2018 Aug 02 ] | ||||||||||||||||||||||||
Looking at the issue comments, it's unclear what was done here. Could you please clarify what exact changes were in the scope here? | ||||||||||||||||||||||||
Comment by dimir [ 2018 Nov 22 ] | ||||||||||||||||||||||||
As I understand, we removed the time adjustments completely. The difference is still calculated, but only for printing that in DEBUG mode. | ||||||||||||||||||||||||
Comment by richlv [ 2018 Nov 22 ] | ||||||||||||||||||||||||
Michael, can you please confirm or deny the changes mentioned? | ||||||||||||||||||||||||
Comment by Michael Veksler [ 2018 Nov 22 ] | ||||||||||||||||||||||||
According the decision the protocol is not changed and difference is calculated only for debug logging. Documentation:
| ||||||||||||||||||||||||
Comment by richlv [ 2018 Nov 22 ] | ||||||||||||||||||||||||
Thank you, Michael. While you link to some decision, that link only leads back to this page. | ||||||||||||||||||||||||
Comment by dimir [ 2018 Nov 22 ] | ||||||||||||||||||||||||
Yes, that link points to a very secret internal information where different approaches on solving this issue are listed and the decision is made. | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis [ 2018 Dec 12 ] | ||||||||||||||||||||||||
According to discussion in
| ||||||||||||||||||||||||
Comment by dimir [ 2018 Dec 13 ] | ||||||||||||||||||||||||
Correct. Re-opening on behalf of previous comment. | ||||||||||||||||||||||||
Comment by dimir [ 2019 Mar 15 ] | ||||||||||||||||||||||||
Additional documentation changes:
| ||||||||||||||||||||||||
Comment by Glebs Ivanovskis [ 2019 Mar 17 ] | ||||||||||||||||||||||||
Good job, dimir! Consider adding a link to the explanation of what a passive check is. | ||||||||||||||||||||||||
Comment by dimir [ 2019 Mar 17 ] | ||||||||||||||||||||||||
The time synchronization link? | ||||||||||||||||||||||||
Comment by Glebs Ivanovskis [ 2019 Mar 17 ] | ||||||||||||||||||||||||
No, I mean system.localtime, fuzzytime() and trigger expression example mention "passive check". Reader may not know what it is. | ||||||||||||||||||||||||
Comment by dimir [ 2019 Mar 17 ] | ||||||||||||||||||||||||
There, thanks for pointing that out! |