[ZBX-13027] Agent ping sometimes does not go back to OK Created: 2017 Nov 13  Updated: 2017 Nov 17  Resolved: 2017 Nov 17

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.4.4
Fix Version/s: 3.4.5rc1, 4.0.0alpha1, 4.0 (plan)

Type: Problem report Priority: Critical
Reporter: Dirk Bongard Assignee: Andris Zeila
Resolution: Fixed Votes: 0
Labels: triggers
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 16.04 LTS


Attachments: PNG File Screenshot_1.png     PNG File Screenshot_2.png     PNG File Screenshot_3.png     PNG File Screenshot_4.png     PNG File Screenshot_5.png     PNG File Screenshot_6.png     PNG File item-1.png     PNG File item-2.png     Text File zabbix_server.log.txt    
Team: Team A
Sprint: Sprint 21
Story Points: 0.5

 Description   

After I upgraded from 3.2.5 to 3.4.3 I have the problem that the agent ping remains on error, although this is UP again.
Workaround restart zabbix server or stop zabbix agent and start agent after approx 10 minutes. If I use the agent option then I have the agent-Ping error twice Screenshot_4. Is the agent ping up again both enties schow OK.
I usually have this problem with WAN. For 10 devices, 8 show OK in case of a failure. 2 remain on error.



 Comments   
Comment by Dirk Bongard [ 2017 Nov 13 ]

Within this time all processes are not overload. Screenshot_6

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

do you happen to see any failed queries in zabbix server log ?

Comment by Dirk Bongard [ 2017 Nov 13 ]

less zabbix_server.log |grep LIDSHU01
3712:20171113:121955.179 resuming Zabbix agent checks on host "LIDSHU01": connection restored

qury failed:
yes, masses...
3173:20171113:130157.871 [Z3005] query failed: [1264] Out of range value for column 'value' at row 47 [insert into history (itemid,clock,ns,value) values (
49998,1510574515,56777992,0.000000),(50164,1510574515,451684664,0.034550),(111703,1510574515,451684664,0.000843),(138243,1510574515,451684664,0.001533),(4979
6,1510574517,28076406,121.632212),(69236,1510574517,42187773,0.000000),(139016,1510574517,46018291,1.000000),(54956,1510574517,49359955,1.000000),(60056,1510
574517,65257986,0.000000),(68156,1510574517,76800789,0.140000),(139017,1510574517,107525962,130.000000),(48717,1510574517,128040886,42.000000),(122397,151057
4517,132339468,23.448542),(55676,1510574517,176823300,13.280516),(60057,1510574517,217694049,0.000000),(135057,1510574517,217944035,0.037598),(68516,15105745
17,354362970,0.319444),(55316,1510574517,355919542,4.860310),(74157,1510574517,355983800,4.185430),(117717,1510574517,367709453,0.000000),(44277,1510574517,3
69807086,0.258581),(74756,1510574517,370139372,0.000096),(55796,1510574517,370980487,0.000000),(90537,1510574517,372355858,0.000000),(43077,1510574517,405365
165,2264.657341),(74876,1510574517,417458215,1.000000),(38037,1510574517,465767830,94.127942),(111717,1510574517,575779229,-0.000000),(54717,1510574517,59592
1794,50441.192744),(130557,1510574517,630824809,82.204581),(34197,1510574517,645120841,1.000000),(48597,1510574517,648808676,713.296130),(43557,1510574517,67
5236181,0.040000),(55317,1510574517,678621396,1.900137),(42837,1510574517,692615226,0.000000),(101517,1510574517,697827267,3.091666),(59037,1510574517,704774
845,360425239.893333),(96476,1510574517,742519485,0.000000),(44397,1510574517,747205993,0.000000),(72957,1510574517,754577626,1827.335527),(93357,1510574517,
782437885,0.955095),(94677,1510574517,786184088,0.000000),(54957,1510574517,808978529,1.000000),(101637,1510574517,819753316,0.000000),(69837,1510574517,8274

ZBX-12731 ?
I have upgraded via Repo. I followed instruction
https://www.zabbix.com/documentation/3.4/manual/installation/upgrade_packages/debian_ubuntu

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

Should have been fixed in ZBX-12903 could you please upgrade and let me know if issue persists ?

Comment by Dirk Bongard [ 2017 Nov 13 ]

I have already installed 3.4.4. for few days
---------------------------------------
zabbix_server (Zabbix) 3.4.4
Revision 74338 7 November 2017, compilation time: Nov 9 2017 10:50:47
(Ubuntu Repo)

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

Please make sure that you restarted server

Comment by Dirk Bongard [ 2017 Nov 13 ]

Yepp I have:

Start-Date: 2017-11-08 21:42:45
Commandline: apt dist-upgrade
Requested-By: Monitor-Admin (1001)
Upgrade: openjdk-8-jdk:amd64 (8u131-b11-2ubuntu1.16.04.3, 8u151-b12-0ubuntu0.16.04.2), openjdk-8-jre:amd64 (8u131-b11-2ubuntu1.16.04.3, 8u151-b12-0ubuntu0.16.04.2), zabbix-agent:amd64 (1:3.4.3-2+xenial, 1:3.4.4-1+xenial), resolvconf:amd64 (1.78ubuntu4, 1.78ubuntu5), openjdk-8-jdk-headless:amd64 (8u131-b11-2ubuntu1.16.04.3, 8u151-b12-0ubuntu0.16.04.2), zabbix-server-mysql:amd64 (1:3.4.3-2+xenial, 1:3.4.4-1+xenial), zabbix-sender:amd64 (1:3.4.3-2+xenial, 1:3.4.4-1+xenial), zabbix-get:amd64 (1:3.4.3-2+xenial, 1:3.4.4-1+xenial), zabbix-frontend-php:amd64 (1:3.4.3-2+xenial, 1:3.4.4-1+xenial), openjdk-8-jre-headless:amd64 (8u131-b11-2ubuntu1.16.04.3, 8u151-b12-0ubuntu0.16.04.2)
End-Date: 2017-11-08 21:42:59

[...]
Start-Date: 2017-11-09 20:39:51
Commandline: apt dist-upgrade
Requested-By: Monitor-Admin (1001)
Upgrade: zabbix-agent:amd64 (1:3.4.4-1+xenial, 1:3.4.4-2+xenial), snapd:amd64 (2.27.5, 2.28.5), zabbix-server-mysql:amd64 (1:3.4.4-1+xenial, 1:3.4.4-2+xenial), ubuntu-core-launcher:amd64 (2.27.5, 2.28.5), zabbix-sender:amd64 (1:3.4.4-1+xenial, 1:3.4.4-2+xenial), zabbix-get:amd64 (1:3.4.4-1+xenial, 1:3.4.4-2+xenial), zabbix-frontend-php:amd64 (1:3.4.4-1+xenial, 1:3.4.4-2+xenial)
End-Date: 2017-11-09 20:40:03

..............
who -b
Systemstart 2017-11-11 18:11

Comment by Dirk Bongard [ 2017 Nov 13 ]

I have booted the server again:

uptime
21:33:09 up 6 min, 1 user, load average: 1,31, 1,41, 0,70

Server.log
1492:20171113:212807.401 Starting Zabbix Server. Zabbix 3.4.4 (revision 74338).
1492:20171113:212807.426 ****** Enabled features ******
1492:20171113:212807.426 SNMP monitoring: YES
1492:20171113:212807.426 IPMI monitoring: YES
1492:20171113:212807.426 Web monitoring: YES
1492:20171113:212807.426 VMware monitoring: YES
1492:20171113:212807.426 SMTP authentication: YES
1492:20171113:212807.427 Jabber notifications: YES
1492:20171113:212807.427 Ez Texting notifications: YES
1492:20171113:212807.427 ODBC: YES
1492:20171113:212807.427 SSH2 support: YES
1492:20171113:212807.427 IPv6 support: YES
1492:20171113:212807.427 TLS support: YES
1492:20171113:212807.427 ******************************

[...]
3133:20171113:213115.438 [Z3005] query failed: [1264] Out of range value for column 'value' at row 12 [insert into history (itemid,clock,ns,value) values (29602,1510605072,482840103,0.002790),(43803,1510605072,482840103,0.004477),(52366,1510605072,482840103,0.764460),(53877,1510605072,482840103,0.079063),(66768,1510605072,482840103,0.028857),(96624,1510605072,482840103,0.020260),(133122,1510605072,782396410,0.000000),(43423,1510605072,782396410,0.000000),(90522,1510605072,782396410,0.000000),(96361,1510605072,782396410,0.000000),(101834,1510605074,637116628,0.000423),(139274,1510605074,639784805,2622545199104.000000),(36914,1510605074,640270612,0.000000),(117674,1510605074,648122177,662.354206),(121214,1510605074,662143450,0.020000),(45314,1510605074,673364891,0.154759),(138314,1510605074,712328197,0.007778),(44354,1510605074,712933989,1.000000),(68331,1510605074,725212771,40.000000),(89594,1510605074,735919125,0.000000),(41954,1510605074,736097101,0.000000),(123674,1510605074,754820218,0.000000),(94754,1510605074,755653323,1.000000),(62474,1510605074,807243386,1.951111),(60074,1510605074,810699141,0.000000),(49754,1510605074,831469867,0.000000),(25274,1510605074,831799416,25.447595),(36554,1510605074,834280816,0.000000),(62594,1510605074,851050708,29751.321552),(51074,1510605074,861475692,19.021577),(64274,1510605074,895689259,0.000000),(60314,1510605074,922449519,0.000000),(42194,1510605074,928267780,0.253075),(137474,1510605074,932115104,816.027029),(61034,1510605074,932611220,0.000000),(42674,1510605074,940839114,37.329316),(73514,1510605074,985138974,201886515.200000),(68954,1510605074,985150520,27334.499674),(124154,1510605074,987120164,34.000000),(73994,1510605074,988084413,0.000000),(56234,1510605075,6022460,0.026667),(56114,1510605075,8224292,0.002222),(73274,1510605075,12627271,1.000000),(68474,1510605075,16737628,0.000254),(76274,1510605075,19087053,9916.513859),(54914,1510605075,22631661,0.000000),(129675,1510605075,23869359,0.082520),(73034,1510605075,39139692,0.069037),(69674,1510605075,41396215,685.267469),(76154,1510605075,48735526,456.394420),(67514,1510605075,59807784,0.000050),(137594,1510605075,73172275,0.001111),(122355,1510605075,73394193,10.757272),(64394,1510605075,74761589,0.000000),(74954,1510605075,91912179,0.000000),(74834,1510605075,96151672,1.745563),(42795,1510605075,100082549,2.940804),(115275,1510605075,100697098,98.539009),(68714,1510605075,108002076,255.000000),(66794,1510605075,122531627,14.001432),(67394,1510605075,127419963,1.000000),(36915,1510605075,128773889,0.000000),(123075,1510605075,128914015,79.024576),(101475,1510605075,145265061,0.000000),(55754,1510605075,158716513,0.000051),(55274,1510605075,158820820,0.000000),(55154,1510605075,162772323,3527.425702),(101835,1510605075,189698344,0.000044),(42915,1510605075,190675700,0.000000),(101595,1510605075,191595477,1.000000),(43515,1510605075,193287960,97.502783),(117675,1510605075,215969097,1747.785352),(38295,1510605075,232972935,0.012500),(76275,1510605075,235556720,0.060000),(60075,1510605075,262718550,0.000000),(38415,1510605075,263333308,99.954165),(139275,1510605075,280274185,89.005002),(94755,1510605075,302765964,1.000000),(51075,1510605075,309530131,8.288789),(66795,1510605075,311155919,1941.149373),(123675,1510605075,326612743,0.000000),(73035,1510605075,349100864,0.000000),(73515,1510605075,353809442,11.429592),(138315,1510605075,357288239,0.058333),(73275,1510605075,361331944,0.000000),(42675,1510605075,366605370,6902.165634),(89595,1510605075,367421526,0.000000),(36555,1510605075,374013099,255.000000),(90495,1510605075,379167519,0.000000),(54915,1510605075,388786264,0.000000),(23295,1510605075,389700596,0.040625),(25275,1510605075,391481539,-0.000000),(56115,1510605075,400461354,0.808086),(45315,1510605075,411978678,0.515863);

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

Can you please do
select * from item where itemid=139274;
show create table history;
Also can attach screenshot of preprocessing for that item.

Comment by Dirk Bongard [ 2017 Nov 13 ]

You mean the table "Items"? I am not the real DBA. I hope this is the entry you have asked for:

itemid;type;snmp_community;snmp_oid;hostid;name;key_;delay;history;trends;status;value_type;trapper_hosts;units;snmpv3_securityname;snmpv3_securitylevel;snmpv3_authpassphrase;snmpv3_privpassphrase;formula;error;lastlogsize;logtimefmt;templateid;valuemapid;params;ipmi_sensor;authtype;username;password;publickey;privatekey;mtime;flags;interfaceid;port;description;inventory_link;lifetime;snmpv3_authprotocol;snmpv3_privprotocol;state;snmpv3_contextname;evaltype;jmx_endpoint;master_itemid
139274;3;;;11196;Free size of datastore $3;vmware.hv.datastore.size[{$URL},HOST.HOST},datastore2,free];2m;120d;1095d;0;0;;B;;0;;;;Unknown hypervisor uuid.;0;;\N;\N;;;0;{$USERNAME};{$PASSWORD};;;0;4;2042;;;0;30d;0;0;1;;0;;\N

I will search for the responding item...

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

No, I need show create table history; please

Comment by Dirk Bongard [ 2017 Nov 13 ]

I hope this is the correct item...
/zabbix/items.php?form=update&hostid=11196&itemid=139274

Comment by Dirk Bongard [ 2017 Nov 13 ]
mysql> show create table history;
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| history | CREATE TABLE `history` (
  `itemid` bigint(20) unsigned NOT NULL,
  `clock` int(11) NOT NULL DEFAULT '0',
  `value` double(16,4) NOT NULL DEFAULT '0.0000',
  `ns` int(11) NOT NULL DEFAULT '0',
  KEY `history_1` (`itemid`,`clock`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)
Comment by Dirk Bongard [ 2017 Nov 13 ]

do you need special entries of the log, or the complete one? (50MB)

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

This query is the cause, thank you for your report, we will get back to you.

insert into history (itemid,clock,ns,value) values (139274,1510605074,639784805,2622545199104.000000);

Comment by Dirk Bongard [ 2017 Nov 13 ]

Fine...
thank you very much for the fast response...

Comment by Vladislavs Sokurenko [ 2017 Nov 13 ]

meanwhile you can disable this item and see if it helps.

Comment by Dirk Bongard [ 2017 Nov 13 ]

You really sure that is only this item?.. I have an example of few seconds..

Comment by Andris Zeila [ 2017 Nov 14 ]

Missing range validation when converting variant value from uint64 to float.

Index: src/libs/zbxcommon/variant.c
===================================================================
--- src/libs/zbxcommon/variant.c        (revision 74571)
+++ src/libs/zbxcommon/variant.c        (working copy)
@@ -84,6 +84,8 @@
                case ZBX_VARIANT_DBL:
                        return SUCCEED;
                case ZBX_VARIANT_UI64:
+                       if (FAIL == zbx_validate_value_dbl((double)value->data.ui64))
+                               return FAIL;
                        zbx_variant_set_dbl(value, (double)value->data.ui64);
                        return SUCCEED;
                case ZBX_VARIANT_STR:

Actually it would be better to do it in dc_history_set_value() so the internal conversions during preprocessing would not be affected.

wiper based on this logic we should remove floating range validation in preprocessing.

Comment by Dirk Bongard [ 2017 Nov 14 ]

After the hint with this item (itemid=139274) my workaround ist to change the LLD
vmware.hv.datastore.size....,free]
from float to unsigned.
vmware.hv.datastore.size was already unsigned. After this, my "out of range" messages are gone.

I hope this is the source of my topic.

Comment by Andris Zeila [ 2017 Nov 14 ]

Default templates has pfree (% free) space, so the value type is float. I assume that you have changed (or maybe it's a leftover from old templates?) the parameter from pfree to free, leaving the value type float.

Nevertheless server should have caught the floating range error during history syncing process and proper error should have been generated.

Comment by Andris Zeila [ 2017 Nov 14 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-13027

Comment by Andris Zeila [ 2017 Nov 17 ]

Released in:

  • pre-3.4.5rc1 r74732
  • pre-4.0.0alpha1 r74734
Generated at Fri Mar 29 17:19:17 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.