[ZBX-11525] Log/eventlog item values sometimes are displayed in wrong order. Created: 2016 Nov 25 Updated: 2017 May 30 Resolved: 2017 Jan 16 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 3.2.1 |
Fix Version/s: | 3.0.8rc1, 3.2.4rc1, 3.4.0alpha1 |
Type: | Incident report | Priority: | Critical |
Reporter: | Ekaterina Letova | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 0 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Attachments: | 25.11.png compare2.png data_values.txt eventlog.png | ||||||||||||
Issue Links: |
|
Description |
If log messages were generated one after another in a short time period and have equal Timestamp, they will be ordered incorrectly (see screenshot). Steps to reproduce: Create log/eventlog item (for example we have eventlog[System,,,,,10,skip]) Result: Notes: Most likely this behavior appeared after Database changes (p. 5.19) in version 3.2.0 |
Comments |
Comment by Glebs Ivanovskis (Inactive) [ 2016 Nov 25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Is proxy involved? If it is, then it could be a scenario similar to
I don't think database changes you mentioned matter here. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Ekaterina Letova [ 2016 Nov 25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
No, proxy is not ivolved.
We use agent 3.2.1, but the problem is not in agent. Steps to reproduce using zabbix_sender:
Result: Notes: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2016 Nov 25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Could you also provide results of the following query: select h.clock,h.ns,h.value from history_<text, str or log, depending on item type> as h,items as i where i.key_='<item key>' and i.itemid=h.itemid; | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Ekaterina Letova [ 2016 Nov 28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
select h.clock,h.ns,h.value from history_log as h,items as i where i.key_='log[D:\datalog\test2.log]' and i.itemid=h.itemid; v. 3.2.1
v. 3.0.4
Updated screenshot: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Glebs Ivanovskis (Inactive) [ 2016 Nov 28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I guess the problem was introduced earlier (probably with Index: trunk/src/libs/zbxdbhigh/proxy.c =================================================================== --- trunk/src/libs/zbxdbhigh/proxy.c (revision 50444) +++ trunk/src/libs/zbxdbhigh/proxy.c (revision 50445) @@ -2369,7 +2372,7 @@ } } else - av->ts.ns = -1; + av->ts.ns = proxy_timediff.ns; } else zbx_timespec(&av->ts); This makes part of DCcheck_ns() which ensures uniqueness of clock-ns pairs logically dead: static void DCcheck_ns(zbx_timespec_t *ts) { if (ts->ns >= 0) return; ... } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by dimir [ 2016 Dec 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
We decided to fix it by auto-incrementing the nanosecond part of timediff for every value in case of data from sender with timestamps. The fix will be available for versions 3.0 and later. Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-11525 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Aleksandrs Saveljevs [ 2016 Dec 19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Aleksandrs Saveljevs [ 2016 Dec 20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(1) The fix looks plausible for this particular case, but it seems to invalidate the fix in <dimir> RESOLVED in r64645 asaveljevs As discussed, it seems to return <dimir> Actually I see no way to guarantee this. Checking this from value cache is an overhead. Another option could be to save the timestamp of the last value, but this will not help if we receive a value with an older timestamp in between, or more than 2 values with the same clock,ns. RESOLVED asaveljevs Still, it becomes a regression. For instance, if we repeat the same scenario in mysql> select * from history_log where itemid=34855; +-----+--------+------------+-----------+--------+----------+--------+------------+----+ | id | itemid | clock | timestamp | source | severity | value | logeventid | ns | +-----+--------+------------+-----------+--------+----------+--------+------------+----+ | 187 | 34855 | 1482827822 | 0 | | 0 | Line 1 | 0 | 0 | | 188 | 34855 | 1482827822 | 0 | | 0 | Line 2 | 0 | 0 | | 189 | 34855 | 1482827822 | 0 | | 0 | Line 3 | 0 | 1 | +-----+--------+------------+-----------+--------+----------+--------+------------+----+ 3 rows in set (0.00 sec) asaveljevs Decided to check how it affects zabbix_sender. It seems it got "broken" a long time ago. For instance, suppose we use unmodified Zabbix 3.0 server (from the stable branch, not this branch) and use zabbix_sender 2.2 to send values: $ for i in {1..5}; do echo "ZBX-11525 trapper 1482835555 Value $i" | bin/zabbix_sender -z 127.0.0.1 -i - -T; done This gives the following data in the database: | 4468 | 34854 | 1482835555 | 0 | | 0 | Value 1 | 0 | 0 | | 4469 | 34854 | 1482835555 | 0 | | 0 | Value 2 | 0 | 0 | | 4470 | 34854 | 1482835555 | 0 | | 0 | Value 3 | 0 | 0 | | 4471 | 34854 | 1482835555 | 0 | | 0 | Value 4 | 0 | 0 | | 4472 | 34854 | 1482835555 | 0 | | 0 | Value 5 | 0 | 0 | Doing the same with zabbix_sender 3.0: | 4473 | 34854 | 1482835555 | 0 | | 0 | Value 1 | 0 | 74252 | | 4474 | 34854 | 1482835555 | 0 | | 0 | Value 2 | 0 | 96549 | | 4475 | 34854 | 1482835555 | 0 | | 0 | Value 3 | 0 | 64633 | | 4476 | 34854 | 1482835555 | 0 | | 0 | Value 4 | 0 | 43515 | | 4477 | 34854 | 1482835555 | 0 | | 0 | Value 5 | 0 | 43217 | It can be seen that later values can have lower nanoseconds (last column) than earlier values. However, this should be left for asaveljevs As discussed, since there have always been ways to achieve the same timestamps, no objections for moving this forward. CLOSED. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by dimir [ 2016 Dec 27 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(2) [PS] Fix for trunk had lots of conflicts. Fixed in svn://svn.zabbix.com/branches/dev/ZBX-11525-trunk. RESOLVED asaveljevs Trunk (version 3.3.0) should probably not have a ChangeLog entry for this issue. <dimir> Thanks, fixed. asaveljevs Consider process_proxy_data() function. There, we call process_proxy_history_data_33(), which can modify the value of client_timediff, after which we call process_discovery_data_contents() and process_auto_registration_contents() with that modified value. This does not seem to be correct - the original value should be used instead. asaveljevs While at it, please take a look at stylistic suggestions in r64750. REOPENED. <dimir> Looks good. RESOLVED in r64764,r64777 vso CLOSED | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Vladislavs Sokurenko [ 2017 Jan 04 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(3) [F] Nanoseconds are good now, however frontend does not take those into account when sorting. Sort options $options['sortfield'] = ['itemid', 'clock']; Array Parameters: Array ( [history] => 4 [itemids] => Array ( [0] => 23670 ) [output] => extend [sortorder] => DESC [limit] => 500 [sortfield] => Array ( [0] => itemid [1] => clock ) ) Result: Array ( [0] => Array ( [itemid] => 23670 [clock] => 1480080946 [value] => 2016-11-25 15:10:49,187 WARN [Test] TestMessage1 [ns] => 63392 ) [1] => Array ( [itemid] => 23670 [clock] => 1480080946 [value] => 2016-11-25 15:11:12,187 WARN [Test] TestMessage2 [ns] => 63393 ) [2] => Array ( [itemid] => 23670 [clock] => 1480080946 [value] => 2016-11-25 15:12:13,187 WARN [Test] TestMessage3 [ns] => 63394 ) [3] => Array ( [itemid] => 23670 [clock] => 1480080946 [value] => 2016-11-25 15:13:44,187 WARN [Test] TestMessage4 [ns] => 63395 ) [4] => Array ( [itemid] => 23670 [clock] => 1480080946 [value] => 2016-11-25 15:14:05,187 WARN [Test] TestMessage5 [ns] => 63396 ) [5] => Array ( [itemid] => 23670 [clock] => 1480080946 [value] => 2016-11-25 15:15:19,187 WARN [Test] TestMessage6 [ns] => 63397 ) ) SQL query
SQL (0.000199): SELECT h.* FROM history_text h WHERE h.itemid='23670' ORDER BY h.itemid DESC,h.clock DESC LIMIT 500 OFFSET 0
history.php:124 → CView->render() → include() → CScreenHistory->get() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → CHistory->get() → DBselect() in include/classes/api/services/CHistory.php:191
sasha We can't fix it in old versions of Zabbix because issues with performance. Please create a new ZBX issue. vso Result before was tested on svn://svn.zabbix.com/branches/dev/ZBX-11525-trunk Array 3. history.get [CScreenHistory.php:178] Parameters: Array ( [history] => 4 [itemids] => Array ( [0] => 23667 ) [output] => extend [sortorder] => DESC [limit] => 500 [sortfield] => id ) Result: Array ( [0] => Array ( [id] => 6 [itemid] => 23667 [clock] => 1480080946 [value] => 2016-11-25 15:15:19,187 WARN [Test] TestMessage6 [ns] => 64098 ) [1] => Array ( [id] => 5 [itemid] => 23667 [clock] => 1480080946 [value] => 2016-11-25 15:14:05,187 WARN [Test] TestMessage5 [ns] => 64097 ) [2] => Array ( [id] => 4 [itemid] => 23667 [clock] => 1480080946 [value] => 2016-11-25 15:13:44,187 WARN [Test] TestMessage4 [ns] => 64096 ) [3] => Array ( [id] => 3 [itemid] => 23667 [clock] => 1480080946 [value] => 2016-11-25 15:12:13,187 WARN [Test] TestMessage3 [ns] => 64095 ) [4] => Array ( [id] => 2 [itemid] => 23667 [clock] => 1480080946 [value] => 2016-11-25 15:11:12,187 WARN [Test] TestMessage2 [ns] => 64094 ) [5] => Array ( [id] => 1 [itemid] => 23667 [clock] => 1480080946 [value] => 2016-11-25 15:10:49,187 WARN [Test] TestMessage1 [ns] => 64093 ) ) SQL
SELECT h.* FROM history_text h WHERE h.itemid='23667' ORDER BY h.id DESC LIMIT 500 OFFSET 0
Also for trunk I think that it's a bit odd that history values are sorted by itemid which is the same for the item. <dimir> We removed id fields from history tables in 3.2 ( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by Vladislavs Sokurenko [ 2017 Jan 05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
server side successfully tested, however frontend still display in wrong order. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comment by dimir [ 2017 Jan 13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Fixed in
|