[ZBX-15704] Agent log item may send meta information without obvious needs Created: 2019 Feb 21  Updated: 2024 Apr 10  Resolved: 2019 May 01

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 4.0.4
Fix Version/s: 4.0.8rc1, 4.2.2rc1, 4.4.0alpha1, 4.4 (plan)

Type: Problem report Priority: Major
Reporter: Oleksii Zagorskyi Assignee: Aleksejs Sestakovs
Resolution: Fixed Votes: 0
Labels: log, metadata, redundancy
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Team: Team C
Team: Team C
Sprint: Sprint 51 (Apr 2019)
Story Points: 2

 Description   

Have a log[/tmp/aaa] item and fill the log file by a few lines in one go.

echo -e "Started-1\nFinished-1\nStarted-2\nFinished-2" >> /tmp/aaa

Agent will spot log file update and will send the firts line (see ZBX-9229) and later, here what happens in the debuglog:

 18654:20190110:122009.309 In need_meta_update() key:log[/tmp/aaa]
 18654:20190110:122009.309 End of need_meta_update():SUCCEED
 18654:20190110:122009.309 In process_value() key:'it0:log[/tmp/aaa]' value:'(null)'
 18654:20190110:122009.309 In send_buffer() host:'127.0.0.1' port:10051 entries:3/100
 18654:20190110:122009.309 send_buffer() now:1547115609 lastsent:1547115609 now-lastsent:0 BufferSend:1; will not send now
 18654:20190110:122009.309 End of send_buffer():SUCCEED
 18654:20190110:122009.309 buffer: new element 3
 18654:20190110:122009.309 End of process_value():SUCCEED
 18654:20190110:122009.309 End of process_active_checks()
 18654:20190110:122009.309 In get_min_nextcheck()
 18654:20190110:122009.309 End of get_min_nextcheck():1547115610
 18654:20190110:122009.309 __zbx_zbx_setproctitle() title:'active checks #1 [idle 1 sec]'
 18650:20190110:122010.189 __zbx_zbx_setproctitle() title:'collector [processing data]'
 18650:20190110:122010.189 In update_cpustats()
 18650:20190110:122010.189 End of update_cpustats()
 18650:20190110:122010.189 __zbx_zbx_setproctitle() title:'collector [idle 1 sec]'
 18654:20190110:122010.309 In send_buffer() host:'127.0.0.1' port:10051 entries:4/100
 18654:20190110:122010.310 JSON before sending [{"request":"agent data","session":"febb1670f21905dff57620de33b60d28","data":[{"host":"it0","key":"log[/tmp/aaa]","value":"Finished-1","lastlogsize":147,"id":17,"clock":1547115609,"ns":308704727},{"host":"it0","key":"log[/tmp/aaa]","value":"Started-2","lastlogsize":157,"id":18,"clock":1547115609,"ns":308916709},{"host":"it0","key":"log[/tmp/aaa]","value":"Finished-2","lastlogsize":168,"id":19,"clock":1547115609,"ns":309122137},{"host":"it0","key":"log[/tmp/aaa]","lastlogsize":168,"id":20,"clock":1547115609,"ns":309472514}],"clock":1547115610,"ns":310209522}]
 18654:20190110:122010.310 JSON back [{"response":"success","info":"processed: 4; failed: 0; total: 4; seconds spent: 0.000264"}]
 18654:20190110:122010.310 In check_response() response:'{"response":"success","info":"processed: 4; failed: 0; total: 4; seconds spent: 0.000264"}'
 18654:20190110:122010.310 info from server: 'processed: 4; failed: 0; total: 4; seconds spent: 0.000264'
 18654:20190110:122010.311 End of check_response():SUCCEED
 18654:20190110:122010.311 OK
 18654:20190110:122010.311 End of send_buffer():SUCCEED

note please how this meta information only (without "value":) was included (duplicated) without obvious needs:

{"host":"it0","key":"log[/tmp/aaa]","lastlogsize":168,"id":20,"clock":1547115609,"ns":309472514}

Here is related agent code:

	if (0 != (ZBX_METRIC_FLAG_LOG & metric->flags))
	{
		/* meta information update is needed if:                                              */
		/* - lastlogsize or mtime changed since we last sent within this check                */
		/* - nothing was sent during this check and state changed from notsupported to normal */
		/* - nothing was sent during this check and it's a new metric                         */
		if (lastlogsize_sent != metric->lastlogsize || mtime_sent != metric->mtime ||
				(lastlogsize_last == lastlogsize_sent && mtime_last == mtime_sent &&
						(old_state != metric->state ||
						0 != (ZBX_METRIC_FLAG_NEW & metric->flags))))
		{
			/* needs meta information update */
			ret = SUCCEED;
		}
	}

I saw this issue many times when troubleshooted a server side issue with log monitoring.
Those cases were different and looking like even more strange, but I didn't write down steps to reproduce them. Hope a root cause is the same.

Something should be improved to avoid that.



 Comments   
Comment by Oleksii Zagorskyi [ 2019 Feb 22 ]

Checked zabbix_server.log, to see does the redundant meta information cause one more "update items set lastlogsize=168 where itemid=YYY" or not.
It does NOT, luckily.

But it does 2 redundant transactions started and immediately closed, like in comment below.

Comment by Oleksii Zagorskyi [ 2019 Feb 22 ]

It's probably related.
One more scenario.

Just restart an agent, which monitors a log file, wait 5 seconds (BufferSend config param) and make this grep:

# egrep "JSON|got \[" zabbix_agentd.log
  4655:20190222:110803.524 got [{"response":"success","data":[{"key":"log[/tmp/aaa]","delay":1,"lastlogsize":494,"mtime":0}]}]
  4655:20190222:110808.529 JSON before sending [{"request":"agent data","session":"6a2daf55e47a6d59c27d9fc6440ae57e","data":[{"host":"it00","key":"log[/tmp/aaa]","lastlogsize":494,"id":1,"clock":1550826483,"ns":525275354}],"clock":1550826488,"ns":529629386}]
  4655:20190222:110808.530 JSON back [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000093"}]

No "value" here, "lastlogsize" was not changed.
Question - why zabbix agent sends this unneeded data to server?

It does cause SQLs on server side (2 transactions started and closed) and some redundant work, IMO, for history syncers, here is debuglog:

  3321:20190222:110808.567 __zbx_zbx_setproctitle() title:'history syncer #1 [processed 0 values, 0 triggers in 0.000260 sec, syncing history]'
  3321:20190222:110808.567 In zbx_sync_history_cache() history_num:1
  3321:20190222:110808.567 In DCmass_prepare_history() history_num:1
  3321:20190222:110808.567 End of DCmass_prepare_history()
  3321:20190222:110808.567 In DBmass_add_history()
  3321:20190222:110808.567 End of DBmass_add_history()
  3321:20190222:110808.567 In DCmass_update_trends()
  3321:20190222:110808.567 End of DCmass_update_trends()
  3321:20190222:110808.567 query [txnlev:1] [begin;]
  3321:20190222:110808.568 In DBmass_update_items()
  3321:20190222:110808.568 End of DBmass_update_items()
  3321:20190222:110808.568 In process_events() events_num:0
  3321:20190222:110808.568 End of process_events() processed:0
  3321:20190222:110808.568 query [txnlev:1] [commit;]
  3321:20190222:110808.568 In DCupdate_trends()
  3321:20190222:110808.568 End of DCupdate_trends()
  3321:20190222:110808.568 query [txnlev:1] [begin;]
  3321:20190222:110808.568 In recalculate_triggers()
  3321:20190222:110808.568 End of recalculate_triggers()
  3321:20190222:110808.568 In process_events() events_num:0
  3321:20190222:110808.568 End of process_events() processed:0
  3321:20190222:110808.569 query [txnlev:1] [commit;]
  3321:20190222:110808.569 In DBupdate_itservices()
  3321:20190222:110808.569 End of DBupdate_itservices():SUCCEED
  3321:20190222:110808.569 End of zbx_sync_history_cache()
  3321:20190222:110808.569 __zbx_zbx_setproctitle() title:'history syncer #1 [processed 1 values, 0 triggers in 0.002150 sec, idle 1 sec]'
Comment by Oleksii Zagorskyi [ 2019 Feb 22 ]

This should be retested after ZBX-9229 fix. It can be related.

Comment by Oleksii Zagorskyi [ 2019 Feb 22 ]

When send the one line value to log in first 5 seconds after agent restart, it will send the meta information twice:

# egrep "JSON|got \[" zabbix_agentd.log
  6033:20190222:115146.481 got [{"response":"success","data":[{"key":"log[/tmp/aaa]","delay":1,"lastlogsize":630,"mtime":0}]}]
  6033:20190222:115151.488 JSON before sending [{"request":"agent data","session":"57adc7302211dbb16c16fe264b68c4d9","data":[{"host":"it00","key":"log[/tmp/aaa]","lastlogsize":630,"id":1,"clock":1550829106,"ns":481661750},{"host":"it00","key":"log[/tmp/aaa]","value":"Started-1","lastlogsize":640,"id":2,"clock":1550829108,"ns":483791157},{"host":"it00","key":"log[/tmp/aaa]","lastlogsize":640,"id":3,"clock":1550829108,"ns":484096688}],"clock":1550829111,"ns":488124666}]
  6033:20190222:115151.488 JSON back [{"response":"success","info":"processed: 3; failed: 0; total: 3; seconds spent: 0.000214"}]

it's kind of combined issue described in summary and in the comment above.
Maybe root cause of these two is the same.

Comment by Aleksejs Sestakovs [ 2019 Apr 30 ]

Available in:

  • 4.0.8rc1 9ff9a1332cd
  • 4.2.2rc1 021fc8e865b
  • 4.4.0alpha1 (trunk) 65ba5f227c5
Generated at Sat Apr 20 07:20:08 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.