[ZBX-12250] Item logrt skips created new log file after agent restarted Created: 2017 Jun 01  Updated: 2018 Jun 11  Resolved: 2018 Jun 08

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 2.2.17, 4.0.0alpha7
Fix Version/s: 3.0.19rc1, 3.4.11rc1, 4.0.0alpha8, 4.0 (plan)

Type: Incident report Priority: Minor
Reporter: Junichi Karikomi Assignee: Andris Mednis
Resolution: Fixed Votes: 0
Labels: logmonitoring, logrt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Team: Team A
Sprint: Sprint 32, Sprint 33, Sprint 34, Sprint 35
Story Points: 4

 Description   

This problem is observable in following situation
1. Create a log file.

 # echo abcd > /tmp/logtest/f1

2. Create item "logrt[/tmp/logtest/f1]" and begin monitoring,
string "abcd" registered in history.
3. Stop the agent.

 # service zabbix-agent stop

4. Remove the log file.

 # rm /tmp/logtest/f1

5. Restart the agent.

 # service zabbix-agent start

6. Create new log file.
Any string NOT registerd in history.

 # echo ABCD > /tmp/logtest/f1

7. Append string to the log file.
string "XYZ" registerd in history.

 # echo XYZ >> /tmp/logtest/f1

In step 5, the agent process knows no files exist.
And it find new file at step 6, the file should be NEW.
This file should be read and register string to history.



 Comments   
Comment by Andris Mednis [ 2018 May 11 ]

Succesfully reproduced in 2.2 and trunk, so most likely all versions are affected.
After agent is restarted, it:

  1. receives from server "lastlogsize:5 mtime:1526025056" (for example).
  2. sees that there are no files matching. No worry, for logrt[] files may become inaccessible, that is not an error, it could happen during rotation, right ?
  3. sees a new log file with mtime:1526025466 size:5 (for example). As the log file size is 5 (the file size has not changed), agent does nothing. And this is the problem.

What could the agent have done better ? Agent could notice that after start it received mtime:1526025056" from server, but the new log file has a different mtime:1526025466 - that is the sign things have changed and the log file should be analyzed from the start.

What do you think ? Would it be a good solution (for both small and large log files) - to trigger analyze from start in this situation by 'mtime' change ?

If size is smaller, the agent analyzes from start. If size is larger,, the agent analyzes from lastlogsize (e.g.5). When size is equal, the problem arises as you described.

Comment by Junichi Karikomi [ 2018 May 14 ]

I think follows.

When the file not exist at the agent restarted, and the agent find a file after it.

The file should be new file. The agent should analyzes from start regardless of the file size.

 

The file may be recreated between stoped the agent and restarted, the agent cannot judge that the file has been recretaed or not.

In this case, the agent works same as currently.

 

Comment by Andris Mednis [ 2018 May 16 ]

So, it was proposed to modify logrt[] that if there are no files to analyze, then keep 'mtime' as is, but reset 'lastlogsize' to 0.

Comment by Junichi Karikomi [ 2018 May 17 ]

I'm warrying to work the skip option incorrectly when the log file recareted.
Could you see ZBX-9869.

 

Comment by Andris Mednis [ 2018 May 25 ]

As of ZBX-9869 - logrt[] with 'skip' parameter was fixed in ZBX-13253 (from 3.0 and up)

Comment by Andris Mednis [ 2018 May 28 ]

I propose to modify agent as follows:

  • if the agent at start has received for logrt[] item 'lastlogsize' which is NOT 0, AND the log directory is accessible AND there are no log files, then set lastlogsize = 0. So, the initial 'lastlogsize' value from server is not used.
  • if the agent has seen log files (old log file list is not empty) AND the log directory is accessible AND there are no log files, then the agent keeps the old log file list and does NOT reset lastlogsize to 0. The old file list contains data about sizes, initial block MD5 sums, i-nodes etc. about files - we do not throw it away. When log files appear again the agent uses the old list to determine where to analyze from. This reduces "false positives" due to re-analyze everything again.
Comment by Andris Mednis [ 2018 May 30 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-12250-30 for Zabbix 3.0.

Comment by Andris Mednis [ 2018 Jun 06 ]

Available in versions:

  • pre-3.0.19rc1 r81607
  • pre-3.4.11rc1 r81610
  • pre-4.0.0alpha8 (trunk) r81612
Comment by Andris Mednis [ 2018 Jun 06 ]

Documented in:
https://www.zabbix.com/documentation/3.0/manual/config/items/itemtypes/log_items#important_notes
https://www.zabbix.com/documentation/3.4/manual/config/items/itemtypes/log_items#important_notes
https://www.zabbix.com/documentation/4.0/manual/config/items/itemtypes/log_items#important_notes

sasha Thanks! CLOSED

Generated at Fri Mar 29 16:19:46 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.