[ZBX-12260] log item type with UTF-16 encoding Created: 2017 Jun 06  Updated: 2018 Oct 09  Resolved: 2017 Nov 05

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: None
Fix Version/s: 3.0.13rc1, 3.2.10rc1, 3.4.4rc1, 4.0.0alpha1, 4.0 (plan)

Type: Problem report Priority: Minor
Reporter: Olegs Vasiljevs (Inactive) Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 1
Labels: agent, bom, log
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Zabbix server v3.0.9 (Linux, SLES12sp1), v3.2.4 (Centos 7)
Zabbix agent v2.4.4 (2.2.7, 3.2.0) on Windows 7 Pro sp1 (64-bit), Zabbix agent v2.4.4, v3.0.4, 3.0.9, v3.2.0 on Windows 10 Pro


Attachments: Text File results.log    
Issue Links:
Duplicate
Team: Team A
Sprint: Sprint 18, Sprint 19
Story Points: 1

 Description   

There is an issue with log item type monitoring with UTF-16 and possibly UTF-8 encoding type.

Item key:

log["C:\Users\User\Desktop\results.log",,"UTF-16"]
log["C:\Users\User\Desktop\results_utf8.log",,"UTF-8"]

File type:
File encoding is UCS-2 LE BOM, in Windows environment it is the synonym of UTF-16 encoding. Sample file in the attachments.
After conversion of file to UTF-8 the problem persists.
Sample file in the attachments.

Expected behavior:
All file lines are being populated in the item values.

Visible behavior:
Last line of file is not being used to populate the item values. Upon the change of last line, previous is added successfully.



 Comments   
Comment by richlv [ 2017 Jun 06 ]

can you please attach a test file ?

edit : oh, is "results.log" the test file ?

Comment by Olegs Vasiljevs (Inactive) [ 2017 Jun 06 ]

Yes, richlv, attached file is the test one, to be monitored.

Comment by Vladislavs Sokurenko [ 2017 Jun 06 ]

Reproduced in trunk on Linux agent.

In hexdump you can see byte order marker FEFF

hexdump results.log 
0000000 feff 0031 0032 002e 0030 0031 002e 0032
0000010 0030 0031 0037 0020 0031 0032 003a 0032

Also enca confirms this

enca results.log -L none
Universal character set 2 bytes; UCS-2; BMP
  CRLF line terminators
  Byte order reversed in pairs (1,2 -> 2,1)

Item

log["/home/vso/Downloads/results.log",,"UCS-2LE"]

Expected:
all logs are correctly shown

Actual:
There is empty line after each log entry.

zalex_ua Vladislavs, I have to note that BOM header as FEFF means BE order. https://en.wikipedia.org/wiki/Byte_order_mark
I have such file and could get working (with or without BOM in file) by specifying encoding as UTF-16BE on Linux only. Could not get it working on Windows.
Heh, need to be careful for "hexdump" command because of its own print order:

# cat sdf
1234567890

# hexdump sdf
0000000 3231 3433 3635 3837 303

# hexdump -C -n6 results.log
00000000  ff fe 31 00 32 00                                 |..1.2.|

# hexdump -C -n6 bomBE.debug
00000000  fe ff 00 32 00 30                                 |...2.0|

So, this my comment is just a FYI.
vso thanks, so attached file will work for you on Linux, but not on Windows ?

zalex_ua Attached here results.log is with LE order, and yes, I was able to monitor it on Windows and Linux by specifying encoding as just UTF-16

Comment by Vladislavs Sokurenko [ 2017 Jun 06 ]

Can you please run enca results_utf8.log -L none on your log file and provide the results ? olegs.vasiljevs

Comment by Olegs Vasiljevs (Inactive) [ 2017 Jun 07 ]

Sorry for a late response. Just in case it matters, converted the file to both UTF-8 and UTF-8-BOM.

results_utf8.log

enca results_utf8.log -L none
7bit ASCII characters
  CRLF line terminators

results_utf8-bom.log

enca results_utf8-bom.log -L none
Universal transformation format 8 bits; UTF-8
  CRLF line terminators
Comment by Vladislavs Sokurenko [ 2017 Oct 09 ]

I see you are using v3.2.4, can you reproduce the issue in latest version of the server and agent ? 3.2.8

Comment by Vladislavs Sokurenko [ 2017 Oct 09 ]

To sum up zalex_ua states that UTF-16 works correctly on both Linux and windows.
Original bug report is for 3.2.4 which has a known issue ZBX-11855

So what need fixing is UTF-16BE on Windows, also must accept UTF-16LE
Also it would be good to accept UCS-2 and UCS-2BE as encoding string even if UTF-16 can be used in place of it.

Regarding original report, waiting for olegs.vasiljevs to confirm that issue is not present on latest versions.

Comment by Andris Mednis [ 2017 Oct 11 ]

Successfully tested. See minor modification in r73425.

Comment by Andris Zeila [ 2017 Oct 20 ]

Released in:

  • pre-3.0.13rc1 r73776,r73777
  • pre-3.2.8rc1 r73778
  • pre-3.4.4rc1 r73781
  • pre-4.0.0alpha1 r73782
Generated at Fri Mar 29 14:03:06 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.