[ZBX-9915] agent on windows may crash on eventlog when perform FormatMessageW Created: 2015 Sep 28  Updated: 2019 Dec 10

Status: Open
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: None
Fix Version/s: None

Type: Incident report Priority: Trivial
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: crash, eventlog, windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate

 Description   

First of all this is an attempt to leave some report which may help if the crash will happen once again for someone.
This issue could not be troubleshooted to possibly understand what was exactly wrong and possibly avoid agent crashes.
See comments below.



 Comments   
Comment by Oleksii Zagorskyi [ 2015 Sep 28 ]

Zabbix agent on Windows Server 2003 x64 (Version: 5.2.3790 Service Pack 2 Build 3790) started to crash after 5 minutes of runtime.
Agent version 2.4.5 (exe for x64).
It appeared to be because of item key:

eventlog[Application,,"Error|Critical",VERY_UNIQUE_THING,,,skip]

Update interval 300 sec, "lastlogsize" in database 190414.

As currently we know, in case if "lastlogsize" is not 0 the "skip" parameter is actually does not have effect.
So after agent started, it scans the Application log from very beginning, as currently it have a lot of events logged after 190414.
Ant it meat an event with recordID=319487 which it cannot parse and crashes.
In agent log with DebugLevel=4 we can see following:

 36316:20150923:083025.951 In process_eventlog() source:'Application' lastlogsize:319486
 36316:20150923:083025.951 In zbx_open_eventlog()
 36316:20150923:083025.951 zbx_open_eventlog() pNumRecords:91805 pLatestRecord:236773
 36316:20150923:083025.951 End of zbx_open_eventlog():SUCCEED
 36316:20150923:083025.982 In zbx_get_eventlog_message() lastlogsize:319487
 65624:20150923:083026.591 In collect_perfstat()
 65624:20150923:083026.591 End of collect_perfstat()
 65624:20150923:083027.762 In collect_perfstat()
 65624:20150923:083027.762 End of collect_perfstat()
 65624:20150923:083029.075 In collect_perfstat()
 65624:20150923:083029.075 End of collect_perfstat()

Note that last 3 seconds we don't see activity for "active checks" agent process (pid=36316), though all previous time it logged a lot of lines for eventlog parsing.
It use specially compiled zabbix agent (with extended output after crash, and latest trunk 3.0.0alpha3) we can see next:

 35700:20150923:143910.962 End of zbx_get_eventlog_message():SUCCEED
 35700:20150923:143910.962 End of process_eventlog():SUCCEED
 35700:20150923:143910.962 In process_eventlog() source:'Application' lastlogsize:319485
 35700:20150923:143910.962 In zbx_open_eventlog()
 35700:20150923:143910.962 End of zbx_open_eventlog():FAIL FirstID:239161 LastID:329786 numIDs:90626
 35700:20150923:143910.962 End of zbx_open_eventlog():SUCCEED
 35700:20150923:143910.962 In zbx_get_eventlog_message() lastlogsize:319486
 35700:20150923:143910.962 End of zbx_get_eventlog_message():SUCCEED
 35700:20150923:143910.962 End of process_eventlog():SUCCEED
 35700:20150923:143910.962 In process_eventlog() source:'Application' lastlogsize:319486
 35700:20150923:143910.962 In zbx_open_eventlog()
 35700:20150923:143910.962 End of zbx_open_eventlog():FAIL FirstID:239161 LastID:329786 numIDs:90626
 35700:20150923:143910.962 End of zbx_open_eventlog():SUCCEED
 35700:20150923:143910.962 In zbx_get_eventlog_message() lastlogsize:319487
 35700:20150923:143910.977 Unhandled exception c0000005 detected at 0x0000000077D67BBD. Crashing ...
 35700:20150923:143910.977 ====== Fatal information: ======
 35700:20150923:143910.977 Program counter: 0x77d67bbd
 35700:20150923:143910.977 === Registers: ===
 35700:20150923:143910.977 r8      =          1f1e9ec =             32631276 =             32631276
 35700:20150923:143910.977 r9      =          1e40220 =             31719968 =             31719968
 35700:20150923:143910.977 r10     =                1 =                    1 =                    1
 35700:20150923:143910.977 r11     =                3 =                    3 =                    3
 35700:20150923:143910.977 r12     =              400 =                 1024 =                 1024
 35700:20150923:143910.977 r13     =                0 =                    0 =                    0
 35700:20150923:143910.977 r14     =          31bf6c8 =             52164296 =             52164296
 35700:20150923:143910.977 r15     =                0 =                    0 =                    0
 35700:20150923:143910.977 rdi     =                0 =                    0 =                    0
 35700:20150923:143910.977 rsi     =                0 =                    0 =                    0
 35700:20150923:143910.977 rbp     =             29ff =                10751 =                10751
 35700:20150923:143910.977 rbx     =         c07eea46 =           3229542982 =           3229542982
 35700:20150923:143910.977 rdx     =         c07eea46 =           3229542982 =           3229542982
 35700:20150923:143910.977 rax     =                0 =                    0 =                    0
 35700:20150923:143910.977 rcx     =         a0354c9a =           2687847578 =           2687847578
 35700:20150923:143910.977 rsp     =          31bf510 =             52163856 =             52163856
 35700:20150923:143910.977 efl     =            10246 =                66118 =                66118
 35700:20150923:143910.977 csgsfs  =         332b5300 =            858477312 =            858477312
 35700:20150923:143910.977 === Backtrace: ===
 35700:20150923:143911.071 11: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(DeviceIoControl+0x29d) [0x77d67bbd]
 35700:20150923:143911.071 10: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(FormatMessageW+0x37) [0x77d67d27]
 35700:20150923:143911.071 9: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(zbx_format_message+0x35 s:\trunk\src\zabbix_agent\eventlog.c:216) [0x4001e215]
 35700:20150923:143911.071 8: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(zbx_get_eventlog_message+0x246 s:\trunk\src\zabbix_agent\eventlog.c:342) [0x4001f776]
 35700:20150923:143911.071 7: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(process_eventlog+0x185 s:\trunk\src\zabbix_agent\eventlog.c:465) [0x4001fab5]
 35700:20150923:143911.071 6: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(process_eventlog_check+0xbfb s:\trunk\src\zabbix_agent\active.c:1322) [0x4001d63b]
 35700:20150923:143911.071 5: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(process_active_checks+0x145 s:\trunk\src\zabbix_agent\active.c:1475) [0x4001d9a5]
 35700:20150923:143911.071 4: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(active_checks_thread+0x196 s:\trunk\src\zabbix_agent\active.c:1580) [0x4001dd26]
 35700:20150923:143911.071 3: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(zbx_win_thread_entry+0xa s:\trunk\src\libs\zbxsys\threads.c:78) [0x4000cbca]
 35700:20150923:143911.071 2: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(_callthreadstartex+0x17 f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c:348) [0x4003794f]
 35700:20150923:143911.071 1: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(_threadstartex+0x9f f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c:326) [0x40037a03]
 35700:20150923:143911.071 0: C:\Program Files (x86)\Zabbix Agent\zabbix_agentd.exe(BaseThreadStart+0x3a) [0x77d6b71a]
 35700:20150923:143911.071 ================================
 38436:20150923:143911.149 In collect_perfstat()
 38436:20150923:143911.149 End of collect_perfstat()
 38436:20150923:143912.524 In collect_perfstat()
 38436:20150923:143912.524 End of collect_perfstat()
 

note the same recordID=319487.

In Application log every time for agent crash I can see the following:

'The description for Event ID:1000 in Source:'Application Error' cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. The following information is part of the event: zabbix_agentd.exe; 2.4.5.53233; kernel32.dll; 5.2.3790.4480; 0000000000027bbd'

It was not possible to read what contains recordID=319487 on the same windows host. Tried MMC event viewer, wmi request - they all crashed too.

I was able to save the Application log as .evt file and open it on another windows computer.
The recordID=319487 appeared to be:

'The description for Event ID:0 in Source:'TreeSize Free' cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. The following information is part of the event: EAccessViolation:
Access violation at address 0053D6FA in module 'TreeSizeFree.exe'. Read of address 00000250'

Detailed information (if open .evt) on Win7+:

  • <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  • <System>
    <Provider Name="TreeSize Free" />
    <EventID Qualifiers="0">0</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2015-09-21T07:57:08.000000000Z" />
    <EventRecordID>319487</EventRecordID>
    <Channel>C:\app.evt</Channel>
    <Computer>obfuscated</Computer>
    <Security UserID="obfuscated" />
    </System>
  • <EventData>
    <Data>EAccessViolation: Access violation at address 0053D6FA in module 'TreeSizeFree.exe'. Read of address 00000250</Data>
    </EventData>
    </Event>

"TreeSize Free" source from the failed windows server:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog\Application\TreeSize Free]
"EventMessageFile"="C:\\TreeSizeFree.exe"

The TreeSizeFree application (free and available in the Internet) being used on the windows server time to time.

There were not other way than erase all events in the Application log to avoid crashes as this server in production and have to be monitored.

There was a hope that saved .evt file may help to investigate the crash.
I've imported the same registry key, downloaded the TreeSizeFree.exe app to C:\ (although the event stayed the same "The description for Event ID:0 in Source:'TreeSize Free' cannot be found ...")
Then, using a hackish approach, I even successfully "attached" the .evt file as a "Test" eventlog (on WinXP x32) and tested it using zabbix agent the same version 3.0.0alpha3.
I had no agent crashes.

I have no more idea how to try to reproduce it.
But it definitely has once happened so may happen in future again ...

Generated at Wed Apr 17 01:41:25 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.