[ZBX-8470] On CentOS/RHEL 5 64 bit unload module gets SIGSEGV Created: 2014 Jul 14 Updated: 2017 May 30 Resolved: 2014 Aug 01 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G), Proxy (P), Server (S) |
Affects Version/s: | 2.2.3 |
Fix Version/s: | 2.5.0 |
Type: | Incident report | Priority: | Minor |
Reporter: | Bruno Alberto Crespo Garcia | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 0 |
Labels: | loadablemodule, patch | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
CentOS/RHEL 5 64bit |
Attachments: | zabbix-module.patch zabbix-module.patch zabbix_agentd.log zabbix_agentd.log | ||||||||
Issue Links: |
|
Description |
The line *(void **)(&func_uninit) = dlsym(*module, ZBX_MODULE_FUNC_UNINIT); seems to hit a bug in the C compiler that ships with EL 5 and the process received a SIGSEGV. On the other hand this C construction is not very good practice. I created an small patch that fixes the problem and must be OK in any UNIX dialect. |
Comments |
Comment by Juris Miščenko (Inactive) [ 2014 Jul 15 ] |
Bruno, I couldn't reproduce the behaviour on our RHEL5 machine using the stock example 'dummy' module that ships with our distributions (src/modules/dummy/). If possible, could you provide the source of your module (at least the init and uninit portions of it). Thank you. |
Comment by Bruno Alberto Crespo Garcia [ 2014 Jul 15 ] |
Agent log on crash |
Comment by Bruno Alberto Crespo Garcia [ 2014 Jul 15 ] |
I reproduce the issue with the following steps:
And the agent crashes. I uploaded the log file for you. If you are interested I can provide the binary rpm files that my machine generates. I'm trying to do the same thing with an original rhel 5 system. Please let me know if you need some other thing. |
Comment by Bruno Alberto Crespo Garcia [ 2014 Jul 15 ] |
Just to be sure, I created a new virtual machine, installed RHEL 5.9 from DVD, no updates, repeat the process (including rebuilding the rpms in the rhel vm) and the agent crash with your dummy module. I attached the log file. |
Comment by Bruno Alberto Crespo Garcia [ 2014 Jul 15 ] |
BTW: As I said in the title, the issue appear in 64bit version, but not in 32bit version. |
Comment by Aleksandrs Saveljevs [ 2014 Jul 30 ] |
According to "man dlsym", the solution proposed in the patch may not be optimal: cosine = (double (*)(double)) dlsym(handle, "cos"); /* According to the ISO C standard, casting between function pointers and 'void *', as done above, produces undefined results. POSIX.1-2003 and POSIX.1-2008 accepted this state of affairs and proposed the following workaround: *(void **) (&cosine) = dlsym(handle, "cos"); This (clumsy) cast conforms with the ISO C standard and will avoid any compiler warnings. The 2013 Technical Corrigendum to POSIX.1-2008 (a.k.a. POSIX.1-2013) improved matters by requiring that conforming implementations support casting 'void *' to a function pointer. Nevertheless, some compilers (e.g., gcc with the '-pedantic' option) may complain about the cast used in this program. */ I remember doing approximately the following elsewhere and it seemed to work without problems:
typedef union
{
void *ptr;
void (*func)();
}
ZBX_FUNCTION_POINTER;
...
ZBX_FUNCTION_POINTER fp;
fp.ptr = dlsym(handle, "cos");
Then, when a function pointer was needed, I used "fp.func". We might or might not wish to try the same approach here. HP-UX man page: void *handle; int i, *iptr; int (*fptr)(int); /* open the needed object */ handle = dlopen("/usr/mydir/mylib.sl", RTLD_LAZY); /* find address of function and data objects */ fptr = (int (*)(int))dlsym(handle, "some_function"); iptr = (int *)dlsym(handle, "int_object"); ... Solaris man page: int *iptr, (*fptr)(int); /* open the needed object */ handle = dlopen("/usr/home/me/libfoo.so.1", RTLD_LAZY); /* find the address of function and data objects */ fptr = (int (*)(int))dlsym(handle, "my_function"); iptr = (int *)dlsym(handle, "my_object"); ... |
Comment by Bruno Alberto Crespo Garcia [ 2014 Jul 30 ] |
Sorry, I didn't know the POSIX workaround, anyway the compiler in RHEL/CentOS has a problem, another simple fix that I checked and seems to work is: typedef int (*int_func_ptr)(); void unload_modules() { const char *__function_name = "unload_modules"; volatile int_func_ptr func_uninit; void **module; zabbix_log(LOG_LEVEL_DEBUG, "In %s()", __function_name); /* there is no registered modules */ if (NULL == modules) return; for (module = modules; NULL != *module; module++) { *(void **)(&func_uninit) = dlsym(*module, ZBX_MODULE_FUNC_UNINIT); if (NULL == func_uninit) { .... I attach a complete patch.... |
Comment by Bruno Alberto Crespo Garcia [ 2014 Jul 30 ] |
Alexander, I used to think the same that the HP-UX and Solaris manuals, but it seems that Aleksandrs Saveljevs is right, the manual page from the OpenGroup: void *handle; int *iptr, (*fptr)(int); /* open the needed object */ handle = dlopen("/usr/home/me/libfoo.so", RTLD_LOCAL | RTLD_LAZY); /* find the address of function and data objects */ *(void **)(&fptr) = dlsym(handle, "my_function"); iptr = (int *)dlsym(handle, "my_object"); /* invoke function, passing value of integer as a parameter */ (*fptr)(*iptr); |
Comment by Aleksandrs Saveljevs [ 2014 Jul 31 ] |
There are two descriptions of dlsym() on OpenGroup's website. The one referred to by Bruno above is the first one:
The first (older, issue 6) suggests the *(void **)(&fptr) way and the second (newer, issue 7) suggests the (int (*)(int))dlsym(handle, "my_function") way. My workaround with union is described at http://en.wikipedia.org/wiki/Dynamic_loading#Converting_Extracted_Library_Contents . However, http://www.trilithium.com/johan/2004/12/problem-with-dlsym/ suggests that according to the C standard this is still undefined behavior. The latter link also suggests two other workarounds: casting through an integral type (like (int (*)(int))(intptr_t)dlsym()) and memcpy() of void pointer to function pointer, which seems to work in practice, but still there are no guarantees. So there are two standards here that contradict each other: C standard and POSIX. The C standard does not define a conversion of a function pointer to "void *", because function pointers and object pointer are not required to have the same size. The latest version of POSIX, on the other hand, says that function pointers have to be convertible to "void *". Consequently, there is no way to achieve what we need according to the C standard, so we have to aim at POSIX. In other words, there is no theoretically correct solution, so we have to choose a practical solution - the one that works in practice. It seems suggested by all the pages above to use the (int (*)(int))dlsym(handle, "my_function") style and this will be POSIX-compliant. I also took a look at other software (mostly Firefox, a bit in BIND and Net-SNMP) and all use that style, too. So it might be that we should change to this style as well. |
Comment by Juris Miščenko (Inactive) [ 2014 Aug 01 ] |
Fix implemented in svn://svn.zabbix.com/branches/dev/ZBX-8470 |
Comment by Andris Zeila [ 2014 Nov 03 ] |
Successfully tested |
Comment by Juris Miščenko (Inactive) [ 2014 Nov 03 ] |
Fix merged in 2.5.0 (trunk) r50361. |