[ZBX-7648] After using net.dns item, some of the host resolution (web.page.get) uses the name server used in net.dns instead of the configured name servers in /etc/resolv.conf Created: 2014 Jan 14 Updated: 2017 May 30 Resolved: 2014 Jan 21 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 2.2.1 |
Fix Version/s: | 2.0.11rc1, 2.2.2rc1, 2.3.0 |
Type: | Incident report | Priority: | Major |
Reporter: | Adrian Tan | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 0 |
Labels: | dns | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Ubuntu 12.04 LTS |
Attachments: | tcpdump_includecommands.txt |
Description |
Prior to using the net.dns item key, all host resolutions (using web.page.get item key) use the name servers configured in /etc/resolv.conf This caused some web.page.get to fail, as it cannot resolve the host name. /etc/resolv.conf: name server used with net.dns web.page.get A restart of the Zabbix agent is needed in order to remove this "cached" name server entry. |
Comments |
Comment by Juris Miščenko (Inactive) [ 2014 Jan 16 ] |
Name resolver state now gets restored after net.dns requests that specify a nameserver for fulfilling the request. Fixed in svn://svn.zabbix.com/branches/dev/ZBX-7648 |
Comment by Aleksandrs Saveljevs [ 2014 Jan 17 ] |
(1) How about we put the original nameserver back after the call to res_send()? In this case, this will only need to be done once, instead of before every return statement. Also, note that not only we have to reestablish the old servers back, but also _res.retrans and _res.retry. Finally, I think we could also work in case _res.nscount equals 0. For instance, if DNS server is specified in the item. Nameserver information is now restored after the final use of the _res structure. `retrans' and `retry' variables are also saved and restored before and after NS entry manipulation. _res.nscount should NEVER equal 0 unless res_init() hasn't been called (or any of the query functions, which call res_init() internall, if they detect that it hasn't been done previously). A 0 value for nscount indicates a broken implementation of `resolv' on the system or severe system failure. Handling such a case varies from implementation to implementation, but it should be noted that a 0 value nscount does not indicate error and does not result in the setting of the `_res.res_h_errno' variable responsible for reporting error status. It is safe to assume that in the case of an nscount of 0 further calls to resolver functions also fail. The case will not be handled directly. RESOLVED. asaveljevs Yesterday we discussed that we should check for res_init() being successful. It was not done, however. Could you please look into it? REOPENED. jurism Failure check added to resolver structure initialization routine. RESOLVED. asaveljevs Looks good. Please see r41722 before merging. RESOLVED. jurism Looks good! CLOSED. |
Comment by Juris Miščenko (Inactive) [ 2014 Jan 30 ] |
Fixed in pre-2.0.11 r.41995, pre-2.2.2 r.41998, pre-2.3.0 r.42000. |