[ZBX-6047] system.boottime and system.uptime broken on solaris 10/11 zones Created: 2013 Jan 02 Updated: 2017 May 30 Resolved: 2015 Jan 08 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 2.0.4 |
Fix Version/s: | 2.2.9rc1, 2.4.4rc1, 2.5.0 |
Type: | Incident report | Priority: | Blocker |
Reporter: | Andrew Howell | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 1 |
Labels: | solaris, uptime | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
solaris 11 non global zone on Sun Fire X4270 |
Attachments: | uptime.png var-adm-utmpx.patch | ||||
Issue Links: |
|
Description |
system.boottime on a solaris 11 zone returns a 1970 date/time instead of the correct boot time. In the global zone it returns the correct time. |
Comments |
Comment by richlv [ 2013 Dec 04 ] |
confirming with 2.2.1rc1 agent on SunOS 5.11 11.1 sun4v |
Comment by richlv [ 2013 Dec 05 ] |
probably closely related - system.uptime also shows quite... impossible numbers
|
Comment by Alexey Pustovalov [ 2014 Mar 19 ] |
I have reproduced another issue, but like described: root@appserv:~/zabbix-2.2.2# uptime 1:01am up 622 day(s), 2 hr(s), 1 user, load average: 0.01, 0.08, 0.10 root@appserv:~/zabbix-2.2.2# ./src/zabbix_get/zabbix_get -s localhost -k system.boottime 26975 |
Comment by Alexey Pustovalov [ 2014 Mar 20 ] |
is it possible to check latest Zabbix agent version from 2.2 branch? The issue has been solved there. |
Comment by Andrew Howell [ 2014 Mar 26 ] |
I've checked again on 2.2.2 and it still has the same issue with system.boottime and system.uptime |
Comment by Aleksandrs Saveljevs [ 2014 Dec 09 ] |
Currently, in our test environemnt, I have not observed "system.boottime" showing a small value or "system.uptime" showing a big value, but I have observed that "system.boottime" and "system.uptime" show the same numbers in the non-global zone as they do in the global zone. Here is a bit of a research regarding the possible ways to fix the problem. In the present implementation we use "boot_time" kstat counter: $ kstat -p -s '*boot*' unix:0:system_misc:boot_time 1417777382 If that counter is obtained in the non-global zone, it still returns the value for the global zone. One idea that appeared in the process was to look at system calls that system utilities perform. For instance, here is the output of "psrinfo": $ /usr/sbin/psrinfo 0 on-line since 12/05/2014 13:03:02 Doing its "truss" says that it looks at "cpu_info0" kstat counter: ... time() = 1418141200 ioctl(3, KSTAT_IOC_READ, "cpu_info0") = 8068 ... Its value is: $ kstat -p -n cpu_info0 | grep 141 cpu_info:0:cpu_info0:state_begin 1417777382 That would probably be another way to get system boot time, but that still returns the same boot time as the global zone. Code in http://fossies.org/linux/monit/src/process/sysdep_SOLARIS.c also gave a hint that kstat might not be the way to go for non-global zones. A more likely candidate for deeper inspection is "uptime" utility: $ uptime 6:21pm up 36 min(s), 1 user, load average: 0,00, 0,00, 0,02 Doing its "truss" says that it looks in "/var/adm/utmpx" using getutxent() and similar system calls, see http://docs.oracle.com/cd/E19109-01/tsolaris8/817-0882/6mglcr99g/index.html for their documentation. This idea is further confirmed in http://compgroups.net/comp.unix.solaris/source-of-boottime-for-uptime-other-than-ut/41559 . (That discussion also suggests to stat() /proc/0 and look at its atime/mtime/ctime to get system boot time.) An implementation based on "/var/adm/utmpx" is currently available in development branch svn://svn.zabbix.com/branches/dev/ZBX-6047 . The patch that implements the change is attached as "var-adm-utmpx.patch". It would be nice if you could test whether that patch also solves the reported problem. On our test system with the "/var/adm/utmpx" approach the agent reports good values: $ ./zabbix_agentd -t system.boottime system.boottime [u|1418139894] $ ./zabbix_agentd -t system.uptime system.uptime [u|2171] |
Comment by Oleg Ivanivskyi [ 2014 Dec 12 ] |
Look like it's corrected: IN ZONE ----------- uptime 10:24am up 21 day(s), 10:17, 1 user, load average: 5.22, 4.71, 4.02 who -r . run-level 3 Nov 21 00:07 3 0 S Zabbix information : Host uptime (in sec) 2014-12-12 10:21:48 21 days, 10:14:35 -15903 days, 12:03:45 Host boot time 2014-12-12 10:21:48 2014-11-21 00:07:13 +15903 days, 12:04:28 IN GLOBAL --------------- uptime 10:26am up 21 day(s), 11:58, 4 users, load average: 5.29, 4.83, 4.13 who -r . run-level 3 Nov 20 22:35 3 0 S Zabbix information : Host uptime (in sec) 2014-12-12 10:18:57 21 days, 11:51:42 Host boot time 2014-12-12 10:18:57 2014-11-20 22:27:15 - |
Comment by Aleksandrs Saveljevs [ 2014 Dec 16 ] |
Implementation details remain to be discussed, but otherwise it is "Resolved". |
Comment by Andris Zeila [ 2015 Jan 06 ] |
Successfully tested |
Comment by Aleksandrs Saveljevs [ 2015 Jan 06 ] |
(1) The agent does not compile on Solaris 8: boottime.c:23:18: zone.h: No such file or directory boottime.c: In function `SYSTEM_BOOTTIME': boottime.c:30: error: `GLOBAL_ZONEID' undeclared (first use in this function) boottime.c:30: error: (Each undeclared identifier is reported only once boottime.c:30: error: for each function it appears in.) asaveljevs According to http://en.wikipedia.org/wiki/Solaris_Containers , zones are available since Solaris 10. Therefore, added a check for zone.h during configuration in r51408. RESOLVED. wiper CLOSED |
Comment by Aleksandrs Saveljevs [ 2015 Jan 06 ] |
A couple of topics were discussed with wiper:
|
Comment by Aleksandrs Saveljevs [ 2015 Jan 07 ] |
(2) Code in src/libs/zbxsysinfo/solaris/boottime.c in 2.4 was very different from 2.2 due to wiper CLOSED |
Comment by Andris Zeila [ 2015 Jan 07 ] |
Successfully tested |
Comment by Aleksandrs Saveljevs [ 2015 Jan 07 ] |
Fixed in pre-2.2.9 r51419, pre-2.4.4 r51420, and pre-2.5.0 (trunk) r51421. |