[ZBX-6047] system.boottime and system.uptime broken on solaris 10/11 zones Created: 2013 Jan 02  Updated: 2017 May 30  Resolved: 2015 Jan 08

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 2.0.4
Fix Version/s: 2.2.9rc1, 2.4.4rc1, 2.5.0

Type: Incident report Priority: Blocker
Reporter: Andrew Howell Assignee: Unassigned
Resolution: Fixed Votes: 1
Labels: solaris, uptime
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

solaris 11 non global zone on Sun Fire X4270


Attachments: PNG File uptime.png     File var-adm-utmpx.patch    
Issue Links:
Duplicate

 Description   

system.boottime on a solaris 11 zone returns a 1970 date/time instead of the correct boot time. In the global zone it returns the correct time.



 Comments   
Comment by richlv [ 2013 Dec 04 ]

confirming with 2.2.1rc1 agent on SunOS 5.11 11.1 sun4v

Comment by richlv [ 2013 Dec 05 ]

probably closely related - system.uptime also shows quite... impossible numbers

Comment by Alexey Pustovalov [ 2014 Mar 19 ]

I have reproduced another issue, but like described:

root@appserv:~/zabbix-2.2.2# uptime 
  1:01am  up 622 day(s), 2 hr(s),  1 user,  load average: 0.01, 0.08, 0.10
root@appserv:~/zabbix-2.2.2# ./src/zabbix_get/zabbix_get -s localhost -k system.boottime
26975
Comment by Alexey Pustovalov [ 2014 Mar 20 ]

is it possible to check latest Zabbix agent version from 2.2 branch? The issue has been solved there.

Comment by Andrew Howell [ 2014 Mar 26 ]

I've checked again on 2.2.2 and it still has the same issue with system.boottime and system.uptime

Comment by Aleksandrs Saveljevs [ 2014 Dec 09 ]

Currently, in our test environemnt, I have not observed "system.boottime" showing a small value or "system.uptime" showing a big value, but I have observed that "system.boottime" and "system.uptime" show the same numbers in the non-global zone as they do in the global zone.

Here is a bit of a research regarding the possible ways to fix the problem. In the present implementation we use "boot_time" kstat counter:

$ kstat -p -s '*boot*'                                                                                    
unix:0:system_misc:boot_time    1417777382

If that counter is obtained in the non-global zone, it still returns the value for the global zone.

One idea that appeared in the process was to look at system calls that system utilities perform. For instance, here is the output of "psrinfo":

$ /usr/sbin/psrinfo
0       on-line   since 12/05/2014 13:03:02

Doing its "truss" says that it looks at "cpu_info0" kstat counter:

...
time()                                          = 1418141200
ioctl(3, KSTAT_IOC_READ, "cpu_info0")           = 8068
...

Its value is:

$ kstat -p -n cpu_info0 | grep 141
cpu_info:0:cpu_info0:state_begin        1417777382

That would probably be another way to get system boot time, but that still returns the same boot time as the global zone.

Code in http://fossies.org/linux/monit/src/process/sysdep_SOLARIS.c also gave a hint that kstat might not be the way to go for non-global zones.

A more likely candidate for deeper inspection is "uptime" utility:

$ uptime
  6:21pm  up 36 min(s),  1 user,  load average: 0,00, 0,00, 0,02

Doing its "truss" says that it looks in "/var/adm/utmpx" using getutxent() and similar system calls, see http://docs.oracle.com/cd/E19109-01/tsolaris8/817-0882/6mglcr99g/index.html for their documentation. This idea is further confirmed in http://compgroups.net/comp.unix.solaris/source-of-boottime-for-uptime-other-than-ut/41559 . (That discussion also suggests to stat() /proc/0 and look at its atime/mtime/ctime to get system boot time.)

An implementation based on "/var/adm/utmpx" is currently available in development branch svn://svn.zabbix.com/branches/dev/ZBX-6047 . The patch that implements the change is attached as "var-adm-utmpx.patch". It would be nice if you could test whether that patch also solves the reported problem.

On our test system with the "/var/adm/utmpx" approach the agent reports good values:

$ ./zabbix_agentd -t system.boottime
system.boottime                               [u|1418139894]
$ ./zabbix_agentd -t system.uptime                                                                        
system.uptime                                 [u|2171]
Comment by Oleg Ivanivskyi [ 2014 Dec 12 ]

Look like it's corrected:

IN ZONE
-----------
uptime
10:24am up 21 day(s), 10:17, 1 user, load average: 5.22, 4.71, 4.02
who -r
. run-level 3 Nov 21 00:07 3 0 S
Zabbix information :
Host uptime (in sec)	2014-12-12 10:21:48	21 days, 10:14:35	-15903 days, 12:03:45
Host boot time	2014-12-12 10:21:48	2014-11-21 00:07:13	+15903 days, 12:04:28
IN GLOBAL
---------------
uptime
10:26am up 21 day(s), 11:58, 4 users, load average: 5.29, 4.83, 4.13
who -r
. run-level 3 Nov 20 22:35 3 0 S
Zabbix information :
Host uptime (in sec)	2014-12-12 10:18:57	21 days, 11:51:42
Host boot time	2014-12-12 10:18:57	2014-11-20 22:27:15	-
Comment by Aleksandrs Saveljevs [ 2014 Dec 16 ]

Implementation details remain to be discussed, but otherwise it is "Resolved".

Comment by Andris Zeila [ 2015 Jan 06 ]

Successfully tested

Comment by Aleksandrs Saveljevs [ 2015 Jan 06 ]

(1) The agent does not compile on Solaris 8:

boottime.c:23:18: zone.h: No such file or directory
boottime.c: In function `SYSTEM_BOOTTIME':
boottime.c:30: error: `GLOBAL_ZONEID' undeclared (first use in this function)
boottime.c:30: error: (Each undeclared identifier is reported only once
boottime.c:30: error: for each function it appears in.)

asaveljevs According to http://en.wikipedia.org/wiki/Solaris_Containers , zones are available since Solaris 10. Therefore, added a check for zone.h during configuration in r51408. RESOLVED.

wiper CLOSED

Comment by Aleksandrs Saveljevs [ 2015 Jan 06 ]

A couple of topics were discussed with wiper:

  • Currently, the first entry of type BOOT_TIME is obtained from /var/adm/utmpx. There was a question whether there can be multiple such entries. It was decided that we can assume that /var/adm/utmpx only contains one such entry and the historical information is kept in /var/adm/wtmpx.
  • The suggestion "to stat() /proc/0 and look at its atime/mtime/ctime to get system boot time" mentioned above was deemed too hackish to be implemented as a fallback solution.
Comment by Aleksandrs Saveljevs [ 2015 Jan 07 ]

(2) Code in src/libs/zbxsysinfo/solaris/boottime.c in 2.4 was very different from 2.2 due to ZBXNEXT-2203. Resolved conflicts in svn://svn.zabbix.com/branches/dev/ZBX-6047-2.4. Please take a look.

wiper CLOSED

Comment by Andris Zeila [ 2015 Jan 07 ]

Successfully tested

Comment by Aleksandrs Saveljevs [ 2015 Jan 07 ]

Fixed in pre-2.2.9 r51419, pre-2.4.4 r51420, and pre-2.5.0 (trunk) r51421.

Generated at Wed Apr 24 15:00:32 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.