[ZBX-11439] 3.2 server could not close PROBLEM events Created: 2016 Nov 08  Updated: 2017 May 30  Resolved: 2017 Feb 06

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.2.1
Fix Version/s: None

Type: Incident report Priority: Critical
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Duplicate Votes: 4
Labels: events, problems
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates ZBX-11454 Force OK events to update trigger val... Closed
is duplicated by ZBX-11526 Trigger status stuck in problem state Closed
is duplicated by ZBX-11490 Multiple OK event Closed
is duplicated by ZBX-11412 Problem with trigger recovery Closed
is duplicated by ZBX-11562 Issue not closing (when older and wit... Closed

 Description   

Our server upgraded to 3.2.0 a month ago and a few days ago we got an issue. During the issue we upgraded to 3.2.1 but it did not help.
History of used versions: 1.8 2.2 2.4 3.0 3.2
We don't use any new feature (like events corellations etc).
Nothing has changed in zabbix close to the failure point.
DB server is separate and had not any issue (although had very high CPU user time some time ago).
But zabbix server host VM has migrated to another ESX because of some reason and after that the issue appeared.
History syncers were too busy often and for long period (maybe because of DebugLevel=4 all the time), it could be related.
(after decreased debug to 3, syncers became 5% busy, while with level 4 they were normally 70% busy).

We use GTID-based master-master replication on MySQL. (mysql Ver 15.1 Distrib 10.0.25-MariaDB, for Linux (x86_64) using readline 5.1)

During investigation it gets clear that many triggers are in PROBLEM state (value=1 in "triggers") generate OK events each time when get new value or when being recalculated by timer.
In the same time these triggers DO NOT have any record in "problem" table.
In the same time these triggers have OK/PROBLEM events in "events" table.

The picture looks this way - if you visit Problems page - you see nothing bad, i.e you see nothing, in any view mode.
on dashboard you can see problem state for many triggers in host groups,
on Last 20 issues many triggers being updated constantly and their Age gets small all the time.
on "Triggers top 100" report you can see huge "Number of status changes" for these triggers.

Difference in debug log:
Affected zabbix:

 17023:20161108:153349.436 End of DCget_nextid() table:'events' [96:95]
 17023:20161108:153349.436 query [txnlev:1] [insert into events (eventid,source,object,objectid,clock,ns,value) values (95,0,0,13569,1478612028,511579021,0);
]
 17023:20161108:153349.436 In process_actions() events_num:1
 17023:20161108:153349.436 In zbx_dc_get_actions_eval()
 17023:20161108:153349.436 End of zbx_dc_get_actions_eval() actions:1
 17023:20161108:153349.436 End of process_actions()
 17023:20161108:153349.436 In DBupdate_itservices()
 17023:20161108:153349.436 End of DBupdate_itservices():SUCCEED
 17023:20161108:153349.436 End of process_trigger_events() processed:1
 17023:20161108:153349.436 In zbx_save_trigger_changes()
 17023:20161108:153349.436 query [txnlev:1] [update triggers set lastchange=1478612028 where triggerid=13569;
]

another, not affected zabbix:

  6242:20161108:103022.270 End of DCget_nextid() table:'events' [85:84]
  6242:20161108:103022.270 query [txnlev:1] [insert into events (eventid,source,object,objectid,clock,ns,value) values (84,0,0,13569,1478593821,591608632,0);
]
  6242:20161108:103022.271 query [txnlev:1] [insert into event_recovery (eventid,r_eventid,correlationid,c_eventid,userid) values (83,84,null,null,null);
]
  6242:20161108:103022.271 query [txnlev:1] [update problem set r_eventid=84,r_clock=1478593821,r_ns=591608632,userid=0 where eventid=83;
]
  6242:20161108:103022.272 In process_actions() events_num:1
  6242:20161108:103022.272 In zbx_dc_get_actions_eval()
  6242:20161108:103022.272 End of zbx_dc_get_actions_eval() actions:1
  6242:20161108:103022.272 query [txnlev:1] [select actionid,eventid,escalationid from escalations where eventid=83]
  6242:20161108:103022.272 query [txnlev:1] [update escalations set r_eventid=84 where escalationid=1;
]

We resolved the issue using this direct SQL:

update triggers set value=0 where value=1 and triggerid not in (select distinct objectid from problem);

Question why this issue appeared is still opened, but I think about 2 points:
1. In any conditions it should not happen that "problem" table is not consistent with triggers and/or events.
2. Zabbix server should be able to "survive" that if it still happens.

This issue is somehow similar to ZBX-11426, but still scenario how it happened is different.

Below is some collected information, just in case:

MariaDB [zabbix]> SELECT *, FROM_UNIXTIME(clock) FROM events WHERE objectid=129403 AND clock>=unix_timestamp('2016-11-08 09:55:00') AND clock<=unix_timestamp('2016-11-08 10:00:00') ORDER BY clock DESC;
+-----------+--------+--------+----------+------------+-------+--------------+-----------+----------------------+
| eventid   | source | object | objectid | clock      | value | acknowledged | ns        | FROM_UNIXTIME(clock) |
+-----------+--------+--------+----------+------------+-------+--------------+-----------+----------------------+
| 279907078 |      0 |      0 |   129403 | 1478595600 |     0 |            0 |  26280341 | 2016-11-08 10:00:00  |
| 279906763 |      0 |      0 |   129403 | 1478595574 |     0 |            0 |   6126234 | 2016-11-08 09:59:34  |
| 279906657 |      0 |      0 |   129403 | 1478595570 |     0 |            0 | 977226025 | 2016-11-08 09:59:30  |
| 279906092 |      0 |      0 |   129403 | 1478595544 |     0 |            0 | 993842012 | 2016-11-08 09:59:04  |
...
| 279901329 |      0 |      0 |   129403 | 1478595334 |     0 |            0 |  65403047 | 2016-11-08 09:55:34  |
| 279900981 |      0 |      0 |   129403 | 1478595330 |     0 |            0 | 249855840 | 2016-11-08 09:55:30  |
| 279900642 |      0 |      0 |   129403 | 1478595304 |     0 |            0 |  64580586 | 2016-11-08 09:55:04  |
| 279900299 |      0 |      0 |   129403 | 1478595300 |     0 |            0 | 438725918 | 2016-11-08 09:55:00  |
+-----------+--------+--------+----------+------------+-------+--------------+-----------+----------------------+
MariaDB [zabbix]> select * from triggers where triggerid=129403\G
*************************** 1. row ***************************
          triggerid: 129403
         expression: {321462}=0 or {321463}=1
        description: ESXi is unreachable
                url: http://some
             status: 0
              value: 1
           priority: 5
         lastchange: 1478601964
           comments:
              error:
         templateid: 129305
               type: 0
              state: 0
              flags: 0
      recovery_mode: 0
recovery_expression:
   correlation_mode: 0
    correlation_tag:
       manual_close: 1
1 row in set (0.00 sec)
MariaDB [zabbix]> SELECT DATE(FROM_UNIXTIME(p.clock)) AS date, HOUR(FROM_UNIXTIME(p.clock)) AS hour, p.source, count(*) FROM problem p JOIN triggers t ON p.objectid = t.triggerid WHERE p.object='0' AND p.source='0' AND p.clock>=unix_timestamp('2016-11-01 00:00:00') AND p.clock<=unix_timestamp('2016-11-08 18:00:00') GROUP BY 1,2,3 ORDER BY p.clock DESC,p.eventid DESC;
+------------+------+--------+----------+
| date       | hour | source | count(*) |
+------------+------+--------+----------+
| 2016-11-08 |   10 |      0 |      160 |
| 2016-11-08 |    9 |      0 |      280 |
...
| 2016-11-07 |   11 |      0 |      392 |
| 2016-11-07 |   10 |      0 |      431 |
| 2016-11-07 |    9 |      0 |      104 |
| 2016-11-07 |    8 |      0 |       60 |
| 2016-11-07 |    7 |      0 |       61 |
| 2016-11-07 |    6 |      0 |       61 |
| 2016-11-07 |    5 |      0 |       60 |
...
| 2016-11-06 |    9 |      0 |       60 |
| 2016-11-06 |    8 |      0 |       60 |
| 2016-11-06 |    7 |      0 |       61 |
| 2016-11-06 |    6 |      0 |       60 |
| 2016-11-06 |    5 |      0 |       60 |
| 2016-11-06 |    4 |      0 |       46 |
| 2016-11-05 |   22 |      0 |        1 |
| 2016-11-05 |   14 |      0 |        1 |
| 2016-11-05 |    6 |      0 |        1 |
| 2016-11-05 |    2 |      0 |        1 |
...
| 2016-11-03 |   11 |      0 |        1 |
| 2016-11-03 |    9 |      0 |        1 |
| 2016-11-02 |   21 |      0 |        2 |
| 2016-11-02 |   16 |      0 |        1 |
| 2016-11-01 |    9 |      0 |        1 |
+------------+------+--------+----------+
MariaDB [zabbix]> SELECT DATE(FROM_UNIXTIME(e.clock)) AS date, HOUR(FROM_UNIXTIME(e.clock)) AS hour, e.source, count(*) FROM events e JOIN triggers t ON e.objectid = t.triggerid WHERE e.object='0' AND e.source='0' AND e.clock>=unix_timestamp('2016-11-01 00:00:00') AND e.clock<=unix_timestamp('2016-11-08 18:00:00') GROUP BY 1,2,3 ORDER BY e.clock DESC,e.eventid DESC;
+------------+------+--------+----------+
| date       | hour | source | count(*) |
+------------+------+--------+----------+
| 2016-11-08 |   10 |      0 |    46782 |
| 2016-11-08 |    9 |      0 |    72477 |
| 2016-11-08 |    8 |      0 |    72392 |
| 2016-11-08 |    7 |      0 |    72258 |
...
| 2016-11-04 |   13 |      0 |    53050 |
| 2016-11-04 |   12 |      0 |    53467 |
| 2016-11-04 |   11 |      0 |    53062 |
| 2016-11-04 |   10 |      0 |    38379 |
| 2016-11-04 |    9 |      0 |     3502 |
| 2016-11-04 |    8 |      0 |      510 |
| 2016-11-04 |    7 |      0 |      424 |
| 2016-11-04 |    6 |      0 |      325 |
...
| 2016-11-03 |   19 |      0 |      447 |
| 2016-11-03 |   18 |      0 |      515 |
| 2016-11-03 |   17 |      0 |      531 |
+------------+------+--------+----------+
MariaDB [zabbix]> select from_unixtime(min(clock)) from problem;
+---------------------------+
| from_unixtime(min(clock)) |
+---------------------------+
| 2015-11-09 13:47:09       |
+---------------------------+
MariaDB [zabbix]> select from_unixtime(min(clock)) from events where objectid=129403;
+---------------------------+
| from_unixtime(min(clock)) |
+---------------------------+
| 2016-02-17 15:19:04       |
+---------------------------+
MariaDB [zabbix]> select count(*) from problem where objectid=129403;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
MariaDB [zabbix]> select count(*) from triggers where value=1 and triggerid not in (select distinct objectid from events);
+----------+
| count(*) |
+----------+
|      573 |
+----------+
MariaDB [zabbix]> select count(*) from triggers where value=1 and triggerid not in (select distinct objectid from events where source=0 and value=1);
+----------+
| count(*) |
+----------+
|      577 |
+----------+
MariaDB [zabbix]> update triggers set value=0 where value=1 and triggerid not in (select distinct objectid from problem);
Query OK, 887 rows affected (1 min 36.75 sec)
Rows matched: 887  Changed: 887  Warnings: 0
MariaDB [zabbix]> select description, from_unixtime(lastchange) from triggers where value=1 and triggerid not in (select distinct objectid from events);
Empty set (2.70 sec)


 Comments   
Comment by Aleksandrs Saveljevs [ 2016 Nov 08 ]

ZBX-11412 may be related, too.

zalex_ua It's duplicate of current issue.

Comment by Andris Zeila [ 2016 Nov 10 ]

ZBX-11454 was created to deal with the fallout while this issue will be kept open to fix the cause.

Comment by Oleksii Zagorskyi [ 2016 Nov 11 ]

Got it reproduced accidentally on my home raspberry pi with NVPS ~1.
Simply power failure happened.
One point which COULD BE possible is - when mysql and zabbix_server daemon are starting after power lost - system clock is set to unixtime=0, so colected values (and events etc) could be stored with very old clock and could be removed by first housekeeper invocation (there were 17 events deleted and "problem" could be deleted as constrains).
After some seconds after the daemon started system updates system clock.
Here is collected info, gathered after ~4 hours after the power lost.

PROBLEM event for the trigger is missing:

select *, from_unixtime(clock)  from events where objectid=13625;
...
|    5042 |      0 |      0 |    13625 | 1478433621 |     0 |            0 | 409672413 | 2016-11-06 14:00:21  |
|    5092 |      0 |      0 |    13625 | 1478804421 |     0 |            0 | 950535274 | 2016-11-10 21:00:21  |
|    5094 |      0 |      0 |    13625 | 1478804481 |     0 |            0 | 822419357 | 2016-11-10 21:01:21  |
|    5095 |      0 |      0 |    13625 | 1478804541 |     0 |            0 | 875781585 | 2016-11-10 21:02:21  |
|    5096 |      0 |      0 |    13625 | 1478804601 |     0 |            0 | 276318916 | 2016-11-10 21:03:21  |
|    5097 |      0 |      0 |    13625 | 1478804661 |     0 |            0 | 228924257 | 2016-11-10 21:04:21  |
|    5098 |      0 |      0 |    13625 | 1478804721 |     0 |            0 | 499307034 | 2016-11-10 21:05:21  |
|    5099 |      0 |      0 |    13625 | 1478804781 |     0 |            0 | 146053814 | 2016-11-10 21:06:21  |
|    5100 |      0 |      0 |    13625 | 1478804841 |     0 |            0 | 257003020 | 2016-11-10 21:07:21  |
|    5101 |      0 |      0 |    13625 | 1478804901 |     0 |            0 | 865698839 | 2016-11-10 21:08:21  |
|    5102 |      0 |      0 |    13625 | 1478804961 |     0 |            0 | 328427889 | 2016-11-10 21:09:21  |
...

problem record is missing too:

mysql> select *, from_unixtime(clock)  from problem where objectid=13625;
Empty set (0.01 sec)

trigger, item settings:

mysql> select triggerid, expression, from_unixtime(lastchange) from triggers where value=1 and triggerid not in (select distinct objectid from problem);      
+-----------+------------+---------------------------+
| triggerid | expression | from_unixtime(lastchange) |
+-----------+------------+---------------------------+
|     13625 | {13238}=0  | 2016-11-11 00:55:21       |
+-----------+------------+---------------------------+
1 row in set (0.01 sec)
mysql> select * from functions where functionid=13238;
+------------+--------+-----------+----------+-----------+
| functionid | itemid | triggerid | function | parameter |
+------------+--------+-----------+----------+-----------+
|      13238 |  24141 |     13625 | last     |           |
+------------+--------+-----------+----------+-----------+
1 row in set (0.01 sec)
mysql> select * from items where itemid=24141\G
*************************** 1. row ***************************
               itemid: 24141
                 type: 0
       snmp_community:
             snmp_oid:
               hostid: 10113
                 name: Number of "$1"  processes running
                 key_: proc.num[znc]
                delay: 60
              history: 3
               trends: 7
               status: 0
           value_type: 3
        trapper_hosts:
                units:
           multiplier: 0
                delta: 0
  snmpv3_securityname:
 snmpv3_securitylevel: 0
snmpv3_authpassphrase:
snmpv3_privpassphrase:
              formula: 1
                error:
          lastlogsize: 0
           logtimefmt:
           templateid: NULL
           valuemapid: NULL
           delay_flex:
               params:
          ipmi_sensor:
            data_type: 0
             authtype: 0
             username:
             password:
            publickey:
           privatekey:
                mtime: 0
                flags: 0
          interfaceid: 10
                 port:
          description:
       inventory_link: 0
             lifetime: 30
  snmpv3_authprotocol: 0
  snmpv3_privprotocol: 0
                state: 0
   snmpv3_contextname:
             evaltype: 0
1 row in set (0.00 sec)

server log before and after:

  1308:20161110:153703.248 executing housekeeper
  1308:20161110:153723.456 housekeeper [deleted 487 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 20.204430 sec, idle for 1 hour(s)]
  1308:20161110:163724.249 executing housekeeper
  1308:20161110:163743.647 housekeeper [deleted 489 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 19.394272 sec, idle for 1 hour(s)]
  1308:20161110:173744.283 executing housekeeper
  1308:20161110:173750.671 housekeeper [deleted 487 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 6.383478 sec, idle for 1 hour(s)]
  1308:20161110:183751.335 executing housekeeper
  1308:20161110:183816.057 housekeeper [deleted 487 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 24.718003 sec, idle for 1 hour(s)]
  1315:20161110:195019.151 item "Hikvision:hikvision[storage,status]" became supported
  1315:20161110:200019.514 item "Hikvision:hikvision[storage,capacity]" became supported
  1315:20161110:200023.446 item "Hikvision:hikvision[storage,freeSpace]" became supported
>>>> POWER LOSS HERE <<<<<<
  750:19700101:030024.246 Starting Zabbix Server. Zabbix 3.3.0 (revision 63524).
   750:19700101:030024.304 ****** Enabled features ******
   750:19700101:030024.304 SNMP monitoring:           YES
   750:19700101:030024.304 IPMI monitoring:           YES
   750:19700101:030024.305 Web monitoring:            YES
   750:19700101:030024.305 VMware monitoring:         YES
   750:19700101:030024.306 SMTP authentication:       YES
   750:19700101:030024.306 Jabber notifications:      YES
   750:19700101:030024.306 Ez Texting notifications:  YES
   750:19700101:030024.307 ODBC:                      YES
   750:19700101:030024.307 SSH2 support:              YES
   750:19700101:030024.307 IPv6 support:              YES
   750:19700101:030024.308 TLS support:                NO
   750:19700101:030024.308 ******************************
   750:19700101:030024.309 using configuration file: /etc/zabbix/zabbix_server.conf
   750:19700101:030024.418 [Z3001] connection to database 'trunk' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
   750:19700101:030024.575 database is down: reconnecting in 10 seconds
   750:19700101:030034.601 [Z3001] connection to database 'trunk' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
   750:19700101:030034.782 database is down: reconnecting in 10 seconds
   750:19700101:030044.804 database connection re-established
   750:19700101:030044.875 current database version (mandatory/optional): 03030011/03030011
   750:19700101:030044.876 required mandatory version: 03030011
   750:19700101:030045.486 server #0 started [main process]
  1301:19700101:030045.507 server #1 started [configuration syncer #1]
  1302:19700101:030045.511 server #2 started [db watchdog #1]
  1303:19700101:030045.525 server #3 started [poller #1]
  1304:19700101:030045.548 server #4 started [unreachable poller #1]
  1306:19700101:030045.555 server #5 started [trapper #1]
  1307:19700101:030045.629 server #6 started [icmp pinger #1]
  1309:19700101:030045.685 server #7 started [alerter #1]
  1312:19700101:030045.749 server #9 started [timer #1]
  1311:19700101:030045.784 server #8 started [housekeeper #1]
  1313:19700101:030045.819 server #10 started [http poller #1]
  1314:19700101:030045.861 server #11 started [discoverer #1]
  1317:19700101:030045.869 server #12 started [history syncer #1]
  1320:19700101:030045.952 server #13 started [escalator #1]
  1323:19700101:030046.019 server #14 started [self-monitoring #1]
  1324:19700101:030046.055 server #15 started [task manager #1]
  1317:20161110:201422.488 item "trim became not supported: Timeout while executing a shell script.
  1317:20161110:201517.281 item "trim" became supported
  1311:20161110:204245.780 executing housekeeper
  1311:20161110:204349.036 housekeeper [deleted 35212 hist/trends, 0 items, 17 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 63.250406 sec, idle for 1 hour(s)]
  1320:20161110:204358.446 escalation cancelled: event id:5083 deleted.
  1311:20161110:214349.682 executing housekeeper
  1311:20161110:214419.415 housekeeper [deleted 1991 hist/trends, 0 items, 4 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 29.728824 sec, idle for 1 hour(s)]
  1311:20161110:224420.104 executing housekeeper
  1311:20161110:224436.646 housekeeper [deleted 1991 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 16.538540 sec, idle for 1 hour(s)]
  1311:20161110:234437.439 executing housekeeper
  1311:20161110:234452.312 housekeeper [deleted 2003 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 14.868972 sec, idle for 1 hour(s)]
  1311:20161111:004453.004 executing housekeeper
  1311:20161111:004500.611 housekeeper [deleted 885 hist/trends, 0 items, 0 events, 1 problems, 0 sessions, 0 alarms, 0 audit items in 7.602953 sec, idle for 1 hour(s)]
   750:20161111:005842.912 Got signal [signal:15(SIGTERM),sender_pid:23345,sender_uid:0,reason:0]. Exiting ...
   750:20161111:005845.026 syncing history data...
   750:20161111:005845.027 syncing history data done
   750:20161111:005845.027 syncing trend data...
   750:20161111:005845.081 syncing trend data done
   750:20161111:005845.083 Zabbix Server stopped. Zabbix 3.3.0 (revision 63524).
>>>> the trigger value manually fixed here, server started <<<<<<<<
 23362:20161111:012846.181 executing housekeeper
 23362:20161111:012948.102 housekeeper [deleted 11367 hist/trends, 746 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 61.916295 sec, idle for 1 hour(s)]
 23362:20161111:022948.845 executing housekeeper
 23362:20161111:023035.515 housekeeper [deleted 11613 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 46.665632 sec, idle for 1 hour(s)]
 23362:20161111:033036.249 executing housekeeper
 23362:20161111:033119.040 housekeeper [deleted 11623 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 42.786623 sec, idle for 1 hour(s)]
 23362:20161111:043119.895 executing housekeeper
 23362:20161111:043157.343 housekeeper [deleted 11605 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 37.444627 sec, idle for 1 hour(s)]
 23362:20161111:053157.964 executing housekeeper
 23362:20161111:053316.655 housekeeper [deleted 11598 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 78.687485 sec, idle for 1 hour(s)]

number of deleted history before/in/after restart, hmmm, while NVPS is very stable =1.02 and all items have full history/trends .... strange ...

the item history:

mysql> select *, from_unixtime(clock) from history_uint where itemid=24141 and clock > unix_timestamp('2016-11-10 19:56:00') and clock < unix_timestamp('2016-11-10 21:03:00') order by clock asc; 
+--------+------------+-------+-----------+----------------------+
| itemid | clock      | value | ns        | from_unixtime(clock) |
+--------+------------+-------+-----------+----------------------+
|  24141 | 1478800581 |     1 | 796263667 | 2016-11-10 19:56:21  |
|  24141 | 1478800641 |     1 | 433648777 | 2016-11-10 19:57:21  |
|  24141 | 1478800701 |     1 | 594466441 | 2016-11-10 19:58:21  |
|  24141 | 1478800762 |     1 | 215624657 | 2016-11-10 19:59:22  |
|  24141 | 1478800821 |     1 | 404095308 | 2016-11-10 20:00:21  |
|  24141 | 1478801638 |     0 | 117522365 | 2016-11-10 20:13:58  |
|  24141 | 1478801664 |     0 | 897324614 | 2016-11-10 20:14:24  |
|  24141 | 1478801721 |     0 | 693859126 | 2016-11-10 20:15:21  |
|  24141 | 1478801781 |     0 | 864936317 | 2016-11-10 20:16:21  |
trim
|  24141 | 1478804181 |     0 | 904726744 | 2016-11-10 20:56:21  |
|  24141 | 1478804241 |     0 | 984643395 | 2016-11-10 20:57:21  |
|  24141 | 1478804301 |     0 | 355551816 | 2016-11-10 20:58:21  |
|  24141 | 1478804363 |     0 | 112497073 | 2016-11-10 20:59:23  |
|  24141 | 1478804421 |     1 | 950535274 | 2016-11-10 21:00:21  |
|  24141 | 1478804481 |     1 | 822419357 | 2016-11-10 21:01:21  |
|  24141 | 1478804541 |     1 | 875781585 | 2016-11-10 21:02:21  |
+--------+------------+-------+-----------+----------------------+
55 rows in set (0.01 sec)

syslog:

Jan  1 03:00:34 kot2 mysqld: 700101  3:00:34 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 [Note] Plugin 'FEDERATED' is disabled.
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: The InnoDB memory heap is disabled
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: Mutexes and rw_locks use GCC atomic builtins
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: Compressed tables use zlib 1.2.8
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: Using Linux native AIO
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: Initializing buffer pool, size = 128.0M
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: Completed initialization of buffer pool
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35 InnoDB: highest supported file format is Barracuda.
Jan  1 03:00:35 kot2 mysqld: InnoDB: Log scan progressed past the checkpoint lsn 30017775804
Jan  1 03:00:35 kot2 mysqld: 700101  3:00:35  InnoDB: Database was not shut down normally!
Jan  1 03:00:35 kot2 mysqld: InnoDB: Starting crash recovery.
Jan  1 03:00:35 kot2 mysqld: InnoDB: Reading tablespace information from the .ibd files...
Jan  1 03:00:35 kot2 mysqld: InnoDB: Restoring possible half-written data pages from the doublewrite
Jan  1 03:00:35 kot2 mysqld: InnoDB: buffer...
Jan  1 03:00:36 kot2 mysqld: InnoDB: Doing recovery: scanned up to log sequence number 30017777223
Jan  1 03:00:37 kot2 mysqld: 700101  3:00:37  InnoDB: Starting an apply batch of log records to the database...
Jan  1 03:00:37 kot2 mysqld: InnoDB: Progress in percents: 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
Jan  1 03:00:37 kot2 mysqld: InnoDB: Apply batch completed
Jan  1 03:00:37 kot2 mysqld: 700101  3:00:37  InnoDB: Waiting for the background threads to start
Jan  1 03:00:37 kot2 kernel: [   37.888198] cfg80211: Calling CRDA to update world regulatory domain
Jan  1 03:00:38 kot2 mysqld: 700101  3:00:38 InnoDB: 5.5.52 started; log sequence number 30017777223
Jan  1 03:00:38 kot2 mysqld: 700101  3:00:38 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
Jan  1 03:00:38 kot2 mysqld: 700101  3:00:38 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
Jan  1 03:00:38 kot2 mysqld: 700101  3:00:38 [Note] Server socket created on IP: '0.0.0.0'.
Jan  1 03:00:39 kot2 mysqld: 700101  3:00:39 [Note] Event Scheduler: Loaded 0 events
Jan  1 03:00:39 kot2 mysqld: 700101  3:00:39 [Note] /usr/sbin/mysqld: ready for connections.
Jan  1 03:00:39 kot2 mysqld: Version: '5.5.52-0+deb8u1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Raspbian)
Jan  1 03:00:40 kot2 mysql[402]: Starting MySQL database server: mysqld . . . . . . . . ..
Jan  1 03:00:40 kot2 mysql[402]: Checking for tables which need an upgrade, are corrupt or were
Jan  1 03:00:40 kot2 systemd[1]: Started LSB: Start and stop the mysql database server daemon.
Jan  1 03:00:40 kot2 mysql[402]: not closed cleanly..
Jan  1 03:00:40 kot2 systemd[1]: Starting Multi-User System.
Jan  1 03:00:40 kot2 systemd[1]: Reached target Multi-User System.
Jan  1 03:00:40 kot2 systemd[1]: Starting Graphical Interface.
Jan  1 03:00:40 kot2 systemd[1]: Reached target Graphical Interface.
Jan  1 03:00:40 kot2 systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jan  1 03:00:40 kot2 /etc/mysql/debian-start[1258]: Upgrading MySQL tables if necessary.
Jan  1 03:00:40 kot2 systemd[1]: Started Update UTMP about System Runlevel Changes.
Jan  1 03:00:40 kot2 systemd[1]: Startup finished in 2.514s (kernel) + 38.023s (userspace) = 40.538s.
Jan  1 03:00:40 kot2 kernel: [   41.048227] cfg80211: Calling CRDA to update world regulatory domain
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1265]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1265]: Looking for 'mysql' as: /usr/bin/mysql
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1265]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1265]: This installation of MySQL is already upgraded to 5.5.52, use --force if you still need to run mysql_upgrade
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1277]: Checking for insecure root accounts.
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1282]: WARNING: mysql.user contains 4 root accounts without password!
Jan  1 03:00:41 kot2 /etc/mysql/debian-start[1283]: Triggering myisam-recover for all MyISAM tables
Jan  1 03:00:43 kot2 ntpd_intres[469]: host name not found: 1.debian.pool.ntp.org
Jan  1 03:00:43 kot2 ntpd_intres[469]: DNS 2.debian.pool.ntp.org -> 195.138.82.247
Jan  1 03:00:43 kot2 ntpd_intres[469]: DNS 3.debian.pool.ntp.org -> 78.154.163.166
Jan  1 03:00:44 kot2 kernel: [   44.208998] cfg80211: Calling CRDA to update world regulatory domain
Jan  1 03:00:47 kot2 kernel: [   47.368453] cfg80211: Calling CRDA to update world regulatory domain
Jan  1 03:00:50 kot2 kernel: [   50.528469] cfg80211: Calling CRDA to update world regulatory domain
Jan  1 03:00:53 kot2 kernel: [   53.688559] cfg80211: Exceeded CRDA call max attempts. Not calling CRDA
Nov 10 20:13:56 kot2 systemd[1]: Time has been changed

This case could be more close to already mentioned ZBX-11426, but let's keep it here.
ADDED: yes, remembered that in email client I have those alerts, and there I can see header "Date: Thu, 01 Jan 1970 03:00:54 +0300", while actually received 10 Nov 2016 20:13:29 +0200
So this comment is really about ZBX-11426.

I had another power lost today, but the trigger recovered to OK before first housekeeper activity, so our "problem" did not appear.

Comment by Dawid Mos [ 2016 Dec 06 ]

Same issue here, but it wasn't preceded by any system or hardware crash.
It seems to happen when some issue exists longer (few days or more) and then won't close even if the trigger expression doesn't match.

Temporary solved by UPDATE listed above, but I did it already two times and it happens again, so it looks like a bug.

Comment by Pradeep Patil [ 2016 Dec 21 ]

Hello,

Since it is not sending "OK" alert I believe Actions also will not happen for these events right?

Comment by Oleksii Zagorskyi [ 2016 Dec 22 ]

Pradeep, hmm, didn't pay attention as for alerts of such OK events. I cannot recall them in my real case.
Probably multiple OK/recovery alerts are not being sent, otherwise people would complain on that too, and in first order.

Comment by Oleksii Zagorskyi [ 2017 Jan 23 ]

I think that we need to close this one as duplicate of ZBX-11454, as the issue mainly is resolved, while there are still points for investigation, but that should be performed in separate issue report.

Comment by Alexander Vladishev [ 2017 Feb 06 ]

Closed as duplicate of ZBX-11454.

Generated at Sat Apr 27 04:07:32 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.