-
Problem report
-
Resolution: Duplicate
-
Critical
-
None
-
2.4.6, 2.4.7
-
Centos 7, maria-db with galera cluster
-
Sprint 21, Sprint 23
Oleksiy's UPDATE: see command below to understand why this issue is related to escalator, not alerter.
I have a problem with alerter process. It stuck (not sending e-mails) after 2-3 days of zabbix server working.
Process itself is working correctly, it is shown on `ps` command:
7426 ? S 0:03 \_ /usr/sbin/zabbix_server: alerter [sent alerts: 0 success, 0 fail in 0.000784 sec, idle 30 sec]
Every 30 seconds it runs:
7426 ? S 0:03 \_ /usr/sbin/zabbix_server: alerter [sent alerts: 0 success, 0 fail in 0.000736 sec, idle 30 sec]
But none of emails are sent. When I kill the process (only one zabbix_server fork) Zabbix will restart it and all old email are send immediately. The same I restart zabbix_server service.
During the stuck I increase debug for alerter:
7426:20151212:112149.032 alerter [sending alerts] 7426:20151212:112149.033 query [txnlev:0] [select a.alertid,a.mediatypeid,a.sendto,a.subject,a.message,a.status,mt.mediatypeid,mt.type,mt.description,mt.smtp_server,mt.smtp_helo,mt.smtp_email,mt.exec_path,mt.gsm_modem,mt.username,mt.passwd,a.retries from alerts a,media_type mt where a.mediatypeid=mt.mediatypeid and a.status=0 and a.alerttype=0 order by a.alertid] 7426:20151212:112149.034 alerter [sent alerts: 0 success, 0 fail in 0.000906 sec, idle 30 sec]
I queried the zabbix database with this SQL query, and there is 0 rows returned. But after the fork restarted (or zabbix_server restart) this query returns a lot of rows and emails are send.
16653 ? S 0:00 \_ /usr/sbin/zabbix_server: alerter [sending alerts]
And zabbix server log:
16653:20151212:115132.310 query [txnlev:0] [update alerts set status=1,error='' where alertid=233846] 16653:20151212:115132.659 In execute_action(): alertid [233847] mediatype [0] 16653:20151212:115132.659 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115132.726 End of send_email():SUCCEED 16653:20151212:115132.726 End of execute_action():SUCCEED 16653:20151212:115132.726 alert ID [233847] was sent successfully 16653:20151212:115132.726 query without transaction detected 16653:20151212:115132.726 query [txnlev:0] [update alerts set status=1,error='' where alertid=233847] 16653:20151212:115132.757 In execute_action(): alertid [233848] mediatype [0] 16653:20151212:115132.757 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115132.815 End of send_email():SUCCEED 16653:20151212:115132.815 End of execute_action():SUCCEED 16653:20151212:115132.815 alert ID [233848] was sent successfully 16653:20151212:115132.815 query without transaction detected 16653:20151212:115132.815 query [txnlev:0] [update alerts set status=1,error='' where alertid=233848] 16653:20151212:115132.978 In execute_action(): alertid [233849] mediatype [0] 16653:20151212:115132.978 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115133.407 End of send_email():SUCCEED 16653:20151212:115133.407 End of execute_action():SUCCEED 16653:20151212:115133.407 alert ID [233849] was sent successfully 16653:20151212:115133.407 query without transaction detected 16653:20151212:115133.407 query [txnlev:0] [update alerts set status=1,error='' where alertid=233849] 16653:20151212:115133.546 In execute_action(): alertid [233850] mediatype [0] 16653:20151212:115133.546 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115133.918 End of send_email():SUCCEED 16653:20151212:115133.919 End of execute_action():SUCCEED 16653:20151212:115133.919 alert ID [233850] was sent successfully 16653:20151212:115133.919 query without transaction detected 16653:20151212:115133.919 query [txnlev:0] [update alerts set status=1,error='' where alertid=233850] 16653:20151212:115133.930 In execute_action(): alertid [233851] mediatype [0] 16653:20151212:115133.930 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115134.091 End of send_email():SUCCEED 16653:20151212:115134.091 End of execute_action():SUCCEED 16653:20151212:115134.091 alert ID [233851] was sent successfully 16653:20151212:115134.091 query without transaction detected 16653:20151212:115134.091 query [txnlev:0] [update alerts set status=1,error='' where alertid=233851] 16653:20151212:115134.569 In execute_action(): alertid [233852] mediatype [0] 16653:20151212:115134.569 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115135.334 End of send_email():SUCCEED 16653:20151212:115135.334 End of execute_action():SUCCEED 16653:20151212:115135.334 alert ID [233852] was sent successfully 16653:20151212:115135.334 query without transaction detected 16653:20151212:115135.334 query [txnlev:0] [update alerts set status=1,error='' where alertid=233852] 16653:20151212:115135.861 In execute_action(): alertid [233853] mediatype [0] 16653:20151212:115135.861 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115136.639 End of send_email():SUCCEED 16653:20151212:115136.639 End of execute_action():SUCCEED 16653:20151212:115136.639 alert ID [233853] was sent successfully 16653:20151212:115136.639 query without transaction detected 16653:20151212:115136.639 query [txnlev:0] [update alerts set status=1,error='' where alertid=233853] 16653:20151212:115137.327 In execute_action(): alertid [233854] mediatype [0] 16653:20151212:115137.327 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115137.574 End of send_email():SUCCEED 16653:20151212:115137.574 End of execute_action():SUCCEED 16653:20151212:115137.574 alert ID [233854] was sent successfully 16653:20151212:115137.574 query without transaction detected 16653:20151212:115137.574 query [txnlev:0] [update alerts set status=1,error='' where alertid=233854] 16653:20151212:115137.595 In execute_action(): alertid [233855] mediatype [0] 16653:20151212:115137.595 In send_email() smtp_server:'127.0.0.1' 16653:20151212:115137.957 End of send_email():SUCCEED 16653:20151212:115137.957 End of execute_action():SUCCEED 16653:20151212:115137.957 alert ID [233855] was sent successfully 16653:20151212:115137.957 query without transaction detected .......... 16653:20151212:115200.571 alerter [sent alerts: 68 success, 0 fail in 156.770028 sec, idle 30 sec]
During this stuck, all remote command at Actions (custom scripts) doesn’t’ work too.
- duplicates
-
ZBX-12925 again. still failed transactions rollback on DB2 backend
- Closed