[ZBX-18971] improve error messages in case of unrecoverable errors from mysql Created: 2021 Feb 04  Updated: 2024 Apr 10  Resolved: 2021 Nov 30

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: None
Fix Version/s: 6.0 (plan)

Type: Documentation task Priority: Minor
Reporter: Oleksii Zagorskyi Assignee: Artjoms Rimdjonoks
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
Team: Team C
Sprint: Sprint 76 (May 2021), Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 79 (Aug 2021), Sprint 80 (Sep 2021), Sprint 81 (Oct 2021), Sprint 82 (Nov 2021)
Story Points: 1

 Description   

Take a look to this log:

 27680:20210114:171831.958 [Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
 27669:20210114:172217.750 [Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
 27673:20210114:173031.830 [Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
 27489:20210114:175119.574 [Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
 27443:20210114:180321.494 [Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
 27722:20210114:182528.661 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;]
 27722:20210114:182528.661 [Z3005] query failed: [2006] MySQL server has gone away [rollback;]
 27407:20210114:182528.662 [Z3005] query failed: [1053] Server shutdown in progress [select h.hostid,h.host,h.name,t.httptestid,t.name,t.agent,t.authentication,t.http_user,t.http_password,t.http_proxy,t.retries,t.ssl_cert_file,t.ssl_key_file,t.ssl_key_password,t.verify_peer,t.verify_host,t.delay from httptest t,hosts h where t.hostid=h.hostid and t.nextcheck<=1610648728 and mod(t.httptestid,50)=33 and t.status=0 and h.proxy_hostid is null and h.status=0 and (h.maintenance_status=0 or h.maintenance_type=0)]
 27386:20210114:182528.667 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select type,itemid from httptestitem where httptestid=962]
 27395:20210114:182528.672 [Z3005] query failed: [1053] Server shutdown in progress [select type,itemid from httpstepitem where httpstepid=873]
 27669:20210114:182528.672 [Z3005] query failed: [2013] Lost connection to MySQL server during query [update hosts set errors_from=0,disable_until=0 where hostid=10647]
 27737:20210114:182528.674 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select distinct t.triggerid,t.description,t.expression,t.status,t.type,t.priority,t.comments,t.url,t.recovery_expression,t.recovery_mode,t.correlation_mode,t.correlation_tag,t.manual_close,t.opdata,t.discover from triggers t,functions f,items i,item_discovery id where t.triggerid=f.triggerid and f.itemid=i.itemid and i.itemid=id.itemid and id.parent_itemid=54872]
 27429:20210114:182528.674 [Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
 27722:20210114:182528.697 [Z3001] connection to database 'zabbix' failed: [9002] Some errors occured.
 27722:20210114:182528.697 Cannot connect to the database. Exiting...
 27367:20210114:182528.703 One child process died (PID:27722,exitcode/signal:1). Exiting ...
 27367:20210114:182528.865 [Z3001] connection to database 'zabbix' failed: [9002] Some errors occured.
 27367:20210114:182528.865 Cannot connect to the database. Exiting...
  8144:20210114:182533.144 Starting Zabbix Server. Zabbix 5.0.3 (revision 146855bff3).
  8144:20210114:182533.144 ****** Enabled features ******
  8144:20210114:182533.144 SNMP monitoring:           YES
  8144:20210114:182533.144 IPMI monitoring:           YES
  8144:20210114:182533.144 Web monitoring:            YES
  8144:20210114:182533.144 VMware monitoring:         YES
  8144:20210114:182533.144 SMTP authentication:       YES
  8144:20210114:182533.144 ODBC:                      YES
  8144:20210114:182533.144 SSH support:               YES
  8144:20210114:182533.144 IPv6 support:              YES
  8144:20210114:182533.144 TLS support:               YES
  8144:20210114:182533.144 ******************************
  8144:20210114:182533.144 using configuration file: /etc/zabbix/zabbix_server.conf
  8144:20210114:182533.189 [Z3001] connection to database 'zabbix' failed: [9002] Some errors occured.
  8144:20210114:182533.189 Cannot connect to the database. Exiting...

Looks like it caused by this specific error:
[9002] Some errors occured.

I looked to related code and it looks to a bit strange

/******************************************************************************
 *                                                                            *
 * Function: DBconnect                                                        *
 *                                                                            *
 * Purpose: connect to the database                                           *
 *                                                                            *
 * Parameters: flag - ZBX_DB_CONNECT_ONCE (try once and return the result),   *
 *                    ZBX_DB_CONNECT_EXIT (exit on failure) or                *
 *                    ZBX_DB_CONNECT_NORMAL (retry until connected)           *
 *                                                                            *
 * Return value: same as zbx_db_connect()                                     *
 *                                                                            *
 ******************************************************************************/
int	DBconnect(int flag)
{
	int	err;

	zabbix_log(LOG_LEVEL_DEBUG, "In %s() flag:%d", __func__, flag);

	while (ZBX_DB_OK != (err = zbx_db_connect(CONFIG_DBHOST, CONFIG_DBUSER, CONFIG_DBPASSWORD,
			CONFIG_DBNAME, CONFIG_DBSCHEMA, CONFIG_DBSOCKET, CONFIG_DBPORT, CONFIG_DB_TLS_CONNECT,
			CONFIG_DB_TLS_CERT_FILE, CONFIG_DB_TLS_KEY_FILE, CONFIG_DB_TLS_CA_FILE, CONFIG_DB_TLS_CIPHER,
			CONFIG_DB_TLS_CIPHER_13)))
	{
		if (ZBX_DB_CONNECT_ONCE == flag)
			break;

		if (ZBX_DB_FAIL == err || ZBX_DB_CONNECT_EXIT == flag)
		{
			zabbix_log(LOG_LEVEL_CRIT, "Cannot connect to the database. Exiting...");
			exit(EXIT_FAILURE);
		}

sorry, I could be wrong as I'm not programmer, but it looks strange for me.
So maybe reasons in the code could be separated and different error messages printed.

It's not very clear why zabbix server self-terminated, while we know that usually it should try to reconnect after 10 seconds.



 Comments   
Comment by Artjoms Rimdjonoks [ 2021 May 17 ]

Investigation

The error message that gets observed in the Zabbix logs:

"[9002] Some errors occured."

is what the DB sends to the mysql connector library (and then to Zabbix) when DB refuses to setup a connection:
"9002" is an error code and it comes from the library call mysql_errno(conn)
and
"Some errors occured" is an error text and it comes from mysql_error(conn)

./src/libs/zbxdb/db.c:
        if (ZBX_DB_OK == ret &&                                                                                                                                                                                                                                                 
                        NULL == mysql_real_connect(conn, host, user, password, dbname, port, dbsocket,                                                                                                                                                                          
                                CLIENT_MULTI_STATEMENTS))                                                                                                                                                                                                                       
        {                                                                                                                                                                                                                                                                       
                zbx_db_errlog(ERR_Z3001, mysql_errno(conn), mysql_error(conn), dbname);                                                                                                                                                                                         
                ret = ZBX_DB_FAIL;                                                                                                                                                                                                                                              
        }   

9002 error code is not defined in the mysql source code or Zabbix, it is Azure specific.
The only way to debug further why the connection was refused - is to investigate the Azure logs. (or find the official MySQL Azure documentation with the reference to this error, it does not seem to be publicly available).

Zabbix or MySQL connector has no other data available that could provide a hint why the connection was not successful.

Comment by dimir [ 2021 May 17 ]

Can we detect 9002 and add a hint to the log message to get more details from Azure logs?

arimdjonoks We can detect 9002 - but I would rather avoid writing the code around this because:
1) this code is not part of the mysql itself - Azure invented this code and injected in its hosted MySQL distro. What if Amazon will have its own MySQL distro with its own 9002 code error? What if Azure changes this code later?
2) I have not found any official documentation that would mention this code - 9002, it is mentioned several times by users from unofficial sources.

It is easy to add this code, but testing and maintaining it would be quite expensive. If the issue repeats - I would rather prefer to mention this error in the "Known Issues". (this is really not Zabbix fault that Azure has such vague errors)

andris: The current message "[9002] Some errors occured" may make user to feel helpless. Seems like "[9002]" can come with other text messages, not only with this "Some errors occured".

I propose to detect this code, log what we got from Azure and append our hint like "See Zabbix documentation "Known issues"". There, in Zabbix online documentation, we can describe and maintain everything we know about the error 9002, without changing Zabbix source code.

Another thing - is it the best action to terminate Zabbix server (instead of endless retrying) if error 9002 shows up? zalex_ua ?

zalex_ua I do not know about the 9002 more than you, sorry. Google is our the only help here. I do not know is it required to terminate on the error or we could repeat.

Comment by Alexander Vladishev [ 2021 Nov 30 ]

Documentation updated:

Generated at Thu Jun 05 22:21:27 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.