[ZBXNEXT-3681] Implement bulk inserts with array binds for Oracle Created: 2017 Feb 03  Updated: 2017 Feb 14  Resolved: 2017 Feb 14

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: Proxy (P), Server (S)
Affects Version/s: None
Fix Version/s: 3.0.8, 3.2.4, 3.4.0alpha1

Type: Change Request Priority: Major
Reporter: Andris Zeila Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: bulk, history, oracle
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate

 Description   

When inserting rows into Oracle database Zabbix binds columns and executes insert statement for every row. This can cause performance problems, especially in high latency situations.

This can be improved by binding arrays instead of single values. Array binding had limitations, so dynamic parameter binding was implemented instead.



 Comments   
Comment by Oleksii Zagorskyi [ 2017 Feb 03 ]

To keep things linked - bind inserts for Oracle were implemented in ZBXNEXT-166.

Comment by Andris Zeila [ 2017 Feb 06 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBXNEXT-3681

Comment by Vladislavs Sokurenko [ 2017 Feb 07 ]

(1) There is TODO: left, is this intentional ?

case ZBX_TYPE_ID: /* TODO: handle 0 -> NULL conversion */

wiper 'TODO' is a leftover, the rest of the comment is valid.

RESOLVED in r65557
vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 07 ]

(2) Small Memory leak on database failure.

As you see below, if zbx_db_bind_parameter_dyn fails it is called again without performing clear

	rc = zbx_db_bind_parameter_dyn(context, position, type, rows, rows_num);

	while (ZBX_DB_DOWN == rc)
	{
		DBclose();
		DBconnect(ZBX_DB_CONNECT_NORMAL);

		if (ZBX_DB_DOWN == (rc = zbx_db_bind_parameter_dyn(context, position, type, rows, rows_num)))
		{
			zabbix_log(LOG_LEVEL_ERR, "database is down: retrying in %d seconds", ZBX_DB_WAIT_DOWN);
			connection_failure = 1;
			sleep(ZBX_DB_WAIT_DOWN);
		}
	}

Possibly related to this
Something wrong in database down handling
If zbx_db_bind_parameter_dyn returns ZBX_DB_DOWN there is memory corruption. investigating why..

 26238:20170207:172927.016 [Z3007] query failed: [-1] ORA-01036: illegal variable name/number
 26238:20170207:172927.016 [Z3007] query failed: [-1] ORA-01036: illegal variable name/number
 26238:20170207:172927.051 [Z3007] query failed: [-1] ORA-01036: illegal variable name/number
*** Error in `./install/sbin/zabbix_server: history syncer #1 [synced 0 items in 0.000001 sec, syncing history]': free(): invalid pointer: 0x0000000000d2a6a0 ***
*** Error in `./install/sbin/zabbix_server: history syncer #1 [synced 0 items in 0.000001 sec, syncing history]': corrupted double-linked list: 0x0000000000d2db80 ***
 26228:20170207:172927.417 cannot parse host availability data from active proxy at "127.0.0.1": missing name of proxy
 26228:20170207:172928.418 cannot parse host availability data from active proxy at "127.0.0.1": missing name of proxy
 26228:20170207:172929.420 cannot parse host availability data from active proxy at "127.0.0.1": missing name of proxy
 26227:20170207:172929.844 [Z3007] query failed: [-1] ORA-01036: illegal variable name/number
*** Error in `./install/sbin/zabbix_server: trapper #1 [connecting to the database]': free(): invalid pointer: 0x0000000000d21ae0 ***
 26228:20170207:172930.422 cannot parse host availability data from active proxy at "127.0.0.1": missing name of proxy
 26231:20170207:172931.423 cannot parse host availability data from active proxy at "127.0.0.1": missing name of proxy
 26228:20170207:172932.424 cannot parse host availability data from active proxy at "127.0.0.1": missing name of proxy
 26218:20170207:172932.925 One child process died (PID:26227,exitcode/signal:6). Exiting ...
 26218:20170207:172934.959 syncing history data...
 26218:20170207:172934.995 [Z3007] query failed: [-1] ORA-01036: illegal variable name/number
*** Error in `./install/sbin/zabbix_server': free(): invalid pointer: 0x00007f6cfcaf87b8 ***

wiper Fixed memory leak and lost database connection handling. Not sure about the errors, couldn't repeat it.
RESOLVED in r65579

vso REOPENED it still leaks.

Here zbx_db_bind_parameter_dyn allocates

retry_oracle:
	DBstatement_prepare(sql_command);

	for (j = 0; j < self->fields.values_num; j++)
	{
		field = (ZBX_FIELD *)self->fields.values[j];

		if (ZBX_DB_OK > zbx_db_bind_parameter_dyn(&contexts[j], j, field->type,
				(zbx_db_value_t **)self->rows.values, self->rows.values_num))
		{
			goto out;
		}
	}

Here on database failure it will retry, there is no zbx_db_clean_bind_context in between.
Only if zbx_db_bind_parameter_dyn fails which is unlikely.

if (ZBX_DB_DOWN == (rc = zbx_db_statement_execute(self->rows.values_num)))
	{
		if (0 < tries++)
		{
			zabbix_log(LOG_LEVEL_ERR, "database is down: retrying in %d seconds", ZBX_DB_WAIT_DOWN);
			connection_failure = 1;
			sleep(ZBX_DB_WAIT_DOWN);
		}

		DBclose();
		DBconnect(ZBX_DB_CONNECT_NORMAL);

		goto retry_oracle;
	}

wiper RESOLVED in r65581
vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 08 ]

(3) Dangling pointer.
As you see here contexts are allocated depending on self->fields.values_num and not initialized

contexts = (zbx_db_bind_context_t *)zbx_malloc(NULL, sizeof(zbx_db_bind_context_t) * self->fields.values_num);

Later on as you see there is no guarantee that all contexts will be initialized, because it can stop in the middle of the work

	for (j = 0; j < self->fields.values_num; j++)
	{
		field = (ZBX_FIELD *)self->fields.values[j];

		if (ZBX_DB_OK > zbx_db_bind_parameter_dyn(&contexts[j], j, field->type,
				(zbx_db_value_t **)self->rows.values, self->rows.values_num))
		{
			goto out;
		}
	}

But when freeing it check not initialized count but same parameter values_num

	for (j = 0; j < self->fields.values_num; j++)
		zbx_db_clean_bind_context(&contexts[j]);

wiper RESOLVED in r65582
vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 08 ]

(4) Unused function DBstatement_execute()

wiper RESOLVED in r65591

The zbx_db_statement_prepare() function succeeds even when connection is down, so we could probably use it directly and remove DBstatement_prepare() function. However let's not break too many things in 3.0

vso CLOSED with small style fixes in r65592

wiper Reviewed, thanks!

Comment by Vladislavs Sokurenko [ 2017 Feb 09 ]

Successfully tested

Comment by Andris Zeila [ 2017 Feb 10 ]

Released in:

  • pre-3.0.8rc1 r65629
  • pre-3.2.4rc1 r65630
  • pre-3.3.0 65631
Comment by Andris Zeila [ 2017 Feb 14 ]

Documented in:

vso Looks good.

sasha CLOSED

Comment by Oleksii Zagorskyi [ 2017 Feb 14 ]

Tested on the affected production. It resolved the issue where 16 syncers were on "an edge" (while being busy for ~60% only) when syncing ~1K incomming NVPS. Write cache slowly dropped all the time.
Delay to db server is ~5 ms and it was not possible to decrease it.

Now the 16 syncers are ~10% busy and write cache is stable on ~99% free.
THANKS !

Generated at Thu Mar 28 21:35:56 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.