[ZBX-11765] Transactions are not properly implemented for IBM DB2 Created: 2017 Feb 02  Updated: 2017 May 30  Resolved: 2017 Feb 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 3.0.6
Fix Version/s: 3.0.9rc1, 3.2.5rc1, 3.4.0alpha1

Type: Incident report Priority: Blocker
Reporter: Vjaceslavs Bogdanovs Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: db2, server, transaction
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
Team: Team A
Sprint: Sprint 1, Sprint 2
Story Points: 0.5

 Description   

Current DB solution does not implement valid transactions for DB2 as DBrollback is not related to DBbegin. This causes the problem when single rollback terminates all cursors, prepared statements, rows, temp files, etc. as no SAVEPOINT is set with DBbegin.

IBM DB2 documentation states:

ROLLBACK without the TO SAVEPOINT clause also causes the following actions to occur:

All locks that are implicitly acquired during the unit of recovery are released.
All cursors are closed, all prepared statements are destroyed, and any cursors that are associated with the prepared statements are invalidated.
All rows and all logical work files of every created temporary table of the application process are deleted. (All the rows of a declared temporary table are not implicitly deleted. As with base tables, any changes that are made to a declared temporary table during the unit of recovery are undone to restore the table to its state at the last commit point.)
All LOB locators, including those that are held, are freed.

Consider the following example:

DBselect                  // select some data
while( DBfetch )          // while data is present
{
        DBbegin           // begins transaction
        DBexecute         // executes statement
        DBrollback        // performs rollback
}          

Expected result is All records are processed by DBfetch, no data is altered by DBexecute (as rollback is called), but current result is DBrollback breaks while loop execution as DBfetch returns NULL after first DBrollback call



 Comments   
Comment by Sergejs Paskevics [ 2017 Feb 13 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-11765

Comment by Vladislavs Sokurenko [ 2017 Feb 14 ]

(1) Savepoints are only cleared on success leaving next point saving to fail.

Here we see that savepoints are only added if there are no already added, else set ZBX_DB_FAIL

if (FAIL == zbx_vector_str_search(&ibm_db2_savepoints, new_savepoint, ZBX_DEFAULT_STR_COMPARE_FUNC) &&
			0 <= (rc = zbx_db_execute("savepoint %s on rollback retain cursors;", new_savepoint)))

Later on we see that savepoints are only removed on success.

if (ZBX_DB_OK == rc)
	{
		zbx_ibm_db2_remove_savepoints();
		zbx_vector_str_destroy(&ibm_db2_savepoints);
	}

Which leaves us in peculiar situation as we will try to add new save point on database down we will get error.

Note that rollback is not performed on ZBX_DB_DOWN so it will not clear.


	if (ZBX_DB_FAIL == rc)
	{
rollback:
		zabbix_log(LOG_LEVEL_DEBUG, "commit called on failed transaction, doing a rollback instead");
		return zbx_db_rollback();
	}

s.paskevics RESOLVED in r65731. savepoint vector has be removed. Please, check new implementation.

vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 14 ]

(2) Redundant vector and logic around it.

To create vector we search if savepoint already exists, even though there can be only one save point at a time, we still use array for some reason.
Also we fail if save point already added, this logic is redundant as it shall never happen because there is already check for nested transaction.
This only adds problems if on some condition we forget to clear vector, then it fail, but actually for no reason.

	int	rc = ZBX_DB_FAIL;

	if (FAIL == zbx_vector_str_search(&ibm_db2_savepoints, new_savepoint, ZBX_DEFAULT_STR_COMPARE_FUNC) &&
			0 <= (rc = zbx_db_execute("savepoint %s on rollback retain cursors;", new_savepoint)))
	{
		zbx_vector_str_append(&ibm_db2_savepoints, zbx_strdup(NULL, new_savepoint));

		rc = ZBX_DB_OK;
	}

Same for rollback, we search for savepoint in array while there can be only one savepoint, we also clear safe points after our safe point for some reason, while there can be only one safepoint.

	if (FAIL != (idx = zbx_vector_str_search(&ibm_db2_savepoints, savepoint, ZBX_DEFAULT_STR_COMPARE_FUNC)) &&
			0 <= (rc = zbx_db_execute("rollback to savepoint %s;", ibm_db2_savepoints.values[idx])))
	{
		rc = ZBX_DB_OK;
	}

	if (ZBX_DB_OK == rc)
	{
		/* remove current savepoint and savepoints that were created after */
		while (idx < ibm_db2_savepoints.values_num)
			zbx_vector_str_remove(&ibm_db2_savepoints, idx++);
	}

s.paskevics RESOLVED in r65731.

vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 14 ]

(3) Memory leak during each savepoint when rollback performed

This is because zbx_ibm_db2_remove_savepoints is only called for commit but not for rollback.

In case of rollback pointers are simply removed from vector.
Actually we don't need to allocate strings when using static string literals so we could avoid malloc/free here.

==25383== 2,000 bytes in 100 blocks are definitely lost in loss record 403 of 409
==25383==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25383==    by 0x7174839: strdup (strdup.c:42)
==25383==    by 0x4645BE: zbx_strdup2 (misc.c:518)
==25383==    by 0x4A6607: zbx_ibm_db2_create_savepoint (db.c:768)
==25383==    by 0x4A66F1: zbx_db_begin (db.c:840)
==25383==    by 0x485B5B: DBtxn_operation (db.c:128)
==25383==    by 0x485BFE: DBbegin (db.c:159)
==25383==    by 0x4223C4: test_function (trapper.c:680)
==25383==    by 0x422529: trapper_thread (trapper.c:713)
==25383==    by 0x46110F: zbx_thread_start (threads.c:128)
==25383==    by 0x415309: MAIN_ZABBIX_ENTRY (server.c:957)
==25383==    by 0x45FE4E: daemon_start (daemon.c:392)

s.paskevics RESOLVED in r65731.

vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 16 ]

(4) Are you sure that we need to RETAIN CURSORS ? Do we use those ?

s.paskevics The ON ROLLBACK RETAIN CURSORS is a mandatory clause that describes cursor behavior within the savepoint.

vso OK, CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 17 ]

(5) We used to rollback by using SQLEndTran function and no savepoints where made.

It would do following:

The ROLLBACK statement backs out, or cancels, the database changes that are made by the current transaction and restores changed data to the state before the transaction began.

Source https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.apdv.gs.doc/doc/c0005018.html

The code for it was:

-	if (SUCCEED != zbx_ibm_db2_success(SQLEndTran(SQL_HANDLE_DBC, ibm_db2.hdbc, SQL_ROLLBACK)))
-		rc = ZBX_DB_DOWN;
+
+	/* Rollback to begin that is marked with savepoint. This move undo all transactions. */
+	if (0 <= (rc = zbx_db_execute("rollback to savepoint zbx_begin_savepoint;")))
+		rc = ZBX_DB_OK;
+

However there were problems with cursors.
Maybe it can be fixed by adding additional flags such as this one:

If the SQL_CURSOR_ROLLBACK_BEHAVIOR or SQL_CURSOR_COMMIT_BEHAVIOR value equals SQL_CB_PRESERVE, SQLEndTran() does not affect open cursors that are associated with the connection. Cursors remain at the row that they pointed to before the call to SQLEndTran().

Source: https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.apdv.gs.doc/doc/c0005018.html

I think that both solutions should be tried and weighted and then decision made on which one to keep.
I Personally would be more in favor of second solution if it's possible, since it is more consistent with the rest of code
For example we do

SQLEndTran(SQL_HANDLE_DBC, ibm_db2.hdbc, SQL_COMMIT)

Also I am not sure if creating savepoints will affect performance, this should also be evaluated.

I have this strange note in documentation, so it might not be possible to do it that way for some reason.

Note: Cursors are always closed when transactions are rolled back.

https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.swg.im.dbclient.config.doc/doc/r0054630.html

vso CLOSED.

Comment by Vladislavs Sokurenko [ 2017 Feb 17 ]

(6) I have tried old and new version with 1000 transactions, unfortunately new version takes 8 seconds while old takes 5 seconds
Have you done any performance measurement to confirm this finding ?

code was

	for (i = 0; i < 1000; i++)
	{
		DBbegin();
		DBexecute(...);
		DBcommit();
	}

Result:

Before:
25711:20170217:170241.952 It took me 389791 clicks (0.389791 seconds).

25711:20170217:170241.952 sec diff 5 microsec diff 5742088

After
26093:20170217:170406.723 It took me 431583 clicks (0.431583 seconds).

26093:20170217:170406.723 sec diff 8 microsec diff 8183533

s.paskevics Do you mean microseconds or seconds?

vso both, it was 5742088 microseconds, now it's 8183533 according to my test,can you reproduce?

s.paskevics My test results:

Case without savepoint with savepoint
1 3.178 4.274
2 3.246 3.486
3 2.260 3.374
4 2.055 6.809

vso Maybe this will help

        for (i = 0; i < 1000; i++)
        {
                DBbegin();
                zbx_snprintf(buffer, sizeof(buffer), "%d", i);
                DBexecute("insert into maintenances (description,maintenanceid,name) VALUES ('%s', " ZBX_FS_UI64 ",'%s')", "hello", DBget_maxid("maintenances") + i, buffer);
                DBcommit();
        } 

s.paskevics WON'T FIX. 3 microsec is not critical for one begin block.

vso CLOSED

Comment by Vladislavs Sokurenko [ 2017 Feb 20 ]

Successfully tested

Comment by Viktors Tjarve [ 2017 Feb 24 ]

(7) Unrelated warning errors:

db.c: In function ‘zbx_db_fetch’:
db.c:1951:39: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     if (PQftype(result->pg_result, i) == ZBX_PG_BYTEAOID) /* binary data type BYTEAOID */
                                       ^
housekeeper.c: In function ‘hk_history_delete_queue_append’:
housekeeper.c:220:42: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
  if ((zbx_uint64_t)history * SEC_PER_DAY > now)
                                          ^
ar: `u' modifier ignored since `D' is the default (see `U')
	
httptest.c: In function ‘process_httptest’:
httptest.c:349:17: warning: unused variable ‘stat’ [-Wunused-variable]
  zbx_httpstat_t stat;
                 ^
httptest.c:344:10: warning: unused variable ‘row’ [-Wunused-variable]
  DB_ROW  row;
          ^
httptest.c: At top level:
httptest.c:203:13: warning: ‘process_step_data’ defined but not used [-Wunused-function]
 static void process_step_data(zbx_uint64_t httpstepid, zbx_httpstat_t *stat, zbx_timespec_t *ts)
             ^

viktors.tjarve RESOLVED in r65959 and r65986.

s.paskevics CLOSED

Comment by Sergejs Paskevics [ 2017 Feb 28 ]

Fixed in:

  • pre-3.0.9rc1 r66006,
  • pre-3.2.5rc1 r66007,
  • pre-3.3.0 (trunk) r66008.
Comment by Oleksii Zagorskyi [ 2017 May 16 ]

Looks like not everything was resolved. ZBX-12186 has been created.

Generated at Thu Mar 28 20:38:41 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.