[ZBXNEXT-1006] in PostgreSQL use SERIAL instead of IDs Created: 2011 Oct 20  Updated: 2021 Oct 15  Resolved: 2021 Oct 15

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: Server (S)
Affects Version/s: None
Fix Version/s: None

Type: Change Request Priority: Minor
Reporter: Stefan Assignee: Unassigned
Resolution: Declined Votes: 9
Labels: database, performance, postgresql
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by ZBXNEXT-1717 must talk with SQL-devops.. Closed

 Description   

Hello,

why you dont use SERIAL instead of ID-Tables or rather to calculate the next ID..
example:
http://www.zabbix.com/forum/showpost.php?p=86607&postcount=18

http://neilconway.org/docs/sequences
http://www.postgresql.org/docs/current/static/functions-info.html
http://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-SERIAL
http://www.varlena.com/GeneralBits/72.php



 Comments   
Comment by Aleksandrs Saveljevs [ 2011 Oct 21 ]

Zabbix used autoincremental fields prior to the introduction of distributed monitoring. With distributed monitoring and the structure of its IDs, it is no longer possible (see question #8 at http://www.zabbix.com/documentation/2.0/manual/appendix/faq).

Comment by Stefan [ 2011 Oct 21 ]

yes but you can use Serial for the DDDDDDDDDDDDD.. and why dont use you several fields for every number?!

Comment by Stefan [ 2011 Oct 24 ]

reopen for answer

Comment by Aleksandrs Saveljevs [ 2011 Oct 24 ]

Discussing the existing structure, suppose we have a database record with ID "N2N2DDDDDDDDDDD" on node N2. How can we insert a record with that ID into a table having an autoincremental ID field into the database on node N1?

Now, your proposal. Suppose we use several fields for every ID: a triple (destination node number, source node number, ID). Suppose also that there is a record (N1, N1, 12345) on node N1. When transferring a record with ID (N2, N2, 12345) from node N2 to node N1, how would it be possible to use autoincremental property of table IDs and wouldn't "12345" from both nodes conflict with each other?

Comment by Markus Wanner [ 2011 Oct 24 ]

Hello Aleksandrs, I'm a Postgres developer and discussed this with Stefan on the German Postgres channel a bit already.

I don't quite understand your first question. At least in Postgtres, it's easily possible to write custom data into an AUTOINC column (well, it's just a DEFAULT pointing to a SERIAL, there). I'd expect MySQL to work the same.

One common way to solve the unique identifier problem for multiple databases is having a pair (origin node id, SERIAL), where each node has its own serial that it increments. The UNIQUENESS constraint is then on that pair. Assuming your node ids don't collide, that usually results in unique identifiers.

Of course, that doesn't guarantee monotonically incrementing numbers. Nor gap-less increments. I don't know your specific needs, though.

Comment by Aleksandrs Saveljevs [ 2011 Oct 25 ]

There are two ID structures we are discussing in this issue. The first structure is the existing one (described at http://www.zabbix.com/documentation/2.0/manual/appendix/faq), the second one is splitting the ID into several columns.

For the first structure, we need to generate IDs in the form NNNSSSDDDDDDDDDDD, which I am not sure is possible automatically with autoincremental fields. Another problem is that, as an optimization, we use multirow inserts and then reference the inserted IDs. We know how to get an inserted ID for a single insert statement, but how to get the inserted IDs if multiple rows were inserted? That is why we use the "ids" table to manage the generated IDs ourselves.

For the second structure, that would mean splitting a single-column ID into three columns, which also includes foreign keys, which is not very convenient. Also, the problem with multirow inserts from the first structure remains.

In any case, the structure of the database is not going to change for 1.8 and is pretty much fixed for 2.0 as well. We might return to autoincremental fields when we redesign distributed monitoring, but it is probably not going to happen anytime soon.

Comment by richlv [ 2011 Oct 25 ]

please also note that the solution would have to work on 5 different databases (as currently supported by zabbix - mysql, pgsql, oracle, sqlite, db2)

Comment by Markus Wanner [ 2011 Oct 25 ]

I agree than only the DDDDDDDDDDD part can usually be filled by a SERIAL or AUTOINC column. However, I see no reason for stuffing all of that information into a single column. I personally consider the separation good practice, whereas the combined variant seems inconvenient to me.

I'm not sure what the issue with multiple row inserts are. Postgres offers INSERT .. RETURNING, which can perfectly well return multiple row's ids (possibly consisting of multiple columns). At least Oracle, DB2 and SQL Server offer that or a similar feature as well. SQLite is quite a different kind of database, anyway, which I think should be coded to differently (it runs in the same process as the client, doesn't support concurrent writers, etc.. "embedded").

I understand that you don't want to change the database structure. This usually is a huge task. And I understand the wish to support multiple databases. However, I'd argue that the current scheme is sub-optimal for enterprise class databases like Postgres or Oracle.

Comment by Stefan [ 2011 Nov 03 ]

@richlv: ok but why must i zabbix compile for a special database? (--with-pgsql) When i don“t use special databasefeatures?!

Comment by richlv [ 2011 Nov 03 ]

to use the appropriate db library ?

Comment by Stefan [ 2012 Nov 09 ]

postgresql has serial
mysql has auto_increment
oracle has sequence

"I understand that you don't want to change the database structure. This usually is a huge task. And I understand the wish to support multiple databases. However, I'd argue that the current scheme is sub-optimal for enterprise class databases like Postgres or Oracle."
/sign

maybe more talk in postgresql/mysql -IRC for better performance solutions..

Comment by Randall Svancara [ 2013 Mar 01 ]

The way that Zabbix performs multi-row or grouped inserts/updates in one large transaction is a performance killer in terms of row locking. This approach simply does not scale. First of all, I think there needs to be some super process that can intelligently broker grouped updates so that no two transactions contain an update for the same record. This will reduce deadlocks. Deadlocks are bad because the database must detect these and there is a long delay usually to discover deadlocks. Additionally, the IDS table, responsible for storing the next ID is also another point of contention in the database. I am seeing error messages in the logs "Something Impossible Just Happened". Again, another deadlock situation where a process is attempting to grab the "next id" for some table and it fails. Why not move to a random string with a very large key space...zabbix is rarely CPU bound, so generation of large keys should not be a problem. Additionally, disk space is also cheap, so storing them should not be a problem either. This opens the way for other possibilities as well, like sharded database, which would provide a greater ability to scale, especially in the cloud where running six instances of a server with 2GB of RAM is cheaper than running 1 instance with 96GB of RAM.

Comment by Marc [ 2015 Sep 04 ]

Since ZBXNEXT-1343 I consider it is valid to pickup the discussion on this again.

Edit:
For better differentiation to the original description of this ticket a dedicated ticket ZBXNEXT-2941 has been created to address the phase out in general.

Comment by Alexei Vladishev [ 2021 Oct 15 ]

No need to keep this ticket open for so long, I am closing it. We won't use serial data type.

Generated at Fri May 09 06:50:12 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.