[ZBXNEXT-2363] DB Schema for MariaDB with TokuDB Created: 2014 Jun 30 Updated: 2017 Nov 15 |
|
Status: | Open |
Project: | ZABBIX FEATURE REQUESTS |
Component/s: | Proxy (P), Server (S) |
Affects Version/s: | 2.2.3, 2.3.1 |
Fix Version/s: | None |
Type: | New Feature Request | Priority: | Major |
Reporter: | Vadim Nesterov | Assignee: | Unassigned |
Resolution: | Unresolved | Votes: | 6 |
Labels: | db, tokudb | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
CentOS 6.5, MariaDB 10.0.12 with latest TokuDB engine |
Issue Links: |
|
Description |
We have converted zabbix db from 2.2 to MariaDB with TokuDB. Benefits of TokuDB:
We need special db schema for TokuDB with sevral rules: On history tables, there is must only one primary index, but it must be CLUSTERING and COVERING fields item_id, clock Trend tables must indexed the same way to get more perfomance on select queries some options must be set: my.cnf: [mysqld] query cache must be enabled, |
Comments |
Comment by richlv [ 2014 Jun 30 ] | ||||||||||||||||||
hmm, what's the feature request aspect here ? | ||||||||||||||||||
Comment by Vadim Nesterov [ 2014 Jun 30 ] | ||||||||||||||||||
there is a db schema called mysql.sql in zabbix sources, so need to create new one tokudb.sql with different indexes than mysql, also this schema must be supported by zabbix server, when it auto updates with new versions. TokuDB is another DB engine like Mysql, Oracle, Postgres, so I ask to support it. | ||||||||||||||||||
Comment by richlv [ 2014 Jun 30 ] | ||||||||||||||||||
according to our experience, tokudb doesn't perform that great for zabbix workload, thus supporting it might not be that useful... leaving this open for now, though | ||||||||||||||||||
Comment by Vadim Nesterov [ 2014 Jun 30 ] | ||||||||||||||||||
we have done all possible optimization to InnoDB and haven't got performance like TokuDB, | ||||||||||||||||||
Comment by Andrejs Čirkovs (Inactive) [ 2014 Jul 01 ] | ||||||||||||||||||
I highly doubt query cache as it is done in Mysql / Mariadb is usable with zabbix (except there's someone pressing Ctrl + F5 constantly generating the same query). Vadim, can you please attach schema generated? I'm a bit curious about the solutions like "and this index includes all fields of each row". | ||||||||||||||||||
Comment by Tim Callaghan [ 2014 Jul 01 ] | ||||||||||||||||||
(full disclosure, I work at Tokutek, creators of TokuDB) @richlv, Can you share your past TokuDB experiences further? We continue to make improvements to the performance of TokuDB and I'd be curious to understand workloads that aren't performant. @everyone, We'd be happy to see broader usage of TokuDB with products like Zabbix, and I'm happy to assist in this effort. I assume a large part of the Zabbix workload is time based insertion on tables with secondary indexes, so our ability to ingest at high rates of speed along with our built-in compression should be a great fit. | ||||||||||||||||||
Comment by richlv [ 2014 Jul 01 ] | ||||||||||||||||||
tim, there was some communication between tokutek/zabbix may 2012 - october 2013, maybe that allows to dig up more records. | ||||||||||||||||||
Comment by Tim Callaghan [ 2014 Jul 01 ] | ||||||||||||||||||
Yes, I remember the early portion of that timeline well. I'm confident that the current version of TokuDB will perform much better than back then, as is evidenced by the creator of this ticket. Please let me know if there is any way we can help. Does Zabbix by chance have any benchmarking code to test their workload's performance? That would be a great place to start. | ||||||||||||||||||
Comment by Vadim Nesterov [ 2014 Jul 01 ] | ||||||||||||||||||
Lets begin deeply into TokuDB Here is: CREATE TABLE `history` ( – DROP TABLE IF EXISTS `history_log`; – DROP TABLE IF EXISTS `history_str`; – DROP TABLE IF EXISTS `history_str_sync`; – DROP TABLE IF EXISTS `history_sync`; – DROP TABLE IF EXISTS `history_text`; – DROP TABLE IF EXISTS `history_uint`; Such indexes KEY `clustering_idx_clock_itemid` (`clock`,`itemid`) `clustering`=yes are wrong from my side, but we work now with it ok. Cause deleting indexes in TokuDB locks tables, I kept them, beacause it almost recreation of table. As you see at history table, I deleted primary index on item_id, @Andrejs Čirkovs, let look at db files:
Look at size of clustering indexes, they are almost the same as main table. Here is the stat of Qcache: show status like '%Qca%';
------------------------
------------------------ Just try it and you will see that front end will respond much faster, don't forget about pushdown -> off | ||||||||||||||||||
Comment by richlv [ 2014 Jul 02 ] | ||||||||||||||||||
tim, unfortunately we don't have any public unified performance tests available at this time | ||||||||||||||||||
Comment by Tim Callaghan [ 2014 Jul 02 ] | ||||||||||||||||||
@vadim, I'll try to respond to your proposed schema and questions. Clustering secondary index size Clustering index advantages Compression USING BTREE syntax Replacing Primary Keys | ||||||||||||||||||
Comment by Vadim Nesterov [ 2014 Jul 02 ] | ||||||||||||||||||
@Tim, Compression: we have 24 cores on server and we the load is very small, load average is about 2, we have 16GB RAM, and have 4GB free, tokudb is set to use 9GB. We woud like to have samller db. KEY `clustering_idx_clock_itemid` (`clock`,`itemid`) USING BTREE - it is unchanged table from original zabbix schema. Replacing Primary Keys - the main topic. So if clustering is table, so I don't understand why to have secondary keys on history data? So I think it is enough to have one PK clustering (item_id,clock) and don't have secondary indexes. Am I right? Tim, another question: is it any value to use desc order while creating index (item_id,clock desc), cause I read that InnoDB don't look at it? What about TokuDB? | ||||||||||||||||||
Comment by Tim Callaghan [ 2014 Jul 02 ] | ||||||||||||||||||
@vadim.
Good question. One potential reason to cluster a secondary index rather than just making what would be the clustering secondary index the primary key is insert performance. If you have a PK on an auto-incrementing column then you are always inserting on the right side of the index, which means inserts are very fast. The same is true if you don't declare a primary key, as the hidden PK is auto-increment. The reason these are fast is that a primary key must be unique, and the unique check can be expensive, as in require an IO, if the insert pattern for the primary key is random. The clustered secondary index doesn't need a unique check so TokuDB's Fractal Tree indexing technology is fast forever, as opposed to InnoDB's which will slow down significantly when your secondary indexes don't fit in the cache. Does that help? To continue this topic I'd need to better understand the particular table in question (lets take this 1 table at a time), and understand the insertion pattern as well as update/delete/select statements.
There isn't value in declaring it desc, an index can efficiently be scanned in either direction. | ||||||||||||||||||
Comment by Vadim Nesterov [ 2014 Jul 03 ] | ||||||||||||||||||
So here is from zabbix 2.2.4 original schema: CREATE TABLE `history` ( Must be index history_1 be clustering? We talk only about for TokuDB. | ||||||||||||||||||
Comment by Tim Callaghan [ 2014 Jul 03 ] | ||||||||||||||||||
@vadim.
In TokuDB you can optionally make secondary indexes clustering. A clustered secondary index is much like the primary key in that a complete copy of the row is stored in the index, which means the following:
Now on to the good stuff! Since the clustered secondary index has a full copy of the row, any select that uses the index has all the columns in the row available to it without having to do a look-up into the primary key index to get the other columns. Both InnoDB and TokuDB have this "look the rest of the row up in the primary key" behavior (which TokuDB avoids if the secondary index is clustered). You can think of clustered secondary indexes as a better version of a covered index. So, I think for the above table you'd be best off leaving the create table alone (keep the hidden PK), but making the index clustering, as in: CREATE TABLE `history` ( `itemid` bigint unsigned NOT NULL, `clock` integer DEFAULT '0' NOT NULL, `value` double(16,4) DEFAULT '0.0000' NOT NULL, `ns` integer DEFAULT '0' NOT NULL, KEY `history_1` (`itemid`,`clock`) CLUSTERING=yes ) ENGINE=TokuDB `COMPRESSION`=TOKUDB_LZMA; |