[ZBXNEXT-3706] mysql utf8 vs utf8mb4 Created: 2015 May 07  Updated: 2024 Apr 10  Resolved: 2022 Jan 25

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: Installation (I)
Affects Version/s: None
Fix Version/s: 6.0.0alpha7, 6.0 (plan)

Type: Change Request Priority: Trivial
Reporter: richlv Assignee: Jurijs Klopovskis
Resolution: Fixed Votes: 7
Labels: encoding, mysql, utf8
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File screenshot-1.png     File utf8mb4_convert.sql    
Issue Links:
Causes
causes ZBXNEXT-7048 Rework how `config` table works in my... Reopened
causes ZBX-20450 Issue with cyrillic names of items an... Closed
causes ZBX-20425 Update SQL conversion script to use u... Closed
causes ZBX-21385 utf8mb3 error message in zabbix serve... Closed
Duplicate
duplicates ZBX-16935 utf8mb4 supported? Closed
is duplicated by ZBXNEXT-3238 Emoji support Closed
is duplicated by ZBX-15948 mysql database upgrade fails, 4.0 - 4... Closed
is duplicated by ZBX-20905 Zabbix server upgrade from 4.0 to 6.0... Closed
is duplicated by ZBX-13488 Four byte UTF-8 symbol causes "query ... Closed
Team: Team B
Sprint: Sprint 81 (Oct 2021), Sprint 82 (Nov 2021), Sprint 83 (Dec 2021), Sprint 84 (Jan 2022)
Story Points: 2

 Description   

we might want to change from utf8 to utf8mb4 for mysql installation instructions and frontend - see https://mathiasbynens.be/notes/mysql-utf8mb4

include/db.inc.php:                                   DBexecute('SET NAMES utf8');


 Comments   
Comment by richlv [ 2015 May 11 ]

(1) potential issue :

Performance of 4-byte UTF-8 (utf8mb4) is slower than for 3-byte UTF-8 (utf8). If you do not want to incur this penalty, continue to use utf8.

from https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-upgrading.html

Comment by richlv [ 2015 May 11 ]

(2) attempting to load zabbix 2.4 schema in an utf8mb4 database fails :

ERROR 1071 (42000) at line 116: Specified key was too long; max key length is 767 bytes

line 116 is :

CREATE UNIQUE INDEX `slideshows_1` ON `slideshows` (`name`);

from https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-upgrading.html :

InnoDB has a maximum index length of 767 bytes for tables that use a COMPACT or REDUNDANT row format, so for utf8 or utf8mb4 columns, you can index a maximum of 255 or 191 characters, respectively.

slideshow name column is 255, which is more than 191. zabbix db schema has more indexes like that, so this issue would be significantly more work

Comment by Alexander Vladishev [ 2017 Feb 16 ]

I move this issue into ZBXNEXT project. Currently Zabbix does not support utf8mb4 character set.

Comment by Alexei Vladishev [ 2021 Oct 15 ]

I believe we already support utf8mb4 in Zabbix 5.x, closing this ticket.

Comment by Alexei Vladishev [ 2021 Oct 18 ]

Actually I was wrong, utf8mb4 is not supported yet, we will implement it in 6.0. Re-opening.

Comment by Jurijs Klopovskis [ 2021 Nov 10 ]

Available in 40c2d49279e (5.5)

Comment by Alexei Vladishev [ 2021 Nov 12 ]

Just a quick update. This is coming in 6.0.0alpha7, already implemented.

Comment by Marco Hofmann [ 2021 Nov 21 ]

Hi again =) In ZBX-17357 we talked a lot about the great db charset/collation conversion script (https://www.zabbix.com/documentation/current/manual/appendix/install/db_charset_coll)).
With the help of this script, we switched to utf8 & utf8_bin for the Zabbix 5.x era.

It seems that with Zabbix 6.0+ and MariaDB 10.6+, this change won't be enough to be future proof. I feel like there will be another DB conversion necessary in the near future.

If I compare the setup instruction from Zabbix 5.4 vs 6.0, we can clearly see a change in the charset/collation command.
(https://www.zabbix.com/documentation/5.4/manual/appendix/install/db_scripts VS. https://www.zabbix.com/documentation/6.0/manual/appendix/install/db_scripts)

mysql> create database zabbix character set utf8 collate utf8_bin;
VS.
mysql> create database zabbix character set utf8mb4 collate utf8mb4_bin;

Add this to the notes gerry posted here ZBXNEXT-6820, that utf8 (aka utf8mb3) will be deprecated and we all might have a lot of work to do.

So my questions are:

  1. Will Zabbix provide a second conversion script, to continue the work you have already started? You helped us to switch from the old 2.x standards, to the new 5.x standards. Now seems to be the time to provide a script to convert from 5.x to 6.x DB standards, and to be future proof for upcoming MySQL and MariaDB changes. Because otherwise, Zabbix 6.4 (or any other future version) might switch it's recommended MariaDB version to 10.7 or 10.8 (or any other future version) and utf8 (aka utf8mb3) would stop working suddenly?
  2. Or can we just use the very same script, and change utf8 to utf8mb4 & utf8_bin to utf8mb4_bin and let it run once more?

Thank you very much for feedback. I have the strong feeling, this will become a big topic, as soon as Zabbix and MySQL/MariaDB remove the deprecated utf8 support.

(Maybe this topic needs a separate ZBX issue?)

Comment by Alexei Vladishev [ 2021 Nov 22 ]

starko , you may find more information including conversion script here: https://www.zabbix.com/documentation/6.0/manual/appendix/install/db_charset_coll

EDIT: Ignore my comment, I just re-read your comment and discovered that you are already aware of this.

Comment by Sergey Simonenko (Inactive) [ 2022 Jan 17 ]

Hello, starko!

We have recently updated our charset reparation documentation for 6.0: https://www.zabbix.com/documentation/6.0/en/manual/appendix/install/db_charset_coll.

The script basically is still the same, only updated from old pseudo-utf8 to utf8mb4. We also mentioned that 'utf8mb3' is about to become deprecated at some point in future soon and encourage users to use 'utf8mb4'.

We also observed different reporting of MySQL client when running this script on MySQL and MariaDB (in case of MySQL as of recent versions you might see '0 rows affected' but it does its job).

Kind regards,
Sergey

Comment by Marco Hofmann [ 2022 Jan 17 ]

Thank you very much for honoring my feedback ssimonenko !
I read the article, but the link to the script is broken -> 404. Can you check that, please?

Comment by Sergey Simonenko (Inactive) [ 2022 Jan 17 ]

Apologies, thanks for pointing – we'll fix it very soon. At this time you can download it from attachment: utf8mb4_convert.sql .

Comment by Sergey Simonenko (Inactive) [ 2022 Jan 25 ]

starko, we fixed the script URL, again, thanks!

Comment by Marco Hofmann [ 2022 Jan 26 ]

Thanks a lot! I will do the conversion in a few months!

Comment by Marco Hofmann [ 2022 Jul 12 ]

ssimonenko I wanted to let you know, that we did the switch from 5.4.12 with MariaDB 10.5 on Debian 11 to 6.0.6 with MariaDB 10.6 on Debian 11 last Friday.
After the upgrade, we switched the charset from utf8 (aka utf8mb3) to utf8mb4 and it worked very well and was very fast. Took just a few minutes. Thank you very much!

Generated at Fri Apr 26 20:39:59 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.