[ZBXNEXT-6420] encoding html contents in web monitoring Created: 2020 Oct 21 Updated: 2023 Dec 28 |
|
| Status: | Confirmed |
| Project: | ZABBIX FEATURE REQUESTS |
| Component/s: | Agent (G) |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature Request | Priority: | Minor |
| Reporter: | Alexey | Assignee: | Zabbix Development Team |
| Resolution: | Unresolved | Votes: | 4 |
| Labels: | encoding, web-scenario, web.page.get | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
ubuntu 18.04 |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
Steps to reproduce:
Result: It don't work if web page use windows-1251. If web page use UTF-8 all ok!
I'm create Items "web.page.get["site.ru",,]" and in the html have questions marks instead cyrillic symbols:
how i'm can edit encoding for correct work finding the string? Expected:
|
| Comments |
| Comment by Alexey [ 2020 Oct 26 ] |
|
help me, friends! |
| Comment by Aigars Kadikis [ 2020 Nov 02 ] |
|
Do you have a correct character set installed in the database level?: show create database zabbix\G show create table history_text\G The database should have: DEFAULT CHARACTER SET utf8 COLLATE utf8_bin The table should have: DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
| Comment by Alexey [ 2020 Nov 02 ] |
|
Yes:
mysql> show create database zabbix\G
*************************** 1. row ***************************
Database: zabbix
Create Database: CREATE DATABASE `zabbix` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin */
1 row in set (0.00 sec)
and:
mysql> show create table history_text\G
*************************** 1. row ***************************
Table: history_text
Create Table: CREATE TABLE `history_text` (
`itemid` bigint(20) unsigned NOT NULL,
`clock` int(11) NOT NULL DEFAULT '0',
`value` text COLLATE utf8_bin NOT NULL,
`ns` int(11) NOT NULL DEFAULT '0',
KEY `history_text_1` (`itemid`,`clock`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.00 sec)
Problems only with encoding the pages windows-1251. Command "curl https://site.ru" also has question marks, solves the problem only "iconv" in shell |
| Comment by Aigars Kadikis [ 2020 Nov 05 ] |
|
Indeed there is no way to store the HTML of https://pikabu.ru/ in Zabbix database. The item goes to the unsupported state with the message: Server returned invalid UTF-8 sequence |
| Comment by Alexey [ 2020 Dec 06 ] |
|
|
| Comment by Alexander Vladishev [ 2021 Jan 04 ] |
|
Currently Zabbix does not support character encoding other than UTF-8 for item web.page.get[]. А workaround to solve this problem is to do the conversion with JS in preprocessing steps. We cannot consider this a bug. Therefore, I am moving this ticket to the ZBXNEXT project. |
| Comment by Konstantīns Ošmjans [ 2021 Jun 02 ] |
|
sasha wrote:
It seems that the error is occurred still before the preprocessing steps could be available to catch this problem? |
| Comment by Konstantīns Ošmjans [ 2023 Oct 25 ] |
|
Just reminder: this problem does still exist (at least, in the current version 6.0.x). The workaround ("to do the conversion with JS in preprocessing steps") is not possible, as the error (and transition of item to unsupported state) is occurred before the preprocessing; so it's noting to do in the preprocessing |
| Comment by Konstantīns Ošmjans [ 2023 Dec 28 ] |
|
Was this problem resolved in |