ZABBIX BUGS AND ISSUES
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-7373

Very slow SQL query for getting last values from history and sometimes can be finished with error

    Details

      Description

      On Monitoring->Overview and Monitoring->Latest data we have function CHistoryManager->getLast() which gets values from history using union way. Sometimes:
      1. If the page contains a lot of items
      2. If some of values have latest values ages ago
      it can be very slow and can be failed with "memory is exhausted near" (increasing memory_limit to 512M, 1G does not help).

      1. history2.1.patch
        2 kB
        Pavels Jelisejevs
      2. history2.2.patch
        2 kB
        Pavels Jelisejevs
      3. history2.patch
        2 kB
        Oleg Ivanivskyi
      4. test.txt
        98 kB
        Ткаченко Максим Александрович
      1. zabbix_2.2.4_latestdata.png
        195 kB
      2. zabbix-2.0.4.PNG
        45 kB
      3. zabbix-2.2.5.PNG
        44 kB

        Issue Links

          Activity

          Hide
          Alexey Pustovalov added a comment -

          I made a few tests to check performance issues with the problem:
          1. Vanilla Latest data with about 3000 items:

          ******************** Script profiler ********************
          Total time: 38.640018
          Total SQL time: 26.696099
          SQL count: 782 (selects: 392 | executes: 390)
          Peak memory usage: 111.5M
          Memory limit: 2048M
          

          2. Separated queries by 10 items per query:

          ******************** Script profiler ********************
          Total time: 14.392597
          Total SQL time: 2.572941
          SQL count: 1252 (selects: 863 | executes: 389)
          Peak memory usage: 113.25M
          Memory limit: 2048M
          

          3. Added limit for 3 refresh interval to big query:

          ******************** Script profiler ********************
          Total time: 38.431007
          Total SQL time: 25.909821
          SQL count: 782 (selects: 393 | executes: 389)
          Peak memory usage: 111.5M
          Memory limit: 2048M
          

          The last test does not help, but for Oracle DB it helps much better than for MySQL.
          The second test calls crash of Oracle DB related with reached max amount of opened cursors.

          Show
          Alexey Pustovalov added a comment - I made a few tests to check performance issues with the problem: 1. Vanilla Latest data with about 3000 items: ******************** Script profiler ******************** Total time: 38.640018 Total SQL time: 26.696099 SQL count: 782 (selects: 392 | executes: 390) Peak memory usage: 111.5M Memory limit: 2048M 2. Separated queries by 10 items per query: ******************** Script profiler ******************** Total time: 14.392597 Total SQL time: 2.572941 SQL count: 1252 (selects: 863 | executes: 389) Peak memory usage: 113.25M Memory limit: 2048M 3. Added limit for 3 refresh interval to big query: ******************** Script profiler ******************** Total time: 38.431007 Total SQL time: 25.909821 SQL count: 782 (selects: 393 | executes: 389) Peak memory usage: 111.5M Memory limit: 2048M The last test does not help, but for Oracle DB it helps much better than for MySQL. The second test calls crash of Oracle DB related with reached max amount of opened cursors.

          New trace for changed sql in test.txt

          Show
          Ткаченко Максим Александрович added a comment - New trace for changed sql in test.txt
          Hide
          Kenneth Palmertree added a comment -

          Applied the patch in ZBX-7373 and we noticed improved performance for latest data. Went from about 90 secs to about 1 sec to bring up latest data for a host with about 1300 items.

          Show
          Kenneth Palmertree added a comment - Applied the patch in ZBX-7373 and we noticed improved performance for latest data. Went from about 90 secs to about 1 sec to bring up latest data for a host with about 1300 items.
          Hide
          Kenneth Palmertree added a comment - - edited

          Getting Undefined variable: rs [dashboard.php:158 → make_webmon_overview() → CHttpTestManager->getLastData() → CHistoryManager->getLast() in /var/www/html/WEBSITE/api/classes/managers/CHistoryManager.php:50]: for web widget if nothing is being monitored. Fixed the issue by adding

          $rs = array();

          class CHistoryManager {
          
          	/**
          	 * Returns the last $limit history objects for the given items.
          	 *
          	 * @param array $items  an array of items with the 'itemid' and 'value_type' properties
          	 * @param int $limit
          	 *
          	 * @return array    an array with items IDs as keys and arrays of history objects as values
          	 */
          	public function getLast(array $items, $limit = 1) {
          		$rs = array();
          		foreach ($items as $item) {
          			$table = self::getTableName($item['value_type']);
          			$query = DBselect(
          				'SELECT *'.
          				' FROM '.$table.' h'.
          				' WHERE h.itemid='.zbx_dbstr($item['itemid']).
          				' ORDER BY h.clock DESC',
          				$limit
          			);
          			while ($history = DBfetch($query)) {
          				$rs[$history['itemid']][] = $history;
          			}
          		}
          
          		return $rs;
          	}
          
          Show
          Kenneth Palmertree added a comment - - edited Getting Undefined variable: rs [dashboard.php:158 → make_webmon_overview() → CHttpTestManager->getLastData() → CHistoryManager->getLast() in /var/www/html/WEBSITE/api/classes/managers/CHistoryManager.php:50] : for web widget if nothing is being monitored. Fixed the issue by adding $rs = array(); class CHistoryManager { /** * Returns the last $limit history objects for the given items. * * @param array $items an array of items with the 'itemid' and 'value_type' properties * @param int $limit * * @ return array an array with items IDs as keys and arrays of history objects as values */ public function getLast(array $items, $limit = 1) { $rs = array(); foreach ($items as $item) { $table = self::getTableName($item['value_type']); $query = DBselect( 'SELECT *'. ' FROM '.$table.' h'. ' WHERE h.itemid='.zbx_dbstr($item['itemid']). ' ORDER BY h.clock DESC', $limit ); while ($history = DBfetch($query)) { $rs[$history['itemid']][] = $history; } } return $rs; }
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          Kenneth, thanks for the feedback and the patch. We've decided to include the fix in 2.2.1 and, possibly, improve it later.

          Show
          Pavels Jelisejevs (Inactive) added a comment - Kenneth, thanks for the feedback and the patch. We've decided to include the fix in 2.2.1 and, possibly, improve it later.
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          RESOLVED in svn://svn.zabbix.com/branches/dev/ZBX-7373.

          Show
          Pavels Jelisejevs (Inactive) added a comment - RESOLVED in svn://svn.zabbix.com/branches/dev/ZBX-7373.
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          Fixed in 2.2.1rc1 r40684 and 2.3.0 r40685.

          CLOSED.

          Show
          Pavels Jelisejevs (Inactive) added a comment - Fixed in 2.2.1rc1 r40684 and 2.3.0 r40685. CLOSED.
          Hide
          Stephen Dayman added a comment -

          I have added Kenneth's update function to my app and the page is now loading in less than a second (used to take several minutes). Thank you!

          Show
          Stephen Dayman added a comment - I have added Kenneth's update function to my app and the page is now loading in less than a second (used to take several minutes). Thank you!
          Hide
          Oleksiy Zagorskyi added a comment -

          Reopen to specify correctly Fix Version/s:

          Show
          Oleksiy Zagorskyi added a comment - Reopen to specify correctly Fix Version/s:
          Hide
          Oleksiy Zagorskyi added a comment -

          Closed again.

          Show
          Oleksiy Zagorskyi added a comment - Closed again.
          Hide
          Pascal Schmiel added a comment -

          There is a chance to improve the performance by reducing database overhead when just spliting the UNIONs instead of using individual queries:

          getLast.patch
          --- /opt/app/zabbix/pschmiel/zabbix-2.2.0/frontends/php/api/classes/managers/CHistoryManager.php 2013-11-12 07:07:38.000000000 +0100
          +++ /opt/app/zabbix/frontend/api/classes/managers/CHistoryManager.php   2013-12-05 09:40:26.101923782 +0100
          @@ -31,7 +31,7 @@
                   * @param int $limit
                   *
                   * @return array    an array with items IDs as keys and arrays of history objects as values
          -        */
          +        */
                  public function getLast(array $items, $limit = 1) {
                          $queries = array();
                          foreach ($items as $item) {
          @@ -47,9 +47,12 @@
           
                          $rs = array();
                          foreach ($queries as $tableQueries) {
          -                       $query = DBunion($tableQueries);
          -                       while ($history = DBfetch($query)) {
          -                               $rs[$history['itemid']][] = $history;
          +                       $tableQueriesChunk = array_chunk($tableQueries, 200); //value may be optimized
          +                       foreach ($tableQueriesChunk as $tableQueriesSmall) {
          +                               $query = DBunion($tableQueriesSmall);
          +                               while ($history = DBfetch($query)) {
          +                                       $rs[$history['itemid']][] = $history;
          +                               }
                                  }
                          }
          
          Show
          Pascal Schmiel added a comment - There is a chance to improve the performance by reducing database overhead when just spliting the UNIONs instead of using individual queries: getLast.patch --- /opt/app/zabbix/pschmiel/zabbix-2.2.0/frontends/php/api/classes/managers/CHistoryManager.php 2013-11-12 07:07:38.000000000 +0100 +++ /opt/app/zabbix/frontend/api/classes/managers/CHistoryManager.php 2013-12-05 09:40:26.101923782 +0100 @@ -31,7 +31,7 @@ * @param int $limit * * @ return array an array with items IDs as keys and arrays of history objects as values - */ + */ public function getLast(array $items, $limit = 1) { $queries = array(); foreach ($items as $item) { @@ -47,9 +47,12 @@ $rs = array(); foreach ($queries as $tableQueries) { - $query = DBunion($tableQueries); - while ($history = DBfetch($query)) { - $rs[$history['itemid']][] = $history; + $tableQueriesChunk = array_chunk($tableQueries, 200); //value may be optimized + foreach ($tableQueriesChunk as $tableQueriesSmall) { + $query = DBunion($tableQueriesSmall); + while ($history = DBfetch($query)) { + $rs[$history['itemid']][] = $history; + } } }
          Hide
          Dimitri Bellini added a comment - - edited

          I have tested the latest 2.2 svn branch (05/12/2013) compared to the old Zabbix 2.0.9 with the same host configuration and running the "latest data" page i have very huge difference, below detail:

          HOST Configuration

          HOSTNAME01 Applications (7) Items (4422) Triggers (2639) Graphs (198) Discovery (2)

          Zabbix 2.0.9

          ******************** Script profiler ********************
          Total time: 1.073098
          Total SQL time: 0.563685
          SQL count: 44 (selects: 21 | executes: 23)
          Peak memory usage: 11.450989M
          Memory limit: 512M
          

          Zabbix 2.2 svn (05/12/2013)

          ******************** Script profiler ********************
          Total time: 11.397692
          Total SQL time: 4.470916
          SQL count: 4556 (selects: 4517 | executes: 39)
          Peak memory usage: 103M
          Memory limit: 1G
          
          

          On new 2.2 you can see very huge difference in general but mainly the "Total time" compared to "Total SQL time" is very high.
          If you need other info please ask me.
          Thanks

          Show
          Dimitri Bellini added a comment - - edited I have tested the latest 2.2 svn branch (05/12/2013) compared to the old Zabbix 2.0.9 with the same host configuration and running the "latest data" page i have very huge difference, below detail: HOST Configuration HOSTNAME01 Applications (7) Items (4422) Triggers (2639) Graphs (198) Discovery (2) Zabbix 2.0.9 ******************** Script profiler ******************** Total time: 1.073098 Total SQL time: 0.563685 SQL count: 44 (selects: 21 | executes: 23) Peak memory usage: 11.450989M Memory limit: 512M Zabbix 2.2 svn (05/12/2013) ******************** Script profiler ******************** Total time: 11.397692 Total SQL time: 4.470916 SQL count: 4556 (selects: 4517 | executes: 39) Peak memory usage: 103M Memory limit: 1G On new 2.2 you can see very huge difference in general but mainly the "Total time" compared to "Total SQL time" is very high. If you need other info please ask me. Thanks
          Hide
          Pascal Schmiel added a comment -

          Dimitri, try to use my patch on the latest SVN. (You have to change the getLast method by hand because the patchfile is for zabbix 2.2.0).
          If it makes things better, I will get my hands on a patchfile for current 2.2.1.

          (Maybe you want to try different values for the array_chunk())

          Best
          Pascal

          Show
          Pascal Schmiel added a comment - Dimitri, try to use my patch on the latest SVN. (You have to change the getLast method by hand because the patchfile is for zabbix 2.2.0). If it makes things better, I will get my hands on a patchfile for current 2.2.1. (Maybe you want to try different values for the array_chunk()) Best Pascal
          Hide
          Dimitri Bellini added a comment - - edited

          Hi Pascal
          thanks for your reply, i have tested your patch on Zabbix 2.2.0 because the TRUNK is very different.
          Below i will report the same test:

          ******************** Script profiler ********************
          Total time: 18.430855
          Total SQL time: 11.496547
          SQL count: 157 (selects: 120 | executes: 37)
          Peak memory usage: 84.5M
          Memory limit: 1G
          

          Is much better then the original code but using the Zabbix TRUNK code seems better but is still not fast as 2.0.9.
          Many thanks

          Show
          Dimitri Bellini added a comment - - edited Hi Pascal thanks for your reply, i have tested your patch on Zabbix 2.2.0 because the TRUNK is very different. Below i will report the same test: ******************** Script profiler ******************** Total time: 18.430855 Total SQL time: 11.496547 SQL count: 157 (selects: 120 | executes: 37) Peak memory usage: 84.5M Memory limit: 1G Is much better then the original code but using the Zabbix TRUNK code seems better but is still not fast as 2.0.9. Many thanks
          Hide
          Pascal Schmiel added a comment -

          Try to raise the chunk size from 200 to 1000.
          I'm not familiar with the 2.3.0 trunk but maybe I have a look at this later.

          Show
          Pascal Schmiel added a comment - Try to raise the chunk size from 200 to 1000. I'm not familiar with the 2.3.0 trunk but maybe I have a look at this later.
          Hide
          Dimitri Bellini added a comment - - edited

          Hi Pascal,
          you are right! I have changed the Chunk size to 10 or 5 i have this value:

          ******************** Script profiler ********************
          Total time: 7.750821
          Total SQL time: 1.960531
          SQL count: 1018 (selects: 981 | executes: 37)
          Peak memory usage: 88M
          Memory limit: 1G
          

          Now seems much better! but i don't understand why i have SQL time near to 2 sec and a Total Time of 8 sec...
          Do you know Why?

          Show
          Dimitri Bellini added a comment - - edited Hi Pascal, you are right! I have changed the Chunk size to 10 or 5 i have this value: ******************** Script profiler ******************** Total time: 7.750821 Total SQL time: 1.960531 SQL count: 1018 (selects: 981 | executes: 37) Peak memory usage: 88M Memory limit: 1G Now seems much better! but i don't understand why i have SQL time near to 2 sec and a Total Time of 8 sec... Do you know Why?
          Hide
          Oleksiy Zagorskyi added a comment -

          Dimitri, please use comments formating to paste SQLs output, I mean {code} or {noformat} tags.

          Then it will look like this:

          Total time: 7.750821
          
          Show
          Oleksiy Zagorskyi added a comment - Dimitri, please use comments formating to paste SQLs output, I mean {code} or {noformat} tags. Then it will look like this: Total time: 7.750821
          Hide
          Dimitri Bellini added a comment -

          Thanks Oleksiy
          next time i will do that
          But please considering to not solved this issue, i still have slow query on "Latest Data" page.
          As i understand i'm not the only one with this problem.
          If you have suggestion or you need detail please say to me.

          Show
          Dimitri Bellini added a comment - Thanks Oleksiy next time i will do that But please considering to not solved this issue, i still have slow query on "Latest Data" page. As i understand i'm not the only one with this problem. If you have suggestion or you need detail please say to me.
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          Reopening since this issue still requires work.

          Show
          Pavels Jelisejevs (Inactive) added a comment - Reopening since this issue still requires work.
          Hide
          Stas Shaklein added a comment -

          Hello. I made a patch for this issue.
          (already works for my installation: 13000 hosts, 150000 items, 90000 triggers, incoming data ~5Gb per day [Debian Wheezy + pgpool2 + PostgreSQL 9.3 + Zabbix 2.1.6] partitioning from https://www.zabbix.com/wiki/non-english/ru/partitioning_in_postgresql ; Current DB size: 380 Gb)

          Patch: http://pastebin.com/kzbXxGy9

          Variables

          // partitions.history_uint_p2013_01_02
          // |          |           | \ PARTITION_DATEFORMAT
          // |          |           \ PARTITION_DATE_PREFIX
          // |          \ normal table name
          // \ PARTITION_SCHEMA
          define('PARTITION_SCHEMA', 'partitions');
          define('PARTITION_TABLES_DAILY', '"history" "history_log" "history_str" history_text" "history_uint"');
          define('PARTITION_DATE_PREFIX', '_p');
          define('PARTITION_DATEFORMAT', 'Y_m_d');
          define('LATEST_DATA_MULTIPLIER', 3);
          
          • PARTITION_SCHEMA [string] - schema's name in DB with partition tables
          • PARTITION_TABLES_DAILY [string] - which tables exactly we must use with patch
          • PARTITION_DATE_PREFIX [string] - chars between table name and date prefix
          • PARTITION_DATEFORMAT [string] - date format for your tables ( syntax - http://php.net/manual/en/function.date.php )
          • LATEST_DATA_MULTIPLIER [unsigned integer] - "clock" limit for SQL query (this variable multiplied on delay of each item).
            ... WHERE clock >= <NOW> - ( <ITEM.DELAY> * LATEST_DATA_MULTIPLIER )
                AND clock < <NOW>
            

          Tests with latests data for 50000 items:

          Env Time for generate "Latests data" page
          Zabbix 2.1.6 163 s
          Zabbix 2.2 118 s
          Zabbix 2.[1|2]+ patch without partitions variables ~10-12 s
          Zabbix 2.[1|2]+ patch with partitions variables ~3-4 s
          Show
          Stas Shaklein added a comment - Hello. I made a patch for this issue. (already works for my installation: 13000 hosts, 150000 items, 90000 triggers, incoming data ~5Gb per day [Debian Wheezy + pgpool2 + PostgreSQL 9.3 + Zabbix 2.1.6] partitioning from https://www.zabbix.com/wiki/non-english/ru/partitioning_in_postgresql ; Current DB size: 380 Gb) Patch: http://pastebin.com/kzbXxGy9 Variables // partitions.history_uint_p2013_01_02 // | | | \ PARTITION_DATEFORMAT // | | \ PARTITION_DATE_PREFIX // | \ normal table name // \ PARTITION_SCHEMA define('PARTITION_SCHEMA', 'partitions'); define('PARTITION_TABLES_DAILY', '"history" "history_log" "history_str" history_text" "history_uint"'); define('PARTITION_DATE_PREFIX', '_p'); define('PARTITION_DATEFORMAT', 'Y_m_d'); define('LATEST_DATA_MULTIPLIER', 3); PARTITION_SCHEMA [string] - schema's name in DB with partition tables PARTITION_TABLES_DAILY [string] - which tables exactly we must use with patch PARTITION_DATE_PREFIX [string] - chars between table name and date prefix PARTITION_DATEFORMAT [string] - date format for your tables ( syntax - http://php.net/manual/en/function.date.php ) LATEST_DATA_MULTIPLIER [unsigned integer] - "clock" limit for SQL query (this variable multiplied on delay of each item). ... WHERE clock >= <NOW> - ( <ITEM.DELAY> * LATEST_DATA_MULTIPLIER ) AND clock < <NOW> Tests with latests data for 50000 items: Env Time for generate "Latests data" page Zabbix 2.1.6 163 s Zabbix 2.2 118 s Zabbix 2. [1|2] + patch without partitions variables ~10-12 s Zabbix 2. [1|2] + patch with partitions variables ~3-4 s
          Hide
          richlv added a comment - - edited

          ZBX-7205 might have the same cause
          Alexey Pustovalov not quite right. It can be when a lot of queries to database as well or when PHP selects a lot of data from database.

          Show
          richlv added a comment - - edited ZBX-7205 might have the same cause Alexey Pustovalov not quite right. It can be when a lot of queries to database as well or when PHP selects a lot of data from database.
          Hide
          Eduardo Ruiz added a comment - - edited

          Stas Shaklein,

          On version 2.2.1, CHistoryManager.php is almost the same, and I think there will be no problem.

          However, latest.php is slightly different. What would be the best approach?

          Thanks.

          Show
          Eduardo Ruiz added a comment - - edited Stas Shaklein, On version 2.2.1, CHistoryManager.php is almost the same, and I think there will be no problem. However, latest.php is slightly different. What would be the best approach? Thanks.
          Hide
          Stas Shaklein added a comment -

          Hello Eduardo,

          As 2.2 I have tested "r41010".

           ~$ svn co svn://svn.zabbix.com/trunk zabbix-trunk_r41010 -r r41010 

          Compared to the previous version, r41010 performs a separate SQL query for each item. But that's not really resolving an issue with high load.

          Anyway, the best approach is different for each installation.
          So, this patch not the universal answer. (But I hope, patch contain an helpful idea for the main Zabbix release)

          Show
          Stas Shaklein added a comment - Hello Eduardo, As 2.2 I have tested "r41010". ~$ svn co svn://svn.zabbix.com/trunk zabbix-trunk_r41010 -r r41010 Compared to the previous version, r41010 performs a separate SQL query for each item. But that's not really resolving an issue with high load. Anyway, the best approach is different for each installation. So, this patch not the universal answer. (But I hope, patch contain an helpful idea for the main Zabbix release)
          Hide
          Eduardo Ruiz added a comment - - edited

          Stas (and folks),

          I've tested the solution on 2.2.1 on a testing environment. It seems that is working without any modifications on latest.php. Tables are being created:

          Table Lines
          history_p2014_01_23 1258860
          history_p2014_01_24 3948204
          history_p2014_01_25 3954510
          history_p2014_01_26 3934371
          history_p2014_01_27 1834014
          history_str_p2014_01_23 0
          history_str_p2014_01_24 235
          history_str_p2014_01_25 235
          history_str_p2014_01_26 235
          history_str_p2014_01_27 51
          history_text_p2014_01_23 2644
          history_text_p2014_01_24 11538
          history_text_p2014_01_25 11565
          history_text_p2014_01_26 11818
          history_text_p2014_01_27 6457
          history_uint_p2014_01_23 632804
          history_uint_p2014_01_24 1944413
          history_uint_p2014_01_25 1887237
          history_uint_p2014_01_26 2023270
          history_uint_p2014_01_27 963284
          trends_p2014_01 1094036
          trends_uint_p2014_01 262199

          And they can be consulted normally with CHistoryManager.php and defines.inc.php.

          Would we say that with these two modifications, Zabbix environment would be ready for partitioning?

          Will Zabbix internal functions (like last(), the most used/common) correctly handle data?

          Regards,

          Eduardo

          Show
          Eduardo Ruiz added a comment - - edited Stas (and folks), I've tested the solution on 2.2.1 on a testing environment. It seems that is working without any modifications on latest.php. Tables are being created: Table Lines history_p2014_01_23 1258860 history_p2014_01_24 3948204 history_p2014_01_25 3954510 history_p2014_01_26 3934371 history_p2014_01_27 1834014 history_str_p2014_01_23 0 history_str_p2014_01_24 235 history_str_p2014_01_25 235 history_str_p2014_01_26 235 history_str_p2014_01_27 51 history_text_p2014_01_23 2644 history_text_p2014_01_24 11538 history_text_p2014_01_25 11565 history_text_p2014_01_26 11818 history_text_p2014_01_27 6457 history_uint_p2014_01_23 632804 history_uint_p2014_01_24 1944413 history_uint_p2014_01_25 1887237 history_uint_p2014_01_26 2023270 history_uint_p2014_01_27 963284 trends_p2014_01 1094036 trends_uint_p2014_01 262199 And they can be consulted normally with CHistoryManager.php and defines.inc.php. Would we say that with these two modifications, Zabbix environment would be ready for partitioning? Will Zabbix internal functions (like last(), the most used/common) correctly handle data? Regards, Eduardo
          Hide
          Dimitri Bellini added a comment - - edited

          Hi guys,
          I have tested the new Zabbix 2.2.2rc2 but still have very slow performance with a host with a lot of items, the first run sometimes is going in timeout or have high load time.
          The strange thing is when i open the debug option, the time is not huge as i wait the first time i look the latest data for that host.

          ******************** Script profiler ********************
          Total time: 8.693752
          Total SQL time: 2.208125
          SQL count: 4536 (selects: 4512 | executes: 24)
          Peak memory usage: 104.25M
          Memory limit: 1G
          

          After the first run (MySQL Cahe?) the performance is better:

          ******************** Script profiler ********************
          Total time: 1.133612
          Total SQL time: 0.349234
          SQL count: 699 (selects: 670 | executes: 29)
          Peak memory usage: 27.5M
          Memory limit: 1G
          
          Show
          Dimitri Bellini added a comment - - edited Hi guys, I have tested the new Zabbix 2.2.2rc2 but still have very slow performance with a host with a lot of items, the first run sometimes is going in timeout or have high load time. The strange thing is when i open the debug option, the time is not huge as i wait the first time i look the latest data for that host. ******************** Script profiler ******************** Total time: 8.693752 Total SQL time: 2.208125 SQL count: 4536 (selects: 4512 | executes: 24) Peak memory usage: 104.25M Memory limit: 1G After the first run (MySQL Cahe?) the performance is better: ******************** Script profiler ******************** Total time: 1.133612 Total SQL time: 0.349234 SQL count: 699 (selects: 670 | executes: 29) Peak memory usage: 27.5M Memory limit: 1G
          Hide
          Ben Lambrey added a comment -

          Still a problem using 2.2.2 final.

          Show
          Ben Lambrey added a comment - Still a problem using 2.2.2 final.
          Hide
          Synthead added a comment -

          I can also verify that this is still an issue with 2.2.2. I have a Zabbix server with 570 hosts.

          In my case, in "Latest Data", if I select host "all", Apache and MySQL chews on a CPU for approximately 30 seconds, then returns to the previously-selected host (instead of "all"). Due to this, I am only able to view the latest data for specific hosts via the UI. Setting the row limit in the user's profile doesn't seem to yield any change of behavior. There are no error messages in Apache's error log, either.

          Show
          Synthead added a comment - I can also verify that this is still an issue with 2.2.2. I have a Zabbix server with 570 hosts. In my case, in "Latest Data", if I select host "all", Apache and MySQL chews on a CPU for approximately 30 seconds, then returns to the previously-selected host (instead of "all"). Due to this, I am only able to view the latest data for specific hosts via the UI. Setting the row limit in the user's profile doesn't seem to yield any change of behavior. There are no error messages in Apache's error log, either.
          Hide
          Aleksandrs Saveljevs added a comment -

          One of "Latest data" bugs we might wish to fix under this issue is as follows. Suppose we have just configured a numeric item and it received a value of 5. Then the change column will show a change of "+5", implying that the previous value was 0. It should not.

          Show
          Aleksandrs Saveljevs added a comment - One of "Latest data" bugs we might wish to fix under this issue is as follows. Suppose we have just configured a numeric item and it received a value of 5. Then the change column will show a change of "+5", implying that the previous value was 0. It should not.
          Hide
          Rob Roovers added a comment -

          Still having issues with zabbix 2.2.2 Latest data page is blank

          Show
          Rob Roovers added a comment - Still having issues with zabbix 2.2.2 Latest data page is blank
          Hide
          Oleg Ivanivskyi added a comment -

          Parameter "Dropdown first entry = none" may be used as a temporary solution (https://www.zabbix.com/documentation/2.2/manual/web_interface/frontend_sections/administration/general#gui).

          Show
          Oleg Ivanivskyi added a comment - Parameter "Dropdown first entry = none" may be used as a temporary solution ( https://www.zabbix.com/documentation/2.2/manual/web_interface/frontend_sections/administration/general#gui ).
          Hide
          Rob Roovers added a comment -

          Thanks for that I do see the dropdowns now, although if I select an single entity it doesn't show any data and the page goes blank again.

          Show
          Rob Roovers added a comment - Thanks for that I do see the dropdowns now, although if I select an single entity it doesn't show any data and the page goes blank again.
          Hide
          Janis Jansons added a comment -

          My page doesn't go blank, but it's soooo slow... Latest data is almost unusable after upgrade from 2.0.x

          Show
          Janis Jansons added a comment - My page doesn't go blank, but it's soooo slow... Latest data is almost unusable after upgrade from 2.0.x
          Hide
          Adam K added a comment -

          I noticed that the problem was getting worse after the upgrade because the housekeeper is by default disabled in 2.2.0. The size of the history tables was just growing endlessly, and the latest data page getting slower and slower.

          After re-enabling the housekeeper in administration/general settings, manually cleaning up 3 months of built up history, and rebuilding the table, the latest data page was usable again. Still not quite as good as 2.0.x though.

          Show
          Adam K added a comment - I noticed that the problem was getting worse after the upgrade because the housekeeper is by default disabled in 2.2.0. The size of the history tables was just growing endlessly, and the latest data page getting slower and slower. After re-enabling the housekeeper in administration/general settings, manually cleaning up 3 months of built up history, and rebuilding the table, the latest data page was usable again. Still not quite as good as 2.0.x though.
          Hide
          Ben Lambrey added a comment -

          Janis, Oleg's temp. solution worked for us.

          Show
          Ben Lambrey added a comment - Janis, Oleg's temp. solution worked for us.
          Hide
          Shane McEwan added a comment -

          I just tried Oleg's "history2" patch and it has improved things tremendously! I have a slow database so even with only 200 items on a host I couldn't use the Latest Data page. After installing Oleg's patch the page loads in a couple of seconds.

          Thanks for everyone's work on this issue. We're nearly there!

          Show
          Shane McEwan added a comment - I just tried Oleg's "history2" patch and it has improved things tremendously! I have a slow database so even with only 200 items on a host I couldn't use the Latest Data page. After installing Oleg's patch the page loads in a couple of seconds. Thanks for everyone's work on this issue. We're nearly there!
          Hide
          Janis Jansons added a comment -

          Oleg's "history2" patch seems to make a difference. The performance is still not perfect, but way better then before the patching.

          FYI the patch adds extra ":" on api/classes/managers/CHistoryManager.php line 52 which causes the php to fail executing. Just remove it manually and everything should be working fine.

          Show
          Janis Jansons added a comment - Oleg's "history2" patch seems to make a difference. The performance is still not perfect, but way better then before the patching. FYI the patch adds extra ":" on api/classes/managers/CHistoryManager.php line 52 which causes the php to fail executing. Just remove it manually and everything should be working fine.
          Hide
          larry j prikockis added a comment -

          Any chance this fix will make it into the 2.2.3 release? or is it further down the road? latest data has been painfully slow since the last upgrade. Guess I'll try the patch.

          Show
          larry j prikockis added a comment - Any chance this fix will make it into the 2.2.3 release? or is it further down the road? latest data has been painfully slow since the last upgrade. Guess I'll try the patch.
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          Larry, the 2.2.3 release candidate is already out and we'll need more time to figure out a better way to resolve this issue, so this problem will be fixed only in one of the future releases.

          Show
          Pavels Jelisejevs (Inactive) added a comment - Larry, the 2.2.3 release candidate is already out and we'll need more time to figure out a better way to resolve this issue, so this problem will be fixed only in one of the future releases.
          Hide
          Alexey Pustovalov added a comment - - edited

          Possible MySQL solution...

          1.

          CREATE TABLE item_values(itemid bigint(20), prevvalue varchar(512), prevclock int(11), lastvalue varchar(512), lastclock int(11));
          ALTER TABLE item_values ADD PRIMARY KEY(itemid);
          
          delimiter |
          CREATE TRIGGER [history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON [history,history_uint,history_str,history_text,history_log]
            FOR EACH ROW
          BEGIN
              INSERT INTO item_values (itemid,lastvalue,lastclock) VALUES (NEW.itemid,NEW.value,NEW.clock)
                  ON DUPLICATE KEY UPDATE prevvalue = lastvalue, prevclock = lastclock, lastvalue = NEW.value, lastclock = NEW.clock;
          END;
          |
          
          delimiter ;
          

          [history,history_uint,history_str,history_text,history_log] - replace with table name, create the trigger for each table.

          2. Apply the next patch to the frontend:

          Index: api/classes/managers/CHistoryManager.php
          ===================================================================
          --- api/classes/managers/CHistoryManager.php	(revision 43862)
          +++ api/classes/managers/CHistoryManager.php	(working copy)
          @@ -35,17 +35,37 @@
           	public function getLast(array $items, $limit = 1) {
           		$rs = array();
           		foreach ($items as $item) {
          -			$table = self::getTableName($item['value_type']);
          -			$query = DBselect(
          -				'SELECT *'.
          -				' FROM '.$table.' h'.
          -				' WHERE h.itemid='.zbx_dbstr($item['itemid']).
          -				' ORDER BY h.clock DESC',
          -				$limit
          -			);
          -			while ($history = DBfetch($query)) {
          -				$rs[$history['itemid']][] = $history;
          +			if ($limit > 2) {
          +				$table = self::getTableName($item['value_type']);
          +				$query = DBselect(
          +					'SELECT *'.
          +					' FROM '.$table.' h'.
          +					' WHERE h.itemid='.zbx_dbstr($item['itemid']).
          +					' ORDER BY h.clock DESC',
          +					$limit
          +				);
          +	    			while ($history = DBfetch($query)) {
          +					$rs[$history['itemid']][] = $history;
          +				}
          +
           			}
          +			else {
          +				$query = DBselect(
          +                                        'SELECT *'.
          +                                        ' FROM item_values iv'.
          +                                        ' WHERE iv.itemid='.zbx_dbstr($item['itemid'])
          +                                );
          +                                while ($history = DBfetch($query)) {
          +                                        $rs[$history['itemid']][] = ['itemid' => $history['itemid'],
          +								     'clock' => $history['lastclock'],
          +								     'value' => $history['lastvalue']];
          +
          +					$rs[$history['itemid']][] = ['itemid' => $history['itemid'],
          +                                                                     'clock' => $history['prevclock'],
          +                                                                     'value' => $history['prevvalue']];
          +
          +                                }
          +			}
           		}
           
           		return $rs;
          
          

          Possible bugs:
          1. Deadlocks. testes on system with 2.5m items. Without deadlocks
          2. Truncated data for long values like history_text, history_log tables.

          Known bugs:
          1. Removed items remain in the table. To remove deleted items:

          DELETE FROM item_values WHERE itemid NOT IN (SELECT itemid FROM items);
          

          Can anybody test that solution?

          Show
          Alexey Pustovalov added a comment - - edited Possible MySQL solution... 1. CREATE TABLE item_values(itemid bigint(20), prevvalue varchar(512), prevclock int(11), lastvalue varchar(512), lastclock int(11)); ALTER TABLE item_values ADD PRIMARY KEY(itemid); delimiter | CREATE TRIGGER [history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON [history,history_uint,history_str,history_text,history_log] FOR EACH ROW BEGIN INSERT INTO item_values (itemid,lastvalue,lastclock) VALUES (NEW.itemid,NEW.value,NEW.clock) ON DUPLICATE KEY UPDATE prevvalue = lastvalue, prevclock = lastclock, lastvalue = NEW.value, lastclock = NEW.clock; END; | delimiter ; [history,history_uint,history_str,history_text,history_log] - replace with table name, create the trigger for each table. 2. Apply the next patch to the frontend: Index: api/classes/managers/CHistoryManager.php =================================================================== --- api/classes/managers/CHistoryManager.php (revision 43862) +++ api/classes/managers/CHistoryManager.php (working copy) @@ -35,17 +35,37 @@ public function getLast(array $items, $limit = 1) { $rs = array(); foreach ($items as $item) { - $table = self::getTableName($item['value_type']); - $query = DBselect( - 'SELECT *'. - ' FROM '.$table.' h'. - ' WHERE h.itemid='.zbx_dbstr($item['itemid']). - ' ORDER BY h.clock DESC', - $limit - ); - while ($history = DBfetch($query)) { - $rs[$history['itemid']][] = $history; + if ($limit > 2) { + $table = self::getTableName($item['value_type']); + $query = DBselect( + 'SELECT *'. + ' FROM '.$table.' h'. + ' WHERE h.itemid='.zbx_dbstr($item['itemid']). + ' ORDER BY h.clock DESC', + $limit + ); + while ($history = DBfetch($query)) { + $rs[$history['itemid']][] = $history; + } + } + else { + $query = DBselect( + 'SELECT *'. + ' FROM item_values iv'. + ' WHERE iv.itemid='.zbx_dbstr($item['itemid']) + ); + while ($history = DBfetch($query)) { + $rs[$history['itemid']][] = ['itemid' => $history['itemid'], + 'clock' => $history['lastclock'], + 'value' => $history['lastvalue']]; + + $rs[$history['itemid']][] = ['itemid' => $history['itemid'], + 'clock' => $history['prevclock'], + 'value' => $history['prevvalue']]; + + } + } } return $rs; Possible bugs: 1. Deadlocks. testes on system with 2.5m items. Without deadlocks 2. Truncated data for long values like history_text, history_log tables. Known bugs: 1. Removed items remain in the table. To remove deleted items: DELETE FROM item_values WHERE itemid NOT IN (SELECT itemid FROM items); Can anybody test that solution?
          Hide
          Marc added a comment -

          Doesn't work in connection with PostgreSQL:

          pg_query(): Query failed: ERROR: syntax error at or near "WHERE" LINE 1: ...id='577847' ORDER BY h.clock DESC LIMIT 2) as sel WHERE h2.i... ^ [latest.php:213 → CHistoryManager->getLast() → DBselect() → pg_query() in /usr/share/zabbix/include/db.inc.php:411]
          

          Just in case this was not intended to work for MySQL only.

          Show
          Marc added a comment - Doesn't work in connection with PostgreSQL: pg_query(): Query failed: ERROR: syntax error at or near "WHERE" LINE 1: ...id='577847' ORDER BY h.clock DESC LIMIT 2) as sel WHERE h2.i... ^ [latest.php:213 → CHistoryManager->getLast() → DBselect() → pg_query() in /usr/share/zabbix/include/db.inc.php:411] Just in case this was not intended to work for MySQL only.
          Hide
          Alexey Pustovalov added a comment - - edited

          Marc, please try corrected patch.

          Show
          Alexey Pustovalov added a comment - - edited Marc, please try corrected patch.
          Hide
          Marc added a comment - - edited

          The corrected patch works (syntactically).

          Unfortunately it doesn't have a positive effect in terms of performance (at least for my environment)
          I've repeatedly refreshed Latest data for a host with 3576 items having data (3596 in total):

          • without patch ~8 sec
          • patch applied ~11 sec

          Edit:

          Total time: 8.064284
          Total SQL time: 5.008266
          SQL count: 3660 (selects: 3632 | executes: 28)
          Peak memory usage: 79.75M
          Memory limit: 512M
          
          Total time: 11.609592
          Total SQL time: 8.816184
          SQL count: 3657 (selects: 3633 | executes: 24)
          Peak memory usage: 80.25M
          Memory limit: 512M
          
          Show
          Marc added a comment - - edited The corrected patch works (syntactically). Unfortunately it doesn't have a positive effect in terms of performance (at least for my environment) I've repeatedly refreshed Latest data for a host with 3576 items having data (3596 in total): without patch ~8 sec patch applied ~11 sec Edit: Total time: 8.064284 Total SQL time: 5.008266 SQL count: 3660 (selects: 3632 | executes: 28) Peak memory usage: 79.75M Memory limit: 512M Total time: 11.609592 Total SQL time: 8.816184 SQL count: 3657 (selects: 3633 | executes: 24) Peak memory usage: 80.25M Memory limit: 512M
          Hide
          Dimitri Bellini added a comment -

          I have try to test the Alexey Patch, as first step i have created the mysql table but...

          mysql> drop table item_values;
          Query OK, 0 rows affected (0.01 sec)
          
          mysql> CREATE TABLE item_values(itemid bigint(20), prevvalue varchar(512), prevclock int(11), lastvalue varchar(512), lastclock int(11));
          Query OK, 0 rows affected (0.03 sec)
          
          mysql> ALTER TABLE item_values ADD PRIMARY KEY(itemid);
          Query OK, 0 rows affected (0.04 sec)
          Records: 0  Duplicates: 0  Warnings: 0
          
          mysql> delimiter |
          mysql> CREATE TRIGGER [history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON [history,history_uint,history_str,history_text,history_log]
              -> FOR EACH ROW
              -> BEGIN
              -> INSERT INTO item_values (itemid,lastvalue,lastclock) VALUES (NEW.itemid,NEW.value,NEW.clock)
              -> ON DUPLICATE KEY UPDATE prevvalue = lastvalue, prevclock = lastclock, lastvalue = NEW.value, lastclock = NEW.clock;
              -> END;
              -> |
          ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON' at line 1
          

          My Zabbix DB version is:

          mysql> select * from dbversion;
          +-----------+----------+
          | mandatory | optional |
          +-----------+----------+
          |   2020000 |  2020001 |
          +-----------+----------+
          1 row in set (0.00 sec)
          

          Suggestion?

          Show
          Dimitri Bellini added a comment - I have try to test the Alexey Patch, as first step i have created the mysql table but... mysql> drop table item_values; Query OK, 0 rows affected (0.01 sec) mysql> CREATE TABLE item_values(itemid bigint(20), prevvalue varchar(512), prevclock int(11), lastvalue varchar(512), lastclock int(11)); Query OK, 0 rows affected (0.03 sec) mysql> ALTER TABLE item_values ADD PRIMARY KEY(itemid); Query OK, 0 rows affected (0.04 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> delimiter | mysql> CREATE TRIGGER [history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON [history,history_uint,history_str,history_text,history_log] -> FOR EACH ROW -> BEGIN -> INSERT INTO item_values (itemid,lastvalue,lastclock) VALUES (NEW.itemid,NEW.value,NEW.clock) -> ON DUPLICATE KEY UPDATE prevvalue = lastvalue, prevclock = lastclock, lastvalue = NEW.value, lastclock = NEW.clock; -> END; -> | ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON' at line 1 My Zabbix DB version is: mysql> select * from dbversion; +-----------+----------+ | mandatory | optional | +-----------+----------+ | 2020000 | 2020001 | +-----------+----------+ 1 row in set (0.00 sec) Suggestion?
          Hide
          Alexey Pustovalov added a comment -

          Dimitri, you need to create the trigger for each table. For example,

          CREATE TRIGGER [history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON [history,history_uint,history_str,history_text,history_log]
          

          must be replaced for each history related table, like

          CREATE TRIGGER history_trg BEFORE INSERT ON history
          

          or

          CREATE TRIGGER history_uint_trg BEFORE INSERT ON history_uint
          
          Show
          Alexey Pustovalov added a comment - Dimitri, you need to create the trigger for each table. For example, CREATE TRIGGER [history,history_uint,history_str,history_text,history_log]_trg BEFORE INSERT ON [history,history_uint,history_str,history_text,history_log] must be replaced for each history related table, like CREATE TRIGGER history_trg BEFORE INSERT ON history or CREATE TRIGGER history_uint_trg BEFORE INSERT ON history_uint
          Hide
          Dimitri Bellini added a comment -

          HI Alexey,
          After some work... I have now tested your solution (MySQL DB patch and php file patch) and seems better than every other test, below detail:

          ******************** Script profiler ********************
          Total time: 9.671476
          Total SQL time: 1.992437
          SQL count: 5718 (selects: 5690 | executes: 28)
          Peak memory usage: 119.75M
          Memory limit: 1G
          

          What do you think about? Is the expected result?
          Thanks so much

          Show
          Dimitri Bellini added a comment - HI Alexey, After some work... I have now tested your solution (MySQL DB patch and php file patch) and seems better than every other test, below detail: ******************** Script profiler ******************** Total time: 9.671476 Total SQL time: 1.992437 SQL count: 5718 (selects: 5690 | executes: 28) Peak memory usage: 119.75M Memory limit: 1G What do you think about? Is the expected result? Thanks so much
          Hide
          Alexey Pustovalov added a comment -

          Dimitri, do you see deadlocks in zabbix_server.log file?

          Show
          Alexey Pustovalov added a comment - Dimitri, do you see deadlocks in zabbix_server.log file?
          Hide
          epingnet added a comment - - edited

          I upgrade zabbix to 2.2.3 from 2.0.11,now open monitoring--latet data is very slow, which version can fix the BUG.
          Thanks

          Show
          epingnet added a comment - - edited I upgrade zabbix to 2.2.3 from 2.0.11,now open monitoring--latet data is very slow, which version can fix the BUG. Thanks
          Hide
          Justin McNutt added a comment -

          We have observed this same problem on Zabbix 2.2.x, as compared to Zabbix 2.0.x. We rebuilt our Zabbix server from scratch, so it definitely has nothing to do with the upgrade process. Our database server is MySQL 5.5.23 (RHEL 6, 64-bit) and our Zabbix servers and proxies are Zabbix 2.2.1.

          The problem definitely only occurs on the Latest Data page when one or more Hosts has a large number of Items.

          Does this need more testing or is there a patch that will be published in an upcoming version?

          Thanks!

          Show
          Justin McNutt added a comment - We have observed this same problem on Zabbix 2.2.x, as compared to Zabbix 2.0.x. We rebuilt our Zabbix server from scratch, so it definitely has nothing to do with the upgrade process. Our database server is MySQL 5.5.23 (RHEL 6, 64-bit) and our Zabbix servers and proxies are Zabbix 2.2.1. The problem definitely only occurs on the Latest Data page when one or more Hosts has a large number of Items. Does this need more testing or is there a patch that will be published in an upcoming version? Thanks!
          Hide
          Bruce Reed added a comment -

          I have problem under 2.2.3 with latest data page query timing out and leaving me on last page. Tried Alexey's patch above and we get an empty page when hitting latest data. Had to revert.

          Show
          Bruce Reed added a comment - I have problem under 2.2.3 with latest data page query timing out and leaving me on last page. Tried Alexey's patch above and we get an empty page when hitting latest data. Had to revert.
          Hide
          Gertjan Awater added a comment -

          We have reverted to 2.0.x. because of this.
          I can setup remote access to our test server which is still at 2.2.3 if that's any help.

          Show
          Gertjan Awater added a comment - We have reverted to 2.0.x. because of this. I can setup remote access to our test server which is still at 2.2.3 if that's any help.
          Hide
          Gerd Oberlechner added a comment - - edited

          From a database esthetics view point it was a good idea to remove the lastvalue and lastclock columns from the items table, since they are nothing more but a duplication of data considering that they are also saved in the history* tables. But from the performance view the change was problematic, especially in large environments, where the N+1 query problem rockets off to new dimensions. I tried to optimize the query on my own but did not find any solutions with an acceptable complexity class (http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql).

          Pavels Jelisejevs, you mentioned you will solve this in future releases. Can you tell us if this will be a 2.2.x release or 2.4?
          Can you give us any insight about possible solutions?

          I'm thinking about Alexey Pustovalov trigger based solution for the time being (without any knowhow about MySQLs trigger performance implications for the whole database).

          Show
          Gerd Oberlechner added a comment - - edited From a database esthetics view point it was a good idea to remove the lastvalue and lastclock columns from the items table, since they are nothing more but a duplication of data considering that they are also saved in the history* tables. But from the performance view the change was problematic, especially in large environments, where the N+1 query problem rockets off to new dimensions. I tried to optimize the query on my own but did not find any solutions with an acceptable complexity class ( http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql ). Pavels Jelisejevs, you mentioned you will solve this in future releases. Can you tell us if this will be a 2.2.x release or 2.4? Can you give us any insight about possible solutions? I'm thinking about Alexey Pustovalov trigger based solution for the time being (without any knowhow about MySQLs trigger performance implications for the whole database).
          Hide
          Slash added a comment -

          We are currently using postgresql 9.3, is the trigger solution for mysql usable with postgres?

          By the way, this issue is very problematic, is there any chance that a fix could be included for the next minor 2.2 release?

          Show
          Slash added a comment - We are currently using postgresql 9.3, is the trigger solution for mysql usable with postgres? By the way, this issue is very problematic, is there any chance that a fix could be included for the next minor 2.2 release?
          Hide
          Paweł Sasin added a comment -

          A colleague of mine has some thoughts on this topic, documented here:
          http://remigium.blogspot.com/2014/05/zabbix-223-and-zbx-7373.html

          Show
          Paweł Sasin added a comment - A colleague of mine has some thoughts on this topic, documented here: http://remigium.blogspot.com/2014/05/zabbix-223-and-zbx-7373.html
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          We're currently considering the idea of limit the history values displayed on certain pages like "Latest data" and "Overview" to 24 hours.

          It would be a lot of help if someone could test these two patches: history2.1.patch and history2.2.patch. This is a simplified version of the fix: it applies a global 24 hour limit to history data everywhere. Our lab tests show good results, but we would also like to try them out on real environments. The patch can be applied to any version newer than 2.2.3. Please post any testing results here.

          Show
          Pavels Jelisejevs (Inactive) added a comment - We're currently considering the idea of limit the history values displayed on certain pages like "Latest data" and "Overview" to 24 hours. It would be a lot of help if someone could test these two patches: history2.1.patch and history2.2.patch. This is a simplified version of the fix: it applies a global 24 hour limit to history data everywhere. Our lab tests show good results, but we would also like to try them out on real environments. The patch can be applied to any version newer than 2.2.3. Please post any testing results here.
          Hide
          Aleksandrs Saveljevs added a comment -

          Will 24 hours be configurable?

          Show
          Aleksandrs Saveljevs added a comment - Will 24 hours be configurable?
          Hide
          Marc added a comment - - edited

          Pavels Jelisejevs, just made some quick'n dirty tests on our live system.
          With warm buffer caches, what means Latest data was already accessed previously, there is no difference (what I expect that way).
          For initial access there appears to be a huge difference in a positive manner

          Cache PHP Total time SQL time SQL count
          warm unpatched 8.009978 5.056313 3659
          warm patch2.1 8.397428 5.319804 4400
          warm patch2.2 7.52639 4.408044 3659
          cold unpatched 252.204879 248.803407 3678
          cold patch2.1 9.018636 5.882898 4437
          cold patch2.2 8.246099 5.260707 3676

          Tests with 'warm' caches are done on the same host
          Tests with 'cold' caches are done on separate hosts, of course.

          All hosts have approximately 3500 items and 15 applications.
          I'll keep one of the patches applied and an eye on it.

          Addendum:
          GNU/Linux
          Zabbix 2.2.2
          PostgreSQL 9.3.4
          Partition range: weekly
          Partition count: 11
          Partition volume (float/uint): ~450/~250 million
          Partition size (float/uint): ~25/~45 GB

          Show
          Marc added a comment - - edited Pavels Jelisejevs , just made some quick'n dirty tests on our live system. With warm buffer caches, what means Latest data was already accessed previously, there is no difference (what I expect that way). For initial access there appears to be a huge difference in a positive manner Cache PHP Total time SQL time SQL count warm unpatched 8.009978 5.056313 3659 warm patch2.1 8.397428 5.319804 4400 warm patch2.2 7.52639 4.408044 3659 cold unpatched 252.204879 248.803407 3678 cold patch2.1 9.018636 5.882898 4437 cold patch2.2 8.246099 5.260707 3676 Tests with 'warm' caches are done on the same host Tests with 'cold' caches are done on separate hosts, of course. All hosts have approximately 3500 items and 15 applications. I'll keep one of the patches applied and an eye on it. Addendum: GNU/Linux Zabbix 2.2.2 PostgreSQL 9.3.4 Partition range: weekly Partition count: 11 Partition volume (float/uint): ~450/~250 million Partition size (float/uint): ~25/~45 GB
          Hide
          Pavels Jelisejevs (Inactive) added a comment -

          Aleksandrs, yes, 24 will be configurable from the frontend, and 2.2 will be configurable from defines.inc.php.

          Show
          Pavels Jelisejevs (Inactive) added a comment - Aleksandrs, yes, 24 will be configurable from the frontend, and 2.2 will be configurable from defines.inc.php.
          Hide
          Dimitri Bellini added a comment - - edited

          Hi Pavels,
          I have made some test on my Zabbix test server (zabbix 2.2.3), below the result but seems there are not so many differences.

          Cache	PHP	Total time	SQL time	SQL count
          warm	unpatched	2,0929	0,248218	2134
          warm	patch2.2	2,2705	0,696319	2078
          
          Show
          Dimitri Bellini added a comment - - edited Hi Pavels, I have made some test on my Zabbix test server (zabbix 2.2.3), below the result but seems there are not so many differences. Cache PHP Total time SQL time SQL count warm unpatched 2,0929 0,248218 2134 warm patch2.2 2,2705 0,696319 2078
          Hide
          Krists Krigers (Inactive) added a comment - - edited

          As discussed with Alexander Vladishev, merged Pavels Jelisejevs's patch to 2.2.4rc1 in r46266 and to 2.3.2 (trunk) in r46268.

          A new development branch svn://svn.zabbix.com/branches/dev/ZBX-7373-patch was made for this patch.

          Show
          Krists Krigers (Inactive) added a comment - - edited As discussed with Alexander Vladishev , merged Pavels Jelisejevs 's patch to 2.2.4rc1 in r46266 and to 2.3.2 (trunk) in r46268. A new development branch svn://svn.zabbix.com/branches/dev/ZBX-7373-patch was made for this patch.
          Hide
          Pavels Jelisejevs (Inactive) added a comment - - edited

          (1) This needs to be document: on the "Latest data", "Overview" pages and in "Data overview" screen items we only display values for the last 24 hours.

          <richlv> also whatsnew and https://www.zabbix.com/documentation/2.2/manual/web_interface/definitions

          Martins Valkovskis Updated documentation:

          RESOLVED.

          <richlv> in all the places where we say "It is also possible to change this limitation by changing the value of ZBX_HISTORY_PERIOD constant in include/defines.inc.php.", let's link to the definitions page

          Martins Valkovskis Done so. RESOLVED.

          <richlv> thanks, that looks good - but i got confused about "item overview pages". let's explicitly mention overview page & screen element. do we need links to corresponding frontend section pages from here ?

          Martins Valkovskis Right, that was just copy-paste from code comments. Made more explicit now, perhaps can do without links here. RESOLVED.

          <richlv> it says "for large pages of latest data" in these two pages, which seems wrong :

          REOPENED

          Martins Valkovskis Removed the 'latest data' mention there. Hopefully that resolves it.

          Alexander Vladishev CLOSED

          Show
          Pavels Jelisejevs (Inactive) added a comment - - edited (1) This needs to be document: on the "Latest data", "Overview" pages and in "Data overview" screen items we only display values for the last 24 hours. <richlv> also whatsnew and https://www.zabbix.com/documentation/2.2/manual/web_interface/definitions Martins Valkovskis Updated documentation: https://www.zabbix.com/documentation/2.2/manual/introduction/whatsnew224#latest_data_from_24_hours_only https://www.zabbix.com/documentation/2.2/manual/web_interface/frontend_sections/monitoring/latest_data https://www.zabbix.com/documentation/2.2/manual/web_interface/frontend_sections/monitoring/overview https://www.zabbix.com/documentation/2.2/manual/web_interface/definitions https://www.zabbix.com/documentation/2.4/manual/web_interface/frontend_sections/monitoring/latest_data https://www.zabbix.com/documentation/2.4/manual/web_interface/frontend_sections/monitoring/overview https://www.zabbix.com/documentation/2.4/manual/web_interface/definitions RESOLVED. <richlv> in all the places where we say "It is also possible to change this limitation by changing the value of ZBX_HISTORY_PERIOD constant in include/defines.inc.php.", let's link to the definitions page Martins Valkovskis Done so. RESOLVED. <richlv> thanks, that looks good - but i got confused about "item overview pages". let's explicitly mention overview page & screen element. do we need links to corresponding frontend section pages from here ? Martins Valkovskis Right, that was just copy-paste from code comments. Made more explicit now, perhaps can do without links here. RESOLVED. < richlv > it says "for large pages of latest data" in these two pages, which seems wrong : https://www.zabbix.com/documentation/2.2/manual/web_interface/frontend_sections/monitoring/overview https://www.zabbix.com/documentation/2.4/manual/web_interface/frontend_sections/monitoring/overview REOPENED Martins Valkovskis Removed the 'latest data' mention there. Hopefully that resolves it. Alexander Vladishev CLOSED
          Hide
          Tatapoum added a comment - - edited

          In my setup it doesn't help much. I have about 400 hosts and 100000 items for the moment. This is a distributed setup with an external Mysql DB server with 12G memory.
          Opening the latest data page is VERY slow if the last selection was "all" (up to 3-4 minutes). The PHP memory gets exhausted if it is under 2G. Here are my PHP memory parameters :

              php_value max_execution_time 300
              php_value memory_limit 2048M
              php_value post_max_size 128M
              php_value upload_max_filesize 2M
              php_value max_input_time 300
          

          If I enable the debug, the queries uses too much PHP memory and the page doesn't appears. There are no MySQL slow queries.

          For example, in my setup, does it make sense to display 100000 items on one page ? Wouldn't it be better to implement some kind of paging (1000 items per page is enough) using the LIMIT clause? Sorry if my question is stupid!

          Show
          Tatapoum added a comment - - edited In my setup it doesn't help much. I have about 400 hosts and 100000 items for the moment. This is a distributed setup with an external Mysql DB server with 12G memory. Opening the latest data page is VERY slow if the last selection was "all" (up to 3-4 minutes). The PHP memory gets exhausted if it is under 2G. Here are my PHP memory parameters : php_value max_execution_time 300 php_value memory_limit 2048M php_value post_max_size 128M php_value upload_max_filesize 2M php_value max_input_time 300 If I enable the debug, the queries uses too much PHP memory and the page doesn't appears. There are no MySQL slow queries. For example, in my setup, does it make sense to display 100000 items on one page ? Wouldn't it be better to implement some kind of paging (1000 items per page is enough) using the LIMIT clause? Sorry if my question is stupid!
          Hide
          Tatapoum added a comment -

          OK, as I understand it, the CHistoryManager class isn't really an issue as the getLastForItem() function can now be limited to a one day scope, which should be appropriate for most setups.
          Correct me if I'm wrong, but as I said above this, the biggest issue is that the latest data page tries to fetch all the items, which is useless if you have a high number of items and no selection filter. Moreover, it's not only the calls to getLastForItem() that are long, but the first call to the item->get() API method is VERY slow if you get 100000+ items. Pagination seems to be the only possible solution. As latest.php use an API call to get the item list, we need to add pagination to item->get().

          As a test, I hacked CItem.php to add support for simple pagination of the item->get() query, and also support for sorting by 'hostid', which is more relevant than 'name' or 'itemid' :

          @@ -28,7 +28,7 @@
          
                  protected $tableName = 'items';
                  protected $tableAlias = 'i';
          -       protected $sortColumns = array('itemid', 'name', 'key_', 'delay', 'history', 'trends', 'type', 'status');
          +       protected $sortColumns = array('itemid', 'name', 'key_', 'delay', 'history', 'trends', 'type', 'status', 'hostid');
          
                  public function __construct() {
                          parent::__construct();
          @@ -55,6 +55,7 @@
                   * @param bool   $options['count']
                   * @param string $options['pattern']
                   * @param int    $options['limit']
          +         * @param int    $options['offset']
                   * @param string $options['order']
                   *
                   * @return array|int item data as array or false if error
          @@ -116,6 +117,7 @@
                                  'sortfield'                                     => '',
                                  'sortorder'                                     => '',
                                  'limit'                                         => null,
          +                        'offset'                                        => 0,
                                  'limitSelects'                          => null
                          );
                          $options = zbx_array_merge($defOptions, $options);
          @@ -370,10 +372,11 @@
                          }
          
                          $itemids = array();
          +                $offset = $options['offset'];
                          $sqlParts = $this->applyQueryOutputOptions($this->tableName(), $this->tableAlias(), $options, $sqlParts);
                          $sqlParts = $this->applyQuerySortOptions($this->tableName(), $this->tableAlias(), $options, $sqlParts);
                          $sqlParts = $this->applyQueryNodeOptions($this->tableName(), $this->tableAlias(), $options, $sqlParts);
          -               $res = DBselect($this->createSelectQueryFromParts($sqlParts), $sqlParts['limit']);
          +               $res = DBselect($this->createSelectQueryFromParts($sqlParts), $sqlParts['limit'], $offset);
                          while ($item = DBfetch($res)) {
                                  if (!is_null($options['countOutput'])) {
                                          if (!is_null($options['groupCount'])) {
          

          As DBselect() already implement offsets, it is pretty trivial.

          Then I modified latest.php with a static pagination :

          @@ -193,7 +193,10 @@
                          'filter' => array(
                                  'status' => array(ITEM_STATUS_ACTIVE)
                          ),
          -               'preservekeys' => true
          +               'preservekeys' => true,
          +                'limit' => 1000,
          +                'offset' => 2000,
          +                'sortfield' => 'hostid'
                  ));
           }
          

          Of course, the latest data page now loads in a few seconds, even if the selection fields are set to "all". it's not perfect due to the sorting options, but it's better than nothing.

          ******************** Profileur de script ********************
          Temps total : 6.296404
          Temps SQL total : 5.470514
          Comptes SQL : 1079 (sélections : 1042 | exécutions : 37)
          Pic d'utilisation mémoire : 39.75M
          Limite mémoire : 2048M
          

          I know this would introduce an API change, but wouldn't it be possible to add support for paging in get() methods (maybe first in just a few methods like item->get()) ?

          Show
          Tatapoum added a comment - OK, as I understand it, the CHistoryManager class isn't really an issue as the getLastForItem() function can now be limited to a one day scope, which should be appropriate for most setups. Correct me if I'm wrong, but as I said above this, the biggest issue is that the latest data page tries to fetch all the items, which is useless if you have a high number of items and no selection filter. Moreover, it's not only the calls to getLastForItem() that are long, but the first call to the item->get() API method is VERY slow if you get 100000+ items. Pagination seems to be the only possible solution. As latest.php use an API call to get the item list, we need to add pagination to item->get(). As a test, I hacked CItem.php to add support for simple pagination of the item->get() query, and also support for sorting by 'hostid', which is more relevant than 'name' or 'itemid' : @@ -28,7 +28,7 @@ protected $tableName = 'items'; protected $tableAlias = 'i'; - protected $sortColumns = array('itemid', 'name', 'key_', 'delay', 'history', 'trends', 'type', 'status'); + protected $sortColumns = array('itemid', 'name', 'key_', 'delay', 'history', 'trends', 'type', 'status', 'hostid'); public function __construct() { parent::__construct(); @@ -55,6 +55,7 @@ * @param bool $options['count'] * @param string $options['pattern'] * @param int $options['limit'] + * @param int $options['offset'] * @param string $options['order'] * * @ return array| int item data as array or false if error @@ -116,6 +117,7 @@ 'sortfield' => '', 'sortorder' => '', 'limit' => null , + 'offset' => 0, 'limitSelects' => null ); $options = zbx_array_merge($defOptions, $options); @@ -370,10 +372,11 @@ } $itemids = array(); + $offset = $options['offset']; $sqlParts = $ this ->applyQueryOutputOptions($ this ->tableName(), $ this ->tableAlias(), $options, $sqlParts); $sqlParts = $ this ->applyQuerySortOptions($ this ->tableName(), $ this ->tableAlias(), $options, $sqlParts); $sqlParts = $ this ->applyQueryNodeOptions($ this ->tableName(), $ this ->tableAlias(), $options, $sqlParts); - $res = DBselect($ this ->createSelectQueryFromParts($sqlParts), $sqlParts['limit']); + $res = DBselect($ this ->createSelectQueryFromParts($sqlParts), $sqlParts['limit'], $offset); while ($item = DBfetch($res)) { if (!is_null($options['countOutput'])) { if (!is_null($options['groupCount'])) { As DBselect() already implement offsets, it is pretty trivial. Then I modified latest.php with a static pagination : @@ -193,7 +193,10 @@ 'filter' => array( 'status' => array(ITEM_STATUS_ACTIVE) ), - 'preservekeys' => true + 'preservekeys' => true , + 'limit' => 1000, + 'offset' => 2000, + 'sortfield' => 'hostid' )); } Of course, the latest data page now loads in a few seconds, even if the selection fields are set to "all". it's not perfect due to the sorting options, but it's better than nothing. ******************** Profileur de script ******************** Temps total : 6.296404 Temps SQL total : 5.470514 Comptes SQL : 1079 (sélections : 1042 | exécutions : 37) Pic d'utilisation mémoire : 39.75M Limite mémoire : 2048M I know this would introduce an API change, but wouldn't it be possible to add support for paging in get() methods (maybe first in just a few methods like item->get()) ?
          Hide
          Marc added a comment -

          Tatapoum, just for information:
          You might be interested in ZBXNEXT-370 as well, if not already aware of.
          Btw, in my case (200k items, 1.5k hosts), since 2.2.4 I can even omit the patch in ZBXNEXT-370 and just enforce 'None' resp. 'Not selected' for the Combo-Boxes in Latest data resp. Overview.

          Show
          Marc added a comment - Tatapoum , just for information: You might be interested in ZBXNEXT-370 as well, if not already aware of. Btw, in my case (200k items, 1.5k hosts), since 2.2.4 I can even omit the patch in ZBXNEXT-370 and just enforce 'None' resp. 'Not selected' for the Combo-Boxes in Latest data resp. Overview.
          Hide
          Tatapoum added a comment -

          Thank you Marc, I wasn't aware of this patch. I slightly modified it to make it work with 2.2.4 and it helps. People would now have to choose All/All explicitly, as the Latest Data page takes minutes to display with the All/All selections (300 hosts, almost 100k items checked quite often).
          However it doesn't really solve the issue, it's just a hack... The API really needs support for pagination, you can't download the full items/events/etc when you display a page. It works for small environments, but are small environments the target for a complex product like Zabbix ???
          AFAIK it shouldn't be very difficult to implement, even through the result isn't perfect. I'd like to help, but I'm not sure if Zabbix developers are willing to include patches to the API.

          Show
          Tatapoum added a comment - Thank you Marc, I wasn't aware of this patch. I slightly modified it to make it work with 2.2.4 and it helps. People would now have to choose All/All explicitly, as the Latest Data page takes minutes to display with the All/All selections (300 hosts, almost 100k items checked quite often). However it doesn't really solve the issue, it's just a hack... The API really needs support for pagination, you can't download the full items/events/etc when you display a page. It works for small environments, but are small environments the target for a complex product like Zabbix ??? AFAIK it shouldn't be very difficult to implement, even through the result isn't perfect. I'd like to help, but I'm not sure if Zabbix developers are willing to include patches to the API.
          Hide
          Thomas Daugherty added a comment -

          I see this issue as having been marked resolved, stemming from the change to limit the latest data page to 24 hours. That may solve the issue for smaller setups, but for larger setups we still can't use the latest data page to display large groups of host's data. With the filter set to all/all it takes nearly 10 minutes to load, with the httpd processes' memory high water mark of nearly 2.5GB... The latest data page worked fine prior to 2.2.

          Are there plans to revisit this issue?

          Show
          Thomas Daugherty added a comment - I see this issue as having been marked resolved, stemming from the change to limit the latest data page to 24 hours. That may solve the issue for smaller setups, but for larger setups we still can't use the latest data page to display large groups of host's data. With the filter set to all/all it takes nearly 10 minutes to load, with the httpd processes' memory high water mark of nearly 2.5GB... The latest data page worked fine prior to 2.2. Are there plans to revisit this issue?
          Hide
          Slash added a comment -

          I have the same issue as Thomas, for me the latest data page is still unusable. I even changed the limit from 24 hours to 1 hour but it still doesn't work for some group with many hosts.

          Could you please reopen this ticket?

          Show
          Slash added a comment - I have the same issue as Thomas, for me the latest data page is still unusable. I even changed the limit from 24 hours to 1 hour but it still doesn't work for some group with many hosts. Could you please reopen this ticket?
          Hide
          Rik Bruggink added a comment -

          Latest data page stil unusable on all settings in 2.2.5. Page load time in 2.2.1 was 27 sec load time in 2.2.5 28 secs vs 2.0.4 5 secs.
          Could you please reopen this issue.

          Show
          Rik Bruggink added a comment - Latest data page stil unusable on all settings in 2.2.5. Page load time in 2.2.1 was 27 sec load time in 2.2.5 28 secs vs 2.0.4 5 secs. Could you please reopen this issue.
          Hide
          Rik Bruggink added a comment -

          loadtime on 2.2.5

          Show
          Rik Bruggink added a comment - loadtime on 2.2.5
          Hide
          Slash added a comment -

          Zabbix 2.2.4, trying to display latest data for a group with many hosts.

          Show
          Slash added a comment - Zabbix 2.2.4, trying to display latest data for a group with many hosts.
          Hide
          David Parker added a comment -

          Quick patch for 2.2.4 UI to disable All in "Latest Data" - this is just a hack. With this you can retain the ability to select all in e.g. Events without the liability of having users be able to select All/All in latest data.

          — latest.php.stock 2014-08-07 10:02:53.708501638 -0500
          +++ latest.php 2014-08-07 10:02:13.756489939 -0500
          @@ -139,7 +139,8 @@
          'with_monitored_items' => true
          ),
          'hostid' => getRequest('hostid', null),

          • 'groupid' => getRequest('groupid', null)
            + 'groupid' => getRequest('groupid', null),
            + 'config' => array('deny_all' => 1)
            ));

          $_REQUEST['groupid'] = $pageFilter->groupid;

          and for overview:

          — overview.php.stock 2014-08-07 10:03:11.996506959 -0500
          +++ overview.php 2014-08-07 09:54:40.324349339 -0500
          @@ -76,6 +76,7 @@
          'applications' => array('templated' => false),
          'hostid' => get_request('hostid', null),
          'groupid' => get_request('groupid', null),
          + 'config' => array('deny_all' => 1),
          'application' => get_request('application', null)
          ));

          Show
          David Parker added a comment - Quick patch for 2.2.4 UI to disable All in "Latest Data" - this is just a hack. With this you can retain the ability to select all in e.g. Events without the liability of having users be able to select All/All in latest data. — latest.php.stock 2014-08-07 10:02:53.708501638 -0500 +++ latest.php 2014-08-07 10:02:13.756489939 -0500 @@ -139,7 +139,8 @@ 'with_monitored_items' => true ), 'hostid' => getRequest('hostid', null), 'groupid' => getRequest('groupid', null) + 'groupid' => getRequest('groupid', null), + 'config' => array('deny_all' => 1) )); $_REQUEST ['groupid'] = $pageFilter->groupid; and for overview: — overview.php.stock 2014-08-07 10:03:11.996506959 -0500 +++ overview.php 2014-08-07 09:54:40.324349339 -0500 @@ -76,6 +76,7 @@ 'applications' => array('templated' => false), 'hostid' => get_request('hostid', null), 'groupid' => get_request('groupid', null), + 'config' => array('deny_all' => 1), 'application' => get_request('application', null) ));
          Hide
          Marc added a comment -

          The previous patch snippet is actually not complete.
          See zbxNext370-light.patch found in ZBXNEXT-370

          Show
          Marc added a comment - The previous patch snippet is actually not complete. See zbxNext370-light.patch found in ZBXNEXT-370
          Hide
          iain wright added a comment -

          This is impacting us as well, significantly slower latest data on 2.2.3, will 2.4.0 help?

          Show
          iain wright added a comment - This is impacting us as well, significantly slower latest data on 2.2.3, will 2.4.0 help?
          Hide
          richlv added a comment -

          looks like this issue was handled in a very confusing way. there was something changed for 2.2.1 - but then, as per the comments, limiting to 24 hours was introduced in "2.2.4rc1 in r46266 and to 2.3.2 (trunk) in r46268"

          i'd suggest adding those two versions to the "fix version" field.

          Show
          richlv added a comment - looks like this issue was handled in a very confusing way. there was something changed for 2.2.1 - but then, as per the comments, limiting to 24 hours was introduced in "2.2.4rc1 in r46266 and to 2.3.2 (trunk) in r46268" i'd suggest adding those two versions to the "fix version" field.

            People

            • Assignee:
              Unassigned
              Reporter:
              Alexey Pustovalov
            • Votes:
              36 Vote for this issue
              Watchers:
              47 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: