Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-15956

Configuration Cache Fragmentation

    XMLWordPrintable

    Details

    • Type: Problem report
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 4.0.5
    • Component/s: Server (S)
    • Environment:
      Ubuntu 16.04
      Linux zabbix-backend 4.4.0-143-generic #169-Ubuntu SMP Thu Feb 7 07:56:38 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
    • Team:
      Team A
    • Sprint:
      Sprint 51 (Apr 2019)
    • Story Points:
      3

      Description

      I'm submitting this based on advice received in ZBXNEXT-5113.

      I've run into an issue where our zabbix_server process crashes due to exhaustion of the configuration cache - specifically, lack of an available memory chunk large enough to fulfill the allocation request.  A log message suggested increasing the cachesize, but we were already at the 8GB limit.  I submitted a feature request, ZBXNEXT-5113, to have the limit increased and provided a proposed patch, requesting that it be reviewed for validity.  However, I had also noticed that the configuration cache % free historical data showed 72% free.  Reviewing the error messages indicated that the memory is highly fragmented. 

       

      memory of total size 12884901512 bytes fragmented into 22921433 chunks
      of those, 10416940744 bytes are in   195410 free chunks
      of those, 2101217856 bytes are in 22726023 used chunks
      

       

      I applied the patch I had proposed, built, increased the configuration cache value from 8GB to 12 GB, and waited.  Previously, with the 8GB cachesize, we were getting a zabbix server crash every 10-15 days (2-3 days with cachesize previously set to 4GB).  With the 12GB cachesize, we went 21 days before a crash occurred (log snippet above is from this crash).  At this point I've increased the cachesize to 16GB, but I feel this probably isn't the best fix, as it seems almost all of the configuration cache is tied up in unusable fragmented memory and will inevitably lead to another out of memory scenario.

      Is there something that can be done to improve memory allocation so it doesn't become so fragmented?  Thank you for your time.

      2019-01-24 Incident - cachesize 4GB (similar instances occurred on 2019-01-22, 2019-01-18, 2019-01-15)

       

        7054:20190124:185053.886 __mem_malloc: skipped 112009 asked 12903760 skip_min 256 skip_max 12628776
        7054:20190124:185053.886 [file:dbconfig.c,line:94] zbx_mem_realloc(): out of memory (requested 12903760 bytes)
        7054:20190124:185053.886 [file:dbconfig.c,line:94] zbx_mem_realloc(): please increase CacheSize configuration parameter
        7054:20190124:185053.886 === memory statistics for configuration cache ===
        7054:20190124:185053.886 free chunks of size     24 bytes:        3
        7054:20190124:185053.886 free chunks of size     48 bytes:        1
        7054:20190124:185053.886 free chunks of size     56 bytes:        1
        7054:20190124:185053.887 free chunks of size     64 bytes:        1
        7054:20190124:185053.887 free chunks of size     80 bytes:        2
        7054:20190124:185053.887 free chunks of size     96 bytes:     1614
        7054:20190124:185053.887 free chunks of size    104 bytes:        2
        7054:20190124:185053.908 free chunks of size >= 256 bytes:   112009
        7054:20190124:185053.908 min chunk size:         24 bytes
        7054:20190124:185053.908 max chunk size:   12628776 bytes
        7054:20190124:185053.908 memory of total size 4294966920 bytes fragmented into 14429213 chunks
        7054:20190124:185053.908 of those, 2736242192 bytes are in   113633 free chunks
        7054:20190124:185053.908 of those, 1327857336 bytes are in 14315580 used chunks
        7054:20190124:185053.908 ================================
        7054:20190124:185053.908 === Backtrace: ===
        7054:20190124:185053.909 13: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](zbx_backtrace+0x44) [0x5603ca76b9a2]
        7054:20190124:185053.909 12: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](__zbx_mem_realloc+0x14d) [0x5603ca7679b0]
        7054:20190124:185053.909 11: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](+0xc4eb6) [0x5603ca731eb6]
        7054:20190124:185053.909 10: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](+0x1009e9) [0x5603ca76d9e9]
        7054:20190124:185053.909 9: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](zbx_binary_heap_insert+0x69) [0x5603ca76df46]
        7054:20190124:185053.909 8: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](+0xd0549) [0x5603ca73d549]
        7054:20190124:185053.909 7: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](DCsync_configuration+0x1134) [0x5603ca73e762]
        7054:20190124:185053.909 6: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](dbconfig_thread+0x168) [0x5603ca6ad0de]
        7054:20190124:185053.909 5: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](zbx_thread_start+0x37) [0x5603ca7796cf]
        7054:20190124:185053.909 4: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](MAIN_ZABBIX_ENTRY+0x975) [0x5603ca6a4faa]
        7054:20190124:185053.909 3: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](daemon_start+0x315) [0x5603ca76b131]
        7054:20190124:185053.909 2: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](main+0x305) [0x5603ca6a461f]
        7054:20190124:185053.909 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f3be6306830]
        7054:20190124:185053.909 0: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1548380855.229265 sec, syncing configuration](_start+0x29) [0x5603ca6a37e9]
        7052:20190124:185054.437 One child process died (PID:7054,exitcode/signal:1). Exiting ...
      

       

       

      2019-03-14 Incident - cachesize 8GB (similar instances occurred on 2019-02-28, 2019-02-19, 2019-02-05)

       

        1868:20190314:034026.792 __mem_malloc: skipped 48828 asked 12903760 skip_min 256 skip_max 12857992
        1868:20190314:034026.792 [file:dbconfig.c,line:94] zbx_mem_realloc(): out of memory (requested 12903760 bytes)
        1868:20190314:034026.792 [file:dbconfig.c,line:94] zbx_mem_realloc(): please increase CacheSize configuration parameter
        1868:20190314:034026.792 === memory statistics for configuration cache ===
        1868:20190314:034026.792 free chunks of size     32 bytes:        8
        1868:20190314:034026.792 free chunks of size     48 bytes:        8
        1868:20190314:034026.792 free chunks of size     80 bytes:        1
        1868:20190314:034026.793 free chunks of size     96 bytes:     1592
        1868:20190314:034026.793 free chunks of size    104 bytes:        3
        1868:20190314:034026.793 free chunks of size    112 bytes:        1
        1868:20190314:034026.805 free chunks of size >= 256 bytes:    48828
        1868:20190314:034026.805 min chunk size:         32 bytes
        1868:20190314:034026.805 max chunk size:   12857992 bytes
        1868:20190314:034026.805 memory of total size 8589934216 bytes fragmented into 22273991 chunks
        1868:20190314:034026.805 of those, 6183292960 bytes are in    50441 free chunks
        1868:20190314:034026.805 of those, 2050257416 bytes are in 22223550 used chunks
        1868:20190314:034026.805 ================================
        1868:20190314:034026.805 === Backtrace: ===
        1868:20190314:034026.811 13: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](zbx_backtrace+0x44) [0x55b0add47ec6]
        1868:20190314:034026.811 12: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](__zbx_mem_realloc+0x14d) [0x55b0add43e09]
        1868:20190314:034026.811 11: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](+0xc6335) [0x55b0add0e335]
        1868:20190314:034026.811 10: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](+0x101f0d) [0x55b0add49f0d]
        1868:20190314:034026.811 9: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](zbx_binary_heap_insert+0x69) [0x55b0add4a46a]
        1868:20190314:034026.811 8: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](+0xd19a2) [0x55b0add199a2]
        1868:20190314:034026.811 7: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](DCsync_configuration+0x1134) [0x55b0add1abbb]
        1868:20190314:034026.811 6: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](dbconfig_thread+0x168) [0x55b0adc8883e]
        1868:20190314:034026.811 5: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](zbx_thread_start+0x37) [0x55b0add55bf3]
        1868:20190314:034026.811 4: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](MAIN_ZABBIX_ENTRY+0x975) [0x55b0adc8070a]
        1868:20190314:034026.811 3: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](daemon_start+0x315) [0x55b0add47655]
        1868:20190314:034026.811 2: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](main+0x305) [0x55b0adc7fd7f]
        1868:20190314:034026.811 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f22d47a3830]
        1868:20190314:034026.811 0: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1552556048.314516 sec, syncing configuration](_start+0x29) [0x55b0adc7ef49]
        1488:20190314:034027.817 One child process died (PID:1868,exitcode/signal:1). Exiting ...
      

       

      2019-04-04 Incident - cachesize 12GB

        2288:20190404:095916.682 __mem_malloc: skipped 193611 asked 12903760 skip_min 256 skip_max 12862152
        2288:20190404:095916.682 [file:dbconfig.c,line:94] zbx_mem_realloc(): out of memory (requested 12903760 bytes)
        2288:20190404:095916.682 [file:dbconfig.c,line:94] zbx_mem_realloc(): please increase CacheSize configuration parameter
        2288:20190404:095916.682 === memory statistics for configuration cache ===
        2288:20190404:095916.682 free chunks of size     32 bytes:       12
        2288:20190404:095916.682 free chunks of size     40 bytes:        2
        2288:20190404:095916.682 free chunks of size     48 bytes:        4
        2288:20190404:095916.682 free chunks of size     56 bytes:        1
        2288:20190404:095916.682 free chunks of size     80 bytes:        2
        2288:20190404:095916.682 free chunks of size     96 bytes:     1777
        2288:20190404:095916.682 free chunks of size    112 bytes:        1
        2288:20190404:095916.725 free chunks of size >= 256 bytes:   193611
        2288:20190404:095916.725 min chunk size:         32 bytes
        2288:20190404:095916.725 max chunk size:   12862152 bytes
        2288:20190404:095916.725 memory of total size 12884901512 bytes fragmented into 22921433 chunks
        2288:20190404:095916.725 of those, 10416940744 bytes are in   195410 free chunks
        2288:20190404:095916.725 of those, 2101217856 bytes are in 22726023 used chunks
        2288:20190404:095916.725 ================================
        2288:20190404:095916.725 === Backtrace: ===
        2288:20190404:095916.726 11: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](zbx_backtrace+0x3c) [0x4a0b7c]
        2288:20190404:095916.726 10: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](__zbx_mem_realloc+0x427) [0x49de97]
        2288:20190404:095916.727 9: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](zbx_binary_heap_insert+0xac) [0x4a23fc]
        2288:20190404:095916.727 8: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration]() [0x41dc8b]
        2288:20190404:095916.727 7: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](DCsync_configuration+0x8f1) [0x482751]
        2288:20190404:095916.727 6: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](dbconfig_thread+0xfe) [0x428fee]
        2288:20190404:095916.727 5: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](zbx_thread_start+0x3e) [0x4aa09e]
        2288:20190404:095916.727 4: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](MAIN_ZABBIX_ENTRY+0x6e6) [0x423f36]
        2288:20190404:095916.727 3: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](daemon_start+0x1bb) [0x4a060b]
        2288:20190404:095916.727 2: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](main+0x350) [0x423020]
        2288:20190404:095916.727 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fe1aeef0830]
        2288:20190404:095916.727 0: /usr/sbin/zabbix_server: configuration syncer [synced configuration in 1554393163.208568 sec, syncing configuration](_start+0x29) [0x4233a9]
        1497:20190404:095917.844 One child process died (PID:2288,exitcode/signal:1). Exiting ...
      

       

        Attachments

          Activity

            People

            Assignee:
            vso Vladislavs Sokurenko
            Reporter:
            brian.lloyd Brian Lloyd
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: