Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-7566

Zabbix 2.2.0 server crash - out of memory CachSize

XMLWordPrintable

    • Icon: Incident report Incident report
    • Resolution: Duplicate
    • Icon: Critical Critical
    • None
    • 2.2.0
    • Server (S)

      After enabling a hostgroup (containing 40 hosts with 800 items each) the zabbix server crashed. From zabbix_server.log:

      8063:20131218:111916.035 __mem_malloc: skipped 59 asked 40754872 skip_min 256 skip_max 36587592
      8063:20131218:111916.080 __mem_malloc: skipped 60 asked 40754872 skip_min 256 skip_max 36587592
      8063:20131218:111916.080 file:strpool.c,line:53 zbx_mem_realloc(): out of memory (requested 40754872 bytes)
      8063:20131218:111916.081 file:strpool.c,line:53 zbx_mem_realloc(): please increase CacheSize configuration parameter
      7941:20131218:111916.338 One child process died (PID:8063,exitcode/signal:255). Exiting ...
      7941:20131218:111921.624 syncing history data...
      7941:20131218:111921.624 syncing history data done
      7941:20131218:111921.624 syncing trends data...
      7941:20131218:112029.511 syncing trends data done
      7941:20131218:112029.512 Zabbix Server stopped. Zabbix 2.2.0 (revision 40163).

      CacheSize is at its max in zabbix_server.conf (best we could do is CacheSize=2047M as CacheSize=2G does not work).

      Current cache settings:
      CacheSize=2047M
      HistoryCacheSize=2047M
      TrendCacheSize=2047M
      HistoryTextCacheSize=2047M
      ValueCacheSize=60G

      We have numerous hostgroups with the same number of hosts / items (this was a new host group we brought online when the server crashed), so we have a good understanding on the "impact" on the CacheSize for each host group we add (based on the internal configuration cache size item - which was showing 42% free for configuration cache prior to crash). We did not run out of memory at the OS level.

      We have checked kernel memory settings and have not found anything there that may be a problem (and nothing logged by kernel at time of crash).

        1. 4GB_cache.png
          204 kB
          Jeff C.
        2. cache.png
          22 kB
          Jeff C.
        3. configuration_cache.png
          25 kB
          Mike Davis
        4. crash_after_split.png
          71 kB
          Jeff C.
        5. memory.png
          41 kB
          Jeff C.
        6. nvps.png
          29 kB
          Jeff C.
        7. status.png
          27 kB
          Jeff C.

            Unassigned Unassigned
            tehsuq Jeff C.
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: