[ZBX-23076] Zabbix Agent2 high memory usage Created: 2023 Jul 09 Updated: 2025 Mar 19 Resolved: 2023 Aug 18 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 6.4.4 |
Fix Version/s: | 6.0.22rc1, 6.4.7rc1, 7.0.0alpha4, 7.0 (plan) |
Type: | Problem report | Priority: | Blocker |
Reporter: | Alpha | Assignee: | Eriks Sneiders |
Resolution: | Duplicate | Votes: | 2 |
Labels: | agent | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Debian 12 latest |
Attachments: |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
||||||||||||||||||||
Issue Links: |
|
||||||||||||||||||||
Team: | |||||||||||||||||||||
Sprint: | Sprint 103 (Aug 2023) |
Description |
Result: ● zabbix-agent2.service - Zabbix Agent 2 Loaded: loaded (/lib/systemd/system/zabbix-agent2.service; enabled; preset: enabled) Active: active (running) since Sat 2023-07-08 11:25:26 HKT; 23h ago Main PID: 653408 (zabbix_agent2) Tasks: 8 (limit: 2322) Memory: 553.4M CPU: 10min 33.137s CGroup: /system.slice/zabbix-agent2.service └─653408 /usr/sbin/zabbix_agent2 -c /etc/zabbix/zabbix_agent2.conf I recently updated to version 6.6.4, it has a lot of memory usage. I can't find the cause of the problem, I haven't changed the configuration file since zabbix 6.4.0, Even if I restart zabbix service, the memory will increase very quickly. I have this problem on more than one server. OS version: Debian 12 |
Comments |
Comment by Alpha [ 2023 Jul 09 ] |
Replenish: I am currently not using any plugins, just installed zabbix-agent2. The last agent2 version I used was 6.4.3. It didn't have this problem. |
Comment by Edgar Akhmetshin [ 2023 Jul 10 ] |
Hello Please attach to the issue as a text file output from: zabbix_agent2 -R metrics cat /proc/653408/smaps Where 653408 is the pid from the above example given by you, if it's changed, find it first with ps/pgrep utils. Regards, |
Comment by Alpha [ 2023 Jul 10 ] |
metrics and smaps: |
Comment by Edgar Akhmetshin [ 2023 Jul 10 ] |
7ffa46926000-7ffa56e9f000 ---p 00000000 00:00 0 Size: 267748 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 0 kB Pss: 0 kB Pss_Dirty: 0 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 0 kB Anonymous: 0 kB LazyFree: 0 kB AnonHugePages: 0 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 1 VmFlags: mr mw me sd Please install GDB and execute: gdb --pid 653408 dump memory /tmp/ZBX-23076.mem.dump 0x7ffa46926000 0x7ffa56e9f000 Also provide: cat /proc/653408/maps Compress and attach to the issue '/tmp/ |
Comment by Alpha [ 2023 Jul 11 ] |
cat /proc/79151/maps: GNU gdb (Debian 13.1-3) 13.1 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 79151 [New LWP 79152] [New LWP 79153] [New LWP 79154] [New LWP 79155] [New LWP 79164] [New LWP 79171] [New LWP 79607] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x000000000048d883 in ?? () (gdb) dump memory /tmp/ZBX-23076.mem.dump 0x7ffa46926000 0x7ffa56e9f000 (gdb) exit |
Comment by Edgar Akhmetshin [ 2023 Jul 11 ] |
Thank you. Also is it possible to share the configuration file? |
Comment by Alpha [ 2023 Jul 11 ] |
Already shared, please check the attachment. zabbix_agent2.conf |
Comment by Edgar Akhmetshin [ 2023 Jul 15 ] |
Apply this patch: Build Agent 2 on this system, after execute: go tool pprof http://localhost:8080/debug/pprof/heap go tool pprof http://localhost:8080/debug/pprof/allocs |
Comment by Alpha [ 2023 Jul 15 ] |
./configure --enable-agent2 --enable-ipv6 --with-libcurl --with-libxml2 --with-openssl make -j4 cp src/go/bin/zabbix_agent2 /usr/sbin/zabbix_agent2
Executing the command will generate these two files: /root/pprof/pprof.zabbix_agent2.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz /root/pprof/pprof.zabbix_agent2.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz Do you need both files?
|
Comment by João Pedro Dezembro [ 2023 Jul 19 ] |
Same problem here. RAM usage inits with 22mb and goes up until hit 100% available from the server. |
Comment by Jürgen Narits [ 2023 Jul 19 ] |
Same issue here too, on 6.0.19. |
Comment by Edgar Akhmetshin [ 2023 Jul 31 ] |
Kane yes, please. You can share using public cloud if their size exceeds 15MB. |
Comment by Alpha [ 2023 Jul 31 ] |
pprof.zabbix_agent2.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz pprof.zabbix_agent2.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz @edgar.akhmetshin Can you test it yourself on a VM? This is a common problem that you can easily notice. You can locate the cause of the problem faster. |
Comment by Edgar Akhmetshin [ 2023 Aug 01 ] |
Thank you, we will check. |
Comment by Edgar Akhmetshin [ 2023 Aug 01 ] |
Hello Kane Is it possible to run that pprof enabled version for a longer period of time (catch 500mb usage and execute once again go tool command to get files), unfortunately we are not able to reproduce the issue locally. Currently we can see only this from the debug given: File: zabbix_agent2 Build ID: 42d6d7b71890141fad6b4bf9aee06eceb70766e1 Type: alloc_space Time: Jul 15, 2023 at 2:32pm (UTC) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 31737.49kB, 89.85% of 35322.17kB total Showing top 10 nodes out of 92 flat flat% sum% cum cum% 22564.65kB 63.88% 63.88% 27036.29kB 76.54% compress/flate.NewWriter 3821.02kB 10.82% 74.70% 3821.02kB 10.82% compress/flate.(*compressor).initDeflate (inline) 1056.33kB 2.99% 77.69% 1056.33kB 2.99% compress/flate.(*dictDecoder).init (inline) 1026kB 2.90% 80.60% 1026kB 2.90% zabbix.com/plugins/proc.read2k 650.62kB 1.84% 82.44% 4471.64kB 12.66% compress/flate.(*compressor).init 544.67kB 1.54% 83.98% 544.67kB 1.54% net.open 528.17kB 1.50% 85.47% 528.17kB 1.50% regexp.(*bitState).reset 519.03kB 1.47% 86.94% 519.03kB 1.47% sync.(*Map).dirtyLocked 514kB 1.46% 88.40% 514kB 1.46% bufio.(*Scanner).Scan 513kB 1.45% 89.85% 1569.33kB 4.44% zabbix.com/pkg/zbxcomms.(*Connection).read Regards, |
Comment by Vladislavs Sokurenko [ 2023 Aug 01 ] |
This could be related, please try building with latest go version and see if issue persists. |
Comment by Edgar Akhmetshin [ 2023 Aug 01 ] |
Latest packages are already build with GO 1.20.6, please try 6.4.5 package. |
Comment by Alpha [ 2023 Aug 01 ] |
@edgar.akhmetshin I've tried the latest version 6.4.5 and the problem didn't fix. |
Comment by Edgar Akhmetshin [ 2023 Aug 02 ] |
Kane |
Comment by Johan Van de Wauw [ 2023 Aug 02 ] |
A question as a user, what would be the best way to mitigate this issue? Installing 6.4.3? Rather than monitoring, zabbix-agent is now causing problems. |
Comment by Edgar Akhmetshin [ 2023 Aug 02 ] |
Hello johanvdw You can follow this: Start such Agent and than collect pprof files for heap and allocs, this will help us to find the root cause or may be you can share exact steps to reproduce the issue, with templates and metrics to use. Regards, |
Comment by Johan Van de Wauw [ 2023 Aug 03 ] |
I'll try applying the patch. We observe the issue on our debian 11 servers. Servers were configured using the zabbix_agent ansible role. We use TLS. ```
roles: - name: monitor in zabbix ``` I made the switch to both ssl and agent2 in the same run on those hosts, so it's hard to isolate the cause. Installed zabbix-agent version was 6.4.4 |
Comment by Alpha [ 2023 Aug 04 ] |
root@localhost:~# go tool pprof http://localhost:8080/debug/pprof/heap Fetching profile over HTTP from http://localhost:8080/debug/pprof/heap Saved profile in /root/pprof/pprof.zabbix_agent2.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz File: zabbix_agent2 Build ID: 42d6d7b71890141fad6b4bf9aee06eceb70766e1 Type: inuse_space Time: Aug 5, 2023 at 12:41am (HKT) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 1418.60kB, 100% of 1418.60kB total Showing top 10 nodes out of 18 flat flat% sum% cum cum% 902.59kB 63.63% 63.63% 902.59kB 63.63% compress/flate.NewWriter 516.01kB 36.37% 100% 516.01kB 36.37% reflect.addReflectOff 0 0% 100% 902.59kB 63.63% compress/flate.NewWriterDict 0 0% 100% 902.59kB 63.63% compress/zlib.(*Writer).Write 0 0% 100% 902.59kB 63.63% compress/zlib.(*Writer).writeHeader 0 0% 100% 516.01kB 36.37% git.zabbix.com/ap/plugin-support/plugin.RegisterMetrics 0 0% 100% 516.01kB 36.37% git.zabbix.com/ap/plugin-support/plugin.registerMetric 0 0% 100% 516.01kB 36.37% reflect.(*rtype).Method 0 0% 100% 516.01kB 36.37% reflect.FuncOf 0 0% 100% 516.01kB 36.37% reflect.resolveReflectName (inline) (pprof) exit root@localhost:~# go tool pprof http://localhost:8080/debug/pprof/allocs Fetching profile over HTTP from http://localhost:8080/debug/pprof/allocs Saved profile in /root/pprof/pprof.zabbix_agent2.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz File: zabbix_agent2 Build ID: 42d6d7b71890141fad6b4bf9aee06eceb70766e1 Type: alloc_space Time: Aug 5, 2023 at 12:41am (HKT) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 31.41GB, 92.34% of 34.01GB total Dropped 296 nodes (cum <= 0.17GB) Showing top 10 nodes out of 55 flat flat% sum% cum cum% 21.43GB 62.99% 62.99% 26.15GB 76.89% compress/flate.NewWriter 4.58GB 13.47% 76.46% 4.58GB 13.47% compress/flate.(*compressor).initDeflate (inline) 1.49GB 4.39% 80.85% 1.49GB 4.39% bytes.growSlice 0.98GB 2.87% 83.72% 0.98GB 2.87% compress/flate.(*dictDecoder).init (inline) 0.88GB 2.60% 86.32% 0.88GB 2.60% bufio.(*Scanner).Scan 0.55GB 1.62% 87.94% 0.58GB 1.69% zabbix.com/pkg/tls.NewServer 0.49GB 1.43% 89.37% 0.49GB 1.44% zabbix.com/plugins/proc.read2k 0.40GB 1.19% 90.56% 0.40GB 1.19% net.dnsPacketRoundTrip 0.34GB 0.99% 91.55% 1.56GB 4.59% zabbix.com/pkg/zbxcomms.(*Connection).read 0.27GB 0.79% 92.34% 0.93GB 2.73% zabbix.com/plugins/proc.getProcessState (pprof) exit new pprof: @edgar.akhmetshin The problem can be reproduced on any debian server, but it takes a long time to wait for the memory leak. Please consider raising the priority, the current issue is not trivial. |
Comment by Vladislavs Sokurenko [ 2023 Aug 07 ] |
Total: 34.01GB ROUTINE ======================== compress/flate.NewWriter in /usr/local/go/src/compress/flate/deflate.go 21.43GB 26.15GB (flat, cum) 76.89% of Total . . 665:func NewWriter(w io.Writer, level int) (*Writer, error) { 21.43GB 21.43GB 666: var dw Writer . 4.73GB 667: if err := dw.d.init(w, level); err != nil { . . 668: return nil, err . . 669: } . . 670: return &dw, nil . . 671:} . . 672: ROUTINE ======================== compress/flate.NewWriterDict in /usr/local/go/src/compress/flate/deflate.go 1MB 26.14GB (flat, cum) 76.86% of Total . . 679:func NewWriterDict(w io.Writer, level int, dict []byte) (*Writer, error) { 1MB 1MB 680: dw := &dictWriter{w} . 26.14GB 681: zw, err := NewWriter(dw, level) . . 682: if err != nil { . . 683: return nil, err . . 684: } . . 685: zw.d.fillWindow(dict) . . 686: zw.dict = append(zw.dict, dict...) // duplicate dictionary for Reset method. |
Comment by Vladislavs Sokurenko [ 2023 Aug 08 ] |
Is issue reproducible when monitoring logs, or using global regular expressions or perhaps when TLS is used ? |
Comment by João Pedro Dezembro [ 2023 Aug 08 ] |
On my side, I can reproduce only when I enable an item that uses log or log.count. I use regex on those items to identify some patterns in the logs. After turning on these items, in less than 48h I already have almost 90% of RAM in use (versus 50% on average, when turned off) |
Comment by Vladislavs Sokurenko [ 2023 Aug 08 ] |
Could you please be so kind and share if there is something unusual with those items, at least if global regular expression is used or ordinary and if possible please share regexp. |
Comment by João Pedro Dezembro [ 2023 Aug 08 ] |
I am using the already compiled version 6.4.4 agent2, available here: https://repo.zabbix.com/zabbix/6.4/ubuntu/pool/main/z/zabbix-release/(...).deb ldd output: linux-vdso.so.1 (0x00007ffd27bdd000)
About the regex, nothing out of the ordinary. just to filter by log severity. Example: log[/home/ubuntu/.pm2/logs/api-out.log,.*critical 20.*,,10,all] My host template with log has 8 log items, like the example below. All follow the same regex pattern, changing only the keyword (or the non-existence of them, with "ALL"). |
Comment by Vladislavs Sokurenko [ 2023 Aug 08 ] |
Also please see: |
Comment by Vladislavs Sokurenko [ 2023 Aug 08 ] |
Is encryption used, could that be the cause ? |
Comment by Aleksandre Sebiskveradze [ 2023 Aug 18 ] |
Duplicates https://support.zabbix.com/browse/ZBX-23221 |