Zabbix Integration with Kubernetes (ZBXNEXT-4635)

[ZBXNEXT-6932] Kubernetes Nodes Created: 2021 Sep 22  Updated: 2022 Mar 15  Resolved: 2022 Jan 31

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: Templates (T)
Affects Version/s: None
Fix Version/s: 6.0.0rc1, 6.0 (plan)

Type: Specification change (Sub-task) Priority: Trivial
Reporter: Aleksandrs Larionovs (Inactive) Assignee: Anton Fayantsev (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File kube_get_node.json    
Team: Team INT
Sprint: Sprint 58 (Nov 2019), Sprint 59 (Dec 2019), Sprint 60 (Jan 2020), Sprint 61 (Feb 2020), Sprint 62 (Mar 2020), Sprint 63 (Apr 2020), Sprint 64 (May 2020), Sprint 65 (Jun 2020), Sprint 66 (Jul 2020), Sprint 67 (Aug 2020), Sprint 68 (Sep 2020), Sprint 69 (Oct 2020), Sprint 70 (Nov 2020), Sprint 71 (Dec 2020), Sprint 72 (Jan 2021), Sprint 73 (Feb 2021), Sprint 74 (Mar 2021), Sprint 75 (Apr 2021), Sprint 76 (May 2021), Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 79 (Aug 2021), Sprint 80 (Sep 2021), Sprint 81 (Oct 2021), Sprint 82 (Nov 2021), Sprint 83 (Dec 2021), Sprint 84 (Jan 2022)
Story Points: 3

 Comments   
Comment by Dimitri Bellini [ 2022 Jan 25 ]

I have tested the new Kubernetes Template of Zabbix 6.0beta3 using a k3s v.1.22.5 but after applying the "Kubernetes nodes by HTTP" on my "host" i have this error:

182:20220125:144655.726 [ Kubernetes ] ERROR: Request failed with status code 403: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"forbidden: User \"system:serviceaccount:monitoring:zabbix-service-account\" cannot get path \"/v1/nodes\"","reason":"Forbidden","details":{},"code":403}

I have try to modify the RBAC conf but without success...
Thanks any help

Comment by Yulia Chukina (Inactive) [ 2022 Feb 09 ]

dimitri.bellini Hi, i've just tested helm chart on  k3s 1.22.5 (Major:"1", Minor:"22", GitVersion:"v1.22.6+k3s1")
and all works fine out of the box 

Please, check that you use the latest version of Zabbix helm chart and the correct authorization token in template.

Token you have to get with this command:

kubectl get secret zabbix-service-account -n monitoring -o jsonpath={.data.token} | base64 -d

Also, you can check if the information about nodes is received correctly with curl:

 curl -k https://<YOUR_K3S_CLUSTER_IP>:6443/api/v1/nodes -H "Authorization: Bearer <YOUR_TOKEN_HERE>"

 

If problem is still actual, please, show us results of these commands:

# kubectl describe clusterrolebinding zabbix-zabbix-helm-chrt

# kubectl describe clusterrole zabbix-zabbix-helm-chrt

Comment by Dimitri Bellini [ 2022 Feb 10 ]

@Yulia: Thanks so much, I have retested from scratch and now it works! Why?
It's easy (if you know..) the Host Macro {$KUBE.API.ENDPOINT} must contains also "https://<YOUR_K3S_CLUSTER_IP>:6443/api" like this example "https://192.168.0.217:6443/api".
I also did not understand why by default the "Zabbix Proxy" deployed on Kube is Passive mode (as mention on helm config) but the option VALUE is on "0" so i supposed mean ACTIVE...
At the moment the item "Kubernetes: Get nodes" seems to work but the dependant "Kubernetes: Get nodes: Node LLD" is not working i have this error:

Preprocessing failed for: {"error":"Request failed with status code 403: {\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metad...
1. Failed: Discovery error: Incorrect JSON. Check debug log for more information.

On your test environment everything seems to work?
Thanks so much

Comment by Richard Ostrochovský [ 2022 Feb 14 ]

ychukina / afayantsev, I found something: it seems, that /etc/passwd in zabbix_agent2 (5.4.10) uses file inside the container and not that from K8S node OS.

bash-5.1$ ls -al /etc/passwd
-rw-r--r-- 1 root root 1289 Feb  8 15:51 /etc/passwd

bash-5.1$ cat /etc/passwd
root:x:0:0:root:/root:/bin/ash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/mail:/sbin/nologin
news:x:9:13:news:/usr/lib/news:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucppublic:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
man:x:13:15:man:/usr/man:/sbin/nologin
postmaster:x:14:12:postmaster:/var/mail:/sbin/nologin
cron:x:16:16:cron:/var/spool/cron:/sbin/nologin
ftp:x:21:21::/var/lib/ftp:/sbin/nologin
sshd:x:22:22:sshd:/dev/null:/sbin/nologin
at:x:25:25:at:/var/spool/cron/atjobs:/sbin/nologin
squid:x:31:31:Squid:/var/cache/squid:/sbin/nologin
xfs:x:33:33:X Font Server:/etc/X11/fs:/sbin/nologin
games:x:35:35:games:/usr/games:/sbin/nologin
cyrus:x:85:12::/usr/cyrus:/sbin/nologin
vpopmail:x:89:89::/var/vpopmail:/sbin/nologin
ntp:x:123:123:NTP:/var/empty:/sbin/nologin
smmsp:x:209:209:smmsp:/var/spool/mqueue:/sbin/nologin
guest:x:405:100:guest:/dev/null:/sbin/nologin
nobody:x:65534:65534:nobody:/:/sbin/nologin
utmp:x:100:406:utmp:/home/utmp:/bin/false
zabbix:x:1997:1995:Zabbix monitoring system:/var/lib/zabbix/:/sbin/nologin 

Therefore on helm redeployment of container/pod of older zabbix_agent2 image version with new one, "/etc/passwd has been changed" was triggered.

I consider it as false alarm, not saying anything about K8S node, but only about zabbix_agent2 container (and this file in the pod/container is of course always updated on /etc/passwd updated with new image).

(For example, for /etc/hosts this is not an issue, as it is managed by Kubernetes - host network.)

This issue is not strictly related to K8S nodes monitoring, but to zabbix_agent2 Docker images used in helmchart. Should I/we create separate Jira issue for it?

Thank you!

Comment by Dimitri Bellini [ 2022 Feb 15 ]

I forgot to attach the json output of "Kubernetes: Get nodes" item ( kube_get_node.json ) the relative dependant item "Kubernetes: Get nodes: Node LLD" seems not work on my environment.
The error is:

Preprocessing failed for: {"error":"Request failed with status code 403: {\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metad...
1. Failed: Discovery error: Incorrect JSON. Check debug log for more information.

Thanks for any help

Update1: looking to the Zabbix server Log i can see this error:

1880406:20220215:104955.999 [ Kubernetes discovery ] Filtered node "k3s01.quadrata.it"
1880406:20220215:104955.999 [ Kubernetes discovery ] Node "k3s01.quadrata.it" is not included in the list of endpoint IPs

Where "k3s01.quadrata.it" is the only one node of my kube cluster

Comment by Richard Ostrochovský [ 2022 Feb 15 ]

dimitri.bellini, I was fixing something like this some time ago, maybe it would be the same case?

After I removed following condition from JavaScript preprocessing, it started to work:

if (internalIP in input.endpointIPs) {

If I remember it correctly, the issue was that endpointIPs was missing in JSON input.

I know, it was not an ultimate solution, just an workaround, and I hope, I'll revisit and analyze it in more detail later, after upgrade to ZBX 6.0 (which will take some time, hopefully in 3/2022).

K8S version was: v1.20.8.

Comment by Dimitri Bellini [ 2022 Feb 15 ]

Hi Richard,
thank you for the suggestion and yes is the same situation, the json does not contain the information for the "endpointIPs".
This problem is something related to my use case (k3s) or it is something wrong inside the Official Kube Template? Maybe this question is more related to the Zabbix Dev team
Thank so much

UPDATE: the real problem on my use case is that we do not have any kind of "input.endpointIPs" and i did not understand where the JS collect those values...

Comment by Christian Anton [ 2022 Mar 15 ]

Hey!

Nice to see I am not the only one struggling due to the fact that the documentation is quite ... "basic" for all this topic.

I have spent some hours debugging this same issue of the "Nodes LLD" item returning an empty array.

Reason was, as Dimitri noted, that the JSON produced by the "Get Nodes" item did not include any "endpointIPs". The reason for THAT is, indeed, that the "Nodes LLD" fetches endpoint ips only for the endpoint which is given with the macro {}{$KUBE.NODES.ENDPOINT.NAME}, which is set to "zabbix-agent" by default inside the template. This default name does not conclude with the documented way to init this template: installing the zabbix proxy via helm chart:

kubectl get endpoints -A | grep zabbix

monitoring                 zabbix-kube-state-metrics                     10.42.230.39:8080                                                            26d
monitoring                 zabbix-zabbix-helm-chrt-agent                 192.168.213.29:10050,192.168.213.30:10050,192.168.213.31:10050               26d
monitoring                 zabbix-zabbix-helm-chrt-proxy                 10.42.230.38:10051                                                           26d

 

So, the value the macro {}{$KUBE.NODES.ENDPOINT.NAME} should be set to would be "zabbix-zabbix-helm-chrt-agent" instead.

After this change, the "Nodes LLD" item receives correct data and the two LLD rules start to work:

{{[{"

{#NAME}":"ca-rke2-01","{#IP}":"192.168.213.29","{#ROLES}":"control-plane, etcd, master, worker","{#ARCH}":"amd64","{#OS}":"linux"},{"{#NAME}

":"ca-rke2-02","{#IP}":"192.168.213.30","{#ROLES}":"control-plane, etcd, master, worker","{#ARCH}":"amd64","{#OS}":"linux"},{"

{#NAME}

":"ca-rke2-03","{#IP}":"192.168.213.31","{#ROLES}":"control-plane, etcd, master, worker","{#ARCH}":"amd64","{#OS}":"linux"}]}}

 

Tested on a three-node RKE2 cluster, deployed and managed by rancher

Generated at Mon Apr 28 10:18:41 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.