-
Change Request
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
None
Hello,
Starting from 1st part of monitoring setup of K8S with use of Zabbix this part:
# source https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/kubernetes_http?at=release/7.4 Templates are of two types: Cluster node monitoring Kubernetes nodes by HTTP template discovers cluster nodes, creates hosts in Zabbix based on prototypes and assigns the "Linux by Zabbix agent" template to them. The template collects basic node metrics via the Kubernetes API. Main cluster components monitoring Kubernetes cluster state by HTTP discovers cluster components and control plane nodes, creates Zabbix hosts and assigns the required templates to them.
This creates an issue of a need to create two new hosts directly in zabbix - if user would try to add two templates at once, to one host, error would be displayed, example error:
Cannot inherit LLD rules with key "kube.node.discovery" of both "Kubernetes nodes by HTTP" and "Kubernetes cluster state by HTTP" templates, because the key must be unique on host "K8s-master". [zabbix.php:17 → require_once() → ZBase->run() → ZBase->processRequest() → CController->run() → CControllerHostCreate->doAction() → CApiWrapper->__call() → CFrontendApiWrapper->callMethod() → CApiWrapper->callMethod() → CFrontendApiWrapper->callClientMethod() → CLocalApiClient->callMethod() → CHost->create() → CHostGeneral->updateTemplates() → CHostGeneral::linkTemplatesObjects() → CDiscoveryRule::linkTemplateObjects() → CDiscoveryRule::inherit() → CItemGeneral::checkDoubleInheritedNames() → CApiService::exception() in include/classes/api/services/CItemGeneral.php:689]
For ease of use it would be much better to configure this within one Zabbix host,
Another thing is cluster nodes monitoring - by default, if we did follow suggested method to configure monitoring, we would have one host with "Kubernetes nodes by HTTP",
But we would also have nodes of cluster discovered, and then monitored with Linux by ZA template,
Main issue is that host with "Kubernetes nodes by HTTP" does also contain items which do "monitor" same node,
why to basically store items which do same monitoring or at least add those items to newly created host?
Another issue is "kubelet discovery" - why this creates another host, instead of just creating items on already created node?
this creates an issue of "data clarity" - in case of my test system which does have 3 nodes, i do have:
- 3 discovered nodes,
- 3 discovered kubelets,
- 1 controller manager discovered,
- 1 api servers discovered,
- 1 scheduler discovered
"architecture" of test k8s:
kubectl get nodes -o wide -n monitoring NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s Ready control-plane 117d v1.30.14 192.168.99.28 <none> Ubuntu 24.04.2 LTS 6.8.0-85-generic containerd://1.7.27 k8s2 Ready <none> 117d v1.30.14 192.168.99.27 <none> Ubuntu 24.04.3 LTS 6.8.0-79-generic containerd://1.7.27 k8s3 Ready <none> 117d v1.30.14 192.168.99.26 <none> Ubuntu 24.04.3 LTS 6.8.0-79-generic containerd://1.7.27
Main problem is that for example host with template "kubernetes nodes by http" does contain information about pods, example:
name: Node [k8s2] Pod [calico-node-jkpnk] Conditions: Initialized key: kube.pod.conditions.initialized[calico-node-jkpnk]
But at same time, hosts generated with "kubelet discovery" also do contain information pods, example:
name: Namespace [calico-system] Pod [calico-node-jkpnk] CPU: Load average, 10s key: kube.pod.container_cpu_load_average_10s[calico-system/calico-node-jkpnk]
why to store two important items on two different hosts even, if they are about same entity, same pod?
another additional feature which could be added is details about network "usage", since this data is already reported by cAdvisor, example:
# documentation: # https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md # HELP container_network_receive_bytes_total Cumulative count of bytes received # TYPE container_network_receive_bytes_total counter container_network_receive_bytes_total{container="",id="/",image="",interface="cali9e95e943e0a",name="",namespace="",pod=""} 35217 1762803290450 container_network_receive_bytes_total{container="",id="/",image="",interface="calic3fe71ecbea",name="",namespace="",pod=""} 866 1762803290450 container_network_receive_bytes_total{container="",id="/",image="",interface="calic7621231aad",name="",namespace="",pod=""} 130419 1762803290450 container_network_receive_bytes_total{container="",id="/",image="",interface="ens18",name="",namespace="",pod=""} 6.9779717e+07 1762803290450 container_network_receive_bytes_total{container="",id="/",image="",interface="vxlan.calico",name="",namespace="",pod=""} 0 1762803290450 container_network_receive_bytes_total{container="",id="/system.slice/kubepods-besteffort-pod0ae8f591_e421_4ff0_85bd_dbdbf338cd4d.slice:cri-containerd:5a0375a6604c4e2b8c2f769b7f0d6edb92c3dc310223319d97c068052ce6ca50",image="registry.k8s.io/pause:3.8",interface="eth0",name="5a0375a6604c4e2b8c2f769b7f0d6edb92c3dc310223319d97c068052ce6ca50",namespace="monitoring",pod="zabbix-kube-state-metrics-6f7df5f8c9-fn6n6"} 1.313828e+06 1762803292466
considering it would be great to:
- add "network" monitoring, since cAdvisor already reports this data
- make templates more "clear" to use; if something contains details about pods, store it only in one place, for example on discovered kubelet,
- same thing with nodes, it would be great to store everything on one zabbix host, not on couple of hosts,
this would make "out-of-the-box" template more useful for k8s monitoring,
- mentioned in
-
Page Loading...