[ZBX-21571] Kubernetes: no trigger prototype associated with "Containers ready" condition Created: 2022 Sep 06 Updated: 2024 Apr 10 Resolved: 2023 Jul 07 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Templates (T) |
Affects Version/s: | 6.2.1 |
Fix Version/s: | 6.0.20rc1, 6.4.5rc1, 7.0.0alpha3, 7.0 (plan) |
Type: | Problem report | Priority: | Trivial |
Reporter: | Julien Le Huludut | Assignee: | Denis Rasikhov |
Resolution: | Fixed | Votes: | 0 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Attachments: | pod_latest_data.png |
Team: | Team INT |
Sprint: | Sprint 102 (Jul 2023) |
Story Points: | 1 |
Description |
Hi there ! We test monitoring our kubernetes cluster via zabbix with the "Kubernetes nodes by HTTP" template version 6.2.1 Description: When a container is failing in a pod, kubernetes tries to restart it until it reaches the "CrashLoopBackoff" state. But while a container inside a pod is in this state, no alert is shown in zabbix. Expected behaviour: When any container is in "CrashLoopBackoff" state, a warning should be triggered by default.
Steps to reproduce: 1. Create a faulty container, let's call it Gorbatchev apiVersion: v1 kind: Pod metadata: name: gorbatchev namespace: test-zabbix spec: containers: - image: "busybox" name: gorbatchev #This command will cause the container to fail args: ["perestroika"] 2. wait until it reaches the CrashLoopBackoff state NAME READY STATUS RESTARTS AGE pod/gorbatchev 0/1 CrashLoopBackOff 7 13m Result: Expected:
We were going to create the trigger on our zabbix instance but maybe this should be the default on the template ? Any pod in this state is an issue for cluster admins to investigate IMO.
Thanks in advance – Julien
|
Comments |
Comment by Denis Rasikhov [ 2023 Jun 28 ] |
If some of the containers are not ready in the pod, it doesn't directly means that the pod is in the CrashLoopBackOff state. In many examples of the AlertManager rules still the number of container restarts is used to determine that state. According to the Kubernetes documentation containers are restarted with an increasing exponential delay with a maximum of 5 minutes. Because of that the trigger expression will never work in case if there is only 1 container in the pod and will fire during first couple of minutes after pod creation if there is more than one container, but it'll close after that delay increases. For the proper functioning of the trigger thresholds must be adjusted to include cases with only one container in the pod as well as the evaluation period should be increased to take exponentiality of the back-off delay into account. |
Comment by Denis Rasikhov [ 2023 Jul 02 ] |
Fixed in:
|