[#ZBX-21571] Kubernetes: no trigger prototype associated with "Containers ready" condition

[ZBX-21571] Kubernetes: no trigger prototype associated with "Containers ready" condition Created: 2022 Sep 06 Updated: 2024 Apr 10 Resolved: 2023 Jul 07
Status:	Closed
Project:	ZABBIX BUGS AND ISSUES
Component/s:	Templates (T)
Affects Version/s:	6.2.1
Fix Version/s:	6.0.20rc1, 6.4.5rc1, 7.0.0alpha3, 7.0 (plan)

Type:

Problem report

Priority:

Trivial

Reporter:

Julien Le Huludut

Assignee:

Denis Rasikhov

Resolution:

Fixed

Votes:

Labels:

None

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Attachments:

pod_latest_data.png

Team:

Team INT

Sprint:

Sprint 102 (Jul 2023)

Story Points:

Description

Hi there !

We test monitoring our kubernetes cluster via zabbix with the "Kubernetes nodes by HTTP" template version 6.2.1

Description:

When a container is failing in a pod, kubernetes tries to restart it until it reaches the "CrashLoopBackoff" state. But while a container inside a pod is in this state, no alert is shown in zabbix.

Expected behaviour:

When any container is in "CrashLoopBackoff" state, a warning should be triggered by default.

Steps to reproduce:

1. Create a faulty container, let's call it Gorbatchev

apiVersion: v1
kind: Pod
metadata:
  name: gorbatchev
  namespace: test-zabbix
spec: 
  containers: 
    - image: "busybox"
      name: gorbatchev
      #This command will cause the container to fail
      args: ["perestroika"]

2. wait until it reaches the CrashLoopBackoff state

NAME             READY   STATUS             RESTARTS   AGE
pod/gorbatchev   0/1     CrashLoopBackOff   7          13m

Result:
No alert is shown.

Expected:
A "trigger prototype" should be added to the template to alert when any pod has Conditions: Containers ready to false.

We were going to create the trigger on our zabbix instance but maybe this should be the default on the template ? Any pod in this state is an issue for cluster admins to investigate IMO.

Thanks in advance

–

Julien

Comments

Comment by Denis Rasikhov [ 2023 Jun 28 ]

If some of the containers are not ready in the pod, it doesn't directly means that the pod is in the CrashLoopBackOff state. In many examples of the AlertManager rules still the number of container restarts is used to determine that state. According to the Kubernetes documentation containers are restarted with an increasing exponential delay with a maximum of 5 minutes. Because of that the trigger expression will never work in case if there is only 1 container in the pod and will fire during first couple of minutes after pod creation if there is more than one container, but it'll close after that delay increases. For the proper functioning of the trigger thresholds must be adjusted to include cases with only one container in the pod as well as the evaluation period should be increased to take exponentiality of the back-off delay into account.

Comment by Denis Rasikhov [ 2023 Jul 02 ]

Fixed in:

6.0.20rc1 3d2ac8d7ad2
6.4.5rc1 4f534198f5e
7.0.0alpha3 a5bc8ebda43

Generated at Tue Jan 07 17:25:36 EET 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.

[ZBX-21571] Kubernetes: no trigger prototype associated with "Containers ready" condition Created: 2022 Sep 06 Updated: 2024 Apr 10 Resolved: 2023 Jul 07