[ZBX-12969] Agent user scripts merging stderr with stdout has bad consequences Created: 2017 Nov 01  Updated: 2017 Nov 02  Resolved: 2017 Nov 01

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 3.2.7
Fix Version/s: None

Type: Incident report Priority: Major
Reporter: Telford Tendys Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: UserParameters, agent, mysql, ping, stderr
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS-7 using packaged Zabbix
zabbix-agent-3.2.7-1.el7.x86_64


Attachments: PNG File screenshot-1.png    
Issue Links:
Duplicate
duplicates ZBXNEXT-2410 userparameter_mysql.conf: mysql.ping ... Confirmed
duplicates ZBX-12248 mysql.ping does not detect when mysql... Closed
Story Points: 5

 Description   

The package includes a config file: /etc/zabbix/zabbix_agentd.d/userparameter_mysql.conf

It defines a UserParameter called mysql.ping which makes a good example of why this is so broken.

If MySQL is working properly, the command "mysqladmin ping" will generate the output "mysqld is alive" and pipe over to "grep -c alive" which in turn produces a numeric output "1" sent back on stdout. This is the real output that the Zabbix server is looking for. That's lovely and it gives the sysadmin a nice warm feeling that everything is in good order.

The common template also includes a trigger to report "MySQL server is down" when an output "0" is detected.

Unfortunately, here's what happens when the MySQL server actually does go down: the command "mysqladmin ping" prints a bunch of errors to stderr... hardly surprising since there's a problem. These error messages are merged with the output and sent back to the Zabbix server. They don't look like any sort of number so the server decides no data is available and thus it does not bother to trigger that "MySQL server is down" report and hence nobody gets any notifications.

This is a fundamental flaw with every type of shell command that is defined as UserParameter in any config. We must expect that the entire purpose of a monitoring system is that there will be times when something has gone wrong, therefore any command at any time should be presumed to be sending random values to stderr. These random values might be understood by a human, but they cannot be sent to the Zabbix server if we want any confidence in our alert reporting. Thus, potentially all UserParameter scripts are broken (including the examples provided as part of the standard package) unless someone edits the script to redirect stderr off to /dev/null or possibly to some logfile elsewhere.

Steps to reproduce:

  1. Just use the out of the box configuration if you want to test.
  2. Get your MySQL server monitoring, then try shutting down the server and see if you get an alert.

Result:

  1. No alert gets sent.
  2. Zabbix "Monitoring / Latest Data" page shows that it does not find the numeric value it is expecting.

NOTE:
The exact flush sequence of stderr and stdout is not guaranteed when merging the output streams, especially when we have a shell command consisting of multiple operations piped each to the next (which happens very often). Different system libraries might send stdout first in some cases or might send stderr first.

Correct behaviour should be for Zabbix agent to only send stdout back to the Zabbix server, but if you don't want to throw away the stderr then just log it locally, presumably to the local agent log file. If the agent is smart with buffering those two streams separately, might be possible to guarantee that stdout gets sent FIRST to the Zabbix server and then perhaps send stderr AFTER but this more complex handling of buffers is also more likely to fail in unexpected ways.

At very least, if you really must keep stderr then make sure your standard example scripts and templates are tested properly.



 Comments   
Comment by Telford Tendys [ 2017 Nov 01 ]

Links to old tickets.

Old issue regarding stderr in user parameter scripts: ZBX-1240

Discussion of documentation relating to this: ZBX-8564

Example of working server

[root@z-server ~]# zabbix_get -s 192.0.2.20 -k mysql.ping 
1

Example of server down and broken response

[root@z-server ~]# zabbix_get -s 192.0.2.20 -k mysql.ping 
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)'
Check that mysqld is running and that the socket: '/var/lib/mysql/mysql.sock' exists!
0
Comment by Vladislavs Sokurenko [ 2017 Nov 01 ]

Can you please show your user parameter and trigger ? I don’t see why you wouldn’t get an alert with default example that grep for alive keyword

Comment by Telford Tendys [ 2017 Nov 01 ]

UserParameter setting (out of zabbix package)

UserParameter=mysql.ping,HOME=/var/lib/zabbix mysqladmin ping | grep -c alive

RPM Package details (from yum)

Installed Packages
Name        : zabbix-agent
Arch        : x86_64
Version     : 3.2.7
Release     : 1.el7
Size        : 1.3 M
Repo        : installed
From repo   : zabbix
Summary     : Zabbix Agent
URL         : http://www.zabbix.com/
License     : GPLv2+
Description : Zabbix agent to be installed on monitored systems.

Yum repo in use

[zabbix]
name = zabbix
baseurl = http://repo.zabbix.com/zabbix/3.2/rhel/7/x86_64/
gpgcheck=0

Trigger expression viewed in browser (from standard template)

NOTE: this same template is used as export example in Zabbix, here is the same expression in XML:

    <triggers>
        <trigger>
            <expression>{Template App MySQL:mysql.ping.last(0)}=0</expression>
            <recovery_mode>0</recovery_mode>
            <recovery_expression/>
            <name>MySQL is down</name>
            <correlation_mode>0</correlation_mode>
            <correlation_tag/>
            <url/>
            <status>0</status>
            <priority>2</priority>
            <description/>
            <type>0</type>
            <manual_close>0</manual_close>
            <dependencies/>
            <tags/>
        </trigger>
    </triggers>

See full details in zabbix documentation here – https://www.zabbix.com/documentation/3.4/manual/xml_export_import/templates

Comment by Vladislavs Sokurenko [ 2017 Nov 01 ]

I am sorry, could you also please attach latest data screenshot for this item ?

Comment by Vladislavs Sokurenko [ 2017 Nov 01 ]

similar issue ZBX-12248

Comment by Vladislavs Sokurenko [ 2017 Nov 01 ]

Closing as duplicate of ZBX-12248

Comment by Paul Williamson [ 2017 Nov 02 ]

This is a serious issue.

The Zabbix expects a string of '0' or '1' to be returned from mysql.ping. When this does not happen Zabbix gets a rubbish value and does not generate an alert.

i.e. Alerts are not generate when mysql goes down from the standard configuration.

Comment by Vladislavs Sokurenko [ 2017 Nov 02 ]

Related task ZBX-1240
As you see it helps to debug problems.

Comment by Paul Williamson [ 2017 Nov 02 ]

It may help debug problems but it stops the alert from being triggered according to your default settings.

The trigger for mysql down is:

```

{Template App MySQL:mysql.ping.last(0)}

=0
```

i.e. If mysql.ping does not return 0 you will get this trigger.

When mysql is down the NEVER happens because the stderr is included in the value:

```
UserParameter=mysql.ping,HOME=/var/lib/zabbix mysqladmin ping | grep -c alive
```

Gives you:

```
$ HOME=/var/lib/zabbix mysqladmin ping | grep -c alive
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)'
Check that mysqld is running and that the socket: '/var/lib/mysql/mysql.sock' exists!
0
```

So the 0 at the end. Since the trigger is expecting a 0 and it gets everything it does not work.

Telford has also explained this above but you have failed to understand his issue.

Generated at Sun Apr 06 19:31:33 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.