-
Problem report
-
Resolution: Unresolved
-
Trivial
-
None
-
7.0.18
-
None
-
Debian 12. Zabbix 7.0.18 server and agent.
SMART by Zabbix agent 2 active template version 7.0-4
36-90 disks per system
Bulk data interval is 10m
Disk discovery interval is 1h
-
Support backlog
Steps to reproduce:
- enable the SMART by Zabbix agent 2 active template
- watch the logs and wait for your filesystem to fail.
Result:
Errors in your logs
[Wed Sep 10 13:35:13 2025] mpt3sas_cm0: _transport_smp_handler: timeout
[Wed Sep 10 13:35:13 2025] mf:
[Wed Sep 10 13:35:13 2025] 1a00ff00
[Wed Sep 10 13:35:13 2025] 00000008
[Wed Sep 10 13:35:13 2025] 00000000
[Wed Sep 10 13:35:13 2025] 00000000
[Wed Sep 10 13:35:13 2025] 80170602
[Wed Sep 10 13:35:13 2025] 50030484
[Wed Sep 10 13:35:13 2025] 00000000
[Wed Sep 10 13:35:13 2025] 00000000
[Wed Sep 10 13:35:13 2025]
[Wed Sep 10 13:35:13 2025] ff549000
[Wed Sep 10 13:35:13 2025] 00000000
[Wed Sep 10 13:35:13 2025] 00000008
[Wed Sep 10 13:35:13 2025] 00000000
[Wed Sep 10 13:35:22 2025] sd 0:0:7:0: attempting task abort!scmd(0x00000000e5286425), outstanding for 30572 ms & timeout 30000 ms
[Wed Sep 10 13:35:22 2025] sd 0:0:7:0: [sdk] tag#1891 CDB: Read(16) 88 00 00 00 00 04 8c 3f b7 80 00 00 00 08 00 00
[Wed Sep 10 13:35:22 2025] scsi target0:0:7: handle(0x000f), sas_address(0x4433221107000000), phy(7)
[Wed Sep 10 13:35:22 2025] scsi target0:0:7: enclosure logical id(0x5003048480170602), slot(7)
[Wed Sep 10 13:35:22 2025] scsi target0:0:7: enclosure level(0x0000), connector name( )
[Wed Sep 10 13:35:22 2025] mpt3sas_cm0: mpt3sas_scsih_issue_tm: host reset in progress!
After a while ZFS will mark the disk as missing when the disk is not responding quick enough.
You can see this in the zfs event log
ep 9 2025 17:36:45.636351847 ereport.fs.zfs.delay Sep 9 2025 17:36:45.656352081 ereport.fs.zfs.delay Sep 9 2025 17:36:45.656352081 ereport.fs.zfs.delay Sep 9 2025 17:36:45.708352689 ereport.fs.zfs.delay Sep 9 2025 17:36:45.724352876 ereport.fs.zfs.delay Sep 9 2025 17:36:45.724352876 ereport.fs.zfs.delay Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay Sep 10 2025 01:36:45.626912041 ereport.fs.zfs.delay Sep 10 2025 01:36:45.666912505 ereport.fs.zfs.delay Sep 10 2025 01:36:45.666912505 ereport.fs.zfs.delay Sep 10 2025 03:36:45.354556379 ereport.fs.zfs.delay Sep 10 2025 03:36:45.354556379 ereport.fs.zfs.delay Sep 10 2025 03:36:45.354556379 ereport.fs.zfs.delay Sep 10 2025 03:36:45.358556426 ereport.fs.zfs.delay Sep 10 2025 03:36:45.358556426 ereport.fs.zfs.delay Sep 10 2025 03:36:45.374556612 ereport.fs.zfs.delay Sep 10 2025 03:36:45.418557126 ereport.fs.zfs.delay Sep 10 2025 03:36:45.418557126 ereport.fs.zfs.delay Sep 10 2025 03:36:45.438557360 ereport.fs.zfs.delay Sep 10 2025 12:36:45.753315954 ereport.fs.zfs.delay Sep 10 2025 12:36:45.777316233 ereport.fs.zfs.delay Sep 10 2025 12:36:45.777316233 ereport.fs.zfs.delay Sep 10 2025 12:36:45.781316279 ereport.fs.zfs.delay Sep 10 2025 12:36:45.781316279 ereport.fs.zfs.delay Sep 10 2025 12:36:45.785316326 ereport.fs.zfs.delay Sep 10 2025 12:36:45.797316466 ereport.fs.zfs.delay Sep 10 2025 12:36:45.805316559 ereport.fs.zfs.delay Sep 10 2025 12:36:45.805316559 ereport.fs.zfs.delay Sep 10 2025 12:36:45.809316605 ereport.fs.zfs.delay Sep 10 2025 12:36:45.809316605 ereport.fs.zfs.delay Sep 10 2025 13:36:44.747004437 ereport.fs.zfs.delay Sep 10 2025 13:36:44.911006311 ereport.fs.zfs.delay Sep 10 2025 13:36:45.251010196 ereport.fs.zfs.delay Sep 10 2025 13:36:45.283010561 ereport.fs.zfs.delay Sep 10 2025 13:36:45.283010561 ereport.fs.zfs.delay Sep 10 2025 14:36:45.820883523 ereport.fs.zfs.delay Sep 10 2025 14:36:45.820883523 ereport.fs.zfs.delay Sep 10 2025 14:36:45.836883711 ereport.fs.zfs.delay Sep 10 2025 16:36:43.844587007 ereport.fs.zfs.delay Sep 10 2025 16:36:43.844587007 ereport.fs.zfs.delay Sep 10 2025 16:36:44.328592547 ereport.fs.zfs.delay Sep 10 2025 16:36:44.348592777 ereport.fs.zfs.delay
I also see the "program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO" error in the logs. Mentioned in this issue ZBX-25632.
They always precede the mpt3sas_cm0: _transport_smp_handler: timeout issues.
Also with an hourly interval.
When you look at the interval between the issues it seems that the disc discovery matches the interval not the bulk data.
I see these errors on multiple systems with different hardware, backplanes and hba's. Its a real issue since it degrades the pools and you could lose the pool and data.
Another issue is that when I unlink the template, the plugin continues to work and keep on causing issues. The only way to stop it causing issue is to retract the sudo rights for the command.