Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-26977

Enabeling zabbix smartmonitoring causes mpt3sas_cm0: _transport_smp_handler: timeout

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Unresolved
    • Icon: Trivial Trivial
    • None
    • 7.0.18
    • Agent2 plugin (G)
    • None
    • Debian 12. Zabbix 7.0.18 server and agent.
      SMART by Zabbix agent 2 active template version 7.0-4
      36-90 disks per system
      Bulk data interval is 10m
      Disk discovery interval is 1h
    • Support backlog

      Steps to reproduce:

      1. enable the SMART by Zabbix agent 2 active template 
      2. watch the logs and wait for your filesystem to fail. 

      Result:
      Errors in your logs

      [Wed Sep 10 13:35:13 2025] mpt3sas_cm0: _transport_smp_handler: timeout
      [Wed Sep 10 13:35:13 2025] mf:
                                     
      [Wed Sep 10 13:35:13 2025] 1a00ff00 
      [Wed Sep 10 13:35:13 2025] 00000008 
      [Wed Sep 10 13:35:13 2025] 00000000 
      [Wed Sep 10 13:35:13 2025] 00000000 
      [Wed Sep 10 13:35:13 2025] 80170602 
      [Wed Sep 10 13:35:13 2025] 50030484 
      [Wed Sep 10 13:35:13 2025] 00000000 
      [Wed Sep 10 13:35:13 2025] 00000000 
      [Wed Sep 10 13:35:13 2025] 
                                     
      [Wed Sep 10 13:35:13 2025] ff549000 
      [Wed Sep 10 13:35:13 2025] 00000000 
      [Wed Sep 10 13:35:13 2025] 00000008 
      [Wed Sep 10 13:35:13 2025] 00000000 
      
      [Wed Sep 10 13:35:22 2025] sd 0:0:7:0: attempting task abort!scmd(0x00000000e5286425), outstanding for 30572 ms & timeout 30000 ms
      [Wed Sep 10 13:35:22 2025] sd 0:0:7:0: [sdk] tag#1891 CDB: Read(16) 88 00 00 00 00 04 8c 3f b7 80 00 00 00 08 00 00
      [Wed Sep 10 13:35:22 2025] scsi target0:0:7: handle(0x000f), sas_address(0x4433221107000000), phy(7)
      [Wed Sep 10 13:35:22 2025] scsi target0:0:7: enclosure logical id(0x5003048480170602), slot(7) 
      [Wed Sep 10 13:35:22 2025] scsi target0:0:7: enclosure level(0x0000), connector name(     )
      [Wed Sep 10 13:35:22 2025] mpt3sas_cm0: mpt3sas_scsih_issue_tm: host reset in progress!
      

      After a while ZFS will mark the disk as missing when the disk is not responding quick enough. 

      You can see this in the zfs event log

      ep  9 2025 17:36:45.636351847 ereport.fs.zfs.delay
      Sep  9 2025 17:36:45.656352081 ereport.fs.zfs.delay
      Sep  9 2025 17:36:45.656352081 ereport.fs.zfs.delay
      Sep  9 2025 17:36:45.708352689 ereport.fs.zfs.delay
      Sep  9 2025 17:36:45.724352876 ereport.fs.zfs.delay
      Sep  9 2025 17:36:45.724352876 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.610911856 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.626912041 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.666912505 ereport.fs.zfs.delay
      Sep 10 2025 01:36:45.666912505 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.354556379 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.354556379 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.354556379 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.358556426 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.358556426 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.374556612 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.418557126 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.418557126 ereport.fs.zfs.delay
      Sep 10 2025 03:36:45.438557360 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.753315954 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.777316233 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.777316233 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.781316279 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.781316279 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.785316326 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.797316466 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.805316559 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.805316559 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.809316605 ereport.fs.zfs.delay
      Sep 10 2025 12:36:45.809316605 ereport.fs.zfs.delay
      Sep 10 2025 13:36:44.747004437 ereport.fs.zfs.delay
      Sep 10 2025 13:36:44.911006311 ereport.fs.zfs.delay
      Sep 10 2025 13:36:45.251010196 ereport.fs.zfs.delay
      Sep 10 2025 13:36:45.283010561 ereport.fs.zfs.delay
      Sep 10 2025 13:36:45.283010561 ereport.fs.zfs.delay
      Sep 10 2025 14:36:45.820883523 ereport.fs.zfs.delay
      Sep 10 2025 14:36:45.820883523 ereport.fs.zfs.delay
      Sep 10 2025 14:36:45.836883711 ereport.fs.zfs.delay
      Sep 10 2025 16:36:43.844587007 ereport.fs.zfs.delay
      Sep 10 2025 16:36:43.844587007 ereport.fs.zfs.delay
      Sep 10 2025 16:36:44.328592547 ereport.fs.zfs.delay
      Sep 10 2025 16:36:44.348592777 ereport.fs.zfs.delay
      

       

      I also see the "program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO" error in the logs. Mentioned in this issue ZBX-25632.
      They always precede the   mpt3sas_cm0: _transport_smp_handler: timeout issues. 
      Also with an hourly interval. 
      When you look at the interval between the issues it seems that the disc discovery matches the interval not the bulk data. 
      I see these errors on multiple systems with different hardware, backplanes and hba's. Its a real issue since it degrades the pools and you could lose the pool and data.
      Another issue is that when I unlink the template, the plugin continues to work and keep on causing issues. The only way to stop it causing issue is to retract the sudo rights for the command. 

       

        1. zfs events.txt
          3 kB
        2. syslog.txt
          22 kB
        3. image-2025-10-03-09-00-37-339.png
          image-2025-10-03-09-00-37-339.png
          228 kB

            zabbix.dev Zabbix Development Team
            buijs buijs
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: