Uploaded image for project: 'ZABBIX FEATURE REQUESTS'
  1. ZABBIX FEATURE REQUESTS
  2. ZBXNEXT-6624

S.M.A.R.T. monitoring improvement to agent2

    XMLWordPrintable

Details

    • New Feature Request
    • Status: Elaborating
    • Trivial
    • Resolution: Unresolved
    • 5.2.6
    • None
    • Agent2 plugin (N)
    • None
    • Team INT
    • Sprint 76 (May 2021), Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 79 (Aug 2021), Sprint 80 (Sep 2021), Sprint 81 (Oct 2021), Sprint 82 (Nov 2021)

    Description

      Greetings,
      I wrote up a longer feedback comparision between the SMART monitoring tools we use now and the new plugin for agent2.

      https://www.zabbix.com/forum/zabbix-suggestions-and-feedback/415662-discussion-thread-for-official-zabbix-smart-disk-monitoring

      After some time to think about it with my team, the one thing that the team isn't willing to give up is the drive exit codes.  From the smartctl man page:

      EXIT STATUS
             The exit statuses of smartctl are defined by a bitmask.  If all is
             well with the disk, the exit status (return  value)  of
             smartctl is 0 (all bits turned off).  If a problem occurs, or an
             error, potential error, or fault is detected, then a non-
             zero status is returned.  In this case, the eight different bits in
             the exit status have the following  meanings  for  ATA
             disks; some of these values may also be returned for SCSI disks.       
      
             Bit 0: Command line did not parse.
             Bit 1: Device  open failed, device did not return an IDENTIFY DEVICE
                    structure, or device is in a low-power mode (see '-n'
                    option above).       Bit 2: Some SMART or other ATA command to
                    the disk failed, or there was a checksum error in a SMART
                    data  structure  (see '-b' option above).
             Bit 3: SMART status check returned "DISK FAILING".
             Bit 4: We found prefail Attributes <= threshold.
             Bit 5: SMART  status  check returned "DISK OK" but we found that some
                    (usage or prefail) Attributes have been <= threshold
                    at some time in the past.
             Bit 6: The device error log contains records of errors.
             Bit 7: The device self-test log contains records of errors.  [ATA
                    only] Failed self-tests outdated by a  newer  successful
                    extended self-test are ignored.
      

      This feature request is to add capturing and alerting to the exit codes for the agent2 SMART plugin.
      Thank you.

      Attachments

        Activity

          People

            mchudinov Maxim Chudinov
            cstackpole Chris Stackpole
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: