-
New Feature Request
-
Resolution: Fixed
-
Trivial
-
5.2.6
-
None
-
Sprint 76 (May 2021), Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 79 (Aug 2021), Sprint 80 (Sep 2021), Sprint 81 (Oct 2021), Sprint 82 (Nov 2021), Sprint 83 (Dec 2021), Sprint 84 (Jan 2022), Sprint 85 (Feb 2022)
-
3
Greetings,
I wrote up a longer feedback comparision between the SMART monitoring tools we use now and the new plugin for agent2.
After some time to think about it with my team, the one thing that the team isn't willing to give up is the drive exit codes. From the smartctl man page:
EXIT STATUS
The exit statuses of smartctl are defined by a bitmask. If all is
well with the disk, the exit status (return value) of
smartctl is 0 (all bits turned off). If a problem occurs, or an
error, potential error, or fault is detected, then a non-
zero status is returned. In this case, the eight different bits in
the exit status have the following meanings for ATA
disks; some of these values may also be returned for SCSI disks.
Bit 0: Command line did not parse.
Bit 1: Device open failed, device did not return an IDENTIFY DEVICE
structure, or device is in a low-power mode (see '-n'
option above). Bit 2: Some SMART or other ATA command to
the disk failed, or there was a checksum error in a SMART
data structure (see '-b' option above).
Bit 3: SMART status check returned "DISK FAILING".
Bit 4: We found prefail Attributes <= threshold.
Bit 5: SMART status check returned "DISK OK" but we found that some
(usage or prefail) Attributes have been <= threshold
at some time in the past.
Bit 6: The device error log contains records of errors.
Bit 7: The device self-test log contains records of errors. [ATA
only] Failed self-tests outdated by a newer successful
extended self-test are ignored.
This feature request is to add capturing and alerting to the exit codes for the agent2 SMART plugin.
Thank you.