-
New Feature Request
-
Resolution: Fixed
-
Trivial
-
5.2.6
-
None
-
Sprint 76 (May 2021), Sprint 77 (Jun 2021), Sprint 78 (Jul 2021), Sprint 79 (Aug 2021), Sprint 80 (Sep 2021), Sprint 81 (Oct 2021), Sprint 82 (Nov 2021), Sprint 83 (Dec 2021), Sprint 84 (Jan 2022), Sprint 85 (Feb 2022)
-
3
Greetings,
I wrote up a longer feedback comparision between the SMART monitoring tools we use now and the new plugin for agent2.
After some time to think about it with my team, the one thing that the team isn't willing to give up is the drive exit codes. From the smartctl man page:
EXIT STATUS The exit statuses of smartctl are defined by a bitmask. If all is well with the disk, the exit status (return value) of smartctl is 0 (all bits turned off). If a problem occurs, or an error, potential error, or fault is detected, then a non- zero status is returned. In this case, the eight different bits in the exit status have the following meanings for ATA disks; some of these values may also be returned for SCSI disks. Bit 0: Command line did not parse. Bit 1: Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode (see '-n' option above). Bit 2: Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure (see '-b' option above). Bit 3: SMART status check returned "DISK FAILING". Bit 4: We found prefail Attributes <= threshold. Bit 5: SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past. Bit 6: The device error log contains records of errors. Bit 7: The device self-test log contains records of errors. [ATA only] Failed self-tests outdated by a newer successful extended self-test are ignored.
This feature request is to add capturing and alerting to the exit codes for the agent2 SMART plugin.
Thank you.