-
Problem report
-
Resolution: Fixed
-
Trivial
-
5.2.6
-
RHEL 8
-
Sprint 84 (Jan 2022), Sprint 85 (Feb 2022), Sprint 86 (Mar 2022)
-
1
Steps to reproduce:
- Deploy and configure zabbix-agent2 on RHEL8
- Import the latest template for smart monitoring from git
- Create sudo rule for zabbix user and smartctl
- Have disk discovery failing
Result:
When trying to run the discovery manually with the agent:
zabbix_agent2 -v -t smart.disk.discovery
(...) 2021/04/29 12:47:40.125137 [Smart] stopped looking for RAID devices of megaraid type, err:%!(EXTRA *errors.errorString=failed to get disk data from smartctl: Smartctl open device: /dev/bus/0 [megaraid_disk_00] failed: INQUIRY failed) (...)
Expected:
/sbin/smartctl --scan /dev/sda -d scsi # /dev/sda, SCSI device /dev/sdb -d scsi # /dev/sdb, SCSI device /dev/sdc -d scsi # /dev/sdc, SCSI device /dev/sdd -d scsi # /dev/sdd, SCSI device /dev/sde -d scsi # /dev/sde, SCSI device /dev/sdf -d scsi # /dev/sdf, SCSI device /dev/sdg -d scsi # /dev/sdg, SCSI device /dev/sdh -d scsi # /dev/sdh, SCSI device /dev/sdi -d scsi # /dev/sdi, SCSI device /dev/sdj -d scsi # /dev/sdj, SCSI device /dev/sdk -d scsi # /dev/sdk, SCSI device /dev/bus/0 -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], SCSI device /dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device /dev/bus/0 -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], SCSI device /dev/bus/0 -d megaraid,4 # /dev/bus/0 [megaraid_disk_04], SCSI device /dev/bus/0 -d megaraid,5 # /dev/bus/0 [megaraid_disk_05], SCSI device /dev/bus/0 -d megaraid,6 # /dev/bus/0 [megaraid_disk_06], SCSI device /dev/bus/0 -d megaraid,7 # /dev/bus/0 [megaraid_disk_07], SCSI device /dev/bus/0 -d megaraid,8 # /dev/bus/0 [megaraid_disk_08], SCSI device /dev/bus/0 -d megaraid,9 # /dev/bus/0 [megaraid_disk_09], SCSI device /dev/bus/0 -d megaraid,10 # /dev/bus/0 [megaraid_disk_10], SCSI device /dev/bus/0 -d megaraid,11 # /dev/bus/0 [megaraid_disk_11], SCSI device /dev/bus/0 -d megaraid,12 # /dev/bus/0 [megaraid_disk_12], SCSI device
NOTE: smartctl uses and outputs that virtual bus device that does not really exist in the filesystem, but this way you are able to return the smart status:
smartctl -a /dev/bus/0 -d megaraid,1 smartctl 7.1 2020-04-05 r5049 [x86_64-linux-4.18.0-240.10.1.el8_3.x86_64] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Intel S4510/S4610/S4500/S4600 Series SSDs Device Model: INTEL SSDSC2KG038T8 Serial Number: PHYG025201RH3P8EGN LU WWN Device Id: 5 5cd2e4 152613993 Firmware Version: XCV10120 User Capacity: 3,840,755,982,336 bytes [3.84 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-3 T13/2161-D revision 5 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Thu Apr 29 12:49:44 2021 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART Status not supported: ATA return descriptor not supported by controller firmware SMART overall-health self-assessment test result: PASSED Warning: This result is based on an Attribute check. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x79) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 2) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 8 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 2575 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 170 Available_Reservd_Space 0x0033 099 099 010 Pre-fail Always - 0 171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 2 172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0 174 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 14 175 Power_Loss_Cap_Test 0x0033 100 100 010 Pre-fail Always - 2390 (14 65535) 183 SATA_Downshift_Count 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error_Count 0x0033 100 100 090 Pre-fail Always - 0 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 190 Drive_Temperature 0x0022 081 075 000 Old_age Always - 19 (Min/Max 16/27) 192 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 14 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 19 197 Pending_Sector_Count 0x0012 100 100 000 Old_age Always - 0 199 CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0 225 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 3576929 226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age Always - 522 227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age Always - 25 228 Workload_Minutes 0x0032 100 100 000 Old_age Always - 154396 232 Available_Reservd_Space 0x0033 099 099 010 Pre-fail Always - 0 233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0 234 Thermal_Throttle_Status 0x0032 100 100 000 Old_age Always - 0/0 235 Power_Loss_Cap_Test 0x0033 100 100 010 Pre-fail Always - 2390 (14 65535) 241 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 3576929 242 Host_Reads_32MiB 0x0032 100 100 000 Old_age Always - 1226992 243 NAND_Writes_32MiB 0x0032 100 100 000 Old_age Always - 7374461 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.