[ZBX-19284] Bad values in the agent2 SMART template Created: 2021 Apr 24 Updated: 2024 Apr 10 Resolved: 2021 Oct 07 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent2 plugin (G), Templates (T) |
Affects Version/s: | 5.2.6, 5.4.2 |
Fix Version/s: | 5.0.17rc1, 5.4.6rc1, 6.0.0alpha4, 6.0 (plan) |
Type: | Problem report | Priority: | Major |
Reporter: | Chris Stackpole | Assignee: | Maxim Chudinov (Inactive) |
Resolution: | Fixed | Votes: | 3 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Attachments: |
![]() ![]() |
Team: | |
Sprint: | Sprint 80 (Sep 2021), Sprint 81 (Oct 2021) |
Story Points: | 0.5 |
Description |
Greetings, I've been investigating the use of the SMART template with agent2 on several test hosts. I noticed something very odd today when working on improving the SMART template. A bunch of disks that I got all around the same time a ~2 years ago were all reporting the EXACT same "Power on hours". Well - that's not right. And all the drives shouldn't all be the same either. Are other items wrong too? I went looking at the hard drive itself. This is straight from the smartctl program: 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13647 (130 84 0) Ah. That makes sense. The template is pulling back the "VALUE" which is not at all what is needed here. What is needed is the "RAW_VALUE". Which means my suspicions on all the other values being wrong is probably correct too. What does that look like in json? (snipping only relevant bit) { "id": 9, "name": "Power_On_Hours", "value": 85, "worst": 85, "thresh": 0, "when_failed": "", "flags": { "value": 50, "string": "-O--CK ", "prefailure": false, "updated_online": true, "performance": false, "error_rate": false, "event_count": true, "auto_keep": true }, "raw": { "value": 39973260637519, "string": "13647 (36 91 0)" } }, Capturing the first value isn't as useful to monitoring as the raw value. Now to update the template template_module_smart_agent2.yaml: 53c53 < - '$[?(@.disk_name==''{#NAME}'')].ata_smart_attributes.table[?(@.id=={#ID})].value.first()' --- > - '$[?(@.disk_name==''{#NAME}'')].ata_smart_attributes.raw.table[?(@.id=={#ID})].value.first()' And a recheck... Well, some values are much better. For example, my disk is NOT running at 60 degrees C! But it is at 40C. That's a plus. 190 Airflow_Temperature_Cel 0x0022 060 052 040 Old_age Always - 40 (Min/Max 33/43) In fact, a lot of the values make more sense. Better yet, they aren't all the same across every disk! However, Power_on_hours is still not right. It went from "85" to "19297288074576" when it should be "13647". And I also broke other fields as my Temperature_Celsius (which has the same temp as Airflow_Temperature_Cel) is showing "107374182440"!!! So clearly something is busted in my fix. Investigating the SMART JSON output again, sure enough some values the "raw value" matches the "string" but in others it really is the raw value. Which means the string might be the better field to read in. Though, I have no idea how to deal with multiple value types inside a single discovery item. And I don't really want to create a new value for every potential item here. Here's my attempt to fix it again in template_module_smart_agent2.yaml: 53c52,63 < - '$[?(@.disk_name==''{#NAME}'')].ata_smart_attributes.table[?(@.id=={#ID})].value.first()' --- > - '$[?(@.disk_name==''{#NAME}'')].ata_smart_attributes.table[?(@.id=={#ID})].raw.string.first()' > - > type: JAVASCRIPT > parameters: > - | > var parsed = value.split(" ").filter(function(e){ return e === 0 || e })[0]; > parsed = parsed.split("+").filter(function(e){ return e === 0 || e })[0]; > return parsed > - > type: RTRIM > parameters: > - h Is it the right fix? :shrug: but it works. I'm getting correct values. Zabbix isn't translating the "SMART [sdc sat]: Power on hours" correctly because it reads it in as seconds and changing that to hours just gives "13.65 Kh" instead of translating that into years/months/days/hours. I just removed it to display the raw value for me but maybe there's a better solution there too. However, everything else is looking much better. Hopefully this helps others with this template issue.
|
Comments |
Comment by Chris Stackpole [ 2021 Apr 24 ] | ||||||
I forgot to mention. The reason for the double filter in the javascript is because items like the temperature returns "40 (0 25 0 0 0)" while power on returns "13555h+05m+21.913s". Thus, I'm only grabbing the most significant part. However, without breaking out each item into it's own item I'm not sure of a better way of handling it. | ||||||
Comment by Chris Stackpole [ 2021 Apr 24 ] | ||||||
Dah! This was actually reported in the forums, however, I didn't find anything when I did a bug search. And it is obviously not fixed yet. I apologize if I missed an original bug report somewhere. | ||||||
Comment by Aleksey Volodin [ 2021 Jul 07 ] | ||||||
Steps to reproduce:
Result:
Expected:
Possible reason: Data was read from VALUE instead of RAW_VALUE. | ||||||
Comment by Maxim Chudinov (Inactive) [ 2021 Sep 28 ] | ||||||
Hello cstackpole. Your exploration is very helpful. Unfortunately, not all vendors and disk models provide the raw values in string format as you've written "raw": { "value": 39973260637519, "string": "13647 (36 91 0)" } . "raw": { "value": 21478, "string": "21478" } or "raw": { "value": 13305293286956522, "string": "24042h+51m+37.880s" } and ID#194 Temperature_Celsius in format "raw": { "value": 24, "string": "24" } . ata_smart_attributes.table[?(@.id=={#ID})].raw.string.first() as string value, without Javascript preprocessing. Are you agree? | ||||||
Comment by Chris Stackpole [ 2021 Sep 28 ] | ||||||
Adding it as raw string is probably safer. I agree. Thanks! | ||||||
Comment by Maxim Chudinov (Inactive) [ 2021 Oct 01 ] | ||||||
Available in:
|