[ZBX-25497] VMware host and datastores discovery fails via Proxy Created: 2024 Nov 01  Updated: 2025 Jan 08  Resolved: 2024 Dec 16

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P)
Affects Version/s: 7.0.4, 7.0.5, 7.2.0alpha1
Fix Version/s: 7.0.7rc1, 7.2.1rc1, 7.4.0alpha1

Type: Problem report Priority: Major
Reporter: Konstantin Assignee: Michael Veksler
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: 2h
Time Spent: 2h
Original Estimate: 4h
Environment:

VMware vCenter version 8.0.3.00400
Zabbix Proxy from 7.0.0 to 7.0.5 (tried on Ubuntu 22.04 and 24.04 with PostgreSQL 16)


Attachments: JPEG File discovery_rule_timeout.jpg     File test-v70u24-deb.7z     JPEG File timeout_on_simplecheck.jpg     Zip Archive zabbix_proxy_7.0.7rc1.log.zip     Text File zabbix_proxy_trace.log    
Team: Team B
Sprint: S24-W48/49, S24-W50/51/52/1
Story Points: 0.5

 Description   

Steps to reproduce:

  1. Link template VMware to vCenter server host
  2. Execute discovery rules or template items

Result:
All simple check failed with timeout, discovery rules also failed
See attached screenshots.
Expected:
Receive latest data to simple check items and create datastores and ESXi host from prototypes

In Proxy log's we observe (full log is attached):

12855:20241101:185243.670 Unknown performance counter 718 type of unitInfo:gigaBytes
 12855:20241101:185243.670 adding performance counter gpu/mem.used.gb[latest]:718
 12855:20241101:185243.670 Unknown performance counter 718 type of unitInfo:gigaBytes
 12855:20241101:185243.670 adding performance counter gpu/mem.used.gb[latest,absolute]:718
 12855:20241101:185243.670 Unknown performance counter 719 type of unitInfo:gigaBytes
 12855:20241101:185243.670 adding performance counter gpu/mem.reserved.gb[latest]:719
 12855:20241101:185243.670 Unknown performance counter 719 type of unitInfo:gigaBytes
 12855:20241101:185243.670 adding performance counter gpu/mem.reserved.gb[latest,absolute]:719
 12855:20241101:185243.670 Unknown performance counter 720 type of unitInfo:gigaBytes
 12855:20241101:185243.670 adding performance counter gpu/mem.total.gb[latest]:720
 12855:20241101:185243.670 Unknown performance counter 720 type of unitInfo:gigaBytes
 12855:20241101:185243.670 adding performance counter gpu/mem.total.gb[latest,absolute]:720

I've tried all Proxy's versions from 7.0.0 to 7.0.5 with PostgreSQL on Ubuntu 22.04 and 24.04 the result is the same. If I add this vCenter without Proxy, discovery will work as it should. I tried to reproduce this issue on vCenter 8.0.2 but it works well because there are no GPU perf counters. (i checked it via this script )

Discovery and simple checks on vCenter server 8.0.3 added to Zabbix Server without Proxy works fine.

 

 

 



 Comments   
Comment by Konstantin [ 2024 Nov 11 ]

Hi, is there any update for this bug?

Comment by dimir [ 2024 Nov 27 ]

Yes, it will be fixed ASAP.

Comment by Michael Veksler [ 2024 Nov 29 ]

As I see there are 2 different problems:

  1. vmware added a yet undocumented for 80u3 enum value gigaBytes. Please, test dev the build from [^test-deb70-zbx25497.7z] with new enum added.
  2. Periodical timeout from curl, which means that Virtual Center does not response on time.

Question: what value of VMwareTimeout for server and proxy ?

Comment by Konstantin [ 2024 Nov 30 ]

Thanks for this patch, but it seems there is a little problem with it:
it works only if I increase VMwareTimeout to 60seconds, if I set VMwareTimeout to 30 seconds or less it fails to discover some ESXi hosts and I can observe something like this in the Proxy's trace log (I've attached a new trace log where you can find this fail and that successful discovery after a timeout was increased to 60s ):
980:20241130:112713.262 End of vmware_service_init_hv():FAIL
980:20241130:112713.262 Unable initialize hv host-4043: Timeout was reached.
980:20241130:112713.262 In vmware_service_init_hv() hvid:'host-4038'
980:20241130:112713.262 In vmware_service_get_hv_data() guesthvid:'host-4038'

It's a weird behavior because Zabbix server has VMwareTimeout=10s, and it discovers all objects in the same vCenter without a problem. Also, I've noticed that the provided fix has a dependent pkg libevent-pthreads-2.1-7, so there may be a problem with it.
I've double-checked the network connection between Sever and Proxy to this vCenter and the difference is only in one hop - it is a simple L3 device without inspection/fw/nat, so the jitter from Zabbix Server and Proxy to vCenter is the same, and approx 2-3 msec.
PS:VMwareTimeout was set to a default value 10 sec on the Server and Proxy, after I applied the patch, I had to increase it to 60 sec

Comment by Michael Veksler [ 2024 Dec 02 ]

Considering that the codebase is the same (for server and proxy) and if you think that the network is OK

I would take a close look at libcurl (version) or the DNS server for the proxy (maybe you have a round robin and one of the DNS from the pull is not responding)

I rebuilt the dev test build for ubuntu24
test-v70u24-deb.7z

Comment by Michael Veksler [ 2024 Dec 11 ]

Available in:

Generated at Fri Apr 04 17:35:14 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.