[ZBX-9038] VMware item processing takes a lot of time Created: 2014 Nov 13  Updated: 2024 Apr 10  Resolved: 2018 Dec 02

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: None
Fix Version/s: 4.0.3rc1, 4.2.0alpha2, 4.2 (plan)

Type: Problem report Priority: Trivial
Reporter: Andris Zeila Assignee: Michael Veksler
Resolution: Fixed Votes: 1
Labels: performance, vmware
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File vmware-optimized-requests.diff    
Issue Links:
Duplicate
is duplicated by ZBX-9226 Improve vmware data collection perfor... Closed
Team: Team A
Sprint: Sprint 46, Nov 2018
Story Points: 3

 Description   

Currently most of the data retrieved from VMware service (vCenter or Hypervisor) are stored as were received - in XML format. The parsing is done every time a poller requests value.

There are two problems with it:

  1. the vmware collector is locked while poller parses the required value. It's especially bad for discovery requests where multiple hypervisor/virtual machine data are parsed. However it also adds up when single requests are parsed.
  2. usually data from vmware services is retrieved with less frequency than pollers are querying it. This causes the same value being parsed mulitple times.

To fix it vmware collector must parse the data upon receiving and store already parsed values. This will speed up the value retrieval process and also reduce the shared memory requirements.

Another places to improve:

  • increase lookup speed of performance entity counters (zbx_vmware_perf_entity_t:counters, use vector bsearch instead of search or switch to hashset)
  • store performance counters as integer type instead of string


 Comments   
Comment by Andris Zeila [ 2014 Nov 28 ]

Based on VMware support response regarding "-1" performance counter values

The value "-1" in VPX_STAT_X tables means that performance data were not available or the value of counter is out of valid range (e.g. any negative value).
You can safely ignore this value and treat this as data not available for this time.

Regarding possible causes what comes to my mind is external environment issue like network issue during the host is sending statistics to Vcenter server.

We should ignore -1 values and store performance counter values as unsigned 64 bit integers.

Comment by Andris Zeila [ 2015 Feb 02 ]

Currently we are retrieving top level properties - for example overallStatus, name, vm, summary, parent, datastore are retrieved for hypervisors. However often we really need only small subset of retrieved data. For example we aren't using summary/runtime, which is quite large. So instead of retrieving whole summary property we should specify detalized property paths:

            <urn:propSet>
               <urn:type>HostSystem</urn:type>
               <urn:pathSet>overallStatus</urn:pathSet>
               <urn:pathSet>name</urn:pathSet>
               <urn:pathSet>vm</urn:pathSet>
               <urn:pathSet>summary.quickStats</urn:pathSet>
               <urn:pathSet>summary.config</urn:pathSet>
               <urn:pathSet>summary.hardware</urn:pathSet>
               <urn:pathSet>parent</urn:pathSet>
               <urn:pathSet>datastore</urn:pathSet>
            </urn:propSet>

wiper The largest data requests were optimized in vmware-optimized-requests.diff patch. Tests showed that the retrieved configuration data size droped from 250k to 112k (1 hypervisor, 12 virtual machines). Given the simplicity of this patch - maybe we should move it to a separate issue and apply also to 2.2 version.

wiper Based on this a new issues (ZBX-9279) was created.

Comment by Andris Zeila [ 2016 Jan 26 ]

Hypervisor and virtual machine property lists were introduced in ZBXNEXT-3106. Now the vmware collector will parse the received data and store as easily accessable property lists rather than single xml file, that has to be parsed for every vmware item check.

While ZBXNEXT-3106 deals with the most frequently used data, there are still few places where the xml parsing could be optmized further:

  • the following keys still uses 'runtime' parsing of stored xml data to retrieve values:
    • vmware.cluster.status
    • vmware.eventlog
    • vmware.version
    • vmware.fullname
  • when creating hypervisor/virtual machine objects the corresponding xml data files are loaded in xml parser multiple times. This loading is quite expensive so the code should be restructurized to load the xml document once and then pass it to the corresponding initialization functions.
Comment by Glebs Ivanovskis (Inactive) [ 2017 Dec 05 ]

vmware.eventlog took a step in the correct direction in ZBX-12497.

Comment by Michael Veksler [ 2018 Nov 28 ]

Available in:

  • 4.0.3rc1 r87392
  • 4.2.0alpha2 (trunk) r87393
Generated at Wed Apr 24 01:11:51 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.