[ZBX-7397] value cache continuously exhausting Created: 2013 Nov 18  Updated: 2017 May 30  Resolved: 2013 Dec 12

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 2.2.0
Fix Version/s: 2.2.2rc1

Type: Incident report Priority: Trivial
Reporter: Robert Jerzak Assignee: Andris Zeila
Resolution: Fixed Votes: 1
Labels: valuecache
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2013-11-18 at 12.05.55.png     PNG File Screen Shot 2013-11-26 at 19.49.16.png     JPEG File ValueCache8M.jpg     PNG File testing.png    

 Description   

Zabbix value cache is continuously exhausting. Cache usage is growing about 30% per week in my environment. It seems that nothing can stop it .

zabbix_server.conf:
ValueCacheSize=256M



 Comments   
Comment by Arli [ 2013 Nov 18 ]

I tried with a smaller cache sizes and it seems to me that it actually levels out at some point and cache goes into low memory mode then. See the attatchment ValueCache8M.jpg.
In my test case ValueCacheSize=8M and the cache goes into low memory mode when reaches 6M and stays there.
Another weird thing is that it seams to not fill up 100% ever (keeps about 30% free).

Comment by Andris Zeila [ 2013 Nov 19 ]

In the second diagram I understand the server was restarted on 15/11 around 22:00 ? Also I'm wondering if timeshift options are used in trigger functions (if yes - how many and what is the average timeshift value)?

Comment by Arli [ 2013 Nov 19 ]

Yes, zabbix server was restarted at that time. I'm pretty sure, that I have no timeshift parameters anywhere.
Refresh interval for most items (14K total) is 60s or more and majority of triggers (6K total) evaluate last 0 to 3 values while minority use last few hours worth of data. 2/3 of the items are integers, 1/3 are floats and there are too few other types to mention.

Comment by Robert Jerzak [ 2013 Nov 26 ]

More descriptive screenshot. When there was only about 30% free value cache, log says:

14595:20131126:123943.213 value cache is fully used: please increase ValueCacheSize configuration parameter
14517:20131126:124000.313 value cache is fully used: please increase ValueCacheSize configuration parameter
14580:20131126:124000.422 value cache is fully used: please increase ValueCacheSize configuration parameter

Comment by Andris Zeila [ 2013 Dec 03 ]

While I could not repeat infinite growth of value cache usage, I found one issue which in theory might lead to infinite cache growth. Plus the storage space calculations for new data chunks were improved - It halved value cache usage in my test environment (60K items, 300K triggers, max function period 2 hours).

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-7397

Comment by Aleksandrs Saveljevs [ 2013 Dec 04 ]

(1) At the very beginning of valuecache.c there is a brief description of how value cache works and there is a mention of "lastvalue storage mode". As far as I know, this mode of operation was removed. Could you please review and update the comment?

asaveljevs In a comment to vc_release_space(), there is a mention of VC_MIN_FREE_SPACE. That constant, however, is nowhere to be found.

asaveljevs There is also a comment in vc_add_item() that refers to "lastvalue cache". I wonder whether it is still current.

asaveljevs Your fixes in r40727 and r40735 look good. However, valuecache.c still refers to "history storage mode" (grep for "storage mode") in some places.

asaveljevs Additional change to documentation in r40769 looks good, too. Thank you! CLOSED.

Comment by Aleksandrs Saveljevs [ 2013 Dec 05 ]

(2) Please take a look at typo fixes in r40768 and r40772.

wiper thanks. Reviewed and CLOSED

Comment by Aleksandrs Saveljevs [ 2013 Dec 05 ]

(3) A couple of other minor questions during valuecache.c review. I wonder:

  1. why vch_item_free_cache() sets item->values_total to 0, but does not set item->head and item->tail to NULL;
  2. where in vch_item_free_chunk() objects of type zbx_history_record_t are freed and, if they are, whether in the return expression there should be "last - first + 1" instead of "last - first".

wiper

  1. In the only case when item is not removed from cache immediately after calling vch_item_free_cache() the head/tail values are reset. But indeed it would be less confusing and error prone if the head/tail values were reset in vch_item_free_cache() even if the item is destroyed right after.
  2. The +1 is already included in sizeof(zbx_vc_chunk_t) (the zbx_history_record_t slots[1]; member)

asaveljevs 1. corresponds to what I thought (fixing that is optional), but 2. is an interesting idiom, as discussed on IRC. When we allocate zbx_vc_chunk_t, we allocate sizeof(zbx_vc_chunk_t) plus memory for additional slots. Then we use the slots[1] member to address these additional slots. CLOSED.

Comment by Aleksandrs Saveljevs [ 2013 Dec 06 ]

(4) Yesterday we talked about vch_get_new_chunk_slot_count() and the fact that comments above the calls to it are a bit outdated. Would you like to improve that?

wiper the new chunk slot count calculations are improved. RESOLVED in r40792

asaveljevs A better place for zbx_isqrt32() function is probably together with is_prime() and next_prime(), in include/zbxalgo.h and src/libs/zbxalgo/algodefs.c. When moving it, please correct its name (add "32") in the comment. You might also wish to consider returning "int" instead of "zbx_uint64_t" - that is natural, considering that argument is "int" and the result is not greater than the argument.

asaveljevs This functions also fails unit tests. For instance, zbx_isqrt32(8) returns 3, whereas I would expect it to return 2 (i.e., given N return largest integer R such that R^2 <= N). Also, the function uses a non-straightforward algorithm, so it would be nice to describe the algorithm or link to the source. REOPENED.

wiper Fixed and moved zbx_isqrt32() (along with int128 functions) to zbxalgo.
RESOLVED in r40852

asaveljevs Thank you! CLOSED.

Comment by Aleksandrs Saveljevs [ 2013 Dec 10 ]

I have tested the branch with 1000 hosts linked to "Template OS Linux". Value cache usage is much more efficient now:

Comment by richlv [ 2013 Dec 11 ]

(5) added a bit vague description of the improvement at https://www.zabbix.com/documentation/2.2/manual/introduction/whatsnew222 - please review/improve

wiper changed the description slightly. Feel free to improve it more.

<richlv> thanks, CLOSED

Comment by Andris Zeila [ 2013 Dec 11 ]

Released in:
pre-2.2.2rc1 r40908
pre-2.3.0 r40909

Comment by richlv [ 2013 Dec 12 ]

reopen to set "fix version"

Generated at Sat Apr 27 02:55:14 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.