Details

    • Type: Incident report Incident report
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.2rc1
    • Component/s: Server (S)
    • Labels:

      Description

      Zabbix value cache is continuously exhausting. Cache usage is growing about 30% per week in my environment. It seems that nothing can stop it .

      zabbix_server.conf:
      ValueCacheSize=256M

        Activity

        Hide
        Arli added a comment -

        I tried with a smaller cache sizes and it seems to me that it actually levels out at some point and cache goes into low memory mode then. See the attatchment ValueCache8M.jpg.
        In my test case ValueCacheSize=8M and the cache goes into low memory mode when reaches 6M and stays there.
        Another weird thing is that it seams to not fill up 100% ever (keeps about 30% free).

        Show
        Arli added a comment - I tried with a smaller cache sizes and it seems to me that it actually levels out at some point and cache goes into low memory mode then. See the attatchment ValueCache8M.jpg. In my test case ValueCacheSize=8M and the cache goes into low memory mode when reaches 6M and stays there. Another weird thing is that it seams to not fill up 100% ever (keeps about 30% free).
        Hide
        Andris Zeila added a comment -

        In the second diagram I understand the server was restarted on 15/11 around 22:00 ? Also I'm wondering if timeshift options are used in trigger functions (if yes - how many and what is the average timeshift value)?

        Show
        Andris Zeila added a comment - In the second diagram I understand the server was restarted on 15/11 around 22:00 ? Also I'm wondering if timeshift options are used in trigger functions (if yes - how many and what is the average timeshift value)?
        Hide
        Arli added a comment -

        Yes, zabbix server was restarted at that time. I'm pretty sure, that I have no timeshift parameters anywhere.
        Refresh interval for most items (14K total) is 60s or more and majority of triggers (6K total) evaluate last 0 to 3 values while minority use last few hours worth of data. 2/3 of the items are integers, 1/3 are floats and there are too few other types to mention.

        Show
        Arli added a comment - Yes, zabbix server was restarted at that time. I'm pretty sure, that I have no timeshift parameters anywhere. Refresh interval for most items (14K total) is 60s or more and majority of triggers (6K total) evaluate last 0 to 3 values while minority use last few hours worth of data. 2/3 of the items are integers, 1/3 are floats and there are too few other types to mention.
        Hide
        Robert Jerzak added a comment -

        More descriptive screenshot. When there was only about 30% free value cache, log says:

        14595:20131126:123943.213 value cache is fully used: please increase ValueCacheSize configuration parameter
        14517:20131126:124000.313 value cache is fully used: please increase ValueCacheSize configuration parameter
        14580:20131126:124000.422 value cache is fully used: please increase ValueCacheSize configuration parameter

        Show
        Robert Jerzak added a comment - More descriptive screenshot. When there was only about 30% free value cache, log says: 14595:20131126:123943.213 value cache is fully used: please increase ValueCacheSize configuration parameter 14517:20131126:124000.313 value cache is fully used: please increase ValueCacheSize configuration parameter 14580:20131126:124000.422 value cache is fully used: please increase ValueCacheSize configuration parameter
        Hide
        Andris Zeila added a comment -

        While I could not repeat infinite growth of value cache usage, I found one issue which in theory might lead to infinite cache growth. Plus the storage space calculations for new data chunks were improved - It halved value cache usage in my test environment (60K items, 300K triggers, max function period 2 hours).

        Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-7397

        Show
        Andris Zeila added a comment - While I could not repeat infinite growth of value cache usage, I found one issue which in theory might lead to infinite cache growth. Plus the storage space calculations for new data chunks were improved - It halved value cache usage in my test environment (60K items, 300K triggers, max function period 2 hours). Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-7397
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        (1) At the very beginning of valuecache.c there is a brief description of how value cache works and there is a mention of "lastvalue storage mode". As far as I know, this mode of operation was removed. Could you please review and update the comment?

        Aleksandrs Saveljevs In a comment to vc_release_space(), there is a mention of VC_MIN_FREE_SPACE. That constant, however, is nowhere to be found.

        Aleksandrs Saveljevs There is also a comment in vc_add_item() that refers to "lastvalue cache". I wonder whether it is still current.

        Aleksandrs Saveljevs Your fixes in r40727 and r40735 look good. However, valuecache.c still refers to "history storage mode" (grep for "storage mode") in some places.

        Aleksandrs Saveljevs Additional change to documentation in r40769 looks good, too. Thank you! CLOSED.

        Show
        Aleksandrs Saveljevs added a comment - - edited (1) At the very beginning of valuecache.c there is a brief description of how value cache works and there is a mention of "lastvalue storage mode". As far as I know, this mode of operation was removed. Could you please review and update the comment? Aleksandrs Saveljevs In a comment to vc_release_space(), there is a mention of VC_MIN_FREE_SPACE. That constant, however, is nowhere to be found. Aleksandrs Saveljevs There is also a comment in vc_add_item() that refers to "lastvalue cache". I wonder whether it is still current. Aleksandrs Saveljevs Your fixes in r40727 and r40735 look good. However, valuecache.c still refers to "history storage mode" (grep for "storage mode") in some places. Aleksandrs Saveljevs Additional change to documentation in r40769 looks good, too. Thank you! CLOSED.
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        (2) Please take a look at typo fixes in r40768 and r40772.

        Andris Zeila thanks. Reviewed and CLOSED

        Show
        Aleksandrs Saveljevs added a comment - - edited (2) Please take a look at typo fixes in r40768 and r40772. Andris Zeila thanks. Reviewed and CLOSED
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        (3) A couple of other minor questions during valuecache.c review. I wonder:

        1. why vch_item_free_cache() sets item->values_total to 0, but does not set item->head and item->tail to NULL;
        2. where in vch_item_free_chunk() objects of type zbx_history_record_t are freed and, if they are, whether in the return expression there should be "last - first + 1" instead of "last - first".

        Andris Zeila

        1. In the only case when item is not removed from cache immediately after calling vch_item_free_cache() the head/tail values are reset. But indeed it would be less confusing and error prone if the head/tail values were reset in vch_item_free_cache() even if the item is destroyed right after.
        2. The +1 is already included in sizeof(zbx_vc_chunk_t) (the zbx_history_record_t slots[1]; member)

        Aleksandrs Saveljevs 1. corresponds to what I thought (fixing that is optional), but 2. is an interesting idiom, as discussed on IRC. When we allocate zbx_vc_chunk_t, we allocate sizeof(zbx_vc_chunk_t) plus memory for additional slots. Then we use the slots[1] member to address these additional slots. CLOSED.

        Show
        Aleksandrs Saveljevs added a comment - - edited (3) A couple of other minor questions during valuecache.c review. I wonder: why vch_item_free_cache() sets item->values_total to 0, but does not set item->head and item->tail to NULL; where in vch_item_free_chunk() objects of type zbx_history_record_t are freed and, if they are, whether in the return expression there should be "last - first + 1" instead of "last - first". Andris Zeila In the only case when item is not removed from cache immediately after calling vch_item_free_cache() the head/tail values are reset. But indeed it would be less confusing and error prone if the head/tail values were reset in vch_item_free_cache() even if the item is destroyed right after. The +1 is already included in sizeof(zbx_vc_chunk_t) (the zbx_history_record_t slots [1] ; member) Aleksandrs Saveljevs 1. corresponds to what I thought (fixing that is optional), but 2. is an interesting idiom, as discussed on IRC. When we allocate zbx_vc_chunk_t, we allocate sizeof(zbx_vc_chunk_t) plus memory for additional slots. Then we use the slots [1] member to address these additional slots. CLOSED.
        Hide
        Aleksandrs Saveljevs added a comment - - edited

        (4) Yesterday we talked about vch_get_new_chunk_slot_count() and the fact that comments above the calls to it are a bit outdated. Would you like to improve that?

        Andris Zeila the new chunk slot count calculations are improved. RESOLVED in r40792

        Aleksandrs Saveljevs A better place for zbx_isqrt32() function is probably together with is_prime() and next_prime(), in include/zbxalgo.h and src/libs/zbxalgo/algodefs.c. When moving it, please correct its name (add "32") in the comment. You might also wish to consider returning "int" instead of "zbx_uint64_t" - that is natural, considering that argument is "int" and the result is not greater than the argument.

        Aleksandrs Saveljevs This functions also fails unit tests. For instance, zbx_isqrt32(8) returns 3, whereas I would expect it to return 2 (i.e., given N return largest integer R such that R^2 <= N). Also, the function uses a non-straightforward algorithm, so it would be nice to describe the algorithm or link to the source. REOPENED.

        Andris Zeila Fixed and moved zbx_isqrt32() (along with int128 functions) to zbxalgo.
        RESOLVED in r40852

        Aleksandrs Saveljevs Thank you! CLOSED.

        Show
        Aleksandrs Saveljevs added a comment - - edited (4) Yesterday we talked about vch_get_new_chunk_slot_count() and the fact that comments above the calls to it are a bit outdated. Would you like to improve that? Andris Zeila the new chunk slot count calculations are improved. RESOLVED in r40792 Aleksandrs Saveljevs A better place for zbx_isqrt32() function is probably together with is_prime() and next_prime(), in include/zbxalgo.h and src/libs/zbxalgo/algodefs.c. When moving it, please correct its name (add "32") in the comment. You might also wish to consider returning "int" instead of "zbx_uint64_t" - that is natural, considering that argument is "int" and the result is not greater than the argument. Aleksandrs Saveljevs This functions also fails unit tests. For instance, zbx_isqrt32(8) returns 3, whereas I would expect it to return 2 (i.e., given N return largest integer R such that R^2 <= N). Also, the function uses a non-straightforward algorithm, so it would be nice to describe the algorithm or link to the source. REOPENED. Andris Zeila Fixed and moved zbx_isqrt32() (along with int128 functions) to zbxalgo. RESOLVED in r40852 Aleksandrs Saveljevs Thank you! CLOSED.
        Hide
        Aleksandrs Saveljevs added a comment -

        I have tested the branch with 1000 hosts linked to "Template OS Linux". Value cache usage is much more efficient now:

        Show
        Aleksandrs Saveljevs added a comment - I have tested the branch with 1000 hosts linked to "Template OS Linux". Value cache usage is much more efficient now:
        Hide
        richlv added a comment - - edited

        (5) added a bit vague description of the improvement at https://www.zabbix.com/documentation/2.2/manual/introduction/whatsnew222 - please review/improve

        Andris Zeila changed the description slightly. Feel free to improve it more.

        <richlv> thanks, CLOSED

        Show
        richlv added a comment - - edited (5) added a bit vague description of the improvement at https://www.zabbix.com/documentation/2.2/manual/introduction/whatsnew222 - please review/improve Andris Zeila changed the description slightly. Feel free to improve it more. <richlv> thanks, CLOSED
        Hide
        Andris Zeila added a comment -

        Released in:
        pre-2.2.2rc1 r40908
        pre-2.3.0 r40909

        Show
        Andris Zeila added a comment - Released in: pre-2.2.2rc1 r40908 pre-2.3.0 r40909
        Hide
        richlv added a comment -

        reopen to set "fix version"

        Show
        richlv added a comment - reopen to set "fix version"

          People

          • Assignee:
            Andris Zeila
            Reporter:
            Robert Jerzak
          • Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: