[#ZBX-16347] Postgresql out of memory using timescaledb

First of all I would like to mention that the text below is not directly related to TimescaleDB, so please, discard this comment if you think it is not relevant, but in our team we tend to think that TimescaleDB itself is not itself the root cause, as well the issue is not directly related to Zabbix. We assume that it is more related to ~~ZBX-9722~~ and it is more about partitioning than the TimescaleDB.
We use partitions with Zabbix 4.2 and PostgreSQL 10 and 11. The scales are different - from 50 000 items and 500 nvps to 1 000 000 items and 10k nvps - we face the issue mentioned in this case on all types of setups.
The most likely scenarios when we will face it are the following:

One of the pre-requisites is a populated DB. It is unlikely to face the issue on a fresh DB in our case.
The issue happens only for "UPDATE" actions not "INSERT" ones. One of the tracebacks was:

 TopMemoryContext: 1327912 total in 26 blocks; 483824 free (1519 chunks); 844088 used
 TopTransactionContext: 8192 total in 1 blocks; 7768 free (3 chunks); 424 used
 pgstat TabStatusArray lookup hash table: 57344 total in 3 blocks; 22696 free (6 chunks); 34648 used
 PL/pgSQL function context: 57344 total in 3 blocks; 14640 free (3 chunks); 42704 used
 PL/pgSQL function context: 57344 total in 3 blocks; 14640 free (3 chunks); 42704 used
 Btree proof lookup cache: 8192 total in 1 blocks; 776 free (0 chunks); 7416 used
 PL/pgSQL function context: 57344 total in 3 blocks; 14640 free (3 chunks); 42704 used
 PL/pgSQL function context: 57344 total in 3 blocks; 14640 free (3 chunks); 42704 used
 Type information cache: 24472 total in 2 blocks; 2840 free (0 chunks); 21632 used
 PLpgSQL cast info: 8192 total in 1 blocks; 8072 free (0 chunks); 120 used
 PLpgSQL cast cache: 8192 total in 1 blocks; 1800 free (0 chunks); 6392 used
 PL/pgSQL function context: 57344 total in 3 blocks; 14560 free (2 chunks); 42784 used
 CFuncHash: 8192 total in 1 blocks; 776 free (0 chunks); 7416 used
 Rendezvous variable hash: 8192 total in 1 blocks; 776 free (0 chunks); 7416 used
 PLpgSQL function cache: 24528 total in 2 blocks; 2840 free (0 chunks); 21688 used
 TableSpace cache: 8192 total in 1 blocks; 2312 free (0 chunks); 5880 used
 Operator lookup cache: 24576 total in 2 blocks; 10976 free (5 chunks); 13600 used
 MessageContext: 3474997304 total in 426 blocks; 19856 free (10 chunks); 3474977448 used
 Operator class cache: 8192 total in 1 blocks; 776 free (0 chunks); 7416 used
 smgr relation table: 253952 total in 5 blocks; 30456 free (17 chunks); 223496 used
 TransactionAbortContext: 32768 total in 1 blocks; 32728 free (0 chunks); 40 used
 Portal hash: 8192 total in 1 blocks; 776 free (0 chunks); 7416 used
 PortalMemory: 8192 total in 1 blocks; 8152 free (1 chunks); 40 used
 Relcache by OID: 122880 total in 4 blocks; 46784 free (9 chunks); 76096 used
 CacheMemoryContext: 8380416 total in 10 blocks; 1541136 free (4 chunks); 6839280 used
 ...---...
 WAL record construction: 49768 total in 2 blocks; 6584 free (0 chunks); 43184 used
 PrivateRefCount: 8192 total in 1 blocks; 2840 free (0 chunks); 5352 used
 MdSmgr: 57344 total in 3 blocks; 12944 free (1 chunks); 44400 used
 LOCALLOCK hash: 516096 total in 6 blocks; 193824 free (21 chunks); 322272 used
 Timezones: 104120 total in 2 blocks; 2840 free (0 chunks); 101280 used
 ErrorContext: 8192 total in 1 blocks; 8152 free (4 chunks); 40 used
 Grand total: 3487648744 bytes in 1694 blocks; 2944136 free (1714 chunks); 3484704608 used

It seems that all the free memory was consumed either by "SELECT" query which preceded trends flush or by the "UPDATE trends" batch.

alter table trends rename to trends_old; CREATE TABLE trends ( itemid bigint NOT NULL, clock integer DEFAULT '0' NOT NULL, num integer DEFAULT '0' NOT NULL, value_min numeric(16,4) DEFAULT '0.0000' NOT NULL, value_avg numeric(16,4) DEFAULT '0.0000' NOT NULL, value_max numeric(16,4) DEFAULT '0.0000' NOT NULL, PRIMARY KEY (itemid,clock) ); alter table trends_uint rename to trends_uint_old; CREATE TABLE trends_uint ( itemid bigint NOT NULL, clock integer DEFAULT '0' NOT NULL, num integer DEFAULT '0' NOT NULL, value_min numeric(20) DEFAULT '0' NOT NULL, value_avg numeric(20) DEFAULT '0' NOT NULL, value_max numeric(20) DEFAULT '0' NOT NULL, PRIMARY KEY (itemid,clock) ); INSERT INTO trends(itemid, clock, num, value_min, value_avg, value_max) SELECT * FROM trends_old ON CONFLICT DO NOTHING; INSERT INTO trends_uint(itemid, clock, num, value_min, value_avg, value_max) SELECT * FROM trends_uint_old ON CONFLICT DO NOTHING;

CREATE TABLE trends ( itemid bigint NOT NULL, clock integer DEFAULT '0' NOT NULL, num integer DEFAULT '0' NOT NULL, value_min DOUBLE PRECISION DEFAULT '0.0000' NOT NULL, value_avg DOUBLE PRECISION DEFAULT '0.0000' NOT NULL, value_max DOUBLE PRECISION DEFAULT '0.0000' NOT NULL, PRIMARY KEY (itemid,clock) );

import psycopg2 from psycopg2 import Error try: connection = psycopg2.connect(user = "zabbix", password = "useruser", host = "localhost", database = "zabbix") cursor = connection.cursor() c = 1587549139 create_table_query = "" for x in range(700): create_table_query += " UPDATE trends set value_min=0.23,value_avg=0.24,value_max=0.25 where itemid=10062 and clock='%s';" % c c = c+1 print "executing: " + create_table_query cursor.execute(create_table_query) connection.commit() print "ALL READY" except (Exception, psycopg2.DatabaseError) as error : print ("Error while creating PostgreSQL table", error) finally: if(connection): cursor.close() connection.close() print("PostgreSQL connection is closed")

WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

create_trends_query = "CREATE TABLE trends ("\ "itemid bigint NOT NULL,"\ "clock integer DEFAULT '0' NOT NULL,"\ "num integer DEFAULT '0' NOT NULL,"\ "value_min DOUBLE PRECISION DEFAULT '0.0000' NOT NULL,"\ "value_avg DOUBLE PRECISION DEFAULT '0.0000' NOT NULL,"\ "value_max DOUBLE PRECISION DEFAULT '0.0000' NOT NULL,"\ "PRIMARY KEY (itemid,clock)"\ ") PARTITION BY RANGE (clock);

import psycopg2 from psycopg2 import Error try: connection = psycopg2.connect(user = "zabbix", password = "useruser", host = "localhost", database = "zabbix") cursor = connection.cursor() c = 1587549139 print "setting up prepared query for single update" cursor.execute("prepare myplan as UPDATE trends set value_min=0.23,value_avg=0.24,value_max=0.25 where itemid=10062 and clock=$1") for x in range(700) print "executing with variable: " + c cur.execute("execute myplan(%s)",c c = c+1 connection.commit() print "ALL READY" except (Exception, psycopg2.DatabaseError) as error : print ("Error while creating PostgreSQL table", error) finally: if(connection): cursor.close() connection.close() print("PostgreSQL connection is closed")

[ZBX-16347] Postgresql out of memory using timescaledb Created: 2019 Jul 06 Updated: 2024 Apr 10 Resolved: 2020 Jun 04
Status:	Closed
Project:	ZABBIX BUGS AND ISSUES
Component/s:	Server (S)
Affects Version/s:	4.2.4
Fix Version/s:	4.4.9rc1, 5.0.1rc1, 5.2.0alpha1, 5.2 (plan)

[ZBX-16347] Postgresql out of memory using timescaledb Created: 2019 Jul 06 Updated: 2024 Apr 10 Resolved: 2020 Jun 04