[ZBX-4564] change Zabbix daemons priority on Linux Created: 2012 Jan 19  Updated: 2017 May 30  Resolved: 2012 Jan 26

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G), Proxy (P), Server (S)
Affects Version/s: None
Fix Version/s: 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Major
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by ZBX-4826 Something (agent?) on heavily-loaded ... Closed

 Description   

All zabbix daemons start with priority "5" as hardcoded in the sources.
See revision 228 dated from 2001-10-01.
That was Zabbix 1.0alpha12

Would be nice to remove this code, which allows to control priority at init scripts level.
Currently default priority for daemons and other process in Linux is "0" so the priority "5" it's sort of "outdated" feature.

#ifdef HAVE_SYS_RESOURCE_SETPRIORITY
if (0 != setpriority(PRIO_PROCESS, 0, 5))
zbx_error("Unable to set process priority to 5. Leaving default.");
#endif



 Comments   
Comment by Alexei Vladishev [ 2012 Jan 19 ]

I think we should remove this code as suggested. I am working on it.

Comment by Alexei Vladishev [ 2012 Jan 24 ]

Fixed in dev branch branches/dev/ZBX-4564, ready to test.

<zalex> Agent and Server of dev branch tested. All is fine. Seems ready to code review and merge.

Comment by Alexander Vladishev [ 2012 Jan 25 ]

Great! Tested successfully!

Comment by Alexei Vladishev [ 2012 Jan 26 ]

Implemented in revision 25024.

Comment by Oleksii Zagorskyi [ 2012 Jan 26 ]

Note: Fixed in pre-1.8.11 r25022, pre-1.9.9 r25023





[ZBX-4535] zabbix - FTBFS with ld --as-needed Created: 2012 Jan 11  Updated: 2017 May 30  Resolved: 2012 Jan 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G), Server (S)
Affects Version/s: 1.8.9, 1.8.10
Fix Version/s: 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Blocker
Reporter: Leo Iannacone Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu Dev Precise (12.04)


Attachments: File fix-ftbfs-ld-as-needed.patch    
Issue Links:
Duplicate
duplicates ZBX-494 Cannot compile with --as-needed linke... Closed
duplicates ZBX-3475 Compilation error --with-sqlite3 on F... Closed
is duplicated by ZBX-3898 Fails to build with "gold" binutils l... Closed

 Description   

Dear developers,

zabbix fail to build with flag "ld --as-needed" enabled (as set in Ubuntu), trying to add LDAP and POSTGRESQL LIBS during compilation in the wrong place.

Snippet from build fail about zabbix_agent and LDAP:
i686-linux-gnu-gcc -Wall -g -O2 -I/usr/include/postgresql -I/usr/local/include -I/usr/lib/perl/5.14/CORE -I. -I/usr/include -I/usr/include -I/usr/include -I/usr/include -rdynamic -Wl,-Bsymbolic-functions -Wl,-z,relro -o zabbix_agent -L/usr/lib -lldap -llber zabbix_agent.o stats.o cpustat.o diskdevices.o perfstat.o vmstats.o zbxconf.o ../../src/libs/zbxsysinfo/libzbxagentsysinfo.a ../../src/libs/zbxsysinfo/linux/libspecsysinfo.a ../../src/libs/zbxsysinfo/common/libcommonsysinfo.a ../../src/libs/zbxsysinfo/simple/libsimplesysinfo.a ../../src/libs/zbxlog/libzbxlog.a ../../src/libs/zbxalgo/libzbxalgo.a ../../src/libs/zbxsys/libzbxsys.a ../../src/libs/zbxnix/libzbxnix.a ../../src/libs/zbxcomms/libzbxcomms.a ../../src/libs/zbxconf/libzbxconf.a ../../src/libs/zbxcommon/libzbxcommon.a ../../src/libs/zbxcrypto/libzbxcrypto.a ../../src/libs/zbxjson/libzbxjson.a ../../src/libs/zbxexec/libzbxexec.a -lm -lresolv
../../src/libs/zbxsysinfo/simple/libsimplesysinfo.a(simple.o): In function `check_ldap':
/tmp/buildd/zabbix-1.8.9/debian/tmp-build-PGSQL/src/libs/zbxsysinfo/simple/simple.c:57: undefined reference to `ldap_init'
/tmp/buildd/zabbix-1.8.9/debian/tmp-build-PGSQL/src/libs/zbxsysinfo/simple/simple.c:63: undefined reference to `ldap_search_s'
...

Snippet from build fail about zabbix_server and POSTGRESQL:
x86_64-linux-gnu-gcc -Wall -g -O2 -I/usr/include/postgresql -I/usr/local/include -I/usr/lib/perl/5.14/CORE -I. -I/usr/include -I/usr/include -I/usr/include -I/usr/include -rdynamic -Wl,-Bsymbolic-functions -Wl,-z,relro -o zabbix_server -L/usr/lib -lpq -liksemel -L/usr/lib/x86_64-linux-gnu -lcurl -L/usr/lib -lnetsnmp -L/usr/lib -lnetsnmp -L/usr/lib -L/usr/lib -L/usr/lib zabbix_server-actions.o zabbix_server-operations.o zabbix_server-events.o zabbix_server-zlog.o zabbix_server-server.o alerter/libzbxalerter.a dbsyncer/libzbxdbsyncer.a dbconfig/libzbxdbconfig.a discoverer/libzbxdiscoverer.a pinger/libzbxpinger.a poller/libzbxpoller.a housekeeper/libzbxhousekeeper.a timer/libzbxtimer.a trapper/libzbxtrapper.a nodewatcher/libzbxnodewatcher.a utils/libzbxutils.a httppoller/libzbxhttppoller.a watchdog/libzbxwatchdog.a escalator/libzbxescalator.a proxypoller/libzbxproxypoller.a selfmon/libzbxselfmon.a ../../src/libs/zbxsysinfo/libzbxserversysinfo.a ../../src/libs/zbxsysinfo/linux/libspecsysinfo.a ../../src/libs/zbxsysinfo/common/libcommonsysinfo.a ../../src/libs/zbxsysinfo/simple/libsimplesysinfo.a ../../src/libs/zbxlog/libzbxlog.a ../../src/libs/zbxdbcache/libzbxdbcache.a ../../src/libs/zbxmemory/libzbxmemory.a ../../src/libs/zbxalgo/libzbxalgo.a ../../src/libs/zbxnix/libzbxnix.a ../../src/libs/zbxsys/libzbxsys.a ../../src/libs/zbxconf/libzbxconf.a ../../src/libs/zbxmedia/libzbxmedia.a ../../src/libs/zbxcommon/libzbxcommon.a ../../src/libs/zbxcrypto/libzbxcrypto.a ../../src/libs/zbxcomms/libzbxcomms.a ../../src/libs/zbxcommshigh/libzbxcommshigh.a ../../src/libs/zbxjson/libzbxjson.a ../../src/libs/zbxexec/libzbxexec.a ../../src/libs/zbxself/libzbxself.a ../../src/libs/zbxserver/libzbxserver.a ../../src/libs/zbxicmpping/libzbxicmpping.a ../../src/libs/zbxdbhigh/libzbxdbhigh.a ../../src/libs/zbxdb/libzbxdb.a -liksemel -lcurl -lnetsnmp -lssh2 -lOpenIPMI -lOpenIPMIposix -lm -lresolv
../../src/libs/zbxdb/libzbxdb.a(db.o): In function `zbx_db_close':
/tmp/buildd/zabbix-1.8.9/debian/tmp-build-PGSQL/src/libs/zbxdb/db.c:443: undefined reference to `PQfinish'
../../src/libs/zbxdb/libzbxdb.a(db.o): In function `zbx_db_vexecute':
/tmp/buildd/zabbix-1.8.9/debian/tmp-build-PGSQL/src/libs/zbxdb/db.c:815: undefined reference to `PQexec'

The attached patch fixes both problems, it moves '-lldap -llber' into LDAP_LIBS (exporting it and using in configure.in) and '-lpq' into POSTGRESQL_LIBS as well (already exported and defined into configure.in).

Can you kindly consider to apply this patch?

Thanks,

Leo.



 Comments   
Comment by Alexander Vladishev [ 2012 Jan 28 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4535

Comment by Oleksii Zagorskyi [ 2012 Jan 29 ]

Just for record: as I see this dev branch already includes fix from ZBX-3475. Great!
So ZBX-3475 can be closed after this one.

Comment by dimir [ 2012 Jan 30 ]

Great fix! Even more order in our autoconfiguration process! Just tiny formatting change in r25079.

<Sasha> CLOSED

Comment by Alexander Vladishev [ 2012 Jan 30 ]

Fixed in versions pre-1.8.11 r25085 and pre-1.9.9 r25087.





[ZBX-4526] trigger checks only first 256 chars of item value Created: 2012 Jan 09  Updated: 2017 May 30  Resolved: 2012 Jan 12

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Documentation (D), Server (S)
Affects Version/s: 1.8.10
Fix Version/s: 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Critical
Reporter: Pavel Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: triggers
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Tested on Linux.


Attachments: JPEG File zbx-4526-item-screenshot.jpg    

 Description   

If I use trigger "

{hostname:web.get.page[site].str(substr)}

" then substring found only if it occurs in the first 256 (not sure) bytes of value including http header.
I did not found this limitation in documentation, and it works OK in zabbix 1.8.8.
Item "hostname:web.get.regex[site,,,substr]" works good on both versions, but sometimes it's useful to get full page content in alert (as value).
As workaround I use trigger like this:

{hostname:web.page.get[site].str("substr")}

<1&

{hostname:web.page.regexp[site,,,substr].count(#2,"substr","eq")}

<1
first part five value for alerter, second part set real condition.



 Comments   
Comment by richlv [ 2012 Jan 09 ]

might be another items.lastvalue manifestation (although i can't find the previous issues on this right now)

Comment by Alexander Vladishev [ 2012 Jan 11 ]

what database are you using?

Comment by Alexander Vladishev [ 2012 Jan 11 ]

Cannot reproduce with PostgreSQL.

Check a structure of a 'items' table. Size of fields 'lastvalue' and 'prevvalue' should be 255.

Comment by Pavel [ 2012 Jan 11 ]

Mysql.
Yes, lastvalue and prevvalue columns of items table have type "varchar(255)".

Comment by Pavel [ 2012 Jan 12 ]

Is it possible that it depends of 8-bit value?
In my case, triggers "

{happy.kiev.ua:web.page.get[fidonet.org.ua].str("sysopka",#2)}

<1" and "

{happy.kiev.ua:web.page.get[fidonet.org.ua].str("sysopka")}

<1" returns problem, but "

{happy.kiev.ua:web.page.get[fidonet.org.ua].str("DOCTYPE",#2)}

<1" returns OK.
I can open my zabbix_agentd for your server if it can help.

Comment by Alexander Vladishev [ 2012 Jan 12 ]

Please attach screenshot of "web.page.get[fidonet.org.ua]" item configuration.

Comment by Pavel [ 2012 Jan 12 ]

web.page.get[fidonet.org.ua] item configuration

Comment by Alexander Vladishev [ 2012 Jan 12 ]

Confirmed: The given problem happens if in the first 255 characters is present the CR (0х13) character.

Comment by Alexander Vladishev [ 2012 Jan 12 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4526

Comment by Alexander Vladishev [ 2012 Jan 12 ]

The priority is changed to the "Critical".

Comment by dimir [ 2012 Jan 13 ]

Successfully tested. Please fix ChangeLog message, it contains typo.

Comment by Alexander Vladishev [ 2012 Jan 13 ]

Fixed in version pre-1.8.11, revision 24752.





[ZBX-4507] Action not removing host from group "Discovered Hosts" Created: 2012 Jan 03  Updated: 2017 May 30  Resolved: 2012 Jan 25

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.8, 1.8.9, 1.8.10
Fix Version/s: 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Major
Reporter: Attilla de Groot Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: actions, discovery
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux, Debian


Attachments: PNG File Screen Shot 2012-01-03 at 3.50.13 PM.png     PNG File Screen Shot 2012-01-03 at 3.50.46 PM.png     PNG File Screen Shot 2012-01-03 at 4.04.38 PM.png    

 Description   

Hi,

I've created a discovery rule that discovers switches on our platform and attaches templates to the hosts based on the snmp result. I'm also adding a host to the appropriate host group based on the result of one of the snmp discovery rules.

This all works fine, except for the removal from the Discovered Hosts group. I'd like to keep this host group clear from already discovered hosts, but that seems to be impossible or simply not work. As you can see in the attached screenshot, I created a separate action to remove a host from the discovered hosts group, but that doesn't seem to work. I also tried to just add the removal operation to one of the three separate actions, but the also doesn't work.

There are no logs to debug, but I'm guessing that the delete action is being processed before the "add to hostgroup" action. This would result in a host being without a host group and probably not possible. I'm guessing this because the delete action is put in the list first when I'm adding the operation to one of the the separate actions.

It would be very helpful if this is solved.



 Comments   
Comment by Oleksii Zagorskyi [ 2012 Jan 04 ]

Discovery can work very slow for big IP addresses range with so big count of checks and many of unavailable hosts.
Maybe need to wait day or two ?

Comment by Attilla de Groot [ 2012 Jan 04 ]

I'm aware of that, but I have this configuration for about 6 months now. So I think that is more than enough waiting.

If you have a better way of configuring the discovery, please tell me.

Comment by Alexander Vladishev [ 2012 Jan 11 ]

Confirmed: We always add discovered host to group "Discovered hosts".

Comment by Attilla de Groot [ 2012 Jan 16 ]

That is a good thing. But I'd like to remove it again.

Because it also shows up as a host group with errors on the dashboard for example.

Comment by Alexander Vladishev [ 2012 Jan 18 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4507

Comment by dimir [ 2012 Jan 20 ]

It's interesting how I finally managed to reproduce the problem only when I split "add host" and "remove from discovered hosts" operations into 2 different actions. Within one action I wasn't able to reproduce it. Yes, but it's true that server was adding the host back to group every time "add host" operation was executed.

Comment by dimir [ 2012 Jan 23 ]

Tested. Please review my changes in r24932, r24933.

<Sasha> CLOSED

Comment by Alexander Vladishev [ 2012 Jan 24 ]

Fixed in version pre-1.8.11, revision 24989.

Comment by Oleksii Zagorskyi [ 2012 Jan 25 ]

Just a note: Fixed in version pre-1.9.9, revision 24993 too.





[ZBX-4479] it is possible to add a dependency from a template to a host Created: 2011 Dec 22  Updated: 2017 May 30  Resolved: 2012 Jan 05

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Frontend (F), Server (S)
Affects Version/s: 1.9.8 (beta)
Fix Version/s: 1.9.9 (beta)

Type: Incident report Priority: Blocker
Reporter: richlv Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: dependencies, triggers
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by ZBXNEXT-835 Allow Trigger Dependency from a Templ... Closed

 Description   

in current trunk it is possible to add dependency from template trigger to a host trigger. it is an accidental change, and it is not known whether server will operate properly this way. it should be decided whether this is allowed - if it is, it would solve ZBXNEXT-835 (and should be documented)

if not, this should be prevented.



 Comments   
Comment by richlv [ 2012 Jan 04 ]

(1) rev 24503 did :
+ for ($i = 1; $i <=2; $i++) {

is suspect this might not match the coding stye
<Slava> RESOLVED r 24511

  • for ($i = 1; $i <=2; $i++) {
    + for ($i = 1; $i <= 2; $i++) {

<richlv> CLOSED

Comment by Pavels Jelisejevs (Inactive) [ 2012 Jan 04 ]

Server side TESTED.

Comment by richlv [ 2012 Jan 05 ]

for the record, this is supposed to allow template triggers to depend on host triggers and should also fulfill ZBXNEXT-835

Comment by richlv [ 2012 Jan 05 ]

(2) this should be documented in the trigger dependency docs

http://www.zabbix.com/documentation/2.0/manual/config/triggers/dependencies

<Sasha> Rich, thanks! CLOSED

Comment by dimir [ 2012 Jan 05 ]

Tested, somebody should review my changes in r24536 and r24547 and then it can be merged into upstream.

<Sasha> CLOSED

Comment by Alexander Vladishev [ 2012 Jan 05 ]

Available in version pre-1.9.9, rev. 24570.

Comment by Oleksii Zagorskyi [ 2012 Jan 05 ]

Reopened to remove 2.0 from "Fix version"

Comment by Oleksii Zagorskyi [ 2012 Jan 05 ]

Closed again.





[ZBX-4424] Handling of the log of long Japanese Created: 2011 Dec 08  Updated: 2017 May 30  Resolved: 2011 Dec 11

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G), Server (S)
Affects Version/s: 1.8.8, 1.8.9
Fix Version/s: 1.8.10, 1.9.9 (beta), 2.0.0

Type: Incident report Priority: Blocker
Reporter: suzuka Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: items, localization
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

RHEL CentOS 5.6, 6.0 Postgresql-8.4.7



 Description   

I am monitoring Windows EventLog.

When Zabbix Server receives the log of 255 or more Japanese characters(512 bytes or more ?), an error is outputted and data is not registered.

This problem occurs in zabbix1.8.8 and 1.8.9 and it does not coour in 1.8.7 and 1.8.6.

I guess that there is a problem in handling of the Japanese character after 255 character in 1.8.8 or later.

The following is a log when I test.

1) Zabbix v1.8.8
Japanese char 254 -> correct

[pg_log]

2011-12-08 11:06:03 JST: 4352: LOG: duration: 0.480 ms statement: insert into alerts (alertid,actionid,eventid,userid,clock,mediatypeid,sendto,subject,message,status,alerttype,esc_step) values (4703,4,40420,3,1323309963,4,'sasaki@localhost','[????] Error is output to ApplicationLog on Windows2008: PROBLEM','?????????????

Error is output to ApplicationLog on Windows2008: PROBLEM
Last value: ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

',0,0,0)

[zabbix_server.log]

none

2) Zabbix v1.8.8
Japanese char 255 -> Not correct

[pg_log]

2011-12-08 11:10:32 JST: 4354: ERROR: invalid byte sequence for encoding "UTF8": 0xe32720
2011-12-08 11:10:32 JST: 4354: HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
2011-12-08 11:10:32 JST: 4354: ERROR: current transaction is aborted, commands ignored until end of transaction block
2011-12-08 11:10:32 JST: 4354: STATEMENT: insert into history_log (id,itemid,clock,timestamp,source,severity,value,logeventid) values (8,25109,1323310227,1323310148,'EventCreate',4,'???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????',1000);

2011-12-08 11:10:32 JST: 4354: ERROR: current transaction is aborted, commands ignored until end of transaction block

[zabbix_server.log]

3688:20111208:111032.678 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: invalid byte sequence for encoding "UTF8": 0xe32720
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encodi
ng".
[update items set lastclock=1323310227,lastlogsize=15486,mtime=0,prevvalue=lastvalue,lastvalue='??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????<E3>' where itemid=25109;
]
3688:20111208:111032.679 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: current transaction is aborted, commands ignored until end of tra
nsaction block
[insert into history_log (id,itemid,clock,timestamp,source,severity,value,logeventid) values (8,25109,1323310227,1323310148,'EventCreate',4,'?
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????',1000);
]
3688:20111208:111032.679 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: current transaction is aborted, commands ignored until end of tra
nsaction block
[select distinct t.triggerid,t.type,t.value,t.error,t.expression,f.itemid from triggers t,functions f,items i where t.triggerid=f.triggerid and
f.itemid=i.itemid and t.status=0 and f.itemid in (25109) order by t.triggerid]

3) Zabbix v1.8.7
Japanese char 254 -> correct

[pg_log]

2011-12-08 11:34:34 JST: 5504: LOG: duration: 0.747 ms statement: insert into alerts (alertid,actionid,eventid,userid,clock,mediatypeid,sendto,subject,message,status,alerttype,esc_step) values (4705,4,40681,3,1323311674,4,'sasaki@localhost','[????] Error is output to ApplicationLog on Windows2008: PROBLEM','?????????????

Error is output to ApplicationLog on Windows2008: PROBLEM
Last value: ??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

',0,0,0)

[zabbix_server.log]

none

4) Zabbix v1.8.7
Japanese char 255 -> correct

[pg_log]

2011-12-08 11:38:05 JST: 5504: LOG: duration: 0.764 ms statement: insert into alerts (alertid,actionid,eventid,userid,clock,mediatypeid,sendto,subject,message,status,alerttype,esc_step) values (4706,4,40682,3,1323311885,4,'sasaki@localhost','[????] Error is output to ApplicationLog on Windows2008: PROBLEM','?????????????

Error is output to ApplicationLog on Windows2008: PROBLEM
Last value: ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

',0,0,0)

[zabbix_server.log]

none



 Comments   
Comment by Alexander Vladishev [ 2011 Dec 11 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4424

Comment by dimir [ 2011 Dec 12 ]

Please review my changes in r23928.

<Sasha> Great! CLOSED

Comment by Alexander Vladishev [ 2011 Dec 13 ]

Fixed in version pre-1.8.10, revision 23950.





[ZBX-4418] zabbix_server [98798]: ERROR [file:db.c,line:1464] Something impossible has just happened. Created: 2011 Dec 05  Updated: 2017 May 30  Resolved: 2011 Dec 09

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.9.8 (beta)
Fix Version/s: 1.9.9 (beta), 2.0.0

Type: Incident report Priority: Blocker
Reporter: Danilo Chilene Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: events, sql
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

FreeBSD srv-zabbix01.trendeiras 8.2-RELEASE-p4 FreeBSD 8.2-RELEASE-p4 #5: Mon Dec 5 07:57:08 BRST 2011 [email protected]:/usr/obj/usr/src/sys/TREND amd64

FreeBSD inside vmware, typical instalation.

(root@srv-zabbix01 - ~ @14:29:40)
1: pkg_info |grep mysql
mysql-client-5.5.17 Multithreaded SQL database (client)
mysql-server-5.5.17 Multithreaded SQL database (server)
php5-mysql-5.3.8 The mysql shared extension for php

./configure --prefix=/usr/local --sysconfdir=/usr/local/etc/zabbix --enable-server --enable-agent --with-mysql --with-libcurl --with-net-snmp --with-ssh2 --with-ldap --with-openipmi



 Description   

Strange behavior of log /tmp/zabbix_server.log
I always get strange network errors when running Zabbix on FreeBSD.

Below the error:

98798:20111205:143001.507 [Z3005] query failed: [1690] BIGINT UNSIGNED value is out of range in '(`zabbix`.`ids`.`nextid` + -(5256))' [update ids set nextid=nextid+-5256 where nodeid=0 and table_name='events' and field_name='eventid']
zabbix_server [98798]: ERROR file:db.c,line:1464 Something impossible has just happened.
98798:20111205:143031.469 [Z3005] query failed: [1690] BIGINT UNSIGNED value is out of range in '(`zabbix`.`ids`.`nextid` + -(5256))' [update ids set nextid=nextid+-5256 where nodeid=0 and table_name='events' and field_name='eventid']



 Comments   
Comment by Alexander Vladishev [ 2011 Dec 09 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4418

Comment by dimir [ 2011 Dec 12 ]

Tested.

Comment by Alexander Vladishev [ 2011 Dec 12 ]

Fixed in version pre-1.9.9, revision 23920.





[ZBX-4404] MaxHousekeeperDelete max-value is wrong either in the default zabbix_server.conf or zabbix_server/server.c Created: 2011 Nov 29  Updated: 2017 May 30  Resolved: 2012 Jan 05

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.8
Fix Version/s: 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Minor
Reporter: René Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: trivial
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Centos


Issue Links:
Duplicate
is duplicated by ZBX-4405 MaxHousekeeperDelete max-value is wro... Closed

 Description   

In zabbix_server.conf the range-example for MaxHousekeeperDelete does not match with the max value set in zabbix_server.c

zabbix_server.conf

  1. Range: 0-1048576

vs

#zabbix_server/server.c

{"MaxHousekeeperDelete", &CONFIG_MAX_HOUSEKEEPER_DELETE, TYPE_INT, PARM_OPT, 0, 1000000}

,



 Comments   
Comment by Alexei Vladishev [ 2011 Nov 29 ]

Thanks for reporting this. It should be fixed.

Comment by Alexander Vladishev [ 2012 Jan 05 ]

Fixed the default zabbix_server.conf in version pre-1.8.11, r24572.





[ZBX-4376] Quoted spec symbols in lld via proxy Created: 2011 Nov 22  Updated: 2017 May 30  Resolved: 2012 Feb 21

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 1.9.8 (beta)
Fix Version/s: 1.8.10, 1.9.9 (beta)

Type: Incident report Priority: Blocker
Reporter: Alexey Pustovalov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: trivial
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File item_prototypes.png     PNG File items.png    

 Description   

24917:20111122:233650.539 In DCmass_proxy_add_history()
24917:20111122:233650.539 query [txnlev:1]
....
"snmp.discovery":[
....
{
"

{#SNMPINDEX}":10110,
"{#SNMPVALUE}":"GigabitEthernet0
/10"},
{
"{#SNMPINDEX}

":10111,
"

{#SNMPVALUE}

":"GigabitEthernet0
/11"},
....

and we have incorrect item key in zabbix server. Ex.:
????????:?????? ?????? GigabitEthernet0\/16 ifInUcastPkts["GigabitEthernet0\/16"]



 Comments   
Comment by Alexander Vladishev [ 2011 Nov 30 ]

It's normal. All low level discovery data is stored and processed in JSON format. Slashes (/) in JSON strings should be escaped. (www.json.org)

Please attach a screenshot, if you have a problem with LLD.

Comment by Alexey Pustovalov [ 2011 Dec 01 ]

screenshots

Comment by Alexander Vladishev [ 2011 Dec 01 ]

What database you are using on server and proxy side?

Comment by Alexander Vladishev [ 2011 Dec 01 ]

Confirmed, when using mysql on proxy side.

Comment by Alexander Vladishev [ 2011 Dec 01 ]

Could repeat only once. After setting additional debug-information in the code and compilation of a proxy the problem hasn't repeated.

Try to recompile a proxy too using these commands:

./bootstrap.sh
./configure -enable-proxy --with<db> ...
make clean
make dbschema
make install

Comment by Alexey Pustovalov [ 2011 Dec 01 ]

problem remains with last trunk proxy on PostgreSQL.

Comment by Alexey Pustovalov [ 2011 Dec 01 ]

on debian with mysql the problem hasn't repeated.
problem on gentoo with postgresql only.

Comment by Alexander Vladishev [ 2011 Dec 01 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4376

Comment by Alexey Pustovalov [ 2011 Dec 01 ]

working. Thanks

Comment by dimir [ 2011 Dec 01 ]

Do we really want to "set standard_conforming_strings to off;" ?

Comment by Alexander Vladishev [ 2011 Dec 01 ]

You are right! It is the bad idea. In version PostgreSQL 8.1 this variable only for reading. I re-open this problem for finishing.

Comment by Alexander Vladishev [ 2011 Dec 01 ]

Has been fixed by another way. Please retest r23707.

Comment by dimir [ 2011 Dec 02 ]

I like it!

Comment by Alexander Vladishev [ 2011 Dec 02 ]

Fixed in version pre-1.9.9, revision 23728. Backported to pre-1.8.10, revision 23729.

Comment by richlv [ 2014 Jan 30 ]

note that json standard does not require escaping of slashes (only doublequotes and backslashes). from http://www.ietf.org/rfc/rfc4627.txt :

"All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F)."

ZBX-5116 talks about similar topic in the frontend/api





[ZBX-4298] No messages in the LEVEL_WARNING log about deleted values by housekeeper (table "housekeeper") Created: 2011 Oct 31  Updated: 2017 May 30  Resolved: 2011 Dec 19

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.8
Fix Version/s: 1.8.10, 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Minor
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: housekeeper
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

DebugLevel=3



 Description   

When zabbix_server works with the DebugLevel=3 it reports to the log this message:
"Deleted <NNN> records from history and trends"

But here are not included values deleted by housekeeper for the items already deleted from configuration (table "housekeeper").
1. This lack of information could confuse when troubleshooting of DB performance.
2. It's not consistent that the housekeeper is reporting count of the deleted outdated values but does not report count of values for items already deleted from configuration.

Try to imagine a situation when some user has performed "Unlink and clear" action with some big template linked to the several hosts and in the nearest hour he felt some problem with the DB performance and he thinks - what happened ?

Here is part of DebugLevel=4 (I added several EOL for better view):

15500:20111031:150250.029 End of housekeeping_history_and_trends():0

15500:20111031:150250.029 Deleted 0 records from history and trends

15500:20111031:150250.029 In housekeeping_process_log()
15500:20111031:150250.029 query [txnlev:0] [select housekeeperid,tablename,field,value from housekeeper order by table name]
15500:20111031:150250.029 query without transaction detected

15500:20111031:150250.029 query [txnlev:0] [delete from history where itemid=22578 limit 100]
15500:20111031:150250.044 deleted 100 records from table 'history'

15500:20111031:150250.044 query without transaction detected

15500:20111031:150250.044 query [txnlev:0] [delete from history_uint where itemid=22577 limit 100]
15500:20111031:150250.060 deleted 100 records from table 'history_uint'

15500:20111031:150250.060 End of housekeeping_process_log():SUCCEED
15500:20111031:150250.060 In housekeeping_events() now:1320062561
15500:20111031:150250.060 query [txnlev:0] [select event_history from config]
15500:20111031:150250.060 query [txnlev:0] [select eventid from events where clock<1288526561]
15500:20111031:150250.060 End of housekeeping_events():SUCCEED

15500:20111031:150250.060 In housekeeping_alerts() now:1320062561
15500:20111031:150250.060 query [txnlev:0] [select alert_history from config]
15500:20111031:150250.060 query without transaction detected
15500:20111031:150250.060 query [txnlev:0] [delete from alerts where clock<1288526561]
15500:20111031:150250.060 deleted 0 records from table 'alerts'
15500:20111031:150250.060 End of housekeeping_alerts():SUCCEED

15500:20111031:150250.060 In housekeeping_sessions() now:1320062561
15500:20111031:150250.060 query without transaction detected
15500:20111031:150250.060 query [txnlev:0] [delete from sessions where lastaccess<1288526561]
15500:20111031:150250.061 deleted 0 records from table 'sessions'
15500:20111031:150250.061 End of housekeeping_sessions():SUCCEED
15500:20111031:150250.061 sleeping for 3600 seconds

(in this example "MaxHousekeeperDelete=100")

As you see in the function "housekeeping_process_log()", where the values are deleted from the tables history, history_uint (and not only), no messages are added for the LOG_LEVEL_WARNING.

I ask to add these messages to log. Maybe individually per each table, maybe summarize history+trends, I don't know how is better.

If any such values were not deleted then needn't to report any messages.



 Comments   
Comment by Oleksii Zagorskyi [ 2011 Oct 31 ]

(1) By the way, the name of function "housekeeping_process_log()" seems is not optimal.

I suggest to change it to "housekeeping_process_deleted_items()" or similar.

<dimir> I chose housekeeping_cleanup(), RESOLVED in r23553

<zalex> Excellent ! CLOSED

Comment by dimir [ 2011 Nov 25 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-4298

Comment by dimir [ 2011 Nov 25 ]

I've chosen "housekeeping_cleanup()", if nobody minds. So the report will look like this:

11921:20111125:180134.824 housekeeper deleted 0 records from history and trends, 28561 records of deleted items, 0 events, 0 alerts and 0 sessions

This will be logged on every housekeeper step even if nothing was removed. Agreed with sasha that this way there will will be no questions whether it actually was or not an attempt to delete the data.

Comment by Oleksii Zagorskyi [ 2011 Nov 27 ]

Dev branch tested. Works as expected. Single line for report is very-very good.

(2) I would suggest to write that line as:
housekeeper deleted: 0 records from history and trends, 28561 records of deleted items, 0 events, 0 alerts, 0 sessions
It seems for me it will be in more readable form.

<dimir> RESOLVED in r23560

<zalex> Many thanks. CLOSED.

Comment by dimir [ 2011 Nov 30 ]

Fixed in pre-1.8.10 r23632, pre-1.9.9 r23633.

Comment by Alexander Vladishev [ 2011 Nov 30 ]

(1) Broken compilation of the latest trunk.

housekeeper.c: In function ‘housekeeping_cleanup’:
housekeeper.c:135: error: ‘ids_alloc’ undeclared (first use in this function)
housekeeper.c:135: error: (Each undeclared identifier is reported only once
housekeeper.c:135: error: for each function it appears in.)
housekeeper.c:135: error: ‘ids_num’ undeclared (first use in this function)
housekeeper.c: In function ‘housekeeping_alerts’:
housekeeper.c:186: warning: ‘return’ with a value, in function returning void
housekeeper.c: In function ‘housekeeping_events’:
housekeeper.c:213: warning: ‘return’ with a value, in function returning void
housekeeper.c: In function ‘main_housekeeper_loop’:
housekeeper.c:328: error: void value not ignored as it ought to be
housekeeper.c:331: error: void value not ignored as it ought to be

<dimir> sorry for the broken trunk, RESOLVED in r23663 directly in trunk

<zalex> tested. trunk r23663 compiled ok and it works.

<sasha> CLOSED with small change in r23688.

Comment by richlv [ 2011 Nov 30 ]

(2) also :
what's the difference between "history and trends" and "records of deleted items" ? i assume the latter is not the amount of items, but the amount of history and trends values for those items ?

<dimir> "Old history and trends" is the outdated information (as configured in the item keep this and that), "records of deleted items" is all the data (basically what's in "housekeeper" table) related to removed item.

<zalex> maybe would be better to replace all words "records" to the "values"? It will be more clear.

<dimir> For me "deleted 2 values from history and trends" is not more clear than "deleted 2 records from hostory and trends". What I'd add is singular value support.

<dimir> if there are no objections, RESOLVED in r23681

<zalex> dev branch r23681 tested. it works (see 1 event):
"housekeeper deleted: 7056 records from history and trends, 0 records of deleted items, 1 event, 0 alerts, 0 sessions"
he-he, it's some "alternative" to gettext

<dimir> We decided to discard these changes as we don't have anything like it anywhere. CLOSED

Comment by dimir [ 2011 Dec 01 ]

Oleksiy, thank you for testing!

Comment by dimir [ 2011 Dec 01 ]

Fixed in trunk r23663.

Comment by richlv [ 2011 Dec 01 ]

(3) i suspect "d_clenup" is a typo

<dimir> Right, RESOLVED in pre-1.8.10 r23732, pre-1.9.9 r23733.
<sasha> CLOSED

Comment by Alexander Vladishev [ 2011 Dec 02 ]

Closing resolved issue

Comment by dimir [ 2011 Dec 02 ]

Reopening to assign to myself.

Comment by dimir [ 2011 Dec 02 ]

Closed.

Comment by richlv [ 2011 Dec 19 ]

48 deleted events reported as 1 in 1.8.10rc1

strace output :

event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=11", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=18", 40) = 40
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=5", 39) = 39
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=9", 39) = 39
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=1", 39) = 39
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=47", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=38", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=44", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=43", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=36", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=42", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=34", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=35", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=31", 40) = 40
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=8", 39) = 39
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=30", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=40", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=37", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=13", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=39", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=33", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=46", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=29", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=48", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=41", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=14", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=32", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=10", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=27", 40) = 40
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=6", 39) = 39
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=7", 39) = 39
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=12", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=26", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=45", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=24", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=25", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=17", 40) = 40
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=2", 39) = 39
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=3", 39) = 39
event_strace.13938:write(6, "#\0\0\0\3delete from events where eventid=4", 39) = 39
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=15", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=16", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=20", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=21", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=22", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=23", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=19", 40) = 40
event_strace.13938:write(6, "$\0\0\0\3delete from events where eventid=28", 40) = 40
event_strace.13938:write(7, " 13938:20111217:175347.261 housekeeper deleted: 0 records from history and trends, 0 records of deleted items, 1 events, 0 alerts, 0 sessions\n", 142) = 142

Comment by Alexander Vladishev [ 2011 Dec 19 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-4298

Comment by dimir [ 2011 Dec 19 ]

Tested successfully.

Comment by Alexander Vladishev [ 2011 Dec 28 ]

Available in version pre-1.8.11, r24309.





[ZBX-4277] First and another errors about snmp Created: 2011 Oct 26  Updated: 2018 Feb 09  Resolved: 2012 Jul 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 1.9.7 (beta)
Fix Version/s: 1.9.9 (beta)

Type: Incident report Priority: Critical
Reporter: Alexey Pustovalov Assignee: Unassigned
Resolution: Won't fix Votes: 0
Labels: snmp
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

lastest rev 22669



 Description   

i have errors in log zabbix proxy and server:
dotneft # tail -f /var/log/zabbix/zabbix_proxy.log | grep c2r3ups01.test.ru
14513:20111026:152804.038 SNMP item [apc.BatteryTemperature] on host [c2r3ups01.test.ru] failed: another network error, wait for 15 seconds
14513:20111026:152824.153 temporarily disabling SNMP checks on host [c2r3ups01.test.ru]: host unavailable
14510:20111026:152924.166 enabling SNMP checks on host [c2r3ups01.test.ru]: host became available

but i don't see in tcpdump queries:
zabbix-trunk # tcpdump -np host 10.100.52.8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
15:25:12.276711 IP 10.100.11.45.44628 > 10.100.52.8.161: C=public GetRequest(36) .1.3.6.1.4.1.318.1.1.10.2.3.2.1.4[|snmp]
15:25:12.325343 IP 10.100.52.8.161 > 10.100.11.45.44628: C=public GetResponse(41) .1.3.6.1.4.1.318[|snmp]
15:25:49.613001 arp who-has 10.100.52.1 tell 10.100.52.8
15:27:03.381601 IP 10.100.11.45.46681 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.4.2.4.0[|snmp]
15:27:03.416751 IP 10.100.52.8.161 > 10.100.11.45.46681: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]
15:27:04.407535 IP 10.100.11.45.59624 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.7.2.6.0[|snmp]
15:27:04.445123 IP 10.100.52.8.161 > 10.100.11.45.59624: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]
15:27:12.534674 IP 10.100.11.45.33600 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.2.2.8.0[|snmp]
15:27:12.572059 IP 10.100.52.8.161 > 10.100.11.45.33600: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]
15:27:13.549699 IP 10.100.11.45.36081 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.2.2.1.0[|snmp]
15:27:13.582344 IP 10.100.52.8.161 > 10.100.11.45.36081: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]
15:27:14.576940 IP 10.100.11.45.49594 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.2.2.3.0[|snmp]
15:27:14.610833 IP 10.100.52.8.161 > 10.100.11.45.49594: C=public GetResponse(41) .1.3.6.1.4.1.318[|snmp]
15:27:16.609383 IP 10.100.11.45.44065 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.3.2.4.0[|snmp]
15:27:16.643131 IP 10.100.52.8.161 > 10.100.11.45.44065: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]
15:27:17.645861 IP 10.100.11.45.41856 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.3.2.1.0[|snmp]
15:27:17.681825 IP 10.100.52.8.161 > 10.100.11.45.41856: C=public GetResponse(40) .1.3.6.1.4.1.318[|snmp]
15:27:20.701552 IP 10.100.11.45.50608 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.4.1.1.0[|snmp]
15:27:20.739801 IP 10.100.52.8.161 > 10.100.11.45.50608: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]
15:27:28.021190 IP 10.100.11.45 > 10.100.52.8: ICMP echo request, id 14907, seq 0, length 76
15:27:28.024483 IP 10.100.52.8 > 10.100.11.45: ICMP echo reply, id 14907, seq 0, length 76
15:27:29.023347 IP 10.100.11.45 > 10.100.52.8: ICMP echo request, id 14907, seq 1, length 76
15:27:29.026740 IP 10.100.52.8 > 10.100.11.45: ICMP echo reply, id 14907, seq 1, length 76
15:27:30.023728 IP 10.100.11.45 > 10.100.52.8: ICMP echo request, id 14907, seq 2, length 76
15:27:30.034321 IP 10.100.52.8 > 10.100.11.45: ICMP echo reply, id 14907, seq 2, length 76
15:29:24.123489 IP 10.100.11.45.36498 > 10.100.52.8.161: C=public GetRequest(34) .1.3.6.1.4.1.318.1.1.1.7.2.3.0[|snmp]
15:29:24.165822 IP 10.100.52.8.161 > 10.100.11.45.36498: C=public GetResponse(39) .1.3.6.1.4.1.318[|snmp]

apc.BatteryTemperature have oid .1.3.6.1.4.1.318.1.1.1.2.2.2.0



 Comments   
Comment by richlv [ 2011 Oct 26 ]

please, attach a screenshot of item configuration. also, you could try snmpget for the oid exactly as it's specified in the item configuration

Comment by Alexey Pustovalov [ 2011 Oct 26 ]

time snmpget -v2c -c public c2r3ups01.88.ru .1.3.6.1.4.1.318.1.1.1.2.2.2.0
SNMPv2-SMI::enterprises.318.1.1.1.2.2.2.0 = Gauge32: 21

real 0m0.051s
user 0m0.028s
sys 0m0.000s
dotneft ~ $ time snmpget -v2c -c public c2r3ups01.88.ru .1.3.6.1.4.1.318.1.1.1.2.2.2.0
SNMPv2-SMI::enterprises.318.1.1.1.2.2.2.0 = Gauge32: 21

real 0m0.050s
user 0m0.024s
sys 0m0.000s
dotneft ~ $ time snmpget -v2c -c public c2r3ups01.88.ru .1.3.6.1.4.1.318.1.1.1.2.2.2.0
SNMPv2-SMI::enterprises.318.1.1.1.2.2.2.0 = Gauge32: 21

real 0m0.051s
user 0m0.020s
sys 0m0.000s
dotneft ~ $ time snmpget -v2c -c public c2r3ups01.88.ru .1.3.6.1.4.1.318.1.1.1.2.2.2.0
SNMPv2-SMI::enterprises.318.1.1.1.2.2.2.0 = Gauge32: 21

real 0m0.050s
user 0m0.020s
sys 0m0.004s

Comment by Alexey Pustovalov [ 2011 Oct 26 ]

item settings

Comment by Alexey Pustovalov [ 2011 Oct 27 ]

we checking then the problem and found following relationships:
1. often troubles with snmp was with hosts (checking by ip) without/incorrect A or PTR DNS records.
2. after we resolve A and PTR DNS records, another and first errors disappeared from logs

But the problem remains, how affected by the availability of DNS records for hosts with check by ip?

Comment by Alexei Vladishev [ 2011 Nov 17 ]

Perhaps SNMP device relies on availability of DNS somehow, say for security checks, audits, something else?

Comment by Alexey Pustovalov [ 2011 Nov 17 ]

maybe. I think net-snmp have internal checks,audit... etc.
messages about errors missing from zabbix log after DNS records ordering.

Comment by Alexey Pustovalov [ 2012 Jul 28 ]

I think we can close the issue.





[ZBX-4262] can't select item prototypes for graph y axis min/max Created: 2011 Oct 21  Updated: 2017 May 30  Resolved: 2011 Dec 05

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Frontend (F), Server (S)
Affects Version/s: None
Fix Version/s: 1.9.9 (beta), 2.0.0

Type: Incident report Priority: Blocker
Reporter: richlv Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: graphs, lld, trivial
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File r23752-mysqldump.txt     File r23752-objdump.txt.gz     File r23752-zabbix-server.log    

 Description   

create a graph prototype on an lld rule. try to select an item prototype for y axis min/max - that is not available. this makes creating graphs like "used diskspace projected on a graph going from 0 to total diskspace" impossible



 Comments   
Comment by Pavels Jelisejevs (Inactive) [ 2011 Nov 21 ]

GUI RESOLVED.

Comment by dimir [ 2011 Dec 02 ]

(2) Crash in unreachable poller when processing discovery rule. Attached files:

r23752-mysqldump.txt
r23752-objdump.txt
r23752-zabbix-server.log

Steps to reproduce:
1) create template "lld" in a new group "lld"
2) create discovery rule "vfs.fs.discovery" in that template
3) create 2 item prototypes for the rule:
vfs.fs.size[

{#FSNAME},free]
vfs.fs.size[{#FSNAME}

,total]
4) create a graph prototype for the rule:

  • use "free" prototype as an item for the graph
  • use "total" prototype as "Y axis MAX value"
    5) link enabled host to "lld" template
    6) restart zabbix server and agent
    7) observe zabbix_server.log file for a crash

Frequency: 2/2 (the second time from clean database).

<sasha> Thanks for objdump! RESLOVED

<dimir> Perfect! CLOSED

Comment by dimir [ 2011 Dec 05 ]

Tested successfully.

Comment by Alexander Vladishev [ 2011 Dec 05 ]

Available in version pre 1.9.9, revision 23781.





[ZBX-4024] SQL statements for database initialisation should be run in a transaction Created: 2011 Aug 08  Updated: 2017 May 30  Resolved: 2011 Nov 30

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 1.8.6
Fix Version/s: 1.9.9 (beta)

Type: Incident report Priority: Blocker
Reporter: Christoph Haas Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: database, installation
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian GNU/Linux "testing". Linux kernel 2.6.38. 64 bit (aka "amd64"). SQLite version 3.7.7. Laptop with Core i7 system with 4 GB of RAM and 128 GB SSD.


Issue Links:
Duplicate
is duplicated by ZBX-2881 Wrong Item in AIX Template Closed
is duplicated by ZBX-2883 Wrong Item in AIX Template Closed
is duplicated by ZBX-2884 Wrong Item in AIX Template - Applicat... Closed
is duplicated by ZBX-4126 Wrong items in AIX Template Closed
is duplicated by ZBX-4127 Wrong items in HP-UX template Closed

 Description   

The SQL statements used to initialise the database should be encapsulated in a transaction. Current situation:

In create/schema/sqlite.sql you use a transaction:

BEGIN TRANSACTION;
...create schema...
COMMIT;

In create/data/data.sql you don't use a transaction:
INSERT INTO...
INSERT INTO...
INSERT INTO...
... (>12k lines of INSERTs)

When using SQLite3 this leads to an fsync() call after each INSERT statement. On a decent server with a non-decent file system (ext3 does not handle fsync() correctly) and a decent write-cache this may work. But on my development system I use ext4 with an SSD and populating the SQLite3 database takes over an hour! With ext3 it takes half an hour. When using strace I see that the system is stuck time and again on fsync() calls. Besides this has surely worn down my SSD a lot.

I investigated a little and it appears like SQLite is doing the right thing using fsync after each write action. In fact they recommend to use transactions so that only one fsync is happening after the transaction is committed. I changed the create/data/data.sql as follows:

BEGIN;
INSERT INTO...
INSERT INTO...
INSERT INTO...
... (>12k lines of INSERTs)
COMMIT;

This initialises the database within just one second.

I researched on MySQL and PostgreSQL and both understand BEGIN and COMMIT (but not "BEGIN TRANSACTION") so it should be safe to use. I cannot test for DB2 and Oracle though.

I flagged this issue an "improvement" but in fact it's a serious problem that wastes a lot of time during installation and damages disks. As you just have to add BEGIN and COMMIT this should be simple to fix.



 Comments   
Comment by dimir [ 2011 Aug 22 ]

The problem is basically Oracle. It doesn't have "BEGIN" thing. So if we want a transactioned data import in SQLite we would have to maintain separate data.sql files and we don't want that. Basically we see the only solution as to note somewhere in the instructions that SQLite users must wrap the data.sql around "BEGIN/COMMIT" manually.

<dimir> Here is our suggestion (thanks for an idea to sasha):

Here is the debian/rules section to create SQLite db:

cat create/schema/sqlite.sql create/data/data.sql create/data/images_sqlite3.sql > $(TMP_SERVER_SQLITE3)/usr/share/dbconfig-common/data/$(PKG_SERVER_SQLITE3)/install/sqlite3

How about just modifying it to:

(cat create/schema/sqlite.sql; echo 'BEGIN;'; cat create/data/data.sql create/data/images_sqlite3.sql; echo 'COMMIT;') > $(TMP_SERVER_SQLITE3)/usr/share/dbconfig-common/data/$(PKG_SERVER_SQLITE3)/install/sqlite3

?

<richlv> this will work, but you will probably see higher load from shell (should be ok in this case, but with larger datasets it can be a huge problem)

Comment by Christoph Haas [ 2011 Aug 22 ]

Your call. That would mean I'd have to permanently ship a patch with Debian packages. Without that fix an installation time of up to an hour will probably not be considered acceptable for the SQLite3 packages.

Comment by Alexei Vladishev [ 2011 Oct 26 ]

Fixed in development branch for Zabbix 2.0. Ready to test.

Comment by Aleksandrs Saveljevs [ 2011 Oct 28 ]

(1) Directory "database" is missing proper svn:ignore settings.

<alexei> RESOLVED.

<asaveljevs> CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Oct 28 ]

(2) File "do" in the root directory is hopelessly outdated. Shall we remove it?

<asaveljevs> Let's clean up "go" script, too.

<alexei> removed both files. RESOLVED.

<asaveljevs> CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Oct 28 ]

(3) Script gen_php.php does not work anymore:

$ php gen_php.php
File does not exist: "/home/asaveljevs/zabbix-svn/branches/dev/ZBX-4024/create/bin/schema.tmpl"

<alexei> RESOLVED

<asaveljevs> CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Oct 28 ]

(4) Please review r22787 and r22791. The commits are mostly about fixing typos and style inconsistencies.

<asaveljevs> In particular, it fixes a typo in ChangeLog: "db data file uses transations now". However, I did not understand what kind of translations are meant. Could you please explain that? Perhaps it is something worth documenting.

<richlv> i'd guess it was supposed to be "transactions"... unless this was a clever sarcasm that i fell for, it shows the importance of (not having) typos

<alexei> it shows importance of having spell checker built-in vi editor. RESOLVED.

<asaveljevs> CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Oct 28 ]

(5) Script export_data.sh can only be run from create/src, but it would be nice to be able to run it from any directory. Property svn:ignore would then have to be set up to ignore data.tmpl.new, wherever that is generated.

<alexei> Now data.tmpl.new is ignored under create/bin. RESOLVED

<asaveljevs> It is still not possible to run export_data.sh from any directory.

<asaveljevs> Also, if data.tmpl is located in create/src, why not place data.tmpl.new in that directory, too?

<alexei> The file will write data to stdout like other utilities do. Also added help string, proper exit code. It is possible to run it from any directory. RESOLVED

<asaveljevs> Still not possible to run from any directory:

$ create/bin/export_data.sh issue_zbx_4024 > /dev/null
grep: ../src/schema.tmpl: No such file or directory

<alexei> I wrongly tested it running from src, bad test. RESOLVED

<asaveljevs> Still not possible. For instance, from a directory that contains spaces:

$ ZBX\ 4024/create/bin/export_data.sh issue_zbx_4024 | head -10
dirname: extra operand `4024/create/bin/export_data.sh'
Try `dirname --help' for more information.
basedir: ''
schema: '/../src/schema.tmpl'

<alexei> now it works with spaces and hopefully with other special characters as well. RESOLVED

<asaveljevs> No, it does not:

$ ZBX\ 4024/create/bin/export_data.sh issue_zbx_4024 | head -10
dirname: extra operand `4024/create/bin/export_data.sh'
Try `dirname --help' for more information.
...
grep: /../src/schema.tmpl: No such file or directory

You have not fixed all the places.

<alexei> it's a shame! RESOLVED

<asaveljevs> Hurray! CLOSED.

Comment by Aleksandrs Saveljevs [ 2011 Oct 28 ]

(6) Pipe-delimited approach for data may fail if data itself contains pipes. For instance, this can happen with triggers that use OR in expressions or item keys that use regular expressions. Maybe not now, but in future we will have to take that into account.

<richlv> appliance templates already use this (to detect various possible apache or syslog process names) - i'd say that's pretty critical

<alexei> RESOLVED

<asaveljevs> The following code is wrong:

local $line = $_[0];
+ $line =~ s/&pipe;/|/;
@array = split(/|/, $line);
$first = 1;

Replacing of "&pipe;" should be done after splitting $line by pipe characters.

<alexei> :-0 RESOLVED

<asaveljevs> Looks good. Please review additional changes in r22883.

<asaveljevs> Please also take a look at r22893 - it fixes start of transactions in SQLite3. Statements for transactions in IBM DB2 and Oracle will remain empty.

<alexei> I like changes in the Perl code, SQLite3 fix is also great. CLOSED.

Comment by Alexei Vladishev [ 2011 Nov 01 ]

Fixed issues found during code review and testing. Ready to retest.

Comment by Aleksandrs Saveljevs [ 2011 Nov 02 ]

(7) I have updated instructions for Oracle at http://www.zabbix.com/documentation/2.0/manual/installation/install?&#zabbix_database . Since I do not know the current state of ZBXNEXT-675, I used "old_images" in the example. Please review.

<alexei> I found no issues, Oracle instructions look good to me. CLOSED

Comment by Aleksandrs Saveljevs [ 2011 Nov 15 ]

(8) Please review the list of tables that should be included by export_data.sh in data.tmpl.

<alexei> RESOLVED

<asaveljevs> After these changes (actually, before these changes, too), empty tables might appear in data.tmpl. Do we want that?

<asaveljevs> I have exported a database with a slide show in it. When trying to import it into a new database, I get the following error:

$ mysql -uroot issue_zbx_4024 < database/mysql/data.sql
ERROR 1452 (23000) at line 11: Cannot add or update a child row: a foreign key constraint fails (`issue_zbx_4024`.`slides`, CONSTRAINT `c_slides_2` FOREIGN KEY (`screenid`) REFERENCES `screens` (`screenid`) ON DELETE CASCADE)

<asaveljevs> Please also review changes in r23222 and r23223.

<alexei> The changes are ok to me. I ordered all tables so they match referential integrity constraints. Note that no handling of loop constraints (table1->table2->table1) and self-references (table1->table1) is implemented currently. Now empty tables do not appear in both data.tmpl and schema.tmpl. RESOLVED.

<asaveljevs> It would be nice to group tables in schema.tmpl more logically, so that semantically similar table are closer together. For instance, there is no need to keep "trigger_depends" table so far from "triggers" table and "maintenances_*" tables" are very far from "maintenances" table.

<alexei> Fixed trigger_depends & triggers. I am afraid of reviewing and moving other tables (look at maintenances*, there are so many constraints attached!), it may break something if not done very carefully. I would leave it as it is, it's really hard to come up with a single "placement principle", who is coming first, etc.

<asaveljevs> We have a comment in schema.tmpl that says "-- History tables". Tables "dhosts" and "dservices" are not in that section, although they might be considered history tables. Same for "autoreg_host" and "proxy_autoreg_host".

<alexei> RESOLVED

<asaveljevs> Also, table "ids" references table "nodes", but comes before it. Same for tables that reference "images" table.

<alexei> RESOLVED

<asaveljevs> OK, CLOSED for all.

<asaveljevs> Please review my changes in r23329.

<alexei> I like it. CLOSED.

Comment by Alexei Vladishev [ 2011 Nov 29 ]

Resolved in 23584.

Comment by richlv [ 2011 Nov 30 ]

reopen to set more precise "fix for" version

Comment by richlv [ 2011 Nov 30 ]

see ZBX-4408 for some issues with export_data.sh script

Comment by Aleksandrs Saveljevs [ 2012 Dec 04 ]

SQLite FAQ has a high-level explanation for the reason it took so long without transactions: http://www.sqlite.org/faq.html#q19 .





[ZBX-3832] mysterios periods_cnt option Created: 2011 May 23  Updated: 2017 May 30  Resolved: 2011 Nov 30

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Documentation (D), Frontend (F), Installation (I), Server (S)
Affects Version/s: None
Fix Version/s: 1.9.9 (beta), 2.0.0

Type: Incident report Priority: Blocker
Reporter: richlv Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: codequality, graphs, trivial
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

for custom graph items, there's a periods_cnt db field/option. this option is not documented anywhere and nobody knows what it does.

it should either be nuked or properly documented/exposed.



 Comments   
Comment by Pavels Jelisejevs (Inactive) [ 2011 Nov 25 ]

It has been decided to remove aggregate graph items whatsoever.

Please review the update for the trunk GUI in /branches/dev/ZBX-3832. RESOLVED.

Comment by richlv [ 2011 Nov 25 ]

(1) doesn't server also need some updates for graphs it creates from network discovery, active agent auto-registration and low level discovery ?
edit: it must. server still refers to periods_cnt

<pavels> Ah, of course it does, sorry.
<sasha> RESOLVED

<dimir> CLOSED

Comment by richlv [ 2011 Nov 26 ]

(4) documentation must be updated :
http://www.zabbix.com/documentation/2.0/manual/appendix/api/graphitem
http://www.zabbix.com/documentation/2.0/manual/appendix/api/graph/create
http://www.zabbix.com/documentation/2.0/manual/appendix/api/graphitem/get
anywhere else ?

it should also be noted that this change will have to be reflected when transferring xml docs and possibly others

<pavels> RESOLVED.
<sasha> REOPENED http://www.zabbix.com/documentation/2.0/manual/config/visualisation/graphs/custom
<sasha> RESOLVED

<pavels> CLOSED.

Comment by Alexander Vladishev [ 2011 Nov 30 ]

Available in version pre1.9.9, r23639.





[ZBX-3475] Compilation error --with-sqlite3 on FreeBSD Created: 2011 Jan 28  Updated: 2017 May 30  Resolved: 2012 Jan 30

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.9.2 (alpha)
Fix Version/s: 1.8.11, 1.9.9 (beta)

Type: Incident report Priority: Minor
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: freebsd
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

FreeBSD8.1
pkg_info | grep sqlite
php5-pdo_sqlite-5.3.5 The pdo_sqlite shared extension for php
php5-sqlite-5.3.5 The sqlite shared extension for php
py26-pysqlite-2.3.5 A DB-API v2 Python library for the SQLite 3 embedded SQL en
sqlite3-3.7.4 An SQL database engine in a C library


Attachments: File ax_lib_sqlite3.m4.diff    
Issue Links:
Duplicate
is duplicated by ZBX-4535 zabbix - FTBFS with ld --as-needed Closed

 Description   

If i configure --with-sqlite3 then:
checking for SQLite3 library >= 3.0.0... no
configure: error: SQLite3 library not found

If i configure --with-sqlite3=/usr/local then:
checking for SQLite3 library >= 3.0.0... yes
checking for function sqlite3_open_v2() in sqlite3.h... yes

It seems need to improve SQLite3 checking in the different OS.

p.s.zabbix_server (--with-sqlite3=/usr/local) starts and works normally with the sqlite3 db-backend.



 Comments   
Comment by Alexei Vladishev [ 2011 Jan 28 ]

Sorry, but I do not see any issues here. Why blocker?

Check for the SQLite3 library is done in exactly the same way as, for example, check for PostgreSQL library. We do not look at any OS specific directories.

I am closing it.

Comment by Oleksii Zagorskyi [ 2011 Jan 30 ]

he-he, the game is NOT over.
why blocker - because of compilation error .
Probably my knowledge is not great as your Alexei, but I'm single-minded and partially crazy .

I'm not agree that "/usr/local" is a specific directory. It's prefix-path, where installed all software under FreeBSD and maybe not only - i don't know.
And for example you're not right about <We do not look at any OS specific directories> - see to the "libssh2.m4". This macro try to check several paths and it authors are two Zabbix developers.

Okay enough talk, take my very simple patch "ax_lib_sqlite3.m4.diff"

A source info here:
http://git.savannah.gnu.org/gitweb/?p=autoconf-archive.git;a=commitdiff;h=cf19f0cc2e2b52c025063354c9cfd91ea0bf8fdf;hp=05f057531dfd0308c9892df4913c92101196ad18

Well, after figure out of the source code and manuals of auto**** stuff I see that static linking is not provided - and really I can't compile with --enable-static flag. Check it out and apparently it also should to be fixed.

Comment by Alexei Vladishev [ 2011 Jan 31 ]

All right, I agree that it shouldn't be closed. Let's keep it open, priority was changed to Minor.

Comment by Alexander Vladishev [ 2012 Jan 30 ]

Fixed in versions pre-1.8.11 r25085 and pre-1.9.9 r25087. See ZBX-4535 for more details.





[ZBX-3286] Ability to automatically/manually delete discovered resources (low level discovery). Created: 2010 Dec 13  Updated: 2017 May 30  Resolved: 2011 Dec 21

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Frontend (F), Server (S)
Affects Version/s: 1.9.1 (alpha)
Fix Version/s: 1.9.9 (beta)

Type: Incident report Priority: Blocker
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Fixed Votes: 13
Labels: discovery, lld
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

latest trunk


Attachments: Text File zabbix_server_demo_R23852.log    
Issue Links:
Duplicate
is duplicated by ZBX-4592 When using low level discovery in a t... Closed

 Description   

Now the auto-added resources (items, triggers) can not be deleted. Only graph can be manually deleted - i think tis is bug.

Need to come up with a mechanism for automatic/manual removal of obsolete resources.
By obsolete, I mean:
resources that belong to the prototype, but is no longer discovered (for example we delete VLAN);
resources that have ceased to pass through the filter of the macros (various examples).
This should be an option in the settings of the rule - whether to delete obsoleted resources automatically.

I do not have a clear vision of how it should look, but I think that should be able to manually remove these obsolete resources, for example they should be made available for manual deletion (not obsolete resources should stay unavailable for deletion). This is the case if automatic deletion is not enabled.

as discussed with Aleksandrs Saveljevs i mark this issue as "blocker"



 Comments   
Comment by Paxos [ 2011 Apr 27 ]

Having the same issue with 1.9.3.

I think the current logic of keeping auto-generated resources around after they disappear from discovery is the way to go.

However, these items must be able to be removed manually and not auto-removed.

Comment by Marcin Gapiński [ 2011 Oct 14 ]

It would be great if there was an option to choose what to do when discovered item is no longer present.

I can think of two possible scenarios: delete that item automatically or disable it and allow for manual removal. That would certainly decrease resources usage on rapidly changing monitoring environment.

Comment by Yoav Steinberg [ 2011 Nov 08 ]

I'd really like to see an option to auto remove discovered items. For things like file system or net interfaces this might not be critical, and we might want to continue seeing these items after we remove them. But other use cases like for monitoring instances of a process on a system that spawn new processes based on some external input (rapidly changing system) there should be an option to automatically delete the discovered items after they disappear.

Comment by Alexei Vladishev [ 2011 Nov 16 ]

A new global (Administration->General->???) configuration parameter will define life time of lost items. If an item exceeds the period, it will be removed and a new record for housekeeper will be added as well.

Comment by Alexey Pustovalov [ 2011 Nov 21 ]

maybe do better not global? host level or discovery?

Comment by richlv [ 2011 Nov 21 ]

lld rule level would seem to be most appropriate. for example, we might have a host which has network interfaces quite rapidly changed, thus we know that interface going down would allow us to mostly safely remove it after 3 days. on the other hand, disk volumes going away would be less frequent, and data would be more important in the long term, thus we would like to have those items only removed after 2 weeks have passed

Comment by Alexei Vladishev [ 2011 Nov 23 ]

Sure, it's better to keep it on lld level with support of user macros (template, host, global).

Comment by Alexey Fukalov [ 2011 Dec 05 ]

New field "lifetime varchar(64)" is added to "items" table. Allows usermacro or number in days.
New input field is added to discovery rule form before "status".

Comment by Pavels Jelisejevs (Inactive) [ 2011 Dec 06 ]

(1) GUI
I think we should replace the maximum (and, maybe, minimum) lifetime with constants in the CDiscoveryRule::checkSpecificFields() and CDiscoveryRule::validateLifetime() methods and try to avoid using literal constants in the future.

Same for validatePortNumber()

<Vedmak> RESOLVED

<pavels> CLOSED.

Comment by Pavels Jelisejevs (Inactive) [ 2011 Dec 06 ]

(2) GUI
Shouldn't the clearValues() method in CItemGeneral::checkInput() and CHostInterface::checkInput() be called before the validation is performed? What's the point of cleaning up values, if they are already OK?

<Vedmak> RESOLVED

<pavels> CLOSED.

Comment by Pavels Jelisejevs (Inactive) [ 2011 Dec 06 ]

(3) GUI
Please review my changes in r23788 and r23789.

<Vedmak> CLOSED

Comment by Pavels Jelisejevs (Inactive) [ 2011 Dec 06 ]

(4) DOC
The docs should also be udpated
http://www.zabbix.com/documentation/internal/database_2.0/items
http://www.zabbix.com/documentation/2.0/manual/discovery/low_level_discovery

<Vedmak> I'll update db and api docs, and put separate paper on the board for it.

<pavels> Db and API docs are ok. CLOSED.

Comment by Alexander Vladishev [ 2011 Dec 06 ]

(5) [DB] DB patches
upgrades/dbpatches/2.0/postgresql/patch/hosts.sql:64 - incorrect SQL statement delimiter
upgrades/dbpatches/2.0/oracle/patch/hosts.sql:64 - useless constraint 'NOT NULL'

<Vedmak> RESOLVED
<sasha> CLOSED

Comment by Pavels Jelisejevs (Inactive) [ 2011 Dec 06 ]

(6) GUI
When cloning a discovery rule or full cloning a host an error occurs:
"Discovery rule "lld22-2:lld2" has incorrect lifetime: "". (min: 0, max: 3650, user macro allowed)"

<Vedmak> RESOLVED

<pavels> CLOSED.

Comment by Alexander Vladishev [ 2011 Dec 06 ]

(7) [GUI] when creating/updating a discovery rule with 'Keep lost resources period (in days)' = 0 an error occurs:
Discovery rule "vfs.fs.discovery:vfs.fs.discovery" has incorrect lifetime: "". (min: 0, max: 3650, user macro allowed)

<Vedmak> RESOLVED

<pavels> CLOSED.

Comment by richlv [ 2011 Dec 06 ]

(8) api docs (to avoid overloading (4))
a) having http://www.zabbix.com/documentation/2.0/manual/appendix/api/discoveryrule is great, but to be consistent i guess there shouldn't be space in the page title (not sure about capitalisation)

b) also should be listed at http://www.zabbix.com/documentation/2.0/manual/appendix/api/changes_1.8_-_2.0

<Vedmak> RESOLVED

<richlv> do we have a rule on class naming ? because now we have "TemplateScreen" and "Discoveryrule" - would be nice to settle on either camelcase or something.
hmm, changes page actually says "DiscoveryRule" - which one is correct then ?

<zalex> last question moved to ZBXNEXT-1058
CLOSED

Comment by Pavels Jelisejevs (Inactive) [ 2011 Dec 07 ]

GUI is TESTED.

Comment by Alexander Vladishev [ 2011 Dec 07 ]

A server side is ready to test!

<dimir> Great! Please review my changes that allow it to work without graph prototypes in r23965 .

Comment by Oleksii Zagorskyi [ 2011 Dec 08 ]

I could not to not test the dev branch ASAP
Server side tested. Seems it works.
A part of the debuglog attached.

>>> 15557:20111208:010525.923 poller #1 spent 2.178640 seconds while updating 1 values
it's ok because tested SNMP device is partially slow:

  1. time snmpwalk -c scarysecret -v 1 10.20.0.5 IF-MIB::ifAlias
    .....~40 result records ....

real 0m1.965s
user 0m0.344s
sys 0m0.016s

Comment by Oleksii Zagorskyi [ 2011 Dec 08 ]

(9) An item, has been not passed through the filter but doesn't deleted yet, still continue to be monitored:

mysql> select i.lastclock, d.lastcheck from items i, item_discovery d where i.itemid=25905 and i.itemid=d.itemid;
----------------------+

lastclock lastcheck

----------------------+

1323302205 1323298285

----------------------+

Is it ok by design?

added:
For instance, how an user can understand that after he changed the Regexp previously discovered and added resources currently are not passing by filter?
So if a gathering the data will be stopped then it could be some "indicator" that the resources are not discovered anymore an they going to be deleted in N days.

How is else the user can understand that some resources are not discovered anymore?

<zalex> some part moved to the ZBXNEXT-1058, some to the ZBX-4475
CLOSED

Comment by Oleksii Zagorskyi [ 2011 Dec 08 ]

(11) I would suggest to change default value 0 days for "Keep lost resources period (in days)" to for instance 1 day when an user creates new discovery rule.
Try to imagine the case when the user creates discovery rule and forgot to change 0 to the value>=1.
Later he or someone else wrongly changes Regexp of discovery rule and all historical data can be unexpectedly lost in one moment (when discovery will be performed by the zabbix_server next time).

It's danger behavior. The user has to set 0 day only when he is understanding what he doing.

<zalex> moved to ZBXNEXT-1058
CLOSED

Comment by Ghozlane TOUMI [ 2011 Dec 09 ]

Hi.
the method I suggested in ZBXNEXT-925 could be an answer :
you could add a 'layer' between discovery and actual item creation, where user could see with some kind of status

  • the currently discovered ressources
  • the previously discovered ressources no longuer present, both with the date of last discovery
  • and eventualy the manualy added ressources.( wich was the main request for the ZBXNEXT-925)

That way the user can choose between automatic and manual creation / deletion,
see what's going on with the discovery, manage the ressources by disabling / removing ressources.

That could allow some nw use cases like :
deactivate after an initial run the discovery and keep it's ressources ,
use the lld structure with manualy entered ressources etc.

Comment by dimir [ 2011 Dec 14 ]

Note, until ZBX-4425 is merged into trunk the server side will not work in this development branch. Meanwhile you can apply this patch:

svn diff svn://svn.zabbix.com/branches/dev/ZBX-4425 -c r23945

in order to test it.

Comment by Alexey Fukalov [ 2011 Dec 21 ]

merged: svn://svn.zabbix.com/trunk 24129

Comment by Alexey Fukalov [ 2011 Dec 21 ]

Additional improvements will be implemented in ZBXNEXT-1058

Comment by Oleksii Zagorskyi [ 2011 Dec 21 ]

Reopened to close (move) some comments;
to remove 2.0 from Fixversion.

Comment by Oleksii Zagorskyi [ 2011 Dec 21 ]

Closed again.

Comment by Nelson Rotunno [ 2014 Jul 29 ]

I've been looking around on the forums and whatnot and it seems like there is still no way to manually delete discovered resources, it would be great to see that added.

At least in my case, many (nearly 30) unnecessary network interfaces are added via the discovery rule and even though it is possible to disable the new items as a whole, I find this rather "polluting" to my setting.

I see no reason to leave those items there and I can't just recreate the host since I want to keep its historical data. Besides, those items won't get scheduled for auto cleanup because they're still discovered, I simply don't want them around anymore.





[ZBX-2806] Double notification messages (double-generated alert) Created: 2010 Aug 03  Updated: 2017 May 30  Resolved: 2011 Dec 19

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: None
Fix Version/s: 1.9.9 (beta)

Type: Incident report Priority: Blocker
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: actions
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

last trunk revision 13781


Attachments: PNG File 1_double_message_#1.png     PNG File 2_double_message_#2.png     Text File 2action.log     PNG File 3_double_message_#3.png     File ZBX-2806-trunk.patch     PNG File rev24046_improvement+DB_usage.png     PNG File rev24046_improvement.png     File zabbix_server_patched.log.gz    

 Description   

See attached screen shot.
I can not understand when and why this happens (there is an assumption that it is connected with the moment when the server updates the cache configuration).
Server debuglog exists and if you want I can send directly to the developers.

My config (easy to understand that in the picture, the problem is not relevant):
DEBUG

{HOSTNAME}

:

{STATUS}

|

{ITEM.NAME}

| EVENT.ID:

{EVENT.ID}

| Server timestamp =

{DATE}

{TIME}

| TRIGGER.NAME:

{TRIGGER.NAME}

***************************************************
ITEM.VALUE1:

{ITEM.VALUE1}

ITEM.LOG.EVENTID1:

{ITEM.LOG.EVENTID1}

ITEM.LOG.SOURCE1:

{ITEM.LOG.SOURCE1}

ITEM.LOG.AGE1:

{ITEM.LOG.AGE1}

***************************************************
ITEM.LASTVALUE1:

{ITEM.LASTVALUE1}

 Comments   
Comment by Oleksii Zagorskyi [ 2010 Aug 19 ]

Sorry, further to the Russian, because a lot of logic and I can make a mistake with the translation

Теперь мне удается воспроизвести проблему боле четко, почти 100% воспроизведение.
Очень важно чтобы время на сервере и агенте было синхронизировано, тогда легко уверенно воспроизвести проблему.
Порядок действий:
Стартуем сервер где то на 55 секунде так чтобы он начал работать ровно на начале минуты!!! То есть чтобы процедура "DCsync_configuration" возникала ровно в начале минуты и мы на это будем ориентироваться.
Ждем ~57-58 секунды на заббикс агенте, стартуем скрипт, который быстро генерирует 100 событий в журнал событий. Агент настроен так что сразу же начинает отсылать события (буфер 1 сек.).
Сервер обрабатывает эскалации в интервале ~0-10 секунд в начале минуты. Где то в это период происходит "DCsync_configuration" или может быть другая процедура вызывает проблему - я точно не уверен.
В журнале аудита видим не 100, а 101 сообщение.

Предварительный итог таков: когда процедура "process_escalations" заканчивает свою работу после окончания процедуры "DCsync_configuration" тогда на последнее уже обработанное событие повторно ошибочно генерируется Алерт при следующем вызове процедуры "process_escalations". Извините если звучит сумбурно, но где то так.

Я подготовил куски лога где видно последовательность как возникает проблема. Там же даны описания между абзацев.
В примере лога (я подготовил его еще вчера) сервер стартовал НЕ начале минуты !!!, поэтому не обращайте внимание на штампы времени.

Comment by Daniel Poßmann [ 2011 Oct 14 ]

could this issue also apply to zabbix 1.8.5?
we have seen double activated actions and also double generated events (so far always 3s between those events). The last time it happened with a trigger that had 8 conditions (8 items, 4 host 2 items each) connected with a logical OR
As far as i understood the google translation of the first comment, it looks like the same issue

Comment by Oleksii Zagorskyi [ 2011 Oct 17 ]

Yes, i managed to reproduce it in the latest trunk rev:
Zabbix Server v1.9.7 (revision 22419) (9 September 2011)
Compilation time: Oct 14 2011 18:05:51

And I'm almost sure that this problem connected to the call of function DCsync_configuration()
New feature of "-R config_cache_reload" helps me a lot to reproduce this problem at the new much faster server than I used early.
I.e. I'm almost lost the hope to reproduce it, but "-R config_cache_reload" finally helped me

Let me some time to figure out some detail in the debuglog.

Comment by Oleksii Zagorskyi [ 2011 Oct 24 ]

How to reproduce:
When zabbix_server massively sends the alerts (values received by Active agent, but maybe it's nor so important), need to execute the configuration cache reloading.

1. Agent bulk sends 50 values to server (one item as "zabbix agent active")
2. I wait ~ 8-10 seconds and server starts to massively send alerts (50 mail).
3. I execute "config_cache_reload" many-many times (interval 200-400 miliseconds) in short time period of time (3-5 seconds)
4. Problem is reproduced -> i have 51 mails in the box.

Comment by dimir [ 2011 Nov 01 ]

Which database?

Comment by Oleksii Zagorskyi [ 2011 Nov 01 ]

> Which database?
My SQL 5.5.8 (FreeBSD), My SQL 5.1.49 (Debian).

Comment by dimir [ 2011 Nov 02 ]

This happens when alerts are created and put to DB, before they are handled. When we receive 50 values 51 alerts are spawned and 51 alert entries are put to DB. So step 2 and 3 can be skipped. I continue working on it.

Comment by dimir [ 2011 Nov 03 ]

This happens when escalations are handled. The conflict is between "escalator" and "history syncer" processes when they access "escalations" table. Continue investigating.

Comment by dimir [ 2011 Nov 03 ]

I have added some logging and here's the difference (both examples of handling escalation ID 2):

NO DUPLICATE EVENTS:
escalator#1:20111103:154956.268 process_escalations() escalid:2 eventid:8451 escstatus:0
escalator#1:20111103:154956.268 process_escalations() escalid:2 status:ACTIVE send mail
escalator#1:20111103:154956.275 process_escalations() escalid:2 status:0 FAILED, set nextcheck
histsyncr#1:20111103:154956.297 process_actions(eventid:8452) add escalation
histsyncr#1:20111103:154956.297 DBstart_escalation(remove older active escalations) actionid:3 and status not in (1,4,5)

DUPLICATE EVENTS:
escalator#1:20111103:123153.535 process_escalations() escalid:2 eventid:8195 escstatus:0
escalator#1:20111103:123153.535 process_escalations() escalid:2 status:ACTIVE send mail
histsyncr#3:20111103:123153.558 process_actions(eventid:8196) add escalation
histsyncr#3:20111103:123153.558 DBstart_escalation(remove older active escalations) actionid:3 and status not in (1,4,5)
escalator#1:20111103:123153.558 process_escalations() escalid:2 status:0 FAILED, set nextcheck

The problem happens when history syncer "intrudes" in while escalator is processing escalation.

Comment by dimir [ 2011 Nov 07 ]

So, the problem is that when escalator is selecting 51 escalations to process, from table containing 50 records. Here is how it happens:

  • [history syncer] is adding 50 escalations one by one, using steps:
    • change current ACTIVE escalations to SUPERSEDED_ACTIVE
    • insert new escalation with status ACTIVE
  • [escalator] selects escalations to process while history syncer is adding escalations, 51 rows are selected:
    • two records are returned with the same escalation ID and event ID:
      • one with status ACTIVE
      • another with status SUPERSEDED_ACTIVE

For the information, this happens also with just 1 dbsyncer, so no dbsyncers conflict.

Comment by dimir [ 2011 Nov 07 ]

From MySQL manual:

http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html#isolevel_repeatable-read

The default isolation level of a transaction is REPEATABLE READ which causes the behavior. If I add the code to set transaction level to READ COMMITTED to only read what is commited:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

the problem goes away.

Comment by dimir [ 2011 Nov 07 ]

We decided to make our transactions more isolated, by disabling reading uncommitted data explicitly. Related documentation:

http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
http://www.postgresql.org/docs/9.0/static/transaction-iso.html
http://www.sqlite.org/sharedcache.html (Section "2.2.1 Read-Uncommitted Isolation Mode")

Comment by dimir [ 2011 Nov 07 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-2806 .

How I tested it:

  • create Zabbix trapper item "test.garbage" (default values)
  • create a trigger that will fire when that item receives non-zero with event generation Normal + Multiple PROBLEM events
  • create action that will send e-mail when trigger value is PROBLEM

From console, feed the item with values (you can increase 50 to something bigger):

$ for ((i=0; i<50; i++)); do rnd=$[ $RANDOM % 100 + 1 ]; bin/zabbix_sender -z 127.0.0.1 -s "Zabbix server" -k test.garbage -o $rnd; done

You should be getting exactly 50 e-mails.

Comment by dimir [ 2011 Nov 07 ]

Patch for current trunk (r23016).

Comment by dimir [ 2011 Nov 07 ]

Oleksiy,

I have attached the patch for current trunk, would it be possible for you to test that and confirm that it fixes the problem?

Comment by Oleksii Zagorskyi [ 2011 Nov 08 ]

Vladimir,
unfortunately I have reproduced the bug even for patched version.
Please, let me some time to prepare detailed answer and attach debuglog.

Comment by Oleksii Zagorskyi [ 2011 Nov 08 ]

Currently I must to note that seems it's easier to reproduce the bug with DebugLevel=4 but not 3.
I recall this issue originally was reported and several times reproduced in the past with DebugLevel=4 only.

I tried patched version of latest trunk.
I used the command suggested by Vladimir (50 values sent).
I managed to reproduce the problem for ~30-60% attempts.
When I sent 200 values I was managed to reproduce for 99%.

But right now I tried my "old" method. And it works "even better"
I have little impression that it's easier to reproduce the bug using my method of send bulk values by active agent but not using zabbix_sender in the loop.

When I sent 200 values, I managed to reproduce the problem seems for 100% attempts (sometimes I have 202 alerts).

Huh, sorry for possible not clear comment, but I write it during last day and after several touches this issue.

See attached debuglog.
Duplicated alerts - eventID 671109.

Vladimir, why the issue is resolved?

Comment by dimir [ 2011 Nov 09 ]

Reopening based on comment https://support.zabbix.com/browse/ZBX-2806?focusedCommentId=50012&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-50012

Comment by dimir [ 2011 Nov 17 ]

Move escalations handling to escalator process.

RESOLVED in r23283

Comment by dimir [ 2011 Nov 17 ]

This is a fix for 1.8 so Oleksiy, please wait till it's reviewed, then I'll create a patch for trunk. Merging a fix into trunk produces conflicts, I don't want to resolve them multiple times.

<zalex> I can test any version, trunk or 1.8 branch. Currently I do not need the patch for trunk namely. If it makes sense to test 1.8 branch before code review, please let me know.

<dimir> It would be great if you could try reproducing the bug with current 1.8 branch. Because I couldn't do it. Despite I was hitting the bug in 90% of cases when I used trunk. It's weird, considering that I could not find any differences in escalations handling code in 1.8 and trunk.

Comment by Oleksii Zagorskyi [ 2011 Nov 21 ]

<zalex>
trunk branch:
Zabbix server v1.9.8 (revision 23322) (27 October 2011)
Compilation time: Nov 21 2011 02:56:07
Reproducible as usually.

--->>>> 1.8 branch:
Zabbix Server v1.8.9rc2 (revision 23324) (15 November 2011)
Compilation time: Nov 21 2011 01:36:57
Cannot reproduce for 2000 alerts.

Dev (this) branch:
Zabbix Server v1.8.9rc1 (revision 23283) (11 November 2011)
Compilation time: Nov 21 2011 01:01:30
Cannot reproduce for 2000 alerts.

Vladimir, as you see I cannot reproduce even for 1.8 branch.

Comment by dimir [ 2011 Nov 22 ]

(5) Fix triggerid and r_evnetid fields handling in process_escalations(). These fields are NULL by default in trunk.

<dimir> RESOLVED in r23428, r23431, please check
<sasha> CLOSED

Comment by dimir [ 2011 Nov 22 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-2806-trunk

Comment by dimir [ 2011 Nov 22 ]

Because the fix requires changes in database and it's not reproducible in 1.8 anyway the fix will be only provided for trunk.

Oleksiy, it would be great if you could test svn://svn.zabbix.com/branches/dev/ZBX-2806-trunk .

<zalex>
Dev branch tested.
Debuglog:
21893:20111122:231849.864 In process_escalations()
21893:20111122:231849.864 query [txnlev:0] [select escalationid,actionid,triggerid,eventid,r_eventid,esc_step,status from escalations where (status=0 and nextcheck<=1321996729 or status=2 and r_eventid<>0) ·‡ order by actionid,triggerid,escalationid]
21893:20111122:231849.865 [Z3005] query failed: [1064] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '·‡ order by actionid,triggerid,escalationid' at line 1 [select escalationid,actionid,triggerid,eventid,r_eventid,esc_step,status from escalations where (status=0 and nextcheck<=1321996729 or status=2 and r_eventid<>0) ·‡ order by actionid,triggerid,escalationid]
21893:20111122:231849.865 End of process_escalations()

Yes, some characters seems are not UTF8. I don't know what is that.
A part in HEX:
00000100 30 29 20 B7 │ 87 8F 15 7F │ 20 6F 72 64 │ 65 72 20 62 0) .. order b

Need to fix and then I'll retest it again.

Comment by dimir [ 2011 Nov 23 ]

(6) As zalex noted, there was a bug in SQL query.

<dimir> RESOLVED in r23450, please retest.

<zalex> r 23450, still error:
1017:20111123:095534.861 In process_escalations()
1017:20111123:095534.861 query [txnlev:0] [select escalationid,actionid,triggerid,eventid,r_eventid,esc_step,status from escalations where (status=0 and nextcheck<=1322034934 or status=2 and r_eventid not null) and escalationid between 000000000000000 and 099999999999999 order by actionid,triggerid,escalationid]
1017:20111123:095534.861 [Z3005] query failed: [1064] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'null) and escalationid between 000000000000000 and 099999999999999 order by acti' at line 1 [select escalationid,actionid,triggerid,eventid,r_eventid,esc_step,status from escalations where (status=0 and nextcheck<=1322034934 or status=2 and r_eventid not null) and escalationid between 000000000000000 and 099999999999999 order by actionid,triggerid,escalationid]
1017:20111123:095534.862 End of process_escalations()
1017:20111123:095534.862 escalator #1 spent 0.000156 seconds while processing escalations

<dimir> RESOLVED in r23455, sorry, my bad, I should have tested it better.

<zalex> r23455. after 10 000 events the problem is NOT reproducible. Seems it's ready to test|review by developers.

<dimir> Thank you, zalex.
<sasha> CLOSED

Comment by Alexander Vladishev [ 2011 Dec 05 ]

(7) Escalation always is deleted after first step. Following steps will never be executed.

<dimir> That's right, fixed delete escalation condition. RESOLVED in r23813.
<sasha> CLOSED

Comment by Alexander Vladishev [ 2011 Dec 05 ]

(8) Please review my changes in rev 23773.

<dimir> Perfect! CLOSED.

Comment by Alexander Vladishev [ 2011 Dec 07 ]

(9) src/zabbix_server/escalator/escalator.c:1243
I don't like this SQL statement. When I have generated 950 events, notifications have appeared only at 511 of them.

<dimir> RESOLVED in r23905
<Sasha> CLOSED

Comment by Alexander Vladishev [ 2011 Dec 15 ]

(10) Statement "SET DEFINE OFF" doesn't need the line-terminator ';' or '/'. Please remove it from gen_data.pl and help_items.sql.

<dimir> RESOLVED in r24018
<Sasha> CLOSED

Comment by dimir [ 2011 Dec 16 ]

Just a summary of what was done.

Previously escalations were processed by both dbsyncer and escalator. The same records in escalations table were modified by both processes. This was causing a mess and this was not possible to fix without a locking mechanism which we don't have/use (in cases like these). So instead, now dbsyncer only inserts records, which are later modified/removed by escalator process.

This also reduces the workload of dbsyncer which was also an issue.

<richlv> thanks for the description. does http://www.zabbix.com/documentation/2.0/manual/introduction/whatsnew200?&#improved_history_db_syncer_performance summarise it correctly ?
this is something i'd love to test empirically...

<Sasha> Excellent! Thanks!

<zalex> See to the attached picture "rev24046_improvement.png". I suppose the mentioned documentation note has to be reviewed (it seems escalator's work (load) noticeable decreased but not increased) -> REOPENED

<richlv> updated manual according to the testing results (which imply that both escalator & history syncers are more efficient now)

<zalex> Reviewed. I like it, thanks. CLOSED

Comment by Alexander Vladishev [ 2011 Dec 16 ]

Available in version pre-1.9.9, r24046.

Comment by Oleksii Zagorskyi [ 2011 Dec 18 ]

Just for case the last revision of trunk has been tested with big count of alerts (40 000). ALL IS FINE with the server.

Reopened to remove 2.0 from Fix versions and add some interesting results of my experiments.
<zalex> CLOSED

Comment by Oleksii Zagorskyi [ 2011 Dec 19 ]

Reopened to add additional statistic requested by Sasha.
See attached "rev24046_improvement+DB_usage.png"





Generated at Sat Apr 20 00:36:03 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.