-
Incident report
-
Resolution: Duplicate
-
Critical
-
None
-
2.2.0
-
None
-
Linux version 3.10.7 gentoo amd64
Database postgresql 9.2
We have a node setup with 3 nodes (1 master and 2 child nodes), all gentoo 64 bit machines with same kernel
Master Node ID 1
--> Child Node ID3
--> Child Node ID4
Since we have updated from 2.0.8 to 2.2.0, events were not synced from child-nodes to master.
26998:20131114:163443.962 query [txnlev:0] [select nodeid from nodes where nodeid=3 and masterid=1]
26998:20131114:163443.963 query [txnlev:0] [select nodeid from nodes where masterid=1]
26998:20131114:163443.963 query [txnlev:0] [select nodeid from nodes where masterid=4]
26998:20131114:163443.963 query [txnlev:0] [select max(eventid) from events where eventid between 300000000000000 and 399999999999999]
26998:20131114:163443.963 NODE 1: sending [0] to Node [3]
26998:20131114:163443.963 trapper #1 [processed data in 0.000865 sec, waiting for connection]
27016:20131114:163443.967 trapper #16 [processing data]
27016:20131114:163443.967 Trapper got [History<AD>3<AD>3<AD>events
300000000000001<AD>1<AD>1<AD>300300000000013<AD>1384418585<AD>0<AD>0<AD>0
300000000000002<AD>1<AD>2<AD>300300000000088<AD>1384418639<AD>0<AD>0<AD>0
300000000000003<AD>1<AD>1<AD>300300000000014<AD>1384418637<AD>0<AD>0<AD>0
300000000000004<AD>1<AD>2<AD>300300000000089<AD>1384418691<AD>0<AD>0<AD>0
300000000000005<AD>1<AD>1<AD>300300000000015<AD>1384418689<AD>0<AD>0<AD>0
300000000000006<AD>1<AD>2<AD>300300000000090<AD>1384418744<AD>0<AD>0<AD>0
300000000000007<AD>1<AD>1<AD>300300000000016<AD>1384418741<AD>0<AD>0<AD>0
300000000000008<AD>1<AD>2<AD>300300000000091<AD>1384418796<AD>0<AD>0<AD>0
300000000000406<AD>1<AD>2<AD>300300000000105<AD>1384443196<AD>0<AD>0<AD>0
300000000000407<AD>1<AD>1<AD>300300000000055<AD>1384443195<AD>0<AD>0<AD>0] len 21621
27016:20131114:163443.967 In node_history()
27016:20131114:163443.967 query [txnlev:1] [begin;]
27016:20131114:163443.967 query [txnlev:1] [select nodeid from nodes where nodeid=3 and masterid=1]
27016:20131114:163443.967 query [txnlev:1] [select nodeid from nodes where masterid=1]
27016:20131114:163443.968 query [txnlev:1] [select nodeid from nodes where masterid=4]
27016:20131114:163443.968 NODE 1: received events from node 3 for node 3 datalen 21621
27016:20131114:163443.968 In process_record_event()
27016:20131114:163443.968 End of process_record_event():SUCCEED
27016:20131114:163443.968 In process_record_event()
27016:20131114:163443.968 End of process_record_event():SUCCEED
27016:20131114:163443.968 In process_record_event()
27016:20131114:163443.968 End of process_record_event():SUCCEED
27016:20131114:163443.968 In process_record_event()
27016:20131114:163443.968 End of process_record_event():SUCCEED
27016:20131114:163443.968 In process_record_event()
27016:20131114:163443.968 End of process_record_event():SUCCEED
27016:20131114:163443.968 In process_record_event()
27016:20131114:163443.980 query [txnlev:1] [select description,expression,priority,type from triggers where triggerid=300300000013446]
27016:20131114:163443.980 End of process_record_event():SUCCEED
27016:20131114:163443.980 In process_record_event()
27016:20131114:163443.980 End of process_record_event():SUCCEED
27016:20131114:163443.980 In process_record_event()
27016:20131114:163443.980 End of process_record_event():SUCCEED
27016:20131114:163443.980 query [txnlev:1] [commit;]
27016:20131114:163443.980 trapper #16 [processed data in 0.013332 sec, waiting for connection]
27013:20131114:163443.982 trapper #13 [processing data]
27013:20131114:163443.982 Trapper got [ZBX_GET_HISTORY_LAST_ID<AD>3<AD>3
acknowledges<AD>acknowledgeid] len 54
27013:20131114:163443.982 In send_history_last_id()
From the log-files the sync seems to be fine, but no event has ever reached the master-db.
zabbix=# select * from events;
eventid | source | object | objectid | clock | value | acknowledged | ns
----------------------------------------------------------------------+----------
100000000000001 | 3 | 0 | 100100000015089 | 1384419270 | 1 | 0 | 606683896
100000000000002 | 3 | 0 | 100100000015089 | 1384419270 | 0 | 0 | 461607041
(2 rows)
Before we have updated to 2.2.0 we always saw eventids with 3000xxx and 4000xxx on the master database.
- duplicates
-
ZBX-7452 Event synchronization is broken in multinode DM case
- Closed