Ticket #281 (accepted defect)
MAS sometimes goes out of sync
| Reported by: | bertil.engelholm@… | Owned by: | mathi |
|---|---|---|---|
| Priority: | major | Milestone: | PL 2.0.2 |
| Component: | MASv | Version: | 2.0.0 |
| Keywords: | Cc: | ||
| patch waiting for maintainer: | no |
Description
MAS synchonization of it's data between active and standby is not working properly. Sometimes it goes out of sync which will cause a cluster restart if you get a failover in this situation. A warm sync made once every minute will correct the problem but it's up to a minut where you basically run without a synced standby.
The problem is in the mas_init_register function where the active node sometimes increases it data change counter (async_count) but the standby will fail to do the same which means it's out of sync.
In more detail it's when the standby make a mab_fltrid_list_get and finds out that o_fltr_id == reg_req->fltr_id. In this branch the async_count is never increased. The question is if the active node should send a SYNC_DONE at all in this case ?
