Hi Peter,
There are two different scenarios where snapshots come into play, each requiring a different process to re-sync.
Scenario 1 - user loses their connection to NRDP
In this scenario NRDP is still receiving data from Darwin but the user has lost their connection to NRDP. The consumer should be recording the SequenceNumber (included in the header of each message) and timestamp (included in the body of the message) of messages they process so they should know the 'last' message they received. When they reconnect, if they have used a durable subscription to their topic, then providing they reconnect within 5mins the next message through the topic should have the next SequenceNumber and if it does they can just carry on as normal as they have not lost any messages. If the next SequenceNumber does not directly follow the SequenceNumber of their last processed message then they have lost messages and should look to the NRDP FTP site, and:
- clear their database;
- download and process the snapshot;
- download and process the pPort log file entries until they reach the timestamp of the first message they received through their topic after reconnection;
- resume processing messages from their topic
Scenario 2 - NRDP loses connection to DarwinIn this scenario NRDP loses connection to Darwin and NRDP users would see a pause in the provision of messages through their topics whilst NRDP tries to re-establish connection. When NRDP manages to reconnect it will check the PushPortSequence number and if this is contiguous then no messages have been lost and NRDP will start publishing the messages to the topics and everyone just continues.
If the PushPortSequence number isn't contiguous* then NRDP will request a snapshot from Darwin, which it will process and pass through the topics to NRDP users. In the background NRDP should be monitoring its own direct connection to Darwin looking for the snapshot marker you mention and discarding messages until the relevant marker is received, at which point NRDP is now synchronised and it should then publish subsequent <uR> messages to the topics.
* For clarity, NRDP doesn’t publish all messages it receives from Darwin to the topics so it is normal for gaps to appear in the PushPortSequence in the messages received through NRDP and this doesn’t indicate NRDP is losing messages.
So the sequence of events from an NRDP consumer perspective would be:
- A pause in the provision of <uR> messages through their topic
- A series of <sR> messages published though their topic
- Resumption of <uR> messages through their topic
For those using Darwin data for its primary purpose of realtime information this process enables consumers to resync with the latest current information by discarding their local data when they detect a snapshot (either by detecting <sR> messages in their topic or by monitoring the status topic) and processing the snapshot. At which point they will be in sync again and can process any subsequent update records as normal.
For those using NRDP topics to record what happened to services (i.e. not in realtime) they should retain local data as the snapshot only contains information on ‘active’ services, therefore if any services become ‘deactivated’ during the outage they won’t be included in the snapshot and information will be not be provided on what happened to them (so this data is lost). Note: this would be equally true if a user is connected directly to Darwin or receives Darwin data through NRDP.
Regards,
RDG