Darwin journey and schedule elements

Visto 223 veces
Saltar al primer mensaje no leído

Chris

no leída,
19 mar 2018, 6:02:2719/3/18
a A gathering place for the Open Rail Data community
Hi

Looking at the files available from the Darwin FTP site, I'm a little confused about the timetable and snapshot files. The timetable file has Journey elements, and the snapshot has schedule elements. According to the schemas these are different things, but they seem to be mostly the same ? Is a journey the same as a schedule but with a different name ? And if they are different, is there any documentation for the Journey elements ?

Cheers !

Rail Delivery Group

no leída,
19 mar 2018, 8:02:4619/3/18
a A gathering place for the Open Rail Data community
The Darwin timetable file represents a fresh baseline of schedules for the next 48 hrs (approx). Schedules are only published as <Schedule> records via the update (<uR>) or snapshot (<sR>) messages if they are active. If your application needs to know about a schedule which has not been 'activated' as yet then the timetable file can be used to retrieve and show this schedule data. This data should not be used for a schedule which has gone through an active/deactivated cycle as the schedule content may have changed.

There are subtle differences between <Journey> and <Schedule> messages (e.g. such as the <isActive> attribute mentioned above - there are others) which may be the historic reason for the different name (it's been that way for 10yrs+). So users need to account for these differences, and ensure that a <Schedule> message takes precedence over a <Journey> message for the same service (a <Schedule> message is always a full refresh so the latest <Schedule> message takes precedence over any <Journey> or previous <Schedule> message for the same service). Once that's done they can essentially be treated as the same thing.

Chris

no leída,
19 mar 2018, 14:11:5419/3/18
a A gathering place for the Open Rail Data community
Thanks very much for clearing this up ! So just to confirm, to go from nothing to a current schedule:
- read the timetable file and insert Journey elements
- read the snapshot and replace journeys with schedule elements where they already exist, and add them where they do not
- read the push port log files and apply updates to schedules as required
- apply updates from the real-time queue from that point onwards.

Is there a need to reimport the timetable after that point, or will all new schedules appear in the real-time updates from that point onwards ?

Thanks !

WantStuff

no leída,
21 mar 2018, 7:52:2621/3/18
a A gathering place for the Open Rail Data community
Hi Chris,

I'm going a similar process myself so am also learning, but it there are a couple of synchronisation hoops before switching from files to the real time queue.
After loading all the files you request the latest snapshot via STOMP/OpenWire which would cause Darwin to:
1) send another file to the snapshot FTP folder containing changes since the last file you loaded,
2) and start queuing the real-time message updates. 

You then have 5 minutes (I assume) to consume this snapshot and start reading messages from the queue.


Please anyone feel free to jump in and correct me or add to this.


Regards
Martin

Chris

no leída,
21 mar 2018, 15:58:4721/3/18
a A gathering place for the Open Rail Data community
Hey,

I thought that the queue on the broker already contained the last 5 minutes of unconsumed messages? Thus, you'd need to run the logs from the snapshot up to at most 5 minutes ago, then start consuming messages from your (STOMP) queue? And hopefully also deal with duplicate messages (that are both in your queue and the logs).   

So in order:
Load ..._v8.xml.gz
Load snapshot/...xml.gz
Load pPortLog.*.log
And then start consuming the queue

WantStuff

no leída,
21 mar 2018, 19:17:4821/3/18
a A gathering place for the Open Rail Data community
I don't believe the queue automatically holds the last 5 mins, you have to trigger it.
So without requesting the snapshot, the messages won't be queued and you could miss a few messages in between the switch over.

Happy to be proven wrong, this process seems way to over engineered.

Rail Delivery Group

no leída,
22 mar 2018, 10:13:1922/3/18
a A gathering place for the Open Rail Data community
@Chris

If you are only looking at schedules then fundamentally yes, that's the sequence. If you are looking at real time running as well then you do of course also need to apply the <TS> messages in the right order (by timestamp) to get the current status of the train itself (actual movements, forecasts, etc). The queue should hold the last 10 mins of updates so there is a bit of processing to be done around timestamps to determine when to stop using logs and to start using the queue when you first get going (we are looking to improve this with the update to NRDP discussed at last weeks' developer forum, which should hopefully simplify the process)

Peter Hicks (Poggs)

no leída,
22 mar 2018, 13:39:2622/3/18
a WantStuff,A gathering place for the Open Rail Data community
Hi Martin


On 21 Mar 2018, at 11:52, WantStuff <wants...@gmail.com> wrote:

I'm going a similar process myself so am also learning, but it there are a couple of synchronisation hoops before switching from files to the real time queue.
After loading all the files you request the latest snapshot via STOMP/OpenWire which would cause Darwin to:
1) send another file to the snapshot FTP folder containing changes since the last file you loaded,
2) and start queuing the real-time message updates. 

You then have 5 minutes (I assume) to consume this snapshot and start reading messages from the queue.

Please anyone feel free to jump in and correct me or add to this.

The Darwin snapshot is usually from the start of the day around 0200.  Unfortunately, unless NRDP itself disconnects from Darwin, a new snapshot won’t be generated.

You either have to FTP down the log files that you missed (which can be a lot if it’s later in the day) and replay them, or try to cope with missing schedules and/or cancellations.

Ideally (and this is the way the Push Port proper does it), you’d be able to request a snapshot through NRDP at any point in the day, start off with that and then replay message as the day goes on.


Peter



OpenTrainTimes Ltd. registered in England and Wales, company no. 09504022.
Registered office: 13a Davenant Road, Upper Holloway, London N19 3NW

WantStuff

no leída,
22 mar 2018, 14:01:2422/3/18
a A gathering place for the Open Rail Data community
That's an interesting point for both myself and Chris.

So are you saying we can ignore the standard pPort logs, and instead request a snapshot (which will cover everything up to that point in time), load it, and then consume the messages?
If so, that makes far more sense.

Peter Hicks

no leída,
22 mar 2018, 14:57:0422/3/18
a WantStuff,A gathering place for the Open Rail Data community
Hello

First, apologies - I sent that email from the tube yesterday and clearly it didn't send until I opened my laptop today!

I'll assume you're replying to my message since you quoted it.  On the current NRDP, you can't request a snapshot on demand - there's no capability to do so.  FTP-ing down the snapshot from ~0200 just gets you the snapshot from then, and you have to replay all message afterwards.

Your message queue will hold up to 10 minutes of messages - if you consume them, they're gone.

As Andrew pointed out above, the next release of the NRDP will hopefully make this easier - but trying to come up with an architecture that will scale *and* enable fast recovery from outages is not as simple as it seems!


Peter



--
You received this message because you are subscribed to the Google Groups "A gathering place for the Open Rail Data community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openraildata-t...@googlegroups.com.
To post to this group, send email to openrail...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

WantStuff

no leída,
23 mar 2018, 4:47:1323/3/18
a A gathering place for the Open Rail Data community
Hi Peter,

As per the Push Port specifications document section 4.3 Data Phase, we can request a snapshot (although it doesn't have to be via FTP).

As indicated in Figure 1, Clients must follow these steps when communicating with Push Ports Server in the Data Phase: 
1. The client waits for an HBOK status message. 
2. If the client does not require the Darwin sourced timetable, go to step 5. 
3. The client requests the timetable ID. 
4. The client downloads the timetable from the Push Ports FTP server if the ID is different to the timetable the client already has, using the supplied file names. 
5. The client requests a snapshot; either back over the Push Ports connection or made available on the FTP server (see section 6.1.2.3). 
6. The client processes the snapshot. 
7. The client sends a start update request (see section 6.1.2.4).  Push Ports will start holding updates for the client from the moment the snapshot request is received.  These buffered updates are sent as soon as the client sends the start update request. 
8. The client processes updates until a status message with either HBFAIL or HBINIT is received, the client requests the cessation of updates or the TCP/IP connection fails. 
9. If the TCP/IP connection has not failed, go to step 1. 


It is possible we don't have to request a snapshot, but as we have to send a "start update" request before any updates get queued we would have to continue loading from from the logs until we see a data overlap.

I genuinely could be misunderstanding this so do correct me if I'm wrong.
And sorry that I have hijacked your thread Chris, but hopefully this will be helping you too.

Martin

Peter Hicks

no leída,
23 mar 2018, 5:04:4523/3/18
a WantStuff,A gathering place for the Open Rail Data community
Hi Martin

That's for the Push Port itself - direct.  It probably needs to be clearer on the Wiki that the Push Port specification contains more than NRDP supports - such as snapshot functionality!

I'll see if I can find some time this weekend to update the wiki.


Peter

WantStuff

no leída,
23 mar 2018, 5:12:1923/3/18
a A gathering place for the Open Rail Data community
Thanks Peter, that would be very helpful.
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos