Grouping routes together from Cif SCHEDULE data into GTFS-style routes

156 views
Skip to first unread message

Carl Partridge

unread,
Feb 26, 2014, 7:21:08 AM2/26/14
to openrail...@googlegroups.com
Hello everyone,

I've recently joined the group so please forgive any inexperience regarding the various data sources etc.

I've just dived in to the Network Rail schedule data Cif download as I am interested in adding train schedule information to my transit apps.  This which will involve importing the schedules into our own internal databases, which have a similar structure to GTFS, i.e:

Routes -> Trips along routes -> StopTimes on each trip

However, unless I've gotten confused, it seems that the schedules are grouped 'per train' - I wondered what a sensible way would be of re-grouping them into 'routes', i.e. what would be a good primary key to use for this?  One idea I had was perhaps to use 'Origin-Destination' pairs?  Is this something that anyone could advise me on?  Is there a 'route UID' I've missed somewhere?

I am aware there are some existing GTFS exports, etc, of the data, but for SLA reasons we need to use the Network Rail data.

Best wishes,


Carl

_______________________
Carl Partridge
FatAttitude Ltd

Chris Northwood

unread,
Feb 26, 2014, 10:45:11 AM2/26/14
to openraildata-talk
Origin-destination pairs might suit your purposes, but are likely to be quite naive. I guess the biggest question is, "what's a route". For example, between Manchester and London there are typically 3 trains per hour, but the stopping patterns of those trains vary. Some stop at Macclesfield, some stop at Milton Keynes, but not all, etc, so would grouping all of those services together be acceptable for your purposes?

In my experiments, I've gone for a model of "routes" and "variations", however even with this model, origin-destination pairs to define routes isn't satisfactory. For example, stopping vs. express services between two destinations I would want to model as two routes (e.g., Oxford-London, the stopping service is frequently overtaken by the express service, so I would not want to advertise them as the same, as actually catching a later train might be more effective in getting to your destination here). Automatically determining all of this from the schedule data is something I've not quite done yet, I've mainly been forming a mental model that fits it.


--
You received this message because you are subscribed to the Google Groups "A gathering place for the Open Rail Data community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openraildata-t...@googlegroups.com.
To post to this group, send an email to openrail...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Hicks

unread,
Feb 26, 2014, 11:00:53 AM2/26/14
to Carl Partridge, openrail...@googlegroups.com
Hi Carl

On 26 Feb 2014, at 12:21, Carl Partridge <webs...@fatattitude.com> wrote:

> I've just dived in to the Network Rail schedule data Cif download as I am interested in adding train schedule information to my transit apps. This which will involve importing the schedules into our own internal databases, which have a similar structure to GTFS, i.e:
>
> Routes -> Trips along routes -> StopTimes on each trip
>
> However, unless I've gotten confused, it seems that the schedules are grouped 'per train' - I wondered what a sensible way would be of re-grouping them into 'routes', i.e. what would be a good primary key to use for this? One idea I had was perhaps to use 'Origin-Destination' pairs? Is this something that anyone could advise me on? Is there a 'route UID' I've missed somewhere?


I know where you’re coming from, having wrangled with GTFS and done this a few months ago.

How about using a composite key of TOC code, origin TIPLOC and destination TIPLOC?


Peter

Carl Partridge

unread,
Feb 26, 2014, 11:16:52 AM2/26/14
to openrail...@googlegroups.com, Carl Partridge
Thanks for replies everyone.

I guess that the concept of a 'route', when related to railways, is only public-facing if timetables are to be generated.  (or line maps)

So, for the purposes, of live arrivals/departures boards, I'm satisfied with a solution that roughly groups together relevant collections of trains.  I take your point about express services though, Chris; if I was generating full timetables the task would be much harder.

Your suggestion for a PK seems pretty sensible, Peter, I'll look at something similar to that.  And I love the OpenTrainTimes site by the way!

Carl

Carl Partridge

unread,
Feb 26, 2014, 11:23:23 AM2/26/14
to openrail...@googlegroups.com, Carl Partridge
How did you deal with split and join associations, Peter?  Did you create 3 separate trips (i.e. a 'Y' shape, with each line in the Y being a separate GTFS-style-trip)


On Wednesday, February 26, 2014 4:00:53 PM UTC, Peter Hicks wrote:

Peter Hicks

unread,
Feb 26, 2014, 2:28:39 PM2/26/14
to openrail...@googlegroups.com
On 26/02/14 16:23, Carl Partridge wrote:
> How did you deal with split and join associations, Peter? Did you
> create 3 separate trips (i.e. a 'Y' shape, with each line in the Y
> being a separate GTFS-style-trip)
In the crudest way possible - I didn't... I ended up with two trains.

I'm not sure there's a way of modelling a splitting or joining train in
GTFS, is there?


Peter

Carl Partridge

unread,
Feb 26, 2014, 5:05:38 PM2/26/14
to openrail...@googlegroups.com
Ah, I see - yes, that's what I meant - i.e. one train stops, splits and then continues as two trains.  No, as far as I know, GTFS doesn't allow you to model details of a trip that splits.  Although, musing on this, I wonder if you could perhaps do something using the block_id value in the trips.txt file?  In other words, if a train is joined onto, then continues onwards, you may have to model this as 2 separate trips, but each one could have the same block_id to indicate to a user that they can remain on the vehicle to continue their journey.

Being able to model the splits/joins shouldn't affect my own usage case.

Nathan

unread,
Feb 26, 2014, 5:55:46 PM2/26/14
to openrail...@googlegroups.com
According to the GTFS Reference, "a block consists of two or more sequential trips" -- i.e., one of the trips must end before the other begins. Trying to use a block_id to associate joining/splitting trains, you'll end up with two trips in the same block running at the same time, which isn't valid GTFS. Copying trips is the only solution I can think of that doesn't break with the spec.

Carl Partridge

unread,
Feb 26, 2014, 6:10:18 PM2/26/14
to openrail...@googlegroups.com
Yes, you're right, I hadn't thought that through at all - I suppose you could mark one of the halves of the join using a block_id, but not both.   If we take the (rather poor) example of:

- Train A runs from Blackpool to London.
- Train B runs from Glasgow to London.
- At Birmingham, both trains are joined together and continue onwards.

The choices are to map this onto GTFS as:

CHOICE (A)
Trip 1 from Blackpool to Birmingham  
Trip 2, from Glasgow to Birmingham 
Trip 3, from Birmingham to London   (optionally match block_id to that of trip 1)

or

CHOICE (B)
Trip 1 from Blackpool to London
Trip 2, from Glasgow to London


Is that a fair summary of the situation? The disadvantages of (A) being that the trains appear, incorrectly, to terminate at Birmingham, and the disadvantage of (B) being that all station stops between Birmingham and London are duplicated.

I assume that you're talking about (B), Nathan/Peter, which seems like the lesser of two evils.  In which case, I guess some logic could be inserted into my departure board displays to match together the duplicate trains and display only one of them?

Nathan

unread,
Feb 26, 2014, 6:56:49 PM2/26/14
to openrail...@googlegroups.com
Yes, I'd go with choice B. From a journey planning perspective, choice A would be very bad -- travel time and number of changes (perhaps along with other factors) are used to compute optimal journeys, so counting a join as changing trains would mess that up. Also, if a journey planner thought you were changing at Birmingham, it could possibly conclude that you won't have enough time to make the train to London, meaning it wouldn't even suggest the (direct) train to London. Finally, passengers using the journey planner (at least on trip 2) would be expecting to change trains at Birmingham. However, yes, for departure boards you'd need to make one of the trains vanish at Birmingham.
Reply all
Reply to author
Forward
0 new messages