Naming things for a database

52 views
Skip to first unread message

ciaran haines

unread,
Jan 6, 2026, 6:34:01 AM (4 days ago) Jan 6
to A gathering place for the Open Rail Data community
Hi,
I have a database of past passenger train data. The hope is to build some services for rail companies. I'm unsure about some of my language - I think some of these terms mean something different in the wider world of rail. Each of the below is *current name *concept *notes. I'm worrying in advance about any confusion, so any help appreciated before I get these terms fully stuck in my head.

train - I think of this as a service instance, running uniquely once. Against each 'train' in our data I have GEMINI capacity data, DARWIN times & status (cancellations & reroutes), and passenger counts as published by train operators on the rail data marketplace. Are there any problems here?

line - I have one ID per unique ordered list of stops. In my usage, the Chiltern line services from London Marylebone to Birmingham Moor st gets multiple line IDs for each version of stopping or not at Hatton, Snow Hill etc. Multiple services usually share a line ID.  Would a 'route' be a better description? 

Occupancy/crowding/seat availability - Obviously this is something passengers care about. Is there a common term used for this? I'm aware of Pixc but I'm talking about something different - the "how busy does it feel" kind of thing. I intend to use passengers/seating capacity, or the compliment remaining seats/seating capacity (which might be negative for standing trains). I include seat availability since several of the TOCs cap their reported passenger numbers at their stated max capacity for the train, and because this is a concrete idea. Sadly it is horribly clunky as an attribute name.

Thanks in advance!

Gaelan Steele

unread,
Jan 6, 2026, 7:30:55 AM (4 days ago) Jan 6
to openrail...@googlegroups.com

On Jan 6, 2026, at 11:34, 'ciaran haines' via A gathering place for the Open Rail Data community <openrail...@googlegroups.com> wrote:



train - I think of this as a service instance, running uniquely once. Against each 'train' in our data I have GEMINI capacity data, DARWIN times & status (cancellations & reroutes), and passenger counts as published by train operators on the rail data marketplace. Are there any problems here?

I’d avoid “train” due to the potential for the ambiguity - do you mean a service or unit of rolling stock? “service” is probably a pretty good term for this. 

line - I have one ID per unique ordered list of stops. In my usage, the Chiltern line services from London Marylebone to Birmingham Moor st gets multiple line IDs for each version of stopping or not at Hatton, Snow Hill etc. Multiple services usually share a line ID.  Would a 'route' be a better description? 

I guess my question here is what you’re trying to do by assigning line IDs here - because of variations in stopping patterns, lines as you define them won’t really correspond to any meaningful concept for a passenger or probably even a rail employee. If you did need this exact concept for some reason, I’d be inclined to call it a “route” or a “stopping pattern”. 

Occupancy/crowding/seat availability - Obviously this is something passengers care about. Is there a common term used for this? I'm aware of Pixc but I'm talking about something different - the "how busy does it feel" kind of thing. I intend to use passengers/seating capacity, or the compliment remaining seats/seating capacity (which might be negative for standing trains). I include seat availability since several of the TOCs cap their reported passenger numbers at their stated max capacity for the train, and because this is a concrete idea. Sadly it is horribly clunky as an attribute name.

This isn’t an area I’ve looked at much, but I suspect it’s likely to be tricky because of the different kinds of data from different sources. If you’re looking at storing all your input data exactly, you’ll probably need different fields for different sources; if you just want a rough impression for passenger purposes then maybe something like a “crowding” percentage might be sufficient. 

Best wishes,
Gaelan

ciaran haines

unread,
9:11 AM (12 hours ago) 9:11 AM
to A gathering place for the Open Rail Data community
Thanks Gaelen, that's really useful, and kinda ties in with the bits that were driving my doubts. Cheers! 
Reply all
Reply to author
Forward
0 new messages