GTFS Fare Model Proposal

223 views
Skip to first unread message

Brian Ferris

unread,
Mar 11, 2013, 8:11:13 PM3/11/13
to gtfs-f...@googlegroups.com
Hey everyone,

I've been thinking long and hard about GTFS fare modeling based on all the different transit fare systems of the world we were able to uncover in our work so far.  I've got some ideas on how we can move the spec forward to model some of these systems and I'd like to see what you think.

Specifically, I've written up my thoughts in:


There is a lot in there, but let me give a quick summary of my high-level motivations.

First, I'm focusing primarily on getting the base cash fare correct for as many agencies as possible.  I've got ideas on different fare products and eligibility, but I think that can wait for a follow-up discussion.

Second, I'm focusing a lot on the general semantics of the spec, in order to make it clear how the spec should work today and how it will work as we add new fields to various files going forward.

Third, I'm focusing a fair amount of attention on transfers.  I think this is one of the places the current spec doesn't do so well and if we can nail a general model for this going forward, I think we'll be doing ourselves a favor.

Ok, a lot to digest, but I appreciate your thoughts!  You should be able to comment on the document directly and I'm happy to give people edit privileges if they want to change anything.

Thanks,
Brian

Nicholas Albion

unread,
Mar 12, 2013, 7:10:20 PM3/12/13
to gtfs-f...@googlegroups.com
So fare_ids with a transfer_count are additional to the base fare, but fare_ids with min_transfer_count and max_transfer_count are used as restrictions?

Nicholas Albion

unread,
Mar 12, 2013, 7:33:10 PM3/12/13
to gtfs-f...@googlegroups.com
If transfer_duration is specified, but there is no transfer_count, min_transfer_count or max_transfer_count, are unlimited transfers implied?

Is this correct usage for the fares listed below?

fare_rules.txt
--------------------
fare_id, route_type, transfer_duration, included_trips, fare_name
NC_1HR_BUS, 3, 3600, 1, "1 hour bus"
NC_4HR_BUS, 3, 14400, 1, "4 hour bus"
NC_23HR_BF, 3|4, 82800, 1, "23 hour multi"
NC_TTM_BUS, 3, 3600, 10, "TimeTen Multi Ride"
  • 1-Hour & 4-Hour tickets for multiple rides on buses
  • 23-Hour tickets for multiple rides on buses & ferries
  • TimeTen Multi Ride - multiple bus trips within the hour - 10 trips
 Notes:

- transfer_duration probably isn't right for Newcastle (AU), the time begins at the start of the first leg, not the start of the transfer leg as per your draft.
- For NC_23HR_BF I've used a pipe ("|") to indicate that multiple route_types (or fare_class) values may match.
- I've added "included_trips" - not to be confused with transfers, this is a multi-use card which can be used again on different days or on the same day after the expiration of the transfer_duration.
- I've also added "fare_name" so that people know what to ask for.

Nicholas Albion

unread,
Mar 12, 2013, 7:48:39 PM3/12/13
to gtfs-f...@googlegroups.com
Would it be better to use the pipe ("|") I proposed for route_type and fare_class with contains_id, rather than have an undefined number of "contains_idx" and "contains_route_idx" columns?
Alternatively, you could wrap the values in quotes and use a comma delimiter within the field.

We would also want to be able to indicate that some tickets are valid for:

- the entire day (until 4am)
- weekly (7 days until 4am)
- monthly (28 days until 4am)
- quarterly (90 days until 4am)
- yearly (365 days until 4am)

Can I propose 2 additional fields:

- included_days
- end_time  (in our case 28:00:00)

Nicholas Albion

unread,
Mar 12, 2013, 8:00:52 PM3/12/13
to gtfs-f...@googlegroups.com
You have a TODO in Multi-Agency and Multi-Feed.

We currently have multiple feeds for bus and trains in Sydney, but share some fares:
  1. special "link tickets" exist for specific route combinations, eg: train to Bondi Junction, bus to Bondi Beach
  2. MyMulti:
    • unlimited trips on buses, trains, light rail, ferries of multiple agencies in Sydney & Newcastle
    • Adult fare: Weekly, Monthly (28 days until 4am), Quarterly (90 days until 4am), Yearly (365 days)
    • Concession fare: Weekly only
    • MyMulti 1, MyMulti 2, My Multi 3
    • tickets are transferrable
For the "link tickets", it might help if we were able to (optionally) scope "fare_id", "stop_id", "origin_id", "destination_id" and "contains_id" fields to a feed namespace - eg: "au.gov.nsw.FARE_X", "au.gov.nsw.bus.BB", "au.gov.nsw.rail.ZONE_1"

(you mention origin_id and destination_id in the "Logical Equivalence" section, but don't document it anywhere.  We would also probably use "stop_id", eg for the Airport and Bondi)

Nicholas Albion

unread,
Mar 12, 2013, 8:20:34 PM3/12/13
to gtfs-f...@googlegroups.com
To simplify Eligibility issues, could we use a hierarchical system as we do for platform/station?

parent_fare
----------------
- * could indicate a surcharge (or discount) that applies to all other fares.  
- "price" could be positive, negative or a percentage.   ("50%", "0%" or "free"?)
- Values in other fields could be used to restrict which fares this belongs to.  
- Would negation of other fields add too much complication?  eg service_id: "!SUNDAY"

eg:

fare_rules.txt
-------------------
fare_id, parent_fare, min_age, max_age, concession_type
CHILD, *,, 4
PENSIONER, *, ,, 1

fare_id, parent_fare, route_type, with_bike
BIKE, *, !3

fare_attributes.txt
-------------------------
fare_id, price
CHILD, 0%
PENSIONER, 50%,
BIKE, 2.00

concession_type: similar to route_type: pensioner, student, war_veteran...

Nicholas Albion

unread,
Mar 12, 2013, 8:27:47 PM3/12/13
to gtfs-f...@googlegroups.com
We also have a "Family Funday Sunday":
  • all day travel on a Sunday
  • group must be related and include at least one child (under 16 or school student 16-18) and one adult
  • purchase at train station, vending machine, ferry or bus driver
I think it would probably be too hard to incorporate the eligibility criteria into in the GTFS spec, it would probably be enough for Google Transit to present this as an option with the description above (on Sundays).

Nicholas Albion

unread,
Mar 12, 2013, 8:39:46 PM3/12/13
to gtfs-f...@googlegroups.com
"distance_mode" = 1 would nearly suffice for the Sydney bus fares, but as discussed in this post we would probably use per-zone pricing and need to add a zone_mapping.txt file to relate our sections (zones) for each trip.

 Do different cities have different policies regarding stops that lie on the boundary of two zones?

Brian Ferris

unread,
Mar 14, 2013, 4:37:31 AM3/14/13
to gtfs-f...@googlegroups.com
All three rules would be used to specify the cost of the additional post-transfer leg in a multi-leg itinerary.  The base fare would still apply to the initial non-transfer leg.

That said, if you specified a min_transfer_count rule with a value of 0, that rule would apply to any leg, including non-transfer legs.  By the same token, if you specified ONLY a max_transfer_count rule with a value of 5, that would also apply to transfer and non-transfer legs.

Brian


On Wed, Mar 13, 2013 at 12:10 AM, Nicholas Albion <nal...@gmail.com> wrote:
So fare_ids with a transfer_count are additional to the base fare, but fare_ids with min_transfer_count and max_transfer_count are used as restrictions?

--
You received this message because you are subscribed to the Google Groups "GTFS Fare Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-fare-wg...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Brian Ferris

unread,
Mar 14, 2013, 4:45:17 AM3/14/13
to gtfs-f...@googlegroups.com
It's not clear in the definition as currently written, but I agree that if transfer_duration is specified without any other transfer restrictions, then it implies unlimited transfers within the time window.  Which is to say, a transfer_duration rule may match any transfer leg or sequence of transfer legs as long as the start time of each transfer leg is within the specified time window from the start of the original non-transfer leg.

Also, it was an error in my original definition: time should be relative to the start of the initial NON-transfer leg.  Should be fixed now.

Brian


--

Brian Ferris

unread,
Mar 14, 2013, 4:53:58 AM3/14/13
to gtfs-f...@googlegroups.com
Technically speaking, both | and , could be used in an identifier, which makes it tricky to pick a separator.  But restricting fare_id to not use | characters might potentially be better than the contains_idx proposal.  My theory is that there aren't many feeds that would need more than 3-4 contains clauses, but that might not be a valid assumption.


--

Brian Ferris

unread,
Mar 14, 2013, 4:55:09 AM3/14/13
to gtfs-f...@googlegroups.com
Can you elaborate on your included_days and end_time proposals?


On Wed, Mar 13, 2013 at 12:48 AM, Nicholas Albion <nal...@gmail.com> wrote:

--

Nicholas Albion

unread,
Mar 14, 2013, 5:18:25 AM3/14/13
to gtfs-f...@googlegroups.com
How would you indicate that the $0.25 was an additional fare, and not an alternative fare option?  I imagine consuming applications listing various options as applicable to number of transfers, the user and day of week, eg:

- $2.00 one way
- $3.00 return
- $5 Family Funday Sunday (terms)
- $10 weekly (terms)
- more fares

Nicholas Albion

unread,
Mar 14, 2013, 5:26:18 AM3/14/13
to gtfs-f...@googlegroups.com
It would only be the zone_id, route_id and route_type that I imagine would require a pipe or comma delimiter to provide a list.  What about a space character?  That's not a valid character in an identifier, is it?

Brian Ferris

unread,
Mar 14, 2013, 5:27:12 AM3/14/13
to gtfs-f...@googlegroups.com
It seems as many of your proposals concern different fare products (cash fare vs ticket vs passes).  To be clear, I haven't tackled specifying different fare products with this proposal.  I'm all for giving this a shot at some point, but I'm curious to see what people think about these base semantics first.  

Brian Ferris

unread,
Mar 14, 2013, 5:27:56 AM3/14/13
to gtfs-f...@googlegroups.com
There is no restriction on the set of characters that can be used in an identifier.

Nicholas Albion

unread,
Mar 14, 2013, 5:35:23 AM3/14/13
to gtfs-f...@googlegroups.com
"included_days" could be used to indicate the number of days included in a weekly, monthly, yearly ticket etc.   Perhaps it would be more flexible to change the name to "valid_period" and explicitly specify days/months/years:

1d, 7d, 30d, 1m, 90d, 3m, 365d, 1y

You'd assume that a multi-day ticket would expire at midnight, but in Sydney they expire at 4am, so we would need to specify end_time=28:00:00.

start_time/end_time might be an alternative way of handling the RUSH hour fare you described.

Brian Ferris

unread,
Mar 14, 2013, 5:38:27 AM3/14/13
to gtfs-f...@googlegroups.com
One obvious issue with GTFS is that there is no easy way to have references across two GTFS feeds.  This is a problem for inter-agency transfers and for inter-agency fares.  There have been a number of proposals over the years to deal with this but nothing has really stuck.  I'm not opposed to some sort of namespace mechanism.  The trick is figuring out which remote feed you are referring to in a consistent way.


--

Nicholas Albion

unread,
Mar 14, 2013, 5:43:20 AM3/14/13
to gtfs-f...@googlegroups.com
After thinking about this some more, we would probably add an extra column to stop_times.txt named "zone_id".  It would make an already large file even larger, but because GTFS is missing the concept of JourneyPattern in TransXChange, I think this is the best we could do.

The zone_mapping.txt idea would be more efficient - but it might make things more complicated?

Brian Ferris

unread,
Mar 14, 2013, 5:46:53 AM3/14/13
to gtfs-f...@googlegroups.com
Yeah, I had a similar idea of adding zone_id to stop_times.txt.  At the end of the day, I don't sweat the size of stop_times.txt too much ; )  Yeah, it's a bit inefficient to have this info normalized into stop_times.txt, but it is straight-forward.


--

Nicholas Albion

unread,
Mar 14, 2013, 6:00:54 AM3/14/13
to gtfs-f...@googlegroups.com
Can we have zone_id added as an optional field of stop_times.txt?

http://www.skedgo.com/ are currently making there own fare estimates for Sydney, and by their own admission the estimates are not currently very reliable.  

Brian Ferris

unread,
Apr 23, 2013, 4:32:25 AM4/23/13
to gtfs-f...@googlegroups.com
Just to bump this to the top of your inbox, I wanted to know if anyone had any additional opinions about my proposal?  Aaron and David, I'm looking at you specifically.

David Turner

unread,
Apr 25, 2013, 10:54:43 AM4/25/13
to gtfs-f...@googlegroups.com
I'm out of town this week, but here are some preliminary thoughts:

(1) We still need to distinguish between transfer durations relative to the start of a leg (Portland, NYC), and relative to the end (SF)

(2) Also, we need to distinguish between transfers inside fare control vs outside.  NYC has a couple of transfers where this matters.  Perhaps transfers.txt or the parent_stop_id relation could be repurposed for this.

(3) zone_count should specify how repeating zones are counted.  For instance, a trip on SEPTA from Merion to Wallingford.  Of course, SEPTA should code zones on the different lines differently, but the spec should still be clear.

I'm still trying to figure out if NYC is codeable.  One thing that would be nice would be transfer_from_fare_id, which would basically turn the fare system into a simple finite state automaton.  That would make NYC much easier.




On 04/23/2013 04:32 AM, Brian Ferris wrote:

Aaron Antrim

unread,
Apr 29, 2013, 3:51:00 PM4/29/13
to gtfs-f...@googlegroups.com
Hi Brian,

I'll block out some time to take a look this week and get back before Friday.

Brian Ferris

unread,
Apr 29, 2013, 4:08:58 PM4/29/13
to gtfs-f...@googlegroups.com
Responses inline:

On Thu, Apr 25, 2013 at 4:54 PM, David Turner <nov...@novalis.org> wrote:
(1) We still need to distinguish between transfer durations relative to the start of a leg (Portland, NYC), and relative to the end (SF)

It seems like there are a couple of way to model this.  Would you say the distinction is "start vs end of leg" or is it just "transfer duration relative to time of payment" and it happens in SF that you pay on exit?
 
(2) Also, we need to distinguish between transfers inside fare control vs outside.  NYC has a couple of transfers where this matters.  Perhaps transfers.txt or the parent_stop_id relation could be repurposed for this.

Could this be achieve by properly structuring zone ids?  Or with a "transfer_from_stop_id" rule perhaps?
 
(3) zone_count should specify how repeating zones are counted.  For instance, a trip on SEPTA from Merion to Wallingford.  Of course, SEPTA should code zones on the different lines differently, but the spec should still be clear.

How about "zone_count" vs "unique_zone_count", where "zone_count" double counts the same zone if visited twice, where as "unique_zone_count" counts each zone only once, even if visited multiple times?
Reply all
Reply to author
Forward
0 new messages