> The second bit in service_id:
> 1: This trip will operate on its normal schedule for all dates, ONLY
> EXCEPTION is holidays
> 2: This trip will operate on its normal schedule EXCEPT holidays and
> also for a certain date(s)
> 3 and up: This trip will operate on the date in which #2 will not run
> OR it is an extra trip that will run on the date of #2
This is a good idea. There's no reason you can't have two otherwise identical rows in calendar.txt with different service_ids, one of which is removed from holidays and one which isn't.
> Now for the other elephant in the room regarding bus GTFS size, some
> of the MTA's conventions drive up the file size further and I feel
> unnecessarily. In particular, the service_id and trip_id in trips.txt,
> stop_times.txt and all related files.
...
> As you can see, the service_id is 10 bits long, and is for, let's say
> 15 unique service patterns. If you have 15 service patterns, you don't
> need 10 bits, just 1 or at most 2.
This seems less necessary. When I put the feed in a mobile app I put it in a SQLite database and change all internal textual identifiers to ints anyway, which is the most optimal way of doing it. Why don't you consider doing the same?
Adam
Assuming I understand correctly, have you looked at all of the schedule data to confirm that this is in fact the case? I suspect that it's not, but that's just a hunch.
Also, one additional piece of context as to why the bus NYCT GTFS is so big -- GTFS doesn't support trips that start before midnight. But our operation and fancy scheduling systems do, and we take advantage of that to optimize the cost basis for bus service. This means that, for example, the Friday service has to be defined separately in the GTFS from the Monday-Thursday service, since the "Saturday" trips that start before midnight (i.e. on Friday) need to be part of the Friday service in the GTFS. Similar complexity for transitions between schooldays and not schooldays, between any kind of day and a holiday, from holidays back to non-holidays, etc. Make sense?
Thanks,
Mike
> This means that, for example, the Friday service has to be defined separately in the GTFS from the Monday-Thursday service, since the "Saturday" trips that start before midnight (i.e. on Friday) need to be part of the Friday service in the GTFS. Similar complexity for transitions between schooldays and not schooldays, between any kind of day and a holiday, from holidays back to non-holidays, etc. Make sense?
It does make sense. That explains a lot.
I think the ideal would be to have three service_ids: one for buses that run M-F regardless, one for buses that run M-Th night, and one for Friday night.
The M-F might only contain buses through say, 10pm (or whenever the schedules diverge). But this way we avoid duplicating the entire schedule for Friday and editing only the buses after 10pm.
There's nothing wrong with the current way of doing it. Is it worth the extra effort for the MTA to make the feed as elegant and simple as possible? Probably not. But if anyone at MTA decides to redo the bus feed, do take John's suggestions into account.
Adam
I don't even think it would need to be the Friday night trips that start after 10pm. It just would need to be the "Saturday" trips that actually start on Friday.
Just to be clear, your suggestion (and indeed John's suggestion) hinges on the fact that you can actually have multiple service_id's active at any given time/on the same date, even for the same route, correct? I guess I hadn't realized that such a thing was supported in GTFS. Would very much appreciate a definitive answer on this.
GTFS is a critical input into MTA Bus Time, so we will keep this discussion in mind if/when we revisit how GTFS is produced in the context of the Bus Time work.
Thanks,
Mike
-----Original Message-----
From: mtadevelop...@googlegroups.com [mailto:mtadevelop...@googlegroups.com] On Behalf Of Adam Ernst
Sent: Wednesday, September 28, 2011 9:59 AM
To: mtadevelop...@googlegroups.com
Subject: Re: [MTAdev] Bus GTFS size & implementation
Right. 10pm was just an example.
> Just to be clear, your suggestion (and indeed John's suggestion) hinges on the fact that you can actually have multiple service_id's active at any given time/on the same date, even for the same route, correct? I guess I hadn't realized that such a thing was supported in GTFS. Would very much appreciate a definitive answer on this.
That's definitely right, and it's the best way to do it. Just avoid accidentally duplicating service (i.e. having the same bus running in two service ids that are active together). It should make the feed much simpler.
> GTFS is a critical input into MTA Bus Time, so we will keep this discussion in mind if/when we revisit how GTFS is produced in the context of the Bus Time work.
Great!
Adam
Guys,
I use my own proprietary App to convert the GTFS to my format just fine.
I think the only change I would recommend would be to use LIRR's approach and only use calendar_dates to show which service_id's run on a specific day (flag is set to 1's only). You may run into more service id's and entries but that's perfectly OK. A lot of other agencies use this approach as well, MTS, NJT, LIRR, etc.
Do the Metro North ppl talk with the LIRR ppl who create their GTFS file? I think it's more robust and accurate to abandon calendar.txt and use calendar_dates.txt 100%.
Any opinions on this?