Latest MTA Metro North GTFS File - 9/21

71 views
Skip to first unread message

Jason iTB

unread,
Sep 21, 2011, 9:45:51 AM9/21/11
to mtadeveloperresources
Looking at the latest Metro North GTFS File the calendar.txt file
doesn't seem right. Service_id=1 has trains running on Monday thru
Thursday but not Friday. Service_id=2 has the trains running Monday
and Sunday?

Can someone please check this out, every Service_id doesn't seem
correct.

John L

unread,
Sep 21, 2011, 10:01:51 PM9/21/11
to mtadevelop...@googlegroups.com
I believe it should read
 
1,1,1,1,1,1,0,0,20110829,20111016
2,1,0,0,0,0,0,1,20110829,20111016
3,0,0,0,0,0,1,0,20110829,20111016
4,0,0,0,0,0,1,1,20110829,20111016
5,0,0,0,0,0,0,1,20110829,20111016
 
This will be corrected tomorrow.

John L

unread,
Sep 21, 2011, 10:10:47 PM9/21/11
to mtadevelop...@googlegroups.com
There is one train that does run on Sunday mornings and Monday mornings with the same train name, 6870 from South Norwalk to Danbury. This is the connection after midnight for train 6570 that runs on Saturday and Sunday. It is the only one of its kind and hence gets its own service ID :P

Timmy Douglas

unread,
Sep 26, 2011, 10:59:25 PM9/26/11
to mtadeveloperresources
the calendar.txt file I just downloaded looks like this:

service_id,monday,tuesday,wednesday,thursday,friday,saturday,sunday,start_date,end_date
1,1,1,1,1,0,0,0,20110829,20111016
2,1,0,0,0,0,0,1,20110829,20111016
3,0,0,0,0,0,1,0,20110829,20111016
4,0,0,0,0,0,1,1,20110829,20111016
5,0,0,0,0,0,0,1,20110829,20111016
2754,0,0,0,0,0,0,0,20110829,20111016
2774,0,0,0,0,0,0,0,20110829,20111016
2775,0,0,0,0,0,0,0,20110829,20111016
2776,0,0,0,0,0,0,0,20110829,20111016
2779,0,0,0,0,0,0,0,20110829,20111016
2801,0,0,0,0,0,0,0,20110829,20111016
2802,0,0,0,0,0,0,0,20110829,20111016
2803,0,0,0,0,0,0,0,20110829,20111016
2804,0,0,0,0,0,0,0,20110829,20111016

this means those service ids 27xx-28xx are never running because 0 is
marked for monday-sunday
basically no trains are running on friday according to this? as well
as only like 1 or two services for mon-thur?

people are telling me my app doesn't work, but it's more like the
schedule says no trains are running....

On Sep 21, 10:01 pm, John L <larse...@gmail.com> wrote:
> I believe it should read
>
> 1,1,1,1,1,1,0,0,20110829,20111016
> 2,1,0,0,0,0,0,1,20110829,20111016
> 3,0,0,0,0,0,1,0,20110829,20111016
> 4,0,0,0,0,0,1,1,20110829,20111016
> 5,0,0,0,0,0,0,1,20110829,20111016
>
> This will be corrected tomorrow.

iTransitBuddy Support

unread,
Sep 27, 2011, 8:51:44 AM9/27/11
to mtadevelop...@googlegroups.com
I raised this issue last week and it apparently has not been resolved. The Metro North GTFS calendar file is completely out of whack. Can someone from the MTA recreate the file from scratch? The current GTFS is invalid and creating confusion amongst riders who use Apps that rely on those files.

Thanks

Sent from my iPhone

John L

unread,
Sep 27, 2011, 8:27:29 PM9/27/11
to mtadevelop...@googlegroups.com

If you look at calendar_dates you will see that these schedules are turned on for various dates that they are active and turn off the base schedules,  those with single digits.

This follows the gtfs spec and it is not broken.

iTransitBuddy Support

unread,
Sep 27, 2011, 8:36:01 PM9/27/11
to mtadevelop...@googlegroups.com
The calendar file is broken and if you understood weekday and weekend schedules and the actual Metro North train schedule you would understand that it has actual monday trains running on weekends as well as other errors if you went by the current file.

I think we fully understand the calendar dates file which has the 1/2 setting as to whether to add/delete that train for a given date.  That does not apply here.

It is broken and if someone from Metro North / MTA could respond it would be appreciated.

Aaron Donovan

unread,
Sep 27, 2011, 9:20:56 PM9/27/11
to mtadeveloperresources
Let's please try to keep the tone/discourse on this board as collegial
as possible. John works for Metro-North, and takes time out of a very
busy job in order to answer questions on this board. He's very
familiar with the train schedules on weekdays and weekends.

-Aaron

On Sep 27, 8:36 pm, iTransitBuddy Support <supp...@itransitbuddy.com>
wrote:
> The calendar file is broken and if you understood weekday and weekend
> schedules and the actual Metro North train schedule you would understand
> that it has actual monday trains running on weekends as well as other errors
> if you went by the current file.
>
> I think we fully understand the calendar dates file which has the 1/2
> setting as to whether to add/delete that train for a given date.  That does
> not apply here.
>
> It is broken and if someone from Metro North / MTA could respond it would be
> appreciated.
>
>
>
>
>
>
>
> On Tue, Sep 27, 2011 at 8:27 PM, John L <larse...@gmail.com> wrote:
> > If you look at calendar_dates you will see that these schedules are turned
> > on for various dates that they are active and turn off the base schedules,
> > those with single digits.
>
> > This follows the gtfs spec and it is not broken.
> > On Sep 27, 2011 8:52 AM, "iTransitBuddy Support" <
> > supp...@itransitbuddy.com> wrote:
> > > I raised this issue last week and it apparently has not been resolved.
> > The Metro North GTFS calendar file is completely out of whack. Can someone
> > from the MTA recreate the file from scratch? The current GTFS is invalid and
> > creating confusion amongst riders who use Apps that rely on those files.
>
> > > Thanks
>
> > > Sent from my iPhone
>

Ken Anderson

unread,
Sep 27, 2011, 9:44:46 PM9/27/11
to mtadevelop...@googlegroups.com
I totally agree with you Aaron, we must maintain professional decorum here.

While I don't agree with Mr. Buddy's approach, I too feel that the files are broken. According to the current data set, no trains are running on Fridays, even if you account for the exceptions in the calendar_dates.txt file.

Ken

Michael Dannenbring

unread,
Sep 27, 2011, 9:50:29 PM9/27/11
to mtadevelop...@googlegroups.com

I will look into this and post my findings on the forums tomorrow.

In addition there will be new base schedules being published hopefully by the end of this week that extend past October's base.

Mike Dannenbring
Assistant Director IT
MTA Metro-North Railroad

John L

unread,
Sep 27, 2011, 10:32:30 PM9/27/11
to mtadevelop...@googlegroups.com
I was addressing Timmy Douglas' comment
 
"this means those service ids 27xx-28xx are never running because 0 is
marked for monday-sunday"
 
when I talked about the calendar dates file.
 
Regarding the Friday omission in the calendar file, as I pointed out on 9/21, the calendar file should have service ID 1 marked as a 1 for Friday and was mistakenly omitted. Manually editing the file would correct this, but I do understand the frustration with it being omitted in the first place. We will try to do better next time.
 
Regarding iTransitBuddy Support's comment
 
"The calendar file is broken and if you understood weekday and weekend schedules and the actual Metro North train schedule you would understand that it has actual monday trains running on weekends as well as other errors if you went by the current file."
 
There is a train that runs on Sunday and Monday from my post on 9/21
 
"There is one train that does run on Sunday mornings and Monday mornings with the same train name, 6870 from South Norwalk to Danbury. This is the connection after midnight for train 6570 that runs on Saturday and Sunday. It is the only one of its kind and hence gets its own service ID"
 
I believe that there was only the one error, which was addressed above. Were there additional errors beyond the Friday omission in the calendar file?

Wayne

unread,
Sep 27, 2011, 10:33:54 PM9/27/11
to mtadevelop...@googlegroups.com
Thanks Mike.

I don't believe the files are broken - the MTA has simply selected a way to implement the GTFS standard as per their interpretation of that standard. 

That said, I do believe the MTA has inadvertently crossed the line from providing raw data and stepping into an implementation pattern. Here's what I mean:

-  Prior to January 2011, data was tagged to one of the service IDs 1-5 and no/few exception trains were coded.
-  After January 2011 it appears that exception trains were added for holidays, true exceptions, and eventually Yankee Games. This has had the following impact:
1.  the change dramatically increased the data set size provided to your developer community
2.  service_ids in the 2xxx range were added w/o a documented explanation for the developers, or an algorithm to support their addition

Suggestion/ask:
1.  Only code true exception trains as exceptions in calendar_dates.txt; for example TRAIN DOES NOT RUN ON 7/1
2.  Code trains that only run on a single week day, as such; for example, Trains that only run on Saturday have a service_id=3
3.  Code the Yankee trips as a YANKEE Schedule, similar to Harlem, Hudson, and New Haven
4.  DO NOT code holidays as exceptions, let the developers handle this in their code; Holidays are already accounted for in the Saturday, Sunday, and Holiday schedule.

Number 4 (above) took a long time for us to figure out the coding. We looked at July 4th. It appears that because July 4th fell on a Monday it was coded as an exception, in that, ALL of the weekday trips were tagged for removal on 070411, then  ALL of the weekend trips were added back in for 070411. This is what I meant earlier by stepping into the implementation pattern side of things. It's easier for us doing the development work to account for July 4th by simply checking the day of the week and displaying the proper schedule, than it is to trudge through the thousands of extra lines of data.

Please take no offense to my post. I'm not criticizing, just a little frustrated because, like most, we do this work in our spare time. It's a hobby, not an income, and the volume of hours we had to invest to understand this new format has been enormous.

Open to other's opinions.

Thank you.

Wayne

Ken Anderson

unread,
Sep 27, 2011, 10:44:32 PM9/27/11
to mtadevelop...@googlegroups.com
Wayne,

A while back, John L had asked for input around how to code the Yankee schedules - the discussion then landed on the 2xxx style, where Mon-Sun were all zeros and the trains were added in as exceptions.  I thought this was an excellent approach, and fits the GTFS spec nicely.

I don't see how the Yankees trains could be a different schedule, since many of the trips are combinations with other trains - but I'd be interested to hear what John has to say.

For your #4, I'm not sure I follow you.  When you say let the developers handle this in their code, do you mean a calendar_dates.txt file that removes the weekday schedule and substitutes Sunday?  That would be fine - and it's worked that way in the past, but I have a feeling that's not what you meant.  Can you elaborate?  I have to say, the published schedule should be accurate, without assumptions that developers will code around something.

In any case, I agree the files should be smaller - but this is not an easy task, and I'm sure some train scheduler at the MTA will continue to throw monkey wrenches into the plan :)

Ken

Adam Ernst

unread,
Sep 27, 2011, 10:54:21 PM9/27/11
to mtadevelop...@googlegroups.com
Hi Wayne,

I'm afraid your interpretation of the purpose of the GTFS spec is incorrect. The MTA's implementation is spot-on (recent errors aside).

In particular:

> service_ids in the 2xxx range were added w/o a documented explanation

GTFS does not require (or even suggest) the purpose of service_ids should be documented.

> the change dramatically increased the data set size

Irrelevant. What matters is if the data is correct if the rules are applied correctly. Your code did not follow the rules, so it got incorrect results.

> DO NOT code holidays as exceptions, let the developers handle this in their code

This runs directly counter to the GTFS spec. You're not supposed to code for holidays at all; just follow the GTFS rules and correctly coded feeds will always give you the right times, as MTA's does.

> ALL of the weekday trips were tagged for removal on 070411, then ALL of the weekend trips were added back in for 070411

This is exactly the pattern GTFS recommends feed creators use for holidays.

If the MTA should choose to create a different service_id and set of trips for every single day, that would be legal according to GTFS. There is no requirement, or even recommendation, to keep feed sizes small, however much I appreciate that as a mobile developer.

Adam


On Sep 27, 2011, at 10:33 PM, Wayne wrote:

John Larsen

unread,
Sep 28, 2011, 12:22:06 AM9/28/11
to mtadeveloperresources
I take no offense to constructive criticism or otherwise, just trying
to correct issues with the product :)

Wayne, The following link is a lengthy blurb about what you are
discussing.

http://groups.google.com/group/mtadeveloperresources/browse_thread/thread/6a9034b5d6bef1d9/177966a216206a0d?hl=en&lnk=gst&q=john+larsen#177966a216206a0d

To address your points :
1. A funny thing happens for the Yankee schedules. The regularly
scheduled Poughkeepsie express trains from Croton Harmon to GCT that
do not normally make the Yankee stop , do make it on those days. The
problem is that they retain the same train name. The ability to turn
off just those trains and not the whole schedule is needed for these
cases, but the gtfs spec does not account for this. We can identify
the adds but have to include the stopping pattern changes as well. If
we change the train name, we have to , contractually, adjust other
staffing issues. So we just add the stop and don't rename the train.
It is one of many ways to deal with it and that one was chosen.
If I did not get this right, please give me some further detail.

2. The service IDs change and there is no meaning associated with
them. 1 happens to be the M-F only because we start with those
schedules first. 2 could be a Friday only in one gtfs file and a Sat-
Sun in another. Regarding trains that run on one day only, the
application we have written takes that into account, as example there
are some Friday only trains during some of our schedule programs. The
only issue is we don't put meaning to the service ID for this as we
would not be following the specification in spirit.
If I did not get this right, please give me some further detail.

3. The Yankees-153rd Street station is a normal stop on the Hudson
line on non game days, primarily on the locals. This is related to 1.
because the extra service is the easy thing to separate and enable for
these days, turning off the regularly scheduled trains from the base
cannot be done and you would wind up with duplicate trains if we added
it to the exceptions.
If I did not get this right, please give me some further detail.

4. Holidays, inherently, are exceptions and we print amended
timetables to support this. For instance, Rosh Hashanah and Yom Kippur
are modified weekday schedules with extra service NOT a Saturday or
Sunday schedule. Same goes for the day before major holidays (our
getaway schedules) and our shopper special schedules. These exceptions
and extra trains are simply not in the normal schedules.
http://www.mta.info/mta/news/releases/?agency=mnr&en=110901-MNR45
shows example of the Labor Day getaway schedule for Friday. If you
look at the printed timetable for the Weekend Harlem
linehttp://www.mta.info/mnr/html/planning/schedules/pdf/HAR_SS_627_2011.pdf
you will quickly see the issues that I am talking about. Note train
9573 operates on Saturday only and ran on 7/4. The schedule that ran
on 7/4 was a Sunday plus this train and 4673. Also note that 9960 runs
on Sundays and all holidays. The same is true for the New Haven and
Hudson to a much lesser degree though.

However, to your point, we could just add these trains for this day
and say they run only on this day. We would still have to publish a
full Yankee Schedule beacuse of the added stop on the expresses. Still
would wind up with a with all 0 for day in the calendar file and the
exceptions just turned on in the calendar dates, but only for a
handful of trains. Priority is always a developers downfall. When we
can get to it, we will.

There is a whole planning team for schedules from multiple departments
to come up with these schedules. Simply put, they are amazing at what
they do and I don't pretend to even understand the puzzle that is the
schedule. We get the information out according to their process and
requirements. We have the task of fitting their requirements into the
gtfs specification and the outcome is what it is. Not always the best
fit, but not implemented incorrectly either (when we dont make PEBKAC
mistakes :P)

I hope this information helps!

Wayne

unread,
Sep 28, 2011, 12:35:25 AM9/28/11
to mtadevelop...@googlegroups.com
John,

Thank you for this information, it is very helpful in understanding the underlying logic.

As of last Saturday, Brett and I designed a preprocessor to proliferate the schedule out to an atomic form, then merge the trips back into a normalized state. The result is a set of .csv files that contain all trips and exceptions in a much smaller data set format. These files are then run through our importer and populated into our data store. Our biggest challenge had been checking the integrity and accuracy of the data store so that we haven't lost or misrepresented any trips.

Again, thank you for taking the time to provide this explanation. It makes a lot of sense once you see the rules for which the data was gen'd.

Wayne

John Larsen

unread,
Sep 28, 2011, 12:41:29 AM9/28/11
to mtadeveloperresources
as example of the Yankee issue

http://www.mta.info/mnr/html/yankees/pdf/HUDWKD627.pdf is the game day
schedule.
http://www.mta.info/mnr/html/planning/schedules/pdf/HUD_MF_627_2011.pdf
is the regular schedule.

Train 848 is in both files but has a different stopping pattern.

iTransitBuddy Support

unread,
Sep 27, 2011, 10:35:38 PM9/27/11
to mtadevelop...@googlegroups.com
From what my users told me when I made Friday =1 for service id =1 was that it was displaying weekend trains during the week which makes me believe service id = 1 is not a weekday service.

John,

I truly appreciate your help and time.

Thanks!

Sent from my iPhone

iTransitBuddy Support

unread,
Sep 27, 2011, 9:57:28 PM9/27/11
to mtadevelop...@googlegroups.com
Mike,

Thanks!  My main concern in raising the issue was that if there were issues in your process for creating GTFS files it could be corrected before the new files for the new schedule are published. 

Thanks again. 

Sent from my iPhone

iTransitBuddy Support

unread,
Sep 27, 2011, 10:45:55 PM9/27/11
to mtadevelop...@googlegroups.com
The easiest thing to do is what LIRR does, use calendar dates file exclusively and do not use the calendar file. That way each service is is explicitly stated as to which dates they run.  If you have trains that one run on , for example, 10/1, then it would have its own service id. 

I'd recommend that way as it's much easier to feel confident in the data quality. 

My opinion only. 

Thanks!

Sent from my iPhone

iTransitBuddy Support

unread,
Sep 27, 2011, 9:50:39 PM9/27/11
to mtadevelop...@googlegroups.com
Everyone, I'm sorry if responded a bit hastily. I reported the issue over a week ago and explained the issue in detail. If someone from the MTA can look at the details they will see there is an issue.

I'm only pushing for them to be corrected so Metro North riders that use GTFS apps have correct data.

Again, I apologize for any misunderstanding and that I appreciate the MTA's help on this board as well as all participants.

Thanks!

Sent from my iPhone

iTransitBuddy Support

unread,
Sep 27, 2011, 11:27:10 PM9/27/11
to mtadevelop...@googlegroups.com
I agree with Adam, size of the data doesn't matter. It needs to be correct and that's all that matters and whatever the size that's what we have to accept...but these files ate relatively small.

Only change I'd recommend is doing entirely calendar_dates instead. LIRR does this.

Thanks.

Sent from my iPhone

iTransitBuddy Support

unread,
Sep 27, 2011, 9:28:56 PM9/27/11
to mtadevelop...@googlegroups.com
Aaron/John, sorry if the tone of my email was misinterpreted. I've reported an error with my users saying that Weekend trains were showing on Weekdays as has another user. I reported this issue over a week ago.

There really is an issue and the GTFS should probably be recreated.

I appreciate everyone's support of this board and open data.

Thanks!

Sent from my iPhone

Timmy Douglas

unread,
Sep 27, 2011, 9:47:43 PM9/27/11
to mtadevelop...@googlegroups.com
Yeah, I'm seeing the same thing. If you take Friday September 30th for instance, there are no entries
in calendar_dates.txt with that date, and all the entries in calendar.txt have friday=0... so there are no matches.

Yuriy Yakimenko

unread,
Sep 27, 2011, 10:21:47 PM9/27/11
to mtadevelop...@googlegroups.com
I can confirm that the latest (dated Sept 15) Metro North GTFS does not have any trains for this coming Friday, according to what I see.

Yuriy

John Paul N.

unread,
Sep 28, 2011, 9:20:29 AM9/28/11
to mtadeveloperresources
Regarding the Sunday-Monday morning train, wouldn't a time hour value
of greater than 24 solve that? (use the service id for Saturday-
Sunday.)
Message has been deleted

John Larsen

unread,
Sep 28, 2011, 2:36:02 PM9/28/11
to mtadeveloperresources
John Paul N.,

It would if MNR classified that as a Sunday train, of which we do not.
Train
schedules run from 0000 to 2359. If the train starts before 0000 and
ends
afterwards, we do handle it this way.

In this particular case, and the only case, because it is a connection
of a
weekend train that does not run for the remainder of the week, we keep
the
weekend train name but have it run on Monday.

the reason for the 0000 to 2359 is that On Time Performance would be
a
mess to calculate if we used a 0200 to 0159 or other operational day.
This makes it very clean

John Paul N.

unread,
Sep 28, 2011, 4:27:52 PM9/28/11
to mtadeveloperresources
Hi John,

I see what you're saying and it's unfortunate. You (meaning the MNRR)
is bound to the guidelines set by its board and executives, and they
define a service day as being between 0000 to 2359, if that's what
you're saying. I thought Metro-North has a period in which trains
don't operate in the early morning (from GCT) past around 0100 or 0200
for a few hours. The LIRR, however, does, and that complicates things
a bit for them. But in general, that brief period of non-service, I
would think, would be the logical separation of a service day. As an
analogy, television schedules for the early morning are usually
considered part of the previous day, e.g. Monday late night shows air
on Tuesday morning and are considered Monday shows for ratings
purposes. It's hard to imagine why the MTA organization does not
consider that here.

To the casual observer of the schedule, there is no difference time-
wise, so it is frustrating to hear that, but I understand.

There is a similar situation in the Subway GTFS. On Monday mornings,
the J train does not run south of Chambers Street. The current GTFS
does not have a separate calendar.txt service value for Mondays.
Operationally, the GTFS should say those trips are part of the
Saturday (for Sunday Morning) or Sunday (for Monday morning) schedule
but the subway GTFS does not; instead they are part of the Sunday and
weekday schedule. This results in duplicate trips in the weekday early
morning; the only difference being Fulton and Broad Street stations.
In my implementation, I hard-code the exception and remove the extra
trips. But for people using Google Maps, it would show trips from
Fulton or Broad Streets during times when it should not be. OK, I have
not checked this recently, but I wouldn't be surprised if it does
(unless Google personally knows about this particular exception and
codes its servers to recognize the exception, which I would be
surprised to hear if it does for particular agencies.)

Have you checked out the thread I started about the NYCT Bus GTFS? It
also applies here. In fact, I brought the topic up after reading this
thread.

Your response here, however, gives me an ominous feeling for that
thread. But I'll wait to hear what you and your colleagues think.

Thanks,
John Paul

Michael Dannenbring

unread,
Sep 28, 2011, 7:03:12 PM9/28/11
to mtadevelop...@googlegroups.com

All,

I have corrected the issue with the calendar.txt.  This was a clerical issue, not a systematic problem.

I have also included the new base schedules that go live on October 16th.  I created one day exceptions for every day until October 16th.

Friday's Yankee schedule is also included.  I will publish additional Yankee schedules as they become available.

Thak you for your patience,

Mike Dannenbring
Assistant Director IT
MTA Metro-North Railroad

On Sep 28, 2011 4:27 PM, "John Paul N." <tinl...@gmail.com> wrote:

iTransitBuddy Support

unread,
Sep 28, 2011, 7:14:24 PM9/28/11
to mtadevelop...@googlegroups.com
Thanks!

Have you ever considered all service id's a part of calendar_dates ?

Sent from my iPhone

John L

unread,
Sep 30, 2011, 12:23:06 PM9/30/11
to mtadevelop...@googlegroups.com

We had, but the file would be much larger than it is. Also since we have 3 schedules for weekday service (Monday, Tuesday thru Thursday and Friday) . The reason why you have 5 or 6 smallish service id's is because we combined schedules from our 6 base schedules. It is also the reason for the single trip on Sunday and Monday as well as some select Friday only trains. You would wind up with 14 or so service id's with over 700 trips instead of just the 8 Yankee schedules.

Reply all
Reply to author
Forward
0 new messages