Proper Querying of Transit Data from MySQL

646 views
Skip to first unread message

iSk1nny

unread,
Aug 7, 2009, 10:52:07 AM8/7/09
to Transit Developers
Hey everyone! For the past year I have been working on an iPhone based
Transit application that runs off a Restful XML based web service
written in PHP.

I have a little trouble properly querying GTFS Data from my MySQL
Database. I have imported the feed files exactly into the Database as
they are in .cvs form (including Field Names).

Does anyone have proper SQL Queries to be used to calculate Departure
Times for a specified stop_id

Thanx!

Brandon Martin-Anderson

unread,
Aug 7, 2009, 2:22:51 PM8/7/09
to transit-d...@googlegroups.com
You could get all departure times for a given stop_id with:

SELECT departure_time FROM stop_times WHERE stop_id='your_stop_id';

However, this is going to get every departure on every day for every
route that visits that stop.

To limit the stops you get to a particular service day:

SELECT stop_times.departure_time FROM stop_times, trips WHERE
stop_times.stop_id='your_stop_id' AND
trips.service_id='your_service_id' AND
stop_times.trip_id=trips.trip_id;

Further, you could limit the departures to a particular route on a
particular service day:

SELECT stop_times.departure_time FROM stop_times, trips WHERE
stop_times.stop_id='your_stop_id' AND
trips.service_id='your_service_id' AND trips.route_id='your_route_id'
AND stop_times.trip_id=trips.trip_id;

-B

iSk1nny

unread,
Aug 18, 2009, 8:46:55 PM8/18/09
to Transit Developers
Thanx for the Reply!

Thats about the same query I already have. Im looking for a more
'advanced' query that will query for:

1. A particular stop
2. Take into account calendar_dates table
3. Get the departure times for all in service routes

I just cant seem to get the query right... Anymore help would be very
helpfull!

Thanx in advance

On Aug 7, 3:22 pm, Brandon Martin-Anderson <badh...@gmail.com> wrote:
> You could get all departure times for a given stop_id with:
>
> SELECT departure_time FROM stop_times WHERE stop_id='your_stop_id';
>
> However, this is going to get every departure on every day for every
> route that visits that stop.
>
> To limit the stops you get to a particular service day:
>
> SELECT stop_times.departure_time FROM stop_times, trips WHERE
> stop_times.stop_id='your_stop_id' AND
> trips.service_id='your_service_id' AND
> stop_times.trip_id=trips.trip_id;
>
> Further, you could limit the departures to a particular route on a
> particular service day:
>
> SELECT stop_times.departure_time FROM stop_times, trips WHERE
> stop_times.stop_id='your_stop_id' AND
> trips.service_id='your_service_id' AND trips.route_id='your_route_id'
> AND stop_times.trip_id=trips.trip_id;
>
> -B
>

Tim McHugh

unread,
Aug 19, 2009, 5:26:15 PM8/19/09
to Transit Developers
It depends on which method you are using for the calendar and
calendar_dates. If you use the "alternate" method of bypassing the
calendar table it's pretty simple.

select trips.route_id, stop_times.departure_time
from calendar_dates, stop_times, trips
where calendar_dates.date='your_date'
and calendar_dates.exception_type=1
and stop_times.stop_id='your_stop_id'
and calendar_dates.service_id=trips.service_id
and stop_times.trip_id=trips.trip_id;

If you use the "recommended" method combining the calendar and
calendar_dates as exceptions it's more complicated.

select trips.route_id, stop_times.departure_time
from calendar, stop_times, trips
where 'your_date' between calendar.start_date and calendar.end_date
and stop_times.stop_id='your_stop_id'
and calendar.service_id=trips.service_id
and stop_times.trip_id=trips.trip_id
and ( (dayofweek('your_date')=1 and calendar.sunday=1)
or (dayofweek('your_date')=2 and calendar.monday=1)
or (dayofweek('your_date')=3 and calendar.tuesday=1)
or (dayofweek('your_date')=4 and calendar.wednesday=1)
or (dayofweek('your_date')=5 and calendar.thursday=1)
or (dayofweek('your_date')=6 and calendar.friday=1)
or (dayofweek('your_date')=7 and calendar.saturday=1))
and not exists (select 1
from calendar_dates
where calendar_dates.date='your_date'
and calendar_dates.date between
calendar.start_date and calendar.end_date
and
calendar.service_id=calendar_dates.service_id
and calendar_dates.exception_type=2)
union
select trips.route_id, stop_times.departure_time
from calendar_dates, stop_times, trips
where calendar_dates.date='your_date'
and calendar_dates.exception_type = 1
and stop_times.stop_id='your_stop_id'
and calendar_dates.service_id=trips.service_id
and stop_times.trip_id=trips.trip_id;

There may be a more elegant way for the dayofweek part but this should
work (I'm not as familiar with mysql). FYI, TriMet uses the alternate
method because of the simplicity of these types of queries. Hope this
helps.

iSk1nny

unread,
Aug 19, 2009, 5:44:37 PM8/19/09
to Transit Developers
I did the opposite and bypassed 'calendar_dates'. What do you think of
this? for the big downtown terminal (during rush hour - so 15 routes)
it takes about ~200ms of total query time, resulting in my XML Web
Service taking about ~500ms to finish. Total time from querying to
displaying on my iPhone is about ~800ms (on 3G).

SELECT
routes.route_short_name,
routes.route_long_name,
routes.route_type,
MIN(stop_times.departure_time) as departure_time,
MIN(TIMEDIFF(stop_times.departure_time, '{PHP Dynamical Generated 24
Hour Time}')) as departure_time_in
FROM
calendar
Inner Join trips ON trips.service_id = calendar.service_id
Inner Join stop_times ON stop_times.trip_id = trips.trip_id
Inner Join routes ON trips.route_id = routes.route_id
WHERE
calendar.{PHP Dynamical Generated Day} = '1'
AND
'{PHP Dynamical Generated Date}' BETWEEN calendar.start_date AND
calendar.end_date
AND
stop_times.stop_id = '{My Stop ID - Passed from a PHP Function}'
AND
stop_times.departure_time >= '{PHP Dynamical Generated 24 Hour Time}'
GROUP BY
routes.route_short_name
ORDER BY
routes.route_short_name+0 ASC,
stop_times.departure_time ASC

Tim McHugh

unread,
Aug 19, 2009, 6:13:16 PM8/19/09
to Transit Developers
I guess it depends on the scope of agencies that you are planning to
support with your app. Ignoring the calendar_dates file will yield
incorrect results from those agencies that use it. You would have to
either incorporate the calendar_dates into your query or do some
converting of the calendar_dates data into your calendar table (lots
of chopped up start/end date periods for the service_ids).

iSk1nny

unread,
Aug 19, 2009, 6:16:38 PM8/19/09
to Transit Developers
How many rows does calendar_dates usually have?

iSk1nny

unread,
Aug 19, 2009, 6:35:59 PM8/19/09
to Transit Developers
My Query and your Recommend Query give the same results right now, but
that may change depending on the date. You know what your talking
about so Ian gonna take your recommendation, and hope that it is ok if
I proceed to use your query?

Also thank your for your help!
Reply all
Reply to author
Forward
0 new messages