GTFS, Geolocation and Mapping

252 views
Skip to first unread message

Paul Harrington

unread,
Jul 12, 2018, 11:51:12 AM7/12/18
to Transit Developers
Hi,

Within an implementation covering multiple GTFS feeds, I'm looking to present the closest stops to a user on a map. Thinking about overall design and a few cases are thrown up

1) To initially determine if a feed is relevant for a user I am looking to see if the user's GPS coordinates lie within the rectangular bounds of the feed. To find the rectangular bounds of a feed you cycle through every stop in stops.txt to determine the minimum and maximum latitude and longitude. 

- Is there a better easy way of doing this ? 

- Would this work well in practice for most feeds (If for example the feed contained mostly local buses but there were a couple that went much further than the rest the model wouldn't be great) ?

- Is this box information already available for feeds online and if it is not has anybody written software to calculate the extremities which could be run against stops.txt ? 

- Is it correct to assume that once you have worked out the box coordinates once for a given feed that they are unlikely to change much over time ?


2) When displaying  stops closest to a user do you focus on a number of stops to display or stops within a certain distance or a combination of both ?

- No point display 10 stops if there are 20 within half a mile as all are reachable.
- If a stop close by covers a route no point displaying a stop further away if it does not cover additional routes

Anybody have a good algorithm for determining what to show ?


3) Any suggestions regarding mapping APIs ?

- Google is powerful and easy to use but from now needs sign up and is free to up a certain quota (I think). Going forward everything may be charged. 
- Has Openstreet maps got easier for use for these type of applications.
- Other good ones 

Thanks Paul.

Paul Harrington

unread,
Jul 13, 2018, 4:44:21 AM7/13/18
to Transit Developers
Anybody ? 

Andrew Byrd

unread,
Jul 13, 2018, 5:09:17 AM7/13/18
to transit-d...@googlegroups.com

> On 12 Jul 2018, at 23:51, Paul Harrington <harri...@gmail.com> wrote:
> Within an implementation covering multiple GTFS feeds, I'm looking to present the closest stops to a user on a map.

In order to select all stops within a certain radius, you will probably want to use a spatial index of some kind (for example an R-tree) to efficiently extract the relevant objects. Spatial databases will have this capability built in. I have found though that for simple operations it’s just as effective to use a "non-spatial" database and build standard indexes on the latitude and longitude columns. Selecting all stops within a bounding box using simple inequalities is very fast on indexed columns.

If you want the N closest, you will then need to calculate the distance to the pre-filtered stops and sort them. This can also be done well enough even on non-spatial databases using crude projections (longitude * cos(latitude)) and some straightforward SQL.

> 1) To initially determine if a feed is relevant for a user I am looking to see if the user's GPS coordinates lie within the rectangular bounds of the feed. To find the rectangular bounds of a feed you cycle through every stop in stops.txt to determine the minimum and maximum latitude and longitude.

If you have multiple feeds, is there any reason you can't just load them all into a single database / data structure, with a single spatial index?

Regards,
Andrew

Sean Barbeau

unread,
Jul 13, 2018, 9:36:07 AM7/13/18
to Transit Developers

> On 12 Jul 2018, at 23:51, Paul Harrington <harri...@gmail.com> wrote:
> Within an implementation covering multiple GTFS feeds, I'm looking to present the closest stops to a user on a map.

In order to select all stops within a certain radius, you will probably want to use a spatial index of some kind (for example an R-tree) to efficiently extract the relevant objects. Spatial databases will have this capability built in. 

l...@frachet.ca

unread,
Jul 19, 2018, 1:42:37 PM7/19/18
to Transit Developers
Hi Paul,

1) Bounding box will do a fast filter, but you'll always be inside Amtrak in the US, inside NR in the UK or inside VIARail in Canada. What you could do as second level of filter would be to have an "envelope" of the feed, say the polygones covering all the area at less that 10km of a lat/lon of a stop of the GTFS. You can process it once for all per GTFS (using library like turf turfjs.org) and then check if you are inside the polygon every time you check coverage

Online you can use transit.land which has some API with bounding box and "Operator service area" GeoJSON. But this project is only half-alive so it might not but up-to-date.

It will change overtime if new services are added, or if the scope of the feed producer will change. It might be an issue or not for you, depending of what you're trying to achieve.

2) Depends of what you are looking for.

3) Mapzen (parent project of transit.land) used to be hosting Valhala, not sure it is still up-do-date either. 

Leo Frachet

Paul Harrington

unread,
Jul 20, 2018, 6:43:36 AM7/20/18
to Transit Developers
Thanks Guys,

Some interesting algorithms mentioned here, the feed area "envelope" referenced by Leo is something I may look at going forward. 

For now though I'm going to start with the simple crude approach and see how it works out in practice.

I now calculate the bounding rectangular box for each feed and this will be used to determine whether or not to pull down feed stop data to the user's device.

I'm also now calculating the straight line coordinate distance from the user to each stop in the pulled down feeds using a simple canvas model approach and recording those closest. I'm not using the Haversine algorithm as I want it to run quickly and all I am interested in is finding the closest stops: a crude calculation will suffice once you are not dealing with feeds in the very high and low latitudes.

I will aim to return a desired configurable number of stops but can override this target by including all stops within a configurable coordinate distance and excluding those outside a configurable coordinate distance. I've no idea yet what the values should be for these 3 configurable parameters, I guess it will be a trial and error process.  If anybody has any suggestions based on their own experience I'd be keen to hear ?

One of the original questions was suggested mapping providers, I'm looking for something that's free (free at low volumes will do once likely to stay that way), easy to develop in and offers a decent user experience. Any suggestions ???
Reply all
Reply to author
Forward
0 new messages