A crowd-sourced public transport app (sanity check my idea)

12 views
Skip to first unread message

Paul Robinson

unread,
Jul 2, 2010, 1:51:29 PM7/2/10
to opendatam...@googlegroups.com
Hi all,

Over the last few weeks it's become clear to me personally that there is a need for a different approach to obtaining - and sharing - data around public transport.

It's something many of us care about, and whilst TfL are heading in the right direction, I think we can - and should - help more to happen.

Whilst the NaPTAN database is pretty decent, it has some holes - here in Manchester at least, the lack of Metrolink data is pretty key - and also there are of course a couple of commercial use license issues that some may not be particularly happy with.

We also have found there is some resistance to releasing data around timetables, and it's not even clear if the changes rumoured to happen around Traveline data will actually manifest into having useful data any time soon.

Lastly, there is critical data around fares - particularly relating to bus fares - that are operator specific and might be deemed "commercially confidential" despite having huge utility for the public at large. It is my understanding there is no central authority we can go to get comprehensive fare data at this time, and operators are likely to be reluctant for operational reasons to be able to make this data available cost-effectively.

Julian and I have thrown around some ideas and I am sketching out an idea for a tool I think can help us overcome all of this and more through the use of crowd-sourced data.

For some of this to work, I think I'm going to need to obtain NaPTAN under a commercial license, which I have only just started to investigate, but I'm hoping they will agree what I outline below is of sufficient public utility any license fees will be waived. Here's how the app works:

1. A "web app" application runs on the user's device. Initially, I'm thinking GPS-capable phones with sophisticated browsers with geo-location JS hooks such as iPhone and Android. In future, well, we'll see.

2. The user is able to select which local public transport way-point they are currently waiting at. This cross-references NaPTAN data based on geo-location info if available, or a user can search based on postcode, street/area, etc.

3. They press a button to log boarding a bus/train/tram/etc. This records the journey "start time".

4. They are then asked to quickly input route number, select operator, and the fare they have paid (or use of a multi-journey ticket, e.g. MegaRider, etc.)

5. They press another button to mark the end of the journey and confirm which transport waypoint they have disembarked at. This records the journey "end time".

6. Optionally, they could press a button to mark other stops/stations they pass. Something to do on the bus, eh? :-)

This - if used by enough people over a sufficient period of time - will build a significant data resource. Eventually we will be able to work out fare stages, average costs of journeys and even implied timetables based on average journey times.

I'm sure you can all imagine other uses this data - which we could release under a very liberal license - could be put to.

If Traveline and/or others do release timetable data, we will be able to do some really interesting analysis based on what transport authorities are timetabling against what is actually happening, for starters. 

There will also over time be significant value in recording fares being charged for different routes and if these change by operator or even time of day.

The next iteration would allow you to indicate where you wanted to get to and the app would be able to suggest a likely journey, including likely journey time and cost based on the sampled data previously input.

The iteration after that, I would envisage being able to include other transport options not based around NaPTAN data at all (e.g. minicab/black cab fares and journey times), and over time the community would have ownership of a data resource that shows how transport flows, where, and at what cost to the user.

This sounds complex (and to be honest, part of it is - figuring out how to make this work for things like tube journeys is fun, but doable), but I think I could have a very, very basic proof of concept app up and running by the end of next week if people think it has legs, with the source up in github if people want to contribute and the data being made available through a variety of means.

Thoughts? Ideas? Input? Anything?

--
Paul Robinson


Vagueware Limited is registered in England/Wales, number 05700421
Registered Office: 3 Tivoli Place, Ilkley, W. Yorkshire, LS29 8SU
Correspondence: 13 Crossland Road, Manchester, M21 9DU

Signup now for white papers, trends, etc.: http://eepurl.com/tVNx

Julian Tait

unread,
Jul 2, 2010, 6:12:31 PM7/2/10
to opendatam...@googlegroups.com
Here is a great blog post on data.gov.uk about why the Asborometer was developed. It was basically to get work and it worked.

http://data.gov.uk/blog/guest-post-how-i-built-asborometer-jeff-gilfelt

Cheers

Julian

Steven Flower

unread,
Jul 3, 2010, 5:36:13 AM7/3/10
to opendatam...@googlegroups.com
I can see the logic of the argument but still prefer the Awesome-o-meter - which was a response app (at a hackday) that told you how many *good* things were in your area.  Im hoping we get beyond sensationalist reuses of open data like Asborometer, and think context before code.  Stuff like this scares and confuses people, and also reaffirms negative perceptions long held.

I've just checked and Manchester has a 34% "score" and is tenth on the national "leaderboard".  Im having second thoughts about going to the market now...

As you highlight, the post shows that the developer saw an opportunity and struck lucky. Fair play to him in relation to that.  I'm nervous about the media bandwaggon that followed, without proper scrutiny of the context of the original data.

Blimey I sound like some kind of #opendata nimby! I think I will go to the market to reassure myself that Moss Side is actually OK
--
====================
Steven Flower
substance.
3rd Floor
Fourways House
57 Hilton Street
Manchester
M1 2EJ

skype/twitter/identi.ca:stevieflow
0161 244 5440
http://www.substance.coop

Short email? I'm trying to stick to this - http://five.sentenc.es/

If I express an opinion in this email, then it'll be my own.

Sam Smith

unread,
Jul 3, 2010, 7:27:02 AM7/3/10
to opendatam...@googlegroups.com

did you read the discussion about this a few days ago on
mysociety developers-public?


Sam

On Fri, 2 Jul 2010, Paul Robinson wrote:

> Hi all,
>
> Over the last few weeks it's become clear to me personally that there is a need for a different approach to obtaining - and sharing - data around public transport.
>
> It's something many of us care about, and whilst TfL are heading in the right direction, I think we can - and should - help more to happen.
>
> Whilst the NaPTAN database is pretty decent, it has some holes - here in Manchester at least, the lack of Metrolink data is pretty key - and also there are of course a couple of commercial use license issues that some may not be particularly happy with.
>
> We also have found there is some resistance to releasing data around timetables, and it's not even clear if the changes rumoured to happen around Traveline data will actually manifest into having useful data any time soon.
>
> Lastly, there is critical data around fares - particularly relating to bus fares - that are operator specific and might be deemed "commercially confidential" despite having huge utility for the public at large. It is my understanding there is no central authority we can go to get comprehensive fare data at this time, and operators are likely to be reluctant for operational reasons to be able to make this data available cost-effectively.
>
> Julian and I have thrown around some ideas and I am sketching out an idea for a tool I think can help us overcome all of this and more through the use of crowd-sourced data.
>
> For some of this to work, I think I'm going to need to obtain NaPTAN under a commercial license, which I have only just started to investigate, but I'm hoping they will agree what I outline below is of sufficient public utility any license fees will be waived. Here's how the app works:
>
> 1. A "web app" application runs on the user's device. Initially, I'm thinking GPS-capable phones with sophisticated browsers with geo-location JS hooks such as iPhone and Android. In future, well, we'll see.

am


> 2. The user is able to select which local public transport way-point they are currently waiting at. This cross-references NaPTAN data based on geo-location info if available, or a user can search based on postcode, street/area, etc.
>
> 3. They press a button to log boarding a bus/train/tram/etc. This records the journey "start time".
>
> 4. They are then asked to quickly input route number, select operator, and the fare they have paid (or use of a multi-journey ticket, e.g. MegaRider, etc.)
>
> 5. They press another button to mark the end of the journey and confirm which transport waypoint they have disembarked at. This records the journey "end time".
>
> 6. Optionally, they could press a button to mark other stops/stations they pass. Something to do on the bus, eh? :-)
>
> This - if used by enough people over a sufficient period of time - will build a significant data resource. Eventually we will be able to work out fare stages, average costs of journeys and even implied timetables based on average journey times.
>
> I'm sure you can all imagine other uses this data - which we could release under a very liberal license - could be put to.
>
> If Traveline and/or others do release timetable data, we will be able to do some really interesting analysis based on what transport authorities are timetabling against what is actually happening, for starters.
>
> There will also over time be significant value in recording fares being charged for different routes and if these change by operator or even time of day.
>
> The next iteration would allow you to indicate where you wanted to get to and the app would be able to suggest a likely journey, including likely journey time and cost based on the sampled data previously input.
>
> The iteration after that, I would envisage being able to include other transport options not based around NaPTAN data at all (e.g. minicab/black cab fares and journey times), and over time the community would have ownership of a data resource that shows how transport flows, where, and at what cost to the user.
>
> This sounds complex (and to be honest, part of it is - figuring out how to make this work for things like tube journeys is fun, but doable), but I think I could have a very, very basic proof of concept app up and running by the end of next week if people think it has legs, with the source up in github if people want to contribute and the data being made available through a variety of means.
>
> Thoughts? Ideas? Input? Anything?
>
> --
> Paul Robinson
>
> http://vagueware.com :: pa...@vagueware.com :: +44 (0) 7740 465746
>
> Vagueware Limited is registered in England/Wales, number 05700421
> Registered Office: 3 Tivoli Place, Ilkley, W. Yorkshire, LS29 8SU
> Correspondence: 13 Crossland Road, Manchester, M21 9DU
>
> Signup now for white papers, trends, etc.: http://eepurl.com/tVNx
>
>

--
The committed person brings an energy, passion, and
excitement that cannot be generated if you are only
compliant, even genuinely compliant. The committed person
doesn't play by the 'rules of the game.' He is responsible
for the game. If the rules of the game stand in the way of
achieving the vision, he will find ways to change the rules.
A group of people truly committed to a common vision is an
awesome force. They can accomplish the seemingly impossible.
-- Peter M. Senge

Paul Robinson

unread,
Jul 3, 2010, 9:17:26 AM7/3/10
to opendatam...@googlegroups.com
On 3 Jul 2010, at 12:27, Sam Smith wrote:

> did you read the discussion about this a few days ago on
> mysociety developers-public?


No, because I wasn't aware said list existed :-) Like many things inside mySociety it seems to have been cloaked behind a mild veil of clique-ish-ness (or at least not very well publicised outside of the mySociety core), and therefore not immediately obvious to most of us outside what is available, open and of use.

I've gone through the archives for the last month and can't see the thread you're referring to. I have seen a short cryptic message about FixMyTrans which seems to be a bit like Derek Harvie's Metrolink project but at a larger scale.

Can't see anything about crowd-sourcing real-time public transport usage data and extrapolating that into data that fills the hole left by Traveline's inaction... can you point me to the thread you mean?

Sam Smith

unread,
Jul 3, 2010, 10:35:21 AM7/3/10
to opendatam...@googlegroups.com
On Sat, 3 Jul 2010, Paul Robinson wrote:
>> did you read the discussion about this a few days ago on
>> mysociety developers-public?
>
> No, because I wasn't aware said list existed :-) Like many
> things inside mySociety it seems to have been cloaked
> behind a mild veil of clique-ish-ness (or at least not
> very well publicised outside of the mySociety core), and
> therefore not immediately obvious to most of us outside
> what is available, open and of use.

Fair enough; it is somewhat cunningly hidden behind a
description on the "Get Involved" page linked from the top
of every page of the mySociety website.


> I've gone through the archives for the last month and
> can't see the thread you're referring to. I have seen a
> short cryptic message about FixMyTrans which seems to be
> a bit like Derek Harvie's Metrolink project but at a
> larger scale.

Google hasn't helped inform me what that is. Is there a
link?

fix my transport is something else, slightly orthogonal to
that.

> Can't see anything about crowd-sourcing real-time public transport usage data and extrapolating that into data that fills the hole left by Traveline's inaction... can you point me to the thread you mean?

the thread forks a bit immediately after, but there is a
strong discussion after this:
https://secure.mysociety.org/admin/lists/pipermail/developers-public/2010-June/006697.html

Regards
Sam

--
The early bird may get the worm,
but the second mouse gets the cheese.

Derek Harvie

unread,
Jul 3, 2010, 11:06:03 AM7/3/10
to opendatam...@googlegroups.com
>> I've gone through the archives for the last month and
>> can't see the thread you're referring to. I have seen a
>> short cryptic message about FixMyTrans which seems to be
>> a bit like Derek Harvie's Metrolink project but at a
>> larger scale.
>
> Google hasn't helped inform me what that is. Is there a
> link?
>

The idea was born out of frustration from people using the Metrolink
to get to work and not being able to find out how the trams were
running, so it was about attempting to see if there was a way to
crowdsource this data instead...

It hasn't progressed very far, we were mainly toying with ideas of how
to gather the data - then worry about presenting it, etc:

http://metrostink.co.uk/

All feedback welcome!

Simon Ward

unread,
Jul 3, 2010, 11:09:37 AM7/3/10
to opendatam...@googlegroups.com
On Fri, Jul 02, 2010 at 06:51:29PM +0100, Paul Robinson wrote:
> Whilst the NaPTAN database is pretty decent, it has some holes - here
> in Manchester at least, the lack of Metrolink data is pretty key - and
> also there are of course a couple of commercial use license issues
> that some may not be particularly happy with.

If you don’t mind slightly outdated NaPTAN data subsequently “corrected”
by OSMers, Open Street Map imported a snapshot of NaPTAN data some time
back. See the wiki page[1] and those linked from it for all the gory
details. The licensing[2] is CC‐by‐SA (and potentially ODbL).

[1]: http://wiki.openstreetmap.org/wiki/NaPTAN
[2]: http://wiki.openstreetmap.org/wiki/NaPTAN#Licensing

Simon
--
A complex system that works is invariably found to have evolved from a
simple system that works.—John Gall

signature.asc

Sam Smith

unread,
Jul 3, 2010, 11:54:48 AM7/3/10
to opendatam...@googlegroups.com

That's very cute.

geotag tweets with start/stop, and use the hashtag?

You know where the tramlines are, they don't move (and
neither, sometimes, do the trams).


If people tweet (a lot) on the tram each day, and have
geotagging turned on anyway, you don't need them to tweet
about it explicitly, you can just have a bot follow their
tweets and see where they seem to get stuck/delayed etc (ie
tweets over time without the location changing). The bonus
here is that it takes no actual work to provide 90% of the
data each day from the most active users.


#metrostink can then be used to add info as to why they're
not travelling based on platform announcements etc (assuming
they do those; they did when I travelled by tram) rather
than a pure data gathering exercise.


And there should be something, e.g. stickers, that people
can print out and, erm, share with others so more people
find out about it... :)


Nice idea. I'd offer to help if I caught the tram.

Cheers
Sam

--
Genius is 1% inspiration and 99% perspiration, which is why
engineers sometimes smell really bad.

sarah hartley

unread,
Jul 5, 2010, 4:54:39 AM7/5/10
to opendatam...@googlegroups.com
Hi Paul et al,
As I think you know, we work closely with mySoc and they have now made this project public; http://www.mysociety.org/2010/03/15/mysocietys-next-12-months-fixmytransport-and-project-fosbury/

I've spoken to Louise Crow there and she's very interested in what you're planning. Her contact address is lou...@mysociety.org

Sarah
--



Sarah Hartley.
Digital editor | Media trainer | Journalist
www.sarahhartley.wordpress.com | @foodiesarah
Reply all
Reply to author
Forward
0 new messages