Open Transit Data - Delhi

836 wyświetleń
Przejdź do pierwszej nieodczytanej wiadomości

Sarath Guttikunda

nieprzeczytany,
23 lis 2018, 08:53:5323.11.2018
do data...@googlegroups.com
https://twitter.com/suraiya95/status/1065921782325145600
Open data platform for public transit in Delhi is live, with route and bus shelter details and live GPS of cluster buses. Many thanks to @IIITDelhi team led by @pravesh who made it for us! http://otd.delhi.gov.in  We will add DTC buses when the new GPS vendor is in place.

Yogada Joshi

nieprzeczytany,
26 lis 2018, 13:03:5826.11.2018
do datameet
Hi,

Do you have any idea where I could get transit data for Bangalore ?

Arun Ganesh

nieprzeczytany,
26 lis 2018, 15:20:0326.11.2018
do data...@googlegroups.com
Was anyone able to convert the GTFS feed into a geojson? 

Tried https://github.com/BlinkTagInc/gtfs-to-geojson but for some reason does not produce any route lines.

Nikhil VJ

nieprzeczytany,
29 lis 2018, 06:47:3929.11.2018
do data...@googlegroups.com
Hi Arun,

This data doesn't include any shapes.txt file, probably that script requires it. shapes.txt is not mandatory in GTFS. The routes are defined as sequence of stops in stop_times.txt (multiplied by the number of trips in a day that is).

There's room here for improvement. Here's a full gtfs validator output for the delhi data : http://nikhilvj.co.in/files/delhi_gtfs/delhi-gtfs-2.html

One peculiarity : The routes have been split up into separate onward and return journey routes.

If anybody knows someone on the technical team of this, kindly connect me with them. The project leads are probably too busy with handling realtime data access requests and won't take too kindly to feedback about what improvements can be done on static side, but I might be able to put something across to the technical folks.

You can zip up and import the static GTFS files on static GTFS Manager tool. If someone wants to draw the shapefiles of the routes and add them in, the "Default Sequence" page will help you do that.

--
Cheers,
Nikhil VJ
+91-966-583-1250
Pune, India
http://nikhilvj.co.in


On Tue, Nov 27, 2018 at 1:50 AM Arun Ganesh <arung...@gmail.com> wrote:
Was anyone able to convert the GTFS feed into a geojson? 

Tried https://github.com/BlinkTagInc/gtfs-to-geojson but for some reason does not produce any route lines.

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nikhil VJ

nieprzeczytany,
13 mar 2019, 11:08:2813.03.2019
do datameet
Hi Folks,

Sometime last month my API key for the realtime feed of Delhi bus data started working. (link to register for yours).

Here's an "unboxing" of one gtfs-realtime vehicleposition feed file from there:


Note: I'm guessing this is not DTC but other bus services operating in Delhi.


- Nikhil VJ
Pune, India


On Thursday, November 29, 2018 at 5:17:39 PM UTC+5:30, Nikhil VJ wrote:
Hi Arun,

This data doesn't include any shapes.txt file, probably that script requires it. shapes.txt is not mandatory in GTFS. The routes are defined as sequence of stops in stop_times.txt (multiplied by the number of trips in a day that is).

There's room here for improvement. Here's a full gtfs validator output for the delhi data : http://nikhilvj.co.in/files/delhi_gtfs/delhi-gtfs-2.html

One peculiarity : The routes have been split up into separate onward and return journey routes.

If anybody knows someone on the technical team of this, kindly connect me with them. The project leads are probably too busy with handling realtime data access requests and won't take too kindly to feedback about what improvements can be done on static side, but I might be able to put something across to the technical folks.

You can zip up and import the static GTFS files on static GTFS Manager tool. If someone wants to draw the shapefiles of the routes and add them in, the "Default Sequence" page will help you do that.

--
Cheers,
Nikhil VJ
+91-966-583-1250
Pune, India
http://nikhilvj.co.in


Johnson Chetty

nieprzeczytany,
14 mar 2019, 02:20:2414.03.2019
do data...@googlegroups.com

This is helpful. 
Thanks for updating on this Nikhil.  
This thread was from November and you bothered to search and update it.

Thanks, 


You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Warm Regards,
Johnson Chetty




Nikhil VJ

nieprzeczytany,
21 mar 2019, 03:54:4821.03.2019
do datameet
Hi folks,

I have been archiving Delhi's bus realtime gtfs feeds on my server since a month now and collating the data into a flat CSV. Sharing that data at this link for download and analysis: https://server.nikhilvj.co.in/place1/ 

Hoping some folks can make some analysis, visualizations or so of it - I don't have time to delve too much into that right now. It's been a great learning experience arranging the scripts and structures on my digitalocean server to make this long-term continuous archival process possible.

Disclaimer: Default reply to every sage advice that starts with "Why don't you.." is : "Sounds good, please do it and get back with the results." I'm satisfied at my end and am sharing the data wealth here for others to take forward, so don't bug me, just take it and go! ;)


Some more notes:
1. Get 7zip portable / p7zipfull to uncompress it. The uncompressed one is around 6 gigs, compressed its about half a gig.
2. There may be many repetitions in the data though, since the feeds were coming in every 30 secs, plus from my last email the analysis showed there were repetitions in one feed itself. So it's a data cleaning challenge for you here to remove repetitions. (Do it - don't expect things to be already done for you unless you're paying a fortune for it!)
3. If there is too much traffic on my server then I'll lock it all down with username-password restrictions. So don't do silly things like telling a whole class of students to download it from online only. Use a pen drive or your LAN.
4. There is an accompanying reports csv that gives file-level summaries.
5. Timestamps are in epoch format in UTC timezone (as per GTFS-RT specs). Lookup "epoch converter". In the reports file i've added 5.5hrs to get human times in IST.
6. The data inside is covering all the dates from 19 Feb onwards. Moving forward I might make the scripts store things month-wise or week-wise. Here it was important to start asap.
7. Every early morning my scripts will place a fresh version of the data there and remove the previous day's one. So don't be downloading stuff from there at 5am.
8. Tip : Python? Wanna map? Check out folium.
9. Tip : Folium? Wanna share the ipynb notebook? Check out kyso.io

PS: Thanks JohnsonC for the kind words. But that is because I use datameet from google groups instead of from my mailbox, unless it's an immediate followup. So it's like stackexchange for me, and it saves me time and effort.

Cheers
Nikhil VJ
Pune, India

Devdatta Tengshe

nieprzeczytany,
22 mar 2019, 00:23:0622.03.2019
do data...@googlegroups.com
Hi Nikhil,

Thanks for sharing this data.
I had a question about the 'delhi_vehicle_reports.csv' file.

If there any documentation about the fields in this file?

I see the following headers:
file,num_vehicles,feed_timestamp,feed_time,incrementality,bad_count,earliest,diff1,latest,diff2
& I'm wondering what they are

Regards,
Devdatta

Nikhil VJ

nieprzeczytany,
22 mar 2019, 01:42:5722.03.2019
do datameet
Hi Devdatta,

Thanks for asking, happy to provide the documentation here:

file : name of the GTFS-RT feed file that was processed (there's one every 30 secs, so named by epoch timestamp of downloading)
num_vehicles : number of vehicle locations (entities) in the feed
feed_timestamp : timestamp of the feed that's there in the header. If you got the same value as the last feed, it's duplicate. (Whoops I just found out there were some 13k dupes in my data!)
feed_time : human, IST form of timestamp
incrementality : that's another field in the feed's header
bad_count : number of entities that had missing or flawed lat-longs like 0,0
earliest : timestamp of the earliest (farthest back in time) vehicle entity in the file (apart from feed timestamp, each vehicle location 'entity' must carry a timestamp of its own, because we can't assume all 700-odd vehicles sent in their lat-longs at exactly the same time. We're doing asynchronous business here.)
diff1 : seconds gap between earliest and feed_timestamp
latest : time of latest entity
diff2 : seconds gap between latest and feed_timestamp

What diff1, diff2 tells me : How "dated" is the information in the feed; and how "recent" is the information in it. Supposing diff2 was too large consistently, then I wouldn't bother to download a fresh feed every 30 secs which is the standard minimum refreshing time as per GTFS-RT specs.

For reference, Sample json output from a feed:

{'header': {'gtfsRealtimeVersion': '2.0',
  'incrementality': 'FULL_DATASET',
  'timestamp': '1550596818'},
 'entity': [{'id': 'vehicle',
   'vehicle': {'trip': {'tripId': '6255', 'routeId': '225'},
    'position': {'latitude': 28.610946655273438,
     'longitude': 76.980224609375,
     'speed': 0.0},
    'timestamp': '1550596773',
    'vehicle': {'id': 'DL1PD0716', 'label': 'DL1PD0716'}}},


--
Cheers,
Nikhil VJ, Pune, India
http://nikhilvj.co.in

Nikhil VJ

nieprzeczytany,
28 kwi 2019, 14:00:4528.04.2019
do datameet
Hi Folks,

Early this month the real-time GTFS feed of Delhi transit had gotten inflated with duplicating/redundant entries and buggy - They announced a downtime to make fixes. 

Maybe it's the elections code-of-conduct or some other to-do work for which announcement has not been made, but I checked again today, and the Delhi realtime GTFS is now awesome! Clean and crisp feed, no repetitions, very good latency, lat-longs seem accurate, clear route and trip ids given.

I've deployed the live feed on a map visualization called OneBusAway visualizer

I also downloaded their static GTFS data zip, and it's visualized here using Transitfeed's ScheduleViewer:


I'm not getting too much into analysing these datasets right now, but am re-starting long-term archival of the realtime feed which I had discontinued last month owing to the aforementioned issues.


Note: The viz programs behind these tend to crash after prolonged uptime (hey I just use 'em.. this is some pretty complicated programming so I'm glad they work and are open-source!), so I've set them to restart every hour, and they take a couple of mins to rev up. If you find the link not working, please check back in 5-10 mins and if it's still down then let me know. You can email me on nikhil.js [at] gmail.com .


Regards
Nikhil VJ
Pune, India

Nikhil VJ

nieprzeczytany,
21 cze 2019, 05:51:0521.06.2019
do datameet
Hi Folks,

I caught up with this project over the last week, revamped my coding and methods and now have a workflow in place where:

1. The GTFS realtime VehiclePositions data of buses released by Open Transit Data Delhi is being archived once every minute
2. A clearing-house system processes the data files at regular intervals and appends it to a CSV
3. Duplications and bad data are being detected and avoided
4. Summary data with per-file stats is being logged to accompany the raw data collecting
5. Subsets of the summary data are being made for visualisation
6. Two time-series data visualisations are now up showing the metrics of the realtime data, and the latency.
7. The full data collecting since 28 April is compressed into a .7z archive every morning and posted online for downloading.

You'll find everything here:

Also, just to repeat a link shared earlier, visualisation of current vehicle locations on a map: http://server.nikhilvj.co.in:8080/

The .7z has full raw data - inviting you to dig into it.


Cheers
Nikhil VJ, Pune, India

Guneet Narula

nieprzeczytany,
22 cze 2019, 05:48:3122.06.2019
do datameet
Amazing! Thanks Nikhil. Just downloaded the raw data, will explore it
Odpowiedz wszystkim
Odpowiedz autorowi
Przekaż
Nowe wiadomości: 0