I am looking to develop a transit app using GTFS static data. One of the constraints I've set to myself is that the app should use minimal mobile data transfers. Therefore, I would like to embed all the data in the app.
My issue is that GTFS data sets are usually quite large (85MB uncompressed for the city of Sydney for example). I've done a bit of reverse engineering on other apps out there and found out that some of them have managed to compress all that data into a much smaller file (I'm talking about a few MB at most).
Using 7zip, I've managed to compress my 85MB data set down to 5MB which is acceptable for me. The next step is for me to use that 7z file into my app and that's where I'm stuck. There's no way I'm going to uncompress it and put it in a SQL database as that will use too much space on the phone. So I was wondering what are my other options.
Thanks
--
You received this message because you are subscribed to the Google Groups "Transit Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to transit-develop...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Hi Thomas,
Consider that the largest table in a GTFS feed is usually stop_times.txt.
First, departure and arrival times can be represented as a number of
seconds (or minutes, 1/2, 1/4, or 1/8 minutes...) after midnight and
stored directly as integers rather than text. Next, consider that the
time difference between an arrival and a departure time is generally
much smaller than the offset of the arrival time from midnight, so the
dwell time can fit into a narrower data type.
Also, consider that schedule data are often redundant in a way that
general-purpose compression algorithms cannot perceive. Because you have
some insight into the semantics of the data you can spot and exploit
these patterns. Many of the trips in your input GTFS feeds may be exact
copies of other trips shifted in time, or you may be able to shift one
trip in time and subtract the other, storing only small residuals.