Proposal: Specify that files within the Zip file should have at least readable permissions

42 views
Skip to first unread message

Mark Stosberg

unread,
May 23, 2023, 11:37:50 AM5/23/23
to GTFS Changes
The ZIP file format preserves file permissions, but the GTFS spec is currently silent on what they should be. Normally, the *.txt files within GTFS ZIP files have at least readable permissions and often writable as well-- useful if you need to pre-process the zip file in place after downloading it.

According to the current GTFS spec, it's valid to publish a ZIP file where there is no permission to read the files it contains. For example, here's the ls output of such a zip file:

```
      406      0 drwxr-xr-x   2 mark     mark          280 May 22 14:01 .
      417      4 ----------   1 mark     mark          292 Apr 26 00:16 ./feed_info.txt
      416      4 ----------   1 mark     mark          350 Apr 26 00:16 ./directions.txt
      415      4 ----------   1 mark     mark           89 Apr 26 00:16 ./realtime_routes.txt
      414      4 ----------   1 mark     mark           57 Apr 26 00:16 ./calendar_attributes.txt
      413    268 ----------   1 mark     mark       272141 Apr 26 00:16 ./shapes.txt
      412      4 ----------   1 mark     mark          156 Apr 26 00:16 ./calendar.txt
      411    304 ----------   1 mark     mark       309381 Apr 26 00:16 ./stop_times.txt
      410     12 ----------   1 mark     mark        11132 Apr 26 00:16 ./trips.txt
      409      4 ----------   1 mark     mark          815 Apr 26 00:16 ./routes.txt
      408     16 ----------   1 mark     mark        14618 Apr 26 00:16 ./stops.txt
      407      4 ----------   1 mark     mark          245 Apr 26 00:16 ./agency.txt
```

Notice the complete lack of permissions on the files!

Most GTFS zip file already do have at least "user readable" permission" on the files within the zip file. The spec could clarify that the rare files like the above are not valid. (Indeed, I only spotted the issue after a ZIP file crashed our automation which had run for years without encountering this problem with ~500 feeds). 

We pre-process the files in-place, so for our case it's useful if the the permissions of the ".txt" files allow updating the files, but I also realize it could be confusing if the spec required that these files be "writable", but having them at least be readable seems reasonable!

Mark Stosberg

unread,
Jun 16, 2023, 9:17:03 AM6/16/23
to GTFS Changes
I dug into this area some more.

The GTFS spec does not specify /which/ ZIP standard must be used. The original spec came from PKWARE and is entitled  "APPNOTE.TXT - .ZIP File Format Specification".  It can be found here:


This was eventually made into ISO standard, but there's a fee just to get a copy of that standard. However, Wikipedia describes it as a subset of the PKWARE standard, lacking features that are no used in GTFS anyway, like encryption.

However, the most popular CLI  tool for Mac and Linux is "Info-ZIP". It provides the familiar `zip` and `unzip` binaries. 

The detail I'm interested in-- file permissions-- I believe is referred to in the sec as "external file attributes", which are considered host-dependent.  In other words, I think for Unix file permissions to work, you may need to both create and unzip a zip archive on Unix. 

I reviewed the `zip` and `unzip` tools from the "Info-ZIP" project and I couldn't actually find how to tell them to ignore file permissions, so that if an archive was created with a "read-only" file permissions, I might be able to ignore that when opening, using my Unix user's default permissions.

In summary:

 * I don't think it's appropriate to add the spec that the files should have "at least readable permissions", because it turns out the permissions are host-dependent and I believe could be completely ignored depending on what tools are involved. 
* It could be useful to clarify which "ZIP" file format specification must be followed, but this is not as a clear-cut as I might hope. It's difficult to recommend standard which people have to pay just to read the thing. Meanwhile, one of the most popular tools,  Info-ZIP, has it's own extensions, although they might not matter for practical use. So it worth explicitlying mentioning the PKZIP standard? Maybe.

In the end, I'm less clear that's a great way to avoid rare permission problems using the spec, so I withdraw this proposal.

Thanks,

Elias Gino Cripotos

unread,
Jun 16, 2023, 10:17:19 AM6/16/23
to GTFS Changes
Thank you for notifying the community of the proposal withdrawal.

Have a good weekend,
Elias 

Reply all
Reply to author
Forward
0 new messages