Catching up on email ...
> On Jul 8, 2013, at 6:38 AM, John Szakmeister <
jo...@szakmeister.net> wrote:
>> Is there an issue with someone monkeying with parse tables to
>> make something happen that shouldn't happen--a security problem of
>> sorts?
On Jul 8, 2013, at 1:58 PM, David Beazley wrote:
> Hard to say. The new approach wouldn't be using 'import' to load parsing data however. As such, it could be encoded in a different format such as JSON. Maybe it's safer than what's done now.
This sounds like a bad security hole.
On a multiuser system, if I know that user X with grammar Y located at position Z (assuming those are the three factors that go into making the unique name), then I can use the same algorithm to determine what the filename will be. Let's suppose it's $PARSETAB.
I then write a file to $PARSETAB and make it unreadable. This prevents user X from running the program.
Or, taking Alex's code as an example implementation:
if os.path.exists(cache_file):
with open(cache_file) as f:
data = json.load(f)
if self.data_is_valid(g, data):
table = LRTable.from_cache(g, data)
if table is None:
table = LRTable.from_grammar(g)
with open(cache_file, "w") as f:
json.dump(self.serialize_table(table), f)
I make $PARSETAB be readable, hooked to a named pipe. When someone starts to read from the pipe, I have the writer process rename $PARSETAB to $PARSETAB.old then make a symbolic link from $PARSETAB to an arbitrary file in X's account. The process feeding the named pipe then returns an invalid table, so as to trigger the cache write. Remember, the cache write occurs in a process owned by X.
The cache write occurs, saving to $PARSETAB, but because of the symlink the write actually goes to some other file of X's, of my choosing.
Even more fun, I can use make that file be a readable file, but with a different grammar than what's expected. Imagine a command-oriented language:
open "filename"
list 10-30
quit
which also has a command:
unlink "filename"
I, being who I am, might provide an alternate parsetab grammar which maps the "open" token to the unlink rule. Now when X's program reads the Y command to "open" it actually deletes the file.
Even if there's nothing so obviously security prone as that in the grammar, it's still plenty easy for me to introduce an alternate parser definition which can mess things up for X, like swapping "+" and "-".
In short, I don't see any way to get what you want with /tmp and still be secure in a multi-user system with possible malicious users. Your safer bet is to default to a $HOME/.ply-cache directory.
Even then, with non-malicious cases, there are still timing problems. What happens when two instances start at the same time and try to cache the parsetab file? In Alex's code it may produce a ValueError if one process has only managed to write part of the cache when the other process tries to read it. So at the very least it needs to be more robust against odd sorts of timing failures.
I would still want some way to override where the parsetab comes from. I don't like assuming that I have a writeable disk. get_table_data() and set_table_data() seem like they would be fine.
Cheers,
Andrew
da...@dalkescientific.com