Unicode filenames

0 views
Skip to first unread message

Marty Alchin

unread,
Jan 15, 2009, 10:33:05 AM1/15/09
to django-d...@googlegroups.com
While working again with files for model validation, I realized (and
confirmed with Russ, Honza and Alex in IRC) that the tests put in as a
fix for #6009[1] don't actually prove all the behavior that ticket
refers to. They prove that Unicode filenames come through fine from
uploads, but that test doesn't actually save the file at all. The
changes I'm working on for validation cause it to save the file now,
and the second part of that ticket (dropping too many characters from
the filename) comes back to bite us.

I'm not sure if get_valid_filename() needs to be as aggressive as it
currently is, because a bit of light reading suggests that we should
be able to work with Unicode filenames, now that Django passes around
Unicode strings properly throughout. Alas, I'm not an expert in
Unicode or file systems, so I don't know if I'm overlooking some
obvious problem with opening the range up to include Unicode
characters.

I'd like to get this fixed one way or another before putting my file
patch into model validation, because it introduces a mostly unrelated
failure that I'd rather avoid. Does anybody with more experience in
this area have any thoughts on how we can go about it? For reference,
you can spot the failure by applying a simple patch to the test.[2]

-Gul

[1] http://code.djangoproject.com/ticket/6009
[2] http://media.martyalchin.com/6009-test.diff

Marty Alchin

unread,
Jan 15, 2009, 7:53:16 PM1/15/09
to django-d...@googlegroups.com
On Thu, Jan 15, 2009 at 10:33 AM, Marty Alchin <gulo...@gamemusic.org> wrote:
> While working again with files for model validation, I realized (and
> confirmed with Russ, Honza and Alex in IRC) that the tests put in as a
> fix for #6009[1] don't actually prove all the behavior that ticket
> refers to. They prove that Unicode filenames come through fine from
> uploads, but that test doesn't actually save the file at all. The
> changes I'm working on for validation cause it to save the file now,
> and the second part of that ticket (dropping too many characters from
> the filename) comes back to bite us.

Correction: it was actually an error in the test that was causing the
failure; the problem just wasn't evident until the changes I put
through. I've put in ticket #10041 to address the problem with the
test.

> I'm not sure if get_valid_filename() needs to be as aggressive as it
> currently is, because a bit of light reading suggests that we should
> be able to work with Unicode filenames, now that Django passes around
> Unicode strings properly throughout. Alas, I'm not an expert in
> Unicode or file systems, so I don't know if I'm overlooking some
> obvious problem with opening the range up to include Unicode
> characters.

This is still a problem, though it's not the cause of any test
failures. The second part of #6009 is still a potential issue, and I'm
still not sure how best to address it. I'm not going to reopen that
ticket, though, nor will I open a new one until I know a bit more
about how to continue.

-Gul

Reply all
Reply to author
Forward
0 new messages