How can get boto to stop urlencoding the Content-Disposition header incorrectly?

588 views
Skip to first unread message

dpapathanasiou

unread,
Mar 7, 2013, 9:16:25 AM3/7/13
to boto-users
I have some pdf files I'm saving to an S3 bucket.

Internally, the filenames are uuids, but since I want to have them
human-readable by the people who download them from the bucket, I'm
setting a 'Content-Disposition' value when I upload them.

The value of the 'Content-Disposition' key looks like this, e.g.,

"attachment; filename*=UTF-8''My Human-Readable Filename.pdf"

But when I view the metadata information on the S3 console, it has
been urlencoded like this:

"attachment%3B+filename%2A%3DUTF-8%27%27My+Human-Readable
+Filename.pdf"

So when I try to download the file (it's public) direct from the s3
link, it ignores the human-readable filename and presents my with
"[uuid].pdf" as the filename.

If I end the metadata in the S3 console back to this:

"attachment; filename*=UTF-8''My+Human-Readable+Filename.pdf"

then when I download it from the s3 link, it gives me "My Human-
Readable Filename.pdf" as the filename, which is what I want.

I've tried this two different ways in my code.

First I tried creating the content disposition string value and send
it as part of the headers when I push the file:

k = Key(bucket)
k.key = file_base_name(filename)
# set the content-disposition so that when downloaded it has the
human readable title instead of the article_id
content_disposition = u''.join(["attachment; filename*=UTF-8''",
human_readable_filename, file_extension(filename)])
k.set_contents_from_filename(filename, headers={'Content-
Disposition':content_disposition})

I also tried posting it as metadata before calling
set_contents_from_filename() like this:

# set the content-disposition so that when donwloaded it has the
human readable title instead of the article_id
content_disposition = u''.join(["attachment; filename*=UTF-8''",
human_readable_filename, file_extension(filename)])
k.set_metadata('Content-Disposition', content_disposition)
k.set_contents_from_filename(filename)

But in both cases, the result is the same; boto adds unwanted
urlencoding to the value of the 'Content-Disposition' and the download
later from the bucket doesn't work as expected.

Any way around this?

to...@post.fm

unread,
Mar 14, 2013, 2:24:29 PM3/14/13
to Dpapathanasiou, Boto-users
I just ran into this issue myself. After a quick dig in the code, simplest way around it seems to be to ensure that you pass a str for the header and not a unicode object:
headers['Content-Disposition'] = "attachment; filename*=UTF-8''" + os.path.basename(filename).encode('utf-8','ignore')
Hope that helps.

Denis Papathanasiou

unread,
Mar 15, 2013, 10:18:20 AM3/15/13
to Boto-users
Hi Tony,

On Thu, Mar 14, 2013 at 2:24 PM, to...@post.fm <to...@post.fm> wrote:
> On Mar 07, 13 at 02:16 PM, Dpapathanasiou <denis.pap...@gmail.com> wrote:
[snip]
> I just ran into this issue myself. After a quick dig in the code, simplest way around it seems to be to ensure that you pass a str for the header and not a unicode object:
> headers['Content-Disposition'] = "attachment; filename*=UTF-8''" + os.path.basename(filename).encode('utf-8','ignore')
> Hope that helps.

Thanks, it certainly solved the problem when the file names are
strings that can be represented in ascii.
I do, however, have plenty of files which correspond to foreign
language text such as Japanese, and so it remains unsolved in that
context.

Anthony Morgan

unread,
Mar 25, 2013, 1:25:11 PM3/25/13
to Dpapathanasiou, Boto-users
On Mar 15, 13 at 02:18 PM, Denis Papathanasiou wrote:
> [snip]
> I do, however, have plenty of files which correspond to foreign
> language text such as Japanese, and so it remains unsolved in that
> context.

In which case you'll want to encode into rfc6266 first yourself, there is a rfc6266 package:

headers['Content-Disposition'] = rfc6266.build_header(os.path.basename(filename))

Should probably note this only works on modern browsers though (IE > 8 IIRC).
Reply all
Reply to author
Forward
0 new messages