Serving large authenticated files

Mikko Ohtamaa

unread,

Jul 14, 2021, 6:18:02 AM7/14/21

to pylons-...@googlegroups.com

Hi,

I need to serve large authenticated files (several gigabytes). I will do an API key check before the user can download a file.

What is the most efficient way to serve these out from Pyramid? Assuming I do not want to block processes or threads - is it possible?

- Zope 2 used to have a sendfile ( https://www.python.org/dev/peps/pep-0333/#id36) - is there anything equivalent for waitress

- Any ideas about caching the file in the frontend servers (Caddy, Cloudflare) and then just creating a short-lived HTTP redirect (<1h) to the actual target

- Other ideas

Cheers,

Mikko

Theron Luhn

unread,

Jul 14, 2021, 11:57:12 AM7/14/21

to 'Jonathan Vanasco' via pylons-discuss

Pyramid has FileResponse https://docs.pylonsproject.org/projects/pyramid/en/latest/api/response.html#pyramid.response.FileResponse, which does use wsgi.file_wrapper you linked to (if available).

Generally in this situation I offload the file to S3 or similar, and have Pyramid generate a signed URL to redirect to. Operationally much simpler.

— Theron

--
You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pylons-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pylons-discuss/CAK8RCUuTG7xZ9tFf5raSmotjOWPJx31dcXyDujk4222_GDq74w%40mail.gmail.com.

Michael Merickel

unread,

Jul 14, 2021, 12:09:39 PM7/14/21

to pylons-...@googlegroups.com

I have some scenarios where I need to do some processing (envelope decryption) on a file from s3 prior to download and then let the user download it and this is how I do it as well.

1. Download giant file from S3 to temporary file.

2. Process file to another temporary file.

3. Return a FileResponse wrapping the temporary file.

4. via WSGI iterator protocol the server will invoke the close() method on the iterator when the request is cleaned up and this will bubble up to delete the temporary file being wrapped.

My solution is not ideal, there is a lag while the file is downloaded from storage into a temporary file and processed, before I return the iterator. However in practice it doesn't blow out memory (yay) and since the throughput between S3 and EC2 is great it does >1GB files with only a slight lag (a couple seconds iirc).

I would say that if you don't need to do any processing then Theron's S3 suggestion is definitely better assuming you can expose those endpoint details to clients.

- Michael

To view this discussion on the web visit https://groups.google.com/d/msgid/pylons-discuss/8B634DFB-996C-47AB-9623-EC0DFF5FD50D%40luhn.com.

Mikko Ohtamaa

unread,

Jul 15, 2021, 12:34:59 AM7/15/21

to pylons-...@googlegroups.com

Hi Theron and others,

On Wed, 14 Jul 2021 at 17:57, Theron Luhn <the...@luhn.com> wrote:

Pyramid has FileResponse https://docs.pylonsproject.org/projects/pyramid/en/latest/api/response.html#pyramid.response.FileResponse, which does use wsgi.file_wrapper you linked to (if available).

This is perfect and exactly what I was looking for. Thank you.

Generally in this situation I offload the file to S3 or similar, and have Pyramid generate a signed URL to redirect to. Operationally much simpler.

A good point. I will check this out if this makes sense pricing wise. I am not using EC2 or any Amazon infrastructure, as I have specific processing needs for which their cloud servers are too expensive.

Br,

Mikko

Steve Piercy

unread,

Jul 15, 2021, 1:00:15 AM7/15/21

to pylons-...@googlegroups.com

On 7/14/21 9:34 PM, Mikko Ohtamaa wrote:
> Generally in this situation I offload the file to S3 or similar, and have Pyramid generate a signed URL to redirect to. Operationally much simpler.
>
>
> A good point. I will check this out if this makes sense pricing wise. I am not using EC2 or any Amazon infrastructure, as I have specific processing needs for which their cloud servers are too expensive.

A far less expensive storage option than Amazon's S3 is Backblaze's B2 service.

https://www.backblaze.com/b2/cloud-storage.html

--steve

Mike Orr

unread,

Jul 15, 2021, 2:31:17 PM7/15/21

to pylons-...@googlegroups.com

You can run an S3-compatible server on your own like Minio.

Reply all

Reply to author

Forward