Pyramid Response with File-like .write() interface

7 views
Skip to first unread message

Mikko Ohtamaa

unread,
Aug 9, 2022, 12:39:12 PMAug 9
to pylons-...@googlegroups.com
Hi all,

I'd like to stream dynamically generated Parquet-files from Pyramid server. Parquet library itself offers writing to any file-like object. I am aware of app_iter and FileResponse interfaces in Pyramid. However, does Pyramid (or any example or utility class) offer a Python file-like interface, where I could just dynamically .write() stuff?

Br,
Mikko

Bert JW Regeer

unread,
Aug 9, 2022, 1:59:17 PMAug 9
to pylons-...@googlegroups.com
Some WSGI servers pass you the raw file descriptor as wsgi.input, but this is not guaranteed (wsgiref does for instance). Instead you should return an iterator that can be read incrementally so that your WSGI server can chunk responses.



--
You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pylons-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pylons-discuss/CAK8RCUudomfEGYnqAUk57XgOvatDLtFGGk%2Be5tFzu0w81Ez4Lg%40mail.gmail.com.

signature.asc

Michael Merickel

unread,
Aug 9, 2022, 2:28:15 PMAug 9
to pylons-...@googlegroups.com
This harkens back to the discouraged write() callable in WSGI PEP 3333 returned by the start_response() invocation. The PEP as well as Pyramid as a framework would encourage you to map the logic into an app_iter as Bert suggested.

I think you'll want to define a file-like object that you can write to and set as the app_iter. The question will be whether you try to do this by writing from a separate thread or in some other way because once you return control to the WSGI server to iterate on your app_iter then you are no longer in control - you'll need some buffer between where you're generating your parquet file and what you're returning from the app_iter. I don't think a simple tempfile is good enough because you want the app_iter to wait instead of stopping when it hits an EOF unless you know that you've reached the end of the buffer.



--

Michael

Theron Luhn

unread,
Aug 9, 2022, 3:15:04 PMAug 9
to pylons-...@googlegroups.com
Could probably jury rig it together with a Queue.  (Will need to run Parquet in a separate thread.)

class FileLikeIter:
    def __init__(self):
        self.q = Queue(1)

    def write(self, data):
        self.q.put(data)

    def __iter__(self):
        try:
            while True:
                yield self.q.get()
        except Empty:
            ...

— Theron



Mikko Ohtamaa

unread,
Aug 10, 2022, 4:00:29 AMAug 10
to pylons-...@googlegroups.com
Hi Michael, others,

On Tue, 9 Aug 2022 at 20:28, Michael Merickel <mmer...@gmail.com> wrote:
This harkens back to the discouraged write() callable in WSGI PEP 3333 returned by the start_response() invocation. The PEP as well as Pyramid as a framework would encourage you to map the logic into an app_iter as Bert suggested.

I think you'll want to define a file-like object that you can write to and set as the app_iter. The question will be whether you try to do this by writing from a separate thread or in some other way because once you return control to the WSGI server to iterate on your app_iter then you are no longer in control - you'll need some buffer between where you're generating your parquet file and what you're returning from the app_iter. I don't think a simple tempfile is good enough because you want the app_iter to wait instead of stopping when it hits an EOF unless you know that you've reached the end of the buffer.


Thank you for the good commentary.

This is very good insight, I didn't think about the threading issue. I think I will attempt an approach where I just use a buffer like Theron mentioned and then have a library to write from the buffer from another thread.

Br,
Mikko
Reply all
Reply to author
Forward
0 new messages