How can I upload files asynchronously using aiohttp

2,326 views
Skip to first unread message

yxr...@gmail.com

unread,
Apr 25, 2020, 10:05:41 AM4/25/20
to aio-libs
I have some experience with other asynchronous web frameworks, such as Openresty, but I am a beginner of aiohttp.
I am building a http server that supports users to upload files in an asynchronous way.
Image a scenario where user A first uploads a large file of size 10GB and, at the same time, user B uploads a small file of size 100KB.
In a synchronous scenario, user B is blocked until the request from user A is processed, that is, the large file is uploaded completely.
I expect aiohttp to help me mitigate this issue, that is, the request from user B has a chance to be processed while processing the time consuming request of user A.

I write a simple demo:

server.py
import os
import time

import asyncio
from aiohttp import web


async def upload(request):
    print('upload...')
    reader = await request.multipart()
    # reader.next() will `yield` the fields of your form
    field = await reader.next()
    assert field.name == 'mp3'
    filename = field.filename
    # You cannot rely on Content-Length if transfer is chunked.
    size = 0
    with open(os.path.join('upload_'+filename), 'wb') as f:
        while True:
            chunk = await field.read_chunk()  # 8192 bytes by default.
            if not chunk:
                break
            size += len(chunk)
            time.sleep(1) # simulate the time consuming ops
            # await asyncio.sleep(1) # this works but I don't want to yield manually
            print('writing %s ...' % filename)
            f.write(chunk)
    return web.Response(text='{} sized of {} successfully stored'
                             ''.format(filename, size))


app = web.Application()
app.add_routes([
    web.static('/', './'),
    web.post('/upload', upload),
    ])

if __name__ == '__main__':
    web.run_app(app)


index.html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Document</title>
</head>
<body>
    <form action="upload" method="post" accept-charset="utf-8"
      enctype="multipart/form-data">

        <label for="mp3">Mp3</label>
        <input id="mp3" name="mp3" type="file" value=""/>

        <input type="submit" value="submit"/>
    </form>
</body>
</html>


I opened 2 chrome tabs to simulate 2 users.
I upload file 1.jpg in tab1 and upload file 2.jpg in tab2.
What I always see in the output of server.py is
aiohttp » python server.py                                          ~/Documents/programming/test/python/aiohttp  
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
upload...
writing 1.jpg ...
writing 1.jpg ...
writing 1.jpg ...
writing 1.jpg ...
writing 1.jpg ...
writing 1.jpg ...
upload...
writing 2.jpg ...
writing 2.jpg ...
writing 2.jpg ...
writing 2.jpg ...
writing 2.jpg ...
writing 2.jpg ...
writing 2.jpg ...

Within the handler upload, those conroutines like multipart, next, read_chunk does not yield as expected.
I expect the second request to be processed while processing the first request although the total time of processing the 2 requests should still be the sum of each of them because I am using time.sleep() which blocks the thread.
Can somebody help me understand what happens and the right way to do it?
Many thanks.

yxr...@gmail.com

unread,
Apr 25, 2020, 10:16:36 AM4/25/20
to aio-libs
In my last experiment, 1.jpg is 45KB while 2.jpg is 51KB.
If I replace 1.jpg with a large file of size 4GB and specify a chunk size of 1MB in read_chunk, it works as expected.
Though I don't understand why read_chunk does not yield in my first try, the code works now. Thanks.
Reply all
Reply to author
Forward
0 new messages