[Django] #35415: Adding content_type to StreamingHttpResponse on Linux causes memory error after streaming around 1GB-2GB of data.

12 views
Skip to first unread message

Django

unread,
Apr 30, 2024, 5:57:55 AMApr 30
to django-...@googlegroups.com
#35415: Adding content_type to StreamingHttpResponse on Linux causes memory error
after streaming around 1GB-2GB of data.
-----------------------------------------+------------------------
Reporter: LouisB12345 | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 5.0
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-----------------------------------------+------------------------
This bug took a few days to work out and was extremely annoying.
I'm running Django under ASGI and im using was trying to use to stream a
on-the-fly zip-file using the StreamingHttpResponse, note: i dont know if
this occurs under WSGI.
I'm developing on a Windows operating system and after I deemed the code
to be functional i tried it on the Linux vm i have set up.
I noticed that the download would fail almost everytime. The cause was
that the memory usage kept increasing after some time, usually after
around 1-2GB was streamed. So after eliminating multiple factors I came to
the conclusion that when i add content_type= withing the
StreamingHttpResponse this bug occurs.

You can replicate the bug on Linux with the code below, if you remove the
content_type it works as expected but with it the bug occurs.
{{{
from os.path import basename
import logging
import aiofiles
from django.contrib.auth.mixins import LoginRequiredMixin
from django.http import StreamingHttpResponse
from django.views import View
from guppy import hpy

H = hpy()

LOGGER = logging.getLogger(__name__)


class DownloadSelectedFiles(LoginRequiredMixin, View):
def get(self, request) -> StreamingHttpResponse:
file_name = "f.txt"
response = StreamingHttpResponse(file_data(file_name),
content_type="application/octet-stream")
response["Content-Disposition"] = f'attachment;
filename="{basename(file_name)}"'
return response


async def file_data(file_path):
async with aiofiles.open(file_path, "rb") as f:
LOGGER.info(f"Current threads are {threading.active_count()}
opening file {file_path}\n{H.heap()}")
teller = 0
while chunk := await f.read(65536):
teller += 1
await asyncio.sleep(0)
if teller % 1000 == 0:
LOGGER.info(f"Current threads are
{threading.active_count()} yielding chunk nr.{teller}\n{H.heap()}")
yield chunk
}}}

I have some images of the output of the Logs to show the difference.
--
Ticket URL: <https://code.djangoproject.com/ticket/35415>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 30, 2024, 6:10:21 AMApr 30
to django-...@googlegroups.com
#35415: Adding content_type to StreamingHttpResponse on Linux causes memory error
after streaming around 1GB-2GB of data.
-------------------------------+--------------------------------------
Reporter: LouisB12345 | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 5.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Comment (by LouisB12345):

I have some images that i want to attach, but for some reason i can upload
them? Because it is 80+% chance to be spam according to SpamBayes.
--
Ticket URL: <https://code.djangoproject.com/ticket/35415#comment:1>

Django

unread,
Apr 30, 2024, 11:08:50 AMApr 30
to django-...@googlegroups.com
#35415: Adding content_type to StreamingHttpResponse on Linux causes memory error
after streaming around 1GB-2GB of data.
-------------------------------+--------------------------------------
Reporter: LouisB12345 | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by Sarah Boyce):

* resolution: => needsinfo
* status: new => closed

Comment:

Hi LouisB12345, this looks a little unusual to me as you have a sync view
calling an async function.
Maybe because of the context switching between sync and async it's waiting
for the data to accumulate before sending? What server are you running
here?

I recommend you post on [https://forum.djangoproject.com/c/users/async-
channels/23 the forum], verify that `StreamingHttpResponse` is being used
as expected, and Django is at fault here.
--
Ticket URL: <https://code.djangoproject.com/ticket/35415#comment:2>

Django

unread,
May 1, 2024, 4:48:03 AMMay 1
to django-...@googlegroups.com
#35415: Adding content_type to StreamingHttpResponse on Linux causes memory error
after streaming around 1GB-2GB of data.
-------------------------------+--------------------------------------
Reporter: LouisB12345 | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Comment (by LouisB12345):

Replying to [comment:2 Sarah Boyce]:[[br]][[br]]Hello Sarah,[[br]][[br]]I
know for sure that the data is not accumulating before sending, because
the download starts immediately. If i where to not call an async function,
then you will notice the delay and see that it loads the entire file in
memory. Also this would not explain why the memory-error does not happen
when i leave out the content_type.[[br]][[br]]The server I am running is a
'''''Proxmox''''' vm running '''''Debian12''''' with '''''4 cores''''' and
'''''4GB ram''''', '''''intel-core i5-6500T'''''.
--
Ticket URL: <https://code.djangoproject.com/ticket/35415#comment:3>

Django

unread,
May 2, 2024, 9:47:36 AMMay 2
to django-...@googlegroups.com
#35415: Adding content_type to StreamingHttpResponse on Linux causes memory error
after streaming around 1GB-2GB of data.
-------------------------------+--------------------------------------
Reporter: LouisB12345 | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 5.0
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by Natalia Bidart):

* resolution: needsinfo => invalid

Comment:

Hello LouisB12345! Thank you for your report. As Sarah mentioned, the best
course of action at this point is to reach out to the community in the
[https://forum.djangoproject.com/c/internals/async/8 Django Forum (async
category)] to get help debugging your view, since we are not able to
reproduce. See below for the full details of the reproducer that I setup
locally, streaming a 3.3G iso image, without getting any memory usage
increase nor memory error.

Since the goal of this issue tracker is to track issues about Django
itself, and your issue seems, at first, to be located in your custom code,
I'll be closing this ticket as invalid following the
[https://docs.djangoproject.com/en/dev/internals/contributing/triaging-
tickets/#closing-tickets ticket triaging process]. If, after debugging,
you find out that this is indeed a bug in Django, please re-open with the
specific details and please be sure to include a small but complete Django
project to reproduce or a failing test case.

The reproducer I used looks as follows:
* A local Django project (`projectfromrepo`) with an app for this ticket
(`ticket_35415`)
* `uvicorn` installed and serving Django with `python -Wall -m uvicorn
projectfromrepo.asgi:application --reload`
* A views.py with this (slightly simplified) content:
{{{#!python
import aiofiles
import logging
import os
import threading

from django.http import StreamingHttpResponse


logger = logging.getLogger(__name__)


def debug(msg):
logger.info(msg)
print(msg)


def file_download(request):
file_name = "/home/nessita/debian-live-12.2.0-amd64-kde.iso"
assert os.path.exists(file_name)
debug(f"Requested {file_name} which stats {os.stat(file_name)=}.")
response = StreamingHttpResponse(
file_data(file_name), content_type="application/octet-stream"
)
response["Content-Disposition"] = f'attachment;
filename="{file_name}"'
return response


async def file_data(file_path, chunk_size=65536):
debug(f"Current threads are {threading.active_count()} opening file
{file_path}.")
async with aiofiles.open(file_path, mode="rb") as f:
teller = 0
while chunk := await f.read(chunk_size):
teller += 1
if teller % 1000 == 0:
debug(
f"Current threads are {threading.active_count()}
yielding chunk nr.{teller}."
)
yield chunk
}}}
* Included the following in the main urls.py `path("streaming/",
ticket_35415.views.file_download)`
* Visiting http://localhost:8000/streaming/ works without any issues and
the downloaded file matches the hash of the source file. What's printed in
the terminal:
{{{
(djangodev) [nessita@socrates projectfromrepo]$ python -Wall -m uvicorn
projectfromrepo.asgi:application --reload
INFO: Will watch for changes in these directories:
['/home/nessita/fellowship/projectfromrepo']
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [435237] using StatReload
Requested /home/nessita/debian-live-12.2.0-amd64-kde.iso which stats
os.stat(file_name)=os.stat_result(st_mode=33188, st_ino=8093957,
st_dev=66306, st_nlink=1, st_uid=1001, st_gid=1001, st_size=3492741120,
st_atime=1698110826, st_mtime=1698112190, st_ctime=1698112191).
Current threads are 2 opening file /home/nessita/debian-
live-12.2.0-amd64-kde.iso.
Current threads are 4 yielding chunk nr.1000.
Current threads are 4 yielding chunk nr.2000.
Current threads are 4 yielding chunk nr.3000.
Current threads are 4 yielding chunk nr.4000.
Current threads are 4 yielding chunk nr.5000.
Current threads are 4 yielding chunk nr.6000.
Current threads are 4 yielding chunk nr.7000.
Current threads are 4 yielding chunk nr.8000.
Current threads are 4 yielding chunk nr.9000.
Current threads are 4 yielding chunk nr.10000.
Current threads are 4 yielding chunk nr.11000.
Current threads are 4 yielding chunk nr.12000.
Current threads are 4 yielding chunk nr.13000.
Current threads are 4 yielding chunk nr.14000.
Current threads are 4 yielding chunk nr.15000.
Current threads are 4 yielding chunk nr.16000.
Current threads are 4 yielding chunk nr.17000.
Current threads are 4 yielding chunk nr.18000.
Current threads are 4 yielding chunk nr.19000.
Current threads are 4 yielding chunk nr.20000.
Current threads are 4 yielding chunk nr.21000.
Current threads are 4 yielding chunk nr.22000.
Current threads are 4 yielding chunk nr.23000.
Current threads are 4 yielding chunk nr.24000.
Current threads are 4 yielding chunk nr.25000.
Current threads are 4 yielding chunk nr.26000.
Current threads are 4 yielding chunk nr.27000.
Current threads are 4 yielding chunk nr.28000.
Current threads are 4 yielding chunk nr.29000.
Current threads are 4 yielding chunk nr.30000.
Current threads are 4 yielding chunk nr.31000.
Current threads are 4 yielding chunk nr.32000.
Current threads are 4 yielding chunk nr.33000.
Current threads are 4 yielding chunk nr.34000.
Current threads are 4 yielding chunk nr.35000.
Current threads are 4 yielding chunk nr.36000.
Current threads are 4 yielding chunk nr.37000.
Current threads are 4 yielding chunk nr.38000.
Current threads are 4 yielding chunk nr.39000.
Current threads are 4 yielding chunk nr.40000.
Current threads are 4 yielding chunk nr.41000.
Current threads are 4 yielding chunk nr.42000.
Current threads are 4 yielding chunk nr.43000.
Current threads are 4 yielding chunk nr.44000.
Current threads are 4 yielding chunk nr.45000.
Current threads are 4 yielding chunk nr.46000.
Current threads are 4 yielding chunk nr.47000.
Current threads are 4 yielding chunk nr.48000.
Current threads are 4 yielding chunk nr.49000.
Current threads are 4 yielding chunk nr.50000.
Current threads are 4 yielding chunk nr.51000.
Current threads are 4 yielding chunk nr.52000.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35415#comment:4>
Reply all
Reply to author
Forward
0 new messages