#35728: Lazily compute assertion messages
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Type:
| Cleanup/optimization
Status: new | Component: Testing
| framework
Version: dev | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Django’s custom test assertions have some rich failure messages.
Typically, assertions pass, so most computation of these messages is
wasted.
This overhead is quite noticeable for messages based on large strings. For
example, `assertContains` calls `repr()` on the whole response content,
typically thousands of bytes.
To mitigate this, I propose that all custom assert methods pass lazy-
computing objects to unittest’s `msg` arguments.
[
https://github.com/python/cpython/blob/f95fc4de115ae03d7aa6dece678240df085cb4f6/Lib/unittest/case.py#L755-L774
unittest’s _formatMessage()] effectively calls `str()` on the given object
when displaying, so this should work well.
To measure the overhead, I created this test script, named `benchmark.py`:
{{{
import unittest
from django.conf import settings
from django.http import HttpResponse
from django.test import SimpleTestCase
if not settings.configured:
settings.configure()
class ExampleTests(SimpleTestCase):
def test_example(self):
response = HttpResponse("Apple\n" * 1_000)
for _ in range(100_000):
self.assertContains(response, "Apple")
if __name__ == '__main__':
unittest.main(module='benchmark')
}}}
I ran it under cProfile with:
{{{
$ python -m cProfile -o profile -m benchmark
}}}
I tried it without and with the below patch, which disables the custom
message for `assertContains`:
{{{
diff --git django/test/testcases.py django/test/testcases.py
index cd7e7b45d6..de86cb55ec 100644
--- django/test/testcases.py
+++ django/test/testcases.py
@@ -580,13 +580,13 @@ def _assert_contains(self, response, text,
status_code, msg_prefix, html):
content = b"".join(response.streaming_content)
else:
content = response.content
- content_repr = safe_repr(content)
+ content_repr = ""
if not isinstance(text, bytes) or html:
text = str(text)
content = content.decode(response.charset)
- text_repr = "'%s'" % text
+ text_repr = ""
else:
- text_repr = repr(text)
+ text_repr = ""
if html:
content = assert_and_parse_html(
self, content, None, "Response's content is not valid
HTML:"
@@ -623,10 +623,10 @@ def assertContains(
else:
self.assertTrue(
real_count != 0,
- (
- f"{msg_prefix}Couldn't find {text_repr} in the
following response\n"
- f"{content_repr}"
- ),
+ # (
+ # f"{msg_prefix}Couldn't find {text_repr} in the
following response\n"
+ # f"{content_repr}"
+ # ),
)
def assertNotContains(
}}}
Here are the stats.
Before:
* 2,724,770 function calls in 2.280 seconds
* 2.119 seconds spent in `assertContains` calls
After:
* 2,524,805 function calls in 1.117 seconds
* 0.949 seconds spent in `assertContains` calls
It looks like ~50% of the cost of calling `assertContains` is forming the
message.
--
Ticket URL: <
https://code.djangoproject.com/ticket/35728>
Django <
https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.