[Django] #36339: BrokenLinkEmailsMiddleware fires when Referer is invalid and URL redirects

7 views
Skip to first unread message

Django

unread,
Apr 19, 2025, 6:24:16 PM4/19/25
to django-...@googlegroups.com
#36339: BrokenLinkEmailsMiddleware fires when Referer is invalid and URL redirects
--------------------------+-----------------------------------------
Reporter: Xeekoo4u | Type: Uncategorized
Status: new | Component: Uncategorized
Version: 5.2 | Severity: Normal
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------+-----------------------------------------
Hi all,

I discovered some behavior in the BrokenLinkEmailsMiddleware that I don't
fully understand: Our Website is scraped by bots on a regular basis and
one of the bots likes to use ''https://my.web.site/wp-admin'' as the
referrer. The bot requests several URLs that do not exist (e.g. ''/wp-
content'') which are part of the ''IGNORABLE_404_URLS'' and are ignored
properly, but it also requests the ''/login'' URL that returns a 301
redirect to ''/login/''. For some reason, this request results in an email
being sent if the referrer is an invalid URL (even though the actually
requested URL does not return a 404).

It confuses me that I cannot find the part of the code that validates
whether the referrer is valid or not. I discovered issues from almost a
decade ago (https://code.djangoproject.com/ticket/26059,
https://code.djangoproject.com/ticket/25971) that face a similar issue,
but in their case the referrer was (almost) equal to the URL.

Is there any advice how I could handle this issue? Why is the referrer
even validated and why doesn't the validation respect the
''IGNORABLE_404_URLS''? I'd be glad to remove some email spam :)
--
Ticket URL: <https://code.djangoproject.com/ticket/36339>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 21, 2025, 2:20:38 PM4/21/25
to django-...@googlegroups.com
#36339: BrokenLinkEmailsMiddleware fires when Referer is invalid and URL redirects
-------------------------------+--------------------------------------
Reporter: Xeekoo4u | Owner: (none)
Type: Bug | Status: closed
Component: HTTP handling | Version: 5.2
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by Natalia Bidart):

* component: Uncategorized => HTTP handling
* resolution: => invalid
* status: new => closed
* type: Uncategorized => Bug

Comment:

Hello Xeekoo4u, thank you for taking the time to create this ticket. This
report is borderline a support request (which is usually better handled in
the [https://forum.djangoproject.com/c/users/6 Django Forum]), but before
redirecting you there, I invested some time and created a test case
showcasing your scenario. This does not necessarily mean the behavior is a
bug, since:
1. You can customize the behavior for `BrokenLinkEmailsMiddleware` by
providing your own middleware and overriding `is_ignorable_request`.
2. If your project use `django.middleware.common.CommonMiddleware` (which
I believe your project should), any URL that needs the slash appended will
get it appended and the redirect is returned, never hitting the
`BrokenLinkEmailsMiddleware.process_response` method.

To illustrate what I mean, I've created three tests. The first one is the
failing test for the scenario that you described:
`test_referer_invalid_url_redirects`. The following two tests,
`test_referer_invalid_url_redirects_full_request` and
`test_referer_invalid_url_redirects_incomplete_middleware`, showcase what
I mean regarding the `CommonMiddleware`:
{{{#!diff
diff --git a/tests/middleware/tests.py b/tests/middleware/tests.py
index 2e796ecfc7..a7f3e703e8 100644
--- a/tests/middleware/tests.py
+++ b/tests/middleware/tests.py
@@ -485,6 +485,33 @@ class BrokenLinkEmailsMiddlewareTest(SimpleTestCase):
BrokenLinkEmailsMiddleware(self.get_response)(self.req)
self.assertEqual(len(mail.outbox), 1)

+ @override_settings(APPEND_SLASH=True)
+ def test_referer_invalid_url_redirects(self):
+ self.req.path = self.req.path_info = "/login"
+ self.req.META["HTTP_REFERER"] = "https://my.web.site/wp-admin"
+ BrokenLinkEmailsMiddleware(self.get_response)(self.req)
+ self.assertEqual(len(mail.outbox), 0)
+
+ @override_settings(APPEND_SLASH=True, ROOT_URLCONF="middleware.urls")
+ def test_referer_invalid_url_redirects_full_request(self):
+ referer = "https://my.web.site/wp-admin"
+ for url, status_code in [("/slash/", 200), ("/slash", 301)]:
+ with self.subTest(url=url, status_code=status_code):
+ response = self.client.get(url, HTTP_REFERER=referer)
+ self.assertEqual(len(mail.outbox), 0)
+ self.assertEqual(response.status_code, status_code)
+
+ @override_settings(
+ APPEND_SLASH=True,
+ ROOT_URLCONF="middleware.urls",
+
MIDDLEWARE=["django.middleware.common.BrokenLinkEmailsMiddleware"],
+ )
+ def test_referer_invalid_url_redirects_incomplete_middleware(self):
+ referer = "https://my.web.site/wp-admin"
+ response = self.client.get("/slash", HTTP_REFERER=referer)
+ self.assertEqual(len(mail.outbox), 0)
+ self.assertEqual(response.status_code, 301)
+

@override_settings(ROOT_URLCONF="middleware.cond_get_urls")
class ConditionalGetMiddlewareTest(SimpleTestCase):
}}}

As expected, `test_referer_invalid_url_redirects` and
`test_referer_invalid_url_redirects_incomplete_middleware` fail but
`test_referer_invalid_url_redirects_full_request` passes. Given this, I
will close this ticket as `invalid` since I'm not sure we want to support
the (potentially) niche use case of using `BrokenLinkEmailsMiddleware`
**without** `CommonMiddleware`. OTOH, if your project is using
`CommonMiddleware` properly, and you are still affected by this issue,
please reopen providing a way to reproduce (either a test case, or a
minimal Django test project).
--
Ticket URL: <https://code.djangoproject.com/ticket/36339#comment:1>

Django

unread,
Apr 21, 2025, 2:23:26 PM4/21/25
to django-...@googlegroups.com
#36339: BrokenLinkEmailsMiddleware fires when Referer is invalid and URL redirects
-------------------------------+--------------------------------------
Reporter: Xeekoo4u | Owner: (none)
Type: Bug | Status: closed
Component: HTTP handling | Version: 5.2
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by Natalia Bidart):

* cc: Claude Paroz, Jake Howard, Florian Apolloner, Aymeric Augustin, HARI
KRISHNA KANCHI (added)

Comment:

Adding some folks as CC to get their input.
--
Ticket URL: <https://code.djangoproject.com/ticket/36339#comment:2>

Django

unread,
Apr 22, 2025, 4:25:27 AM4/22/25
to django-...@googlegroups.com
#36339: BrokenLinkEmailsMiddleware fires when Referer is invalid and URL redirects
-------------------------------+--------------------------------------
Reporter: Xeekoo4u | Owner: (none)
Type: Bug | Status: closed
Component: HTTP handling | Version: 5.2
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Comment (by Jake Howard):

I also find this issue strange. If you can reliably reproduce the issue
Xeekoo4u, It'd be great if you can try and diagnose what's going on with
`BrokenLinkEmailsMiddleware` yourself to point us in the right direction.
As Natalia says, the forum might also be a good place to get help on this
before circling back to this issue with some more context. Alternatively,
if you can get the issue working in a minimal project, that'd be helpful
too.

The part which makes me think there's something else going on here (rather
than it being merely a bug with `BrokenLinkEmailsMiddleware`) is the `if
response.status_code == 404` on the first line of the middleware, which
ought to avoid any issue like this.
--
Ticket URL: <https://code.djangoproject.com/ticket/36339#comment:3>
Reply all
Reply to author
Forward
0 new messages